SpheriCity: Designing Trustworthy Conversational AI for Sustainability Decision Support

This paper introduces SpheriCity, a provenance-first conversational AI prototype designed to enhance trustworthy knowledge sensemaking and cross-document synthesis in sustainability decision-making, validated by expert feedback highlighting the critical role of transparent sourcing and workflow alignment in building user trust.

Original authors: Ahmed Qayyum, Madison Werner, Kathryn Youngblood, Jenna R. Jambeck, Tahiya Chowdhury

Published 2026-06-15
📖 5 min read🧠 Deep dive

Original authors: Ahmed Qayyum, Madison Werner, Kathryn Youngblood, Jenna R. Jambeck, Tahiya Chowdhury

Original paper licensed under CC BY 4.0 (http://creativecommons.org/licenses/by/4.0/). This is an AI-generated explanation of the paper below. It is not written or endorsed by the authors. For technical accuracy, refer to the original paper. Read full disclaimer

The Big Problem: Too Many Books, Too Little Time

Imagine you are a city planner trying to figure out how to stop plastic waste. You have a massive library of 36 different reports, each one as thick as a phone book (about 65 to 100 pages). These reports are full of data, charts, and policies from different cities around the world.

To answer just one question—like "How did City A handle plastic bags compared to City B?"—you would have to manually flip through hundreds of pages, cross-reference numbers, and try to connect dots that are scattered across different documents. It's like trying to build a puzzle where the pieces are hidden inside 36 different boxes, and you have to find them all by hand. It's slow, tiring, and easy to miss important connections.

The Proposed Solution: A "Super Librarian" AI

The researchers built a tool called SpheriCity. Think of it as a super-smart, super-fast librarian who has read all 36 of those thick reports and can chat with you about them.

You can ask it questions in plain English, like "What are the best ways to reduce plastic litter in coastal cities?" The AI scans the reports, finds the answers, and writes a summary for you.

But here's the catch: In the past, AI chatbots were like confident but forgetful students. They might give you a smooth-sounding answer that sounded great but was made up (a "hallucination") or didn't tell you where they got the info. In sustainability, where decisions affect the environment and public policy, you can't trust an answer if you can't check the homework.

How SpheriCity is Different: The "Receipt" Approach

The researchers realized that for experts (the city planners and scientists), trust is more important than speed. So, they designed SpheriCity with three special features:

  1. The "Receipt" (Provenance): Every time the AI makes a claim, it attaches a digital "receipt." It doesn't just say "City X did this"; it says "City X did this, according to Page 42 of the 2023 Report." This lets the expert instantly click and verify the source.
  2. The "Bullet Point" Summary (Structure): Instead of writing a long, confusing essay, the AI organizes answers into clear bullet points (like a shopping list). This makes it easy to scan, compare, and find missing information quickly.
  3. The "Scaffolding" (Templates): The system gives experts pre-made question templates. Instead of struggling to phrase a complex question, they can select a template like "Compare policies across cities," which helps the AI understand exactly what the expert needs.

The Experiment: Testing with Real Experts

The team didn't just guess if this worked; they tested it with six real sustainability experts.

They gave the experts 13 different questions (like comparing recycling in India vs. the US) and asked them to rate the AI's answers. They tested three different "brain" strategies for the AI:

  • Vector Search: Looking for words that sound similar.
  • Graph Search: Looking for how ideas are connected (like a map of relationships).
  • Hybrid: A mix of both.

The experts rated the answers on things like: Is this relevant? Is it accurate? Is it neutral? Is it deep enough?

What They Found: What Experts Actually Care About

The results were surprising and very practical. Here is what the experts told them:

  • Trust comes from "Receipts," not "Fluency": The experts didn't care if the AI spoke perfectly or sounded like a human. They cared if they could verify the source. If an answer was smooth but had no page numbers, the experts immediately distrusted it. If it had a specific page citation, they trusted it, even if the writing was simple.
  • They want a "Thinking Partner," not a "Magic 8-Ball": The experts didn't want the AI to give them a single, final answer. They wanted the AI to help them think. They preferred answers that showed different angles, trade-offs, and evidence so they could do their own reasoning. They treated the AI as a starting point for exploration, not the final verdict.
  • Small Mistakes Break Trust: If the AI got a small detail wrong—like calling a state a city, or mixing up two different cities—it destroyed the expert's confidence in the entire answer. In high-stakes work, one error makes the whole thing look unreliable.
  • More Info isn't Always Better: Sometimes, the AI gave very long, detailed answers that sounded impressive. But if those answers included a tiny bit of made-up context or over-generalized, the experts rated them lower than shorter, safer answers. They preferred "partial but true" over "complete but risky."

The Takeaway

The paper concludes that to make AI useful for serious work like saving the planet, we need to stop trying to make AI sound like a perfect human. Instead, we should design AI to be a transparent, accountable assistant.

Think of it this way: You don't want a lawyer who gives you a confident speech but hides their evidence. You want a lawyer who hands you the file, points to the exact page, and says, "Here is the evidence. Now, let's decide what to do." That is what SpheriCity aims to be for sustainability experts.

Drowning in papers in your field?

Get daily digests of the most novel papers matching your research keywords — with technical summaries, in your language.

Try Digest →