Case Study #3
A feature that flags when an AI is stating established fact versus drawing its own conclusions. Giving users a signal to pause, not just a response to accept.
Project
Concept Exploration
My Role
Solo UX Designer
Duration
2 Day Sprint
Methods & Tools
Secondary Research - AI Prototyping (Claude Code) - Figma
00 - The Process
How this came together
1
Asked a question to Claude
Prompted an AI about human psychology and the principle of least effort.
2
Noticed something was off
Pushed back on a specific claim. The AI admitted it had presented an inference as fact, with no indication that was the case.
3
Researched the problem
Used Claude and Perplexity to surface and validate secondary research… automation bias studies, hallucination cases, and overtrust literature. Cross-checked findings manually.
4
Designed a solution
Explored five approaches, eliminated four. Landed on inline tinting with a transparency drawer, low friction, accessible, user-controlled.
5
Built the prototype
Used Claude Code to assist with prototyping… generating and iterating on HTML, CSS, and Figma frames through a conversation-driven design loop.
01 - Research
I asked an AI about human psychology. Then to check its own work.
47% of AI-generated citations
submitted by students had incorrect titles, dates, authors, or were fully fabricated
University of Mississippi, 2024
= credible
Users rated AI-generated content as equally credible to human-generated content despite no difference in actual accuracy
Huschens et al., 2023
Mata v. Avianca
A lawyer submitted a brief citing nonexistent court cases generated by ChatGPT, filed in a US federal court
2023 Legal Case
The problem is the design, not the model. Every choice in a standard AI chat interface signals "this is reliable."
Nothing signals "you might want to verify this one."
02 - The Idea
What if the interface just told you when it was guessing?

The feature is simple in principle. A toggle that shows users, inline, which parts of a response are established fact and which are the AI's own reasoning without interrupting the reading experience for people who don't want it.
Best guess
Ideas the AI thinks are probably true but hasn't confirmed from direct evidence.
AI's thinking
Logical inferences the AI drew from established concepts.
Layer 1 — Inline tinting. A subtle background color on flagged segments, with a bottom border for accessibility. Hover or tap to see why it was flagged.
Layer 2 — Transparency drawer. Slides in from the right. Shows a breakdown of the current response, including how many segments are best guesses versus AI reasoning, in plain language.
Pills persist when the mode is off. A passive signal that transparency information exists, without forcing it on every user.
03 - The Hard Calls
None of these were obvious
Each of these had a reasonable alternative. Here's why I went the way I did.
04 - A Question i had to Answer
What if the classification isn't perfectly accurate?
The feature doesn't need to be perfectly accurate to be useful. Visible marking of uncertain text makes users slow down and read more carefully. That pause alone reduces blind acceptance.
Nutritional labels don't need to be perfectly accurate to improve eating decisions. Safety warnings don't need to prevent 100% of accidents to reduce harm. The presence of a signal, even an imperfect one, shifts the cognitive default from passive acceptance to active evaluation.
05 - What i dont know
Three things I'd need to test!
The concept is grounded in real evidence. But evidence isn't the same as validation. Three things need testing before this ships.
1
Technical Limitation
Classification will produce errors. The question is how often and in which direction. A mislabeled factual claim may do more harm than no label at all.
2
Does the language land?
5-second comprehension test. Risk: users interpret "best guess" as universal uncertainty, eroding trust in accurate responses too.
3
Does transparency mode actually shift behavior?
A/B study measuring follow-up search behavior. Risk: the feature creates awareness without changing behavior. Cosmetic transparency, not behavioral transparency.


