AI UX Patterns

TL;DR

Streaming beats "wait and show." Users engage faster. Perceived latency drops.
Design for uncertainty: loading states, confidence cues, and "edit / retry / fallback."
Never leave users stuck. Every AI output should have an escape hatch.

AI features feel different. They're non-deterministic, sometimes wrong, and often slow. UX has to compensate.

Streaming

Instead of: spinner → full response.

Do: show tokens as they arrive. User starts reading in 1–2 seconds. Total time might be the same, but time to first content matters.

Implementation: Use streaming APIs (OpenAI, Anthropic, etc.). Render incrementally. Handle partial JSON if you're parsing structured output — buffer until complete.

Loading States

Skeleton. For structured output (forms, lists), show placeholder layout. Fills in as content arrives.
Typing indicator. "Thinking..." or subtle animation. Not a generic spinner — something that signals "AI is working."
Progress for long tasks. "Searching 10,000 documents..." or "Generating summary (step 2/3)." Users tolerate waits when they know what's happening.

Confidence Indicators

When the AI might be wrong:

Citation. "Based on: [Doc A], [Doc B]." User can verify.
Confidence score. "85% match" — use sparingly. Only if you have a real signal.
Disclaimer. "Double-check for critical decisions." Simple. Honest.

Don't over-engineer. A single "Verify important details" note often suffices.

Fallbacks

Scenario	Fallback
API timeout	"This is taking longer than usual. [Retry] [Cancel]"
Error / rate limit	"Something went wrong. [Try again] or [use traditional search]"
Empty / low-confidence	"I couldn't find a good answer. Here are manual options: ..."
User corrects output	"Thanks. We'll improve." + log for prompt tuning

Never dead-end. Every state has a next action.

Editability

AI-generated content should be editable. Pre-fill a form; user tweaks. Suggest text; user accepts or modifies. Treat AI as a draft, not a final product. That builds trust and catches errors.

Retry and Regenerate

"Regenerate" button. Same input, new output. Sometimes the first response is bad; the second is fine. Give users the lever.

// Streaming = time to first content drops. User reads while it generates.
async function* streamCompletion(prompt: string) {
const stream = await openai.chat.completions.create({
  model: 'gpt-4o-mini',
  messages: [{ role: 'user', content: prompt }],
  stream: true
});
for await (const chunk of stream) {
  const delta = chunk.choices[0]?.delta?.content;
  if (delta) yield delta;
}
}
// Render incrementally. Handle partial JSON if parsing structured output.

Quick Check

Your AI feature times out sometimes. What's the right fallback UX?

Do This Next

Audit one AI feature in your product. What happens on load? On error? On "AI got it wrong"?
Add one improvement — streaming, skeleton, or fallback. Measure: does time-to-engagement improve?
Add escape hatch — retry, cancel, or fallback to non-AI path. Test the happy path and the failure path.