When I was working on products before, I was most afraid of the boss bringing up the topic of user experience.

“Is this process smooth?” “Is this response natural?” “Is this feedback too stiff?”

Everything sounds reasonable when you hear it, but it wasn’t until I actually worked on AI products that I realized – the so-called experience is simply unreliable in large model products.

Not to mention the experience, the output of AI itself is not stable.

Nowadays, bosses who understand AI don’t ask about “whether the experience is good” at all,

Instead:

“What’s the pass rate of your model?”


What is the passing rate? It refers to whether the model has done things correctly

If you create an AI customer service, the boss won’t ask you, “Is this response natural?”

He will ask:

  • “If a user asks 100 questions, how many questions must the AI answer correctly at least?”
  • “Are there any incorrect answers, and if so, is there any risk?”
  • “Can we promise that it won’t tell lies 90% of the time?”

You must answer:

“We have a set of evaluation datasets, ran 100 test cases, and the current model can pass 87 of them.”

✅ If you answer correctly, you can go live.

❌ If you can’t answer, it means you haven’t evaluated the product’s capabilities at all.


You think it’s an experience issue, but the boss looks at delivery capabilities

When I used to work on product revamps, user experience was always the top priority.

However, large model products are not experience-oriented; they are metric-oriented.

You need to give the boss a basis for judgment, and what he looks at is:

  • Do you have the ability to measure the effectiveness?
  • Can you locate the problem?
  • Can this model “consistently get things right”?

No matter how good the experience is, once an error occurs, the entire chain collapses.


In all fairness, experience is too elusive to base decisions on

Let me give an example.

Previously, we tested a Q&A system, and the experience seemed quite good, with responses that were quite “human-like”.

But when we later ran it on the evaluation set, we found that it completely missed the mark on all the most critical questions –

For example, when it comes to things like return policies, customer service procedures, and paid subscription instructions, AI is just beating around the bush.

This won’t do. Because users can accept “not sounding very natural”,

But cannot accept “getting things wrong”.


The truly reliable basis for decision-making is: metrics as the foundation, and user experience as a bonus

When developing AI products, the evaluation priorities are:

  1. First, have a test set and define what the expected output is for each scenario
  2. Run the model again to measure the pass rate
  3. Finally, it’s about experience refinement: smoother, more user-friendly, and more fluid

The boss is looking at whether you have “hard criteria”,

It’s not about whether you have a “polished feel”.


People who still blindly believe that “AI should focus on user experience” most likely have never actually launched a product.

Because as long as you go live, release a version, and start a canary release, you’ll know:

Experience can polish, but metrics can save lives.

In a world where model output is never certain, the only thing that brings peace of mind is your courage to say: “Our current pass rate is 85%, the target is 92%, and this update has increased it by 3 percentage points.”

Those who understand the model believe in this.

Stop trying to impress your boss with “it feels okay”; that’s not reliable.

Since you’ve read this far, if you like it, feel free to give it a thumbs up, click “Like”, and share it. If you want to receive notifications in a timely manner, you can also star me ⭐~ Thank you for reading my article. See you next time.

By Echo

Hi, I'm Echo — creator of the blog AI Blue-Haired Witch. I write daily about AI product breakdowns, practical toolchains, and global market trends. My focus: analyzing viral AI apps, replicating product logic, and exploring actionable paths for indie builders and small teams. Expect hands-on testing, clear perspectives, and product insights grounded in real use. My goal is to help you build with AI — whether it's launching a tool, growing your business, or making your first dollar online. Think of this blog as your "AI tools intel hub for going global", and think of me as your "blue-haired, slightly obsessive product witch" — here to guide you through the chaos.

Leave a Reply

Your email address will not be published. Required fields are marked *