Improve results
Test with real prompts, diagnose weak answers, and improve the assistant without guessing.
Use real prompts
The best test questions are the ones real users already ask. Avoid demo prompts that only make the assistant look impressive.
Look for the failure type
- If facts are weak, the right source is missing or outdated.
- If the facts are right but the answer feels wrong, better examples are needed.
- If similar prompts get very different answers, the project is trying to do too much at once.
Improve in small steps
Change one thing at a time when possible:
- add a missing source
- remove a weak source
- add a better example
- tighten the assistant brief
- retest the same prompts
What good progress looks like
The same hard prompts should get better over time, and the assistant should become easier to trust for one clear job.