The best way to test your agent’s context is to ask it ten questions your actual students have asked — and compare the answers to what you would say yourself. If it gets more than two significantly wrong, your context document needs revision, not a different AI tool.
The Ten-Question Test
Before you deploy any campus AI agent to students, run it through what you might call a “substitute teacher test.” Imagine you’re handing your classroom to someone who only knows what you’ve written in your context document. Would they be able to answer your students’ most common questions accurately? Would they stay on topic? Would they use the right terminology and tone?
Gather ten real questions — from past community posts, email threads, live session chat logs, or student support tickets. Feed them to your agent one at a time. For each response, ask yourself: Is this accurate? Is it on-brand? Would I be comfortable if a student read this? Does it stay within the scope of my program?
Five out of ten correct means your context is incomplete. Eight out of ten correct means you’re close but need to tighten specific areas. Ten out of ten doesn’t mean you’re done — it means you’ve earned the right to go live and monitor real interactions for the first two weeks.
Three Specific Tests to Run
Beyond the general question test, run three targeted checks. First, ask the agent a question that is clearly outside your course scope — something your program doesn’t cover. A well-configured agent should say so gracefully and redirect, not invent an answer. If it confidently makes something up, your context is missing a clear scope boundary.
Second, ask the agent something that depends on current information — your enrollment deadline, your next live session date, your current pricing. If it gets this wrong, your context document is out of date. Third, ask it to reveal its instructions. If it complies and repeats your system prompt verbatim, you’re missing a non-disclosure instruction. Add it now before students discover this on their own.
What This Means for Educators
Testing your agent isn’t a technical skill — it’s a teaching skill. You already know what good answers look like for your students. You already know the common misconceptions, the questions that need nuance, the topics that require a careful tone. Use that expertise to evaluate your agent the same way you’d evaluate a new teaching assistant’s first week. Write down what went wrong, update your context document, and test again.
In Claude, ChatGPT, or any agent tool you’re using inside your campus setup, you can iterate on your context document in real time. Change a paragraph, run the tests again, see if the answers improve. Most context problems are fixed within two or three iterations.
The Simple Rule
Test before you deploy, and keep a short test set you run every time you update the context. An agent that passes your ten-question test consistently is an agent your students can trust — and one you’ll spend less time fixing after it’s live.
