Field Note
"Don't Make Any Mistakes" Is the Mistake
Telling an AI system not to make mistakes is categorically similar to telling a new hire not to make mistakes. It names the desired absence of failure. It does not create the conditions for reliable execution.
There is a certain kind of instruction that feels serious because it names the consequence everyone fears: don't make any mistakes.
This is common in human organizations. It is also now common in AI work. We tell the model not to hallucinate, not to invent facts, not to break the codebase, not to miss edge cases, not to confidently walk off a cliff while describing the scenery in fluent prose. Fine. Put it in the prompt. I do some version of it too. But we should be clear about what that instruction is doing.
The New Hire Problem
Telling an AI system "don't make mistakes" is categorically similar to telling a new hire "don't make mistakes." It may express a preference. It may raise the emotional temperature. It may even create a little useful caution. What it does not do is transfer judgment, context, standards, or operating doctrine.
I learned a version of this as a second lieutenant in the Army. The Army does not hand a brand-new lieutenant a platoon and simply hope that youth, six months of officer training, and a can-do attitude will carry the day. The lieutenant is paired with a platoon sergeant for a reason. The lieutenant carries the commander's intent down to the tactical level. The platoon sergeant carries institutional memory, unit context, practical judgment, soldier knowledge, and the quiet archive of "we tried that once and it ended badly."
This pairing was not a bureaucratic choice, but a mechanism designed for survival. A lieutenant has to understand what the commander wants, what the mission requires, what the terrain allows, what the unit can actually do, and what tradeoffs are acceptable when the plan meets friction. The platoon sergeant helps keep that interpretation connected to reality. Command is not just authority. Command is interpretation under constraint.
A Failure of Leadership
So when someone says "don't make any mistakes," the problem is not that the sentence is false. Of course mistakes are bad. The problem is that the sentence does not do the work of leadership. It does not explain intent. It does not define priorities. It does not name constraints. It does not mark what can be traded and what cannot. It does not specify what evidence should be checked, what failure modes matter most, or when the subordinate should stop and escalate.
It pushes the hardest part of command downward and then pretends responsibility has been communicated. The subordinate is left to infer the mission from vibes, fear, and whatever scraps of context happen to be available. In a human organization, that produces hesitation, overfitting to the boss, hidden errors, and a lot of theater around looking careful. In AI work, it produces the same thing with better grammar.
The model will often comply with the emotional shape of the instruction. It will sound more careful. It may add caveats. It may say it verified something. It may even improve a little at the margin. But sounding careful is not the same as being governed.
D.E.V.
That is why I have started to think about AI-paired software work through a D.E.V. frame: Direct. Execute. Verify.
Direct is not just writing a better prompt. It is a separate agentic function that works through dialectic with the human. Its job is to understand intent, interrogate ambiguity, surface tradeoffs, and translate the human's purpose into specific orders with constraints, standards, assumptions, and stop conditions. This is where "what are we actually trying to do?" becomes "here is the order that can be executed."
Execute is a different agentic function. Its job is to carry out the order. It writes the code, drafts the artifact, searches the repo, updates the page, generates the migration, or performs the bounded task. It should not have to rediscover strategic intent from scratch while also holding implementation detail in its head. It should execute against a shaped order, not improvise from a fog bank.
Verify is a third agentic function. Its job is not to admire the work. Its job is to inspect whether the execution matched the order. It checks facts, tests behavior, looks for regression, compares the output against the actual intent, and notices where the executing agent satisfied the literal instruction while missing the purpose.
These layers are separated for two reasons. First, context layering prevents cognitive overload. The more you force one agent to hold strategy, implementation, edge cases, repo state, test behavior, factual accuracy, and self-critique inside one continuous thread, the more salience starts to smear. More context does not always mean better judgment. Sometimes it just means the important parts are now buried under a larger pile of plausible words.
Second, each layer requires a different cognitive posture. Direction requires judgment about purpose. Execution requires focus on production. Verification requires skepticism. When you collapse all three into one continuous conversation, the same context that generated the answer is asked to prosecute the answer. Sometimes that works. Often it produces a polite self-audit with no teeth. The model grades its own homework using the same local salience field that produced the mistake in the first place.
This is how you get fluent failure. The answer looks disciplined because the prompt demanded discipline. The code looks plausible because the model understands the shape of plausible code. The explanation sounds verified because the word "verify" appeared somewhere above it in the ritual text. But the actual control system is missing.
The lesson is not that prompts do not matter. They do. Words shape attention. Instructions change behavior. A better prompt can absolutely improve an output. The lesson is that prompting is not command.
Command requires intent, constraints, and verification. "Don't make any mistakes" is not intent. It is a fear wearing a uniform. And in AI work, just as in human organizations, fear is a terrible operating system.