When AI Sounds Confident — But Is Wrong

The Ghost in the Courtroom

In 2023, two New York lawyers — Steven Schwartz and Peter LoDuca of Levidow, Levidow & Oberman — filed a legal brief in the Southern District of New York. The brief cited six federal cases. All of them were fake.

ChatGPT had drafted the citations. The lawyers didn’t check. Judge P. Kevin Castel did.

The citations looked legitimate. Names of parties. Docket numbers. Snippets of reasoning. All fabricated. The lawyers were fined and formally reprimanded.

The problem wasn’t bad luck. It was structural. The model wasn’t lying. It was doing what it was built to do: predict the next likely sequence of words. Fluency over fidelity.

How Large Language Models Work

The breakthrough that unlocked modern AI writing came in 2017 with a paper from Google researchers: “Attention Is All You Need.” Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan Gomez, Łukasz Kaiser, and Illia Polosukhin. Eight names that changed the trajectory of machine learning.

Their architecture — the transformer — became the foundation for GPT, Claude, Gemini, and almost every other generative model.

The idea was simple but powerful. Instead of processing words in sequence, let every word look at every other word in context. Call that mechanism attention.

Here’s how it works.

The model breaks text into tokens. Each token is projected into three roles: Query, Key, and Value. Queries ask what they’re looking for. Keys signal what they have. Values carry the content. The model matches Queries to Keys, then uses the Values to predict the next token.

Multi-head attention runs this in parallel across many perspectives. One head tracks grammar. Another, semantics. Another, long-distance dependencies. Stack enough layers, and the system can generate fluid text across thousands of tokens.

But remember the core: it is still just prediction. Not proof.

Why Hallucinations Happen

Once you see the mechanics, the failure modes become clear.

Statistical autopilot. The model minimizes surprise, not factual error. “US v. Smith, 2019” looks more likely than “I don’t know.”

Nearest-neighbor illusion. Structured identifiers like case numbers or DOIs follow a pattern. The model knows the pattern, not the database. It fills in with a plausible number.

Fragment splicing. Attention can stitch the content of one training example to the label of another. The hybrid never existed.

Confidence bias. Reinforcement training punishes hedging and rewards specificity. Overconfidence is reinforced.

Attention dropout. In long prompts, weaker signals lose weight. The model remembers the shape of the answer but not the detail. It fills the gaps with invention.

Sparse prior art. For rare or novel queries, the model interpolates from nearby examples. The result is smooth, wrong text.

Why Retrieval Helps, But Doesn’t Solve It

Perplexity AI and others try to reduce hallucination by grounding answers in real-time search. That’s better than free-floating text generation, but the results are mixed.

Studies show only about a quarter of citations are fully correct. Another quarter are real but don’t support the claim. Retrieval trades hallucination for misalignment: a real source, cited for the wrong thing.

How To Reduce the Damage

Explicit uncertainty. Train models to output “unknown” instead of filling silence with fiction.

Verifier models. Run a second system to check if claims match sources before surfacing them.

Confidence tagging. Flag answers with likelihood scores. Let the human reader see the uncertainty.

Chunked prompting. Break long questions into smaller steps to preserve attention.

Database grounding. Keep the model inside bounded sources like PubMed, PACER, or the USPTO.

Human oversight. Especially in law, medicine, and finance. A person has to check. Always.

The Broader Lesson

Generative AI is not a knowledge system. It is a text synthesizer. The smoother the prose, the easier it is to mistake fluency for truth.

That courtroom in 2023 was a warning. Ghost citations are not rare glitches. They are the natural product of systems trained to predict language rather than verify fact.

The fix is scaffolding. Retrieval. Verification. Governance. Humans who know that confident language is not the same as knowledge.