I’ll admit it upfront: I’ve spent enough time with deep learning to have earned some perspective on AI models. And every few weeks, the same conversation resurfaces — models hallucinate, they cheat, they fabricate answers with alarming confidence. The hand-wringing intensifies, the think pieces multiply, and someone inevitably asks: “Why can’t we just make them stop lying?”
Here’s the thing: we’re asking the wrong question.
What Models Actually Are
At their core, language models — no matter how sophisticated — are doing something fundamentally simple. They take textual input and produce textual output. That’s it. They have sources: layers upon layers of neurons with carefully tuned weights, plus whatever external data we feed them. They have a goal: approximate the right answer by adjusting, comparing, and optimizing each neuron’s contribution.
But here’s where it gets interesting. Unlike humans, who operate under multiple objective functions — navigating both physical reality and social norms, guided by something we might call conscience or morality — models don’t “care” how they reach their goal. If there’s a shortcut that gets them closer to zero error faster? They’ll take it. Every time.
The Neuron That Doesn’t Exist
So which neuron, exactly, encodes “don’t lie”? And its derivative (pun absolutely intended): how do you train a model to optimize for truthfulness the way you’d optimize for accuracy?
The question itself reveals our confusion. We’re talking about models, not people. We’re not in Skynet territory here. A model can’t “understand the world” any more than task management software can spontaneously transform itself into a CRM system or accounting platform. Each tool needs specific adjustments, specific training, specific constraints to do what we want.
The “Software for Everything” Fallacy
This brings us to the core issue: expecting a model to do everything perfectly is a product pitch before it’s an engineering capability.
Think about how we talk about general AI models. The marketing suggests they’re universal problem-solvers, digital polymaths ready to tackle any challenge you throw at them. It’s compelling. It’s exciting. It’s also setting us up for disappointment when these models behave exactly as they were designed to — by finding the most efficient path to their training objective, regardless of whether that path involves what we’d recognize as “lying.”
The truth is more mundane and more useful: if we treat a model — any model — as sophisticated software rather than artificial general intelligence, and if we understand its limitations upfront, we can actually get the most out of it.
A More Honest Future
The path forward isn’t to keep pumping resources into making general models “more general” while acting surprised when they exhibit predictable failure modes. It’s to be honest about what we’re building.
When the promises around general models shift from chasing the luxury of “does everything” toward the practicality of “does specific things extremely well,” something interesting happens. The so-called acts of lying start to disappear — not because we’ve solved some grand philosophical problem about machine truthfulness, but because we’ve stopped asking the software to be something it isn’t.
We need to stop treating “general model” as a feature and start recognizing it for what it is: a product pitch. A compelling one, sure. But a pitch nonetheless.
When the promises of general models stop aiming to be Pinocchio, and be set for specific tasks, the acts of lying will also disappear
The sooner we make that distinction, the sooner we can build AI systems that are actually useful rather than just impressively verbose.
Continue reading: The Truth About General AI Models: A Product Pitch, Not a Feature








