Why We’re Losing Sight of How AI Thinks
Once, we could catch glimpses of how AI reached its conclusions. Now, those moments are vanishing — and that’s no accident.
Years ago, when The Sims first came out, the loading screen flashed absurd progress updates: Partitioning Social Network…Prelaminating Drywall Inventory…Reticulating 3-Dimensional Splines. None of it explained what the game was actually doing — it just reassured us that something was happening. The game hadn’t frozen. Our patience would pay off.
LLMs have their own version of those quirky loading bars, only in this case, they’re not just for show. It’s called chain-of-thought, or CoT. ChatGPT won’t say it’s “prelaminating the drywall,” but it might tell you it’s searching the web, comparing sources, and weighing them against your question.
Or at least, they used to…until the companies decided the audience didn’t need to see the act anymore.
The Disappearing Act
Now, AI companies are hiding more of that reasoning process from users entirely — even the small glimpses we once saw in the form of intermediate steps or tool-use updates. Leading AI researchers are deeply concerned (pdf) because any monitorability, even partial, is a valuable security layer that helps flag “suspicious or potentially harmful interactions.”
For some models, externalizing CoT is essential to the reasoning process, much like someone talking to themself as they try to remember where their keys are. I let myself in the front door and put away the groceries. The keys aren’t on the counter. AHA — they’re still in the door!
What’s At Stake
Seeing inside the reasoning process is particularly important to catching errors before they occur and to fixing misalignment problems in the AI. Recently, Anthropic’s Dario Amodei pointed out that we understand so little of how AI works that it’s akin to alien technology.
When a generative AI system does something, like summarize a financial document, we have no idea, at a specific or precise level, why it makes the choices it does—why it chooses certain words over others, or why it occasionally makes a mistake despite usually being accurate.
This chasm in understanding is particularly alarming given how AI is being introduced in every sector of society like banking, transportation safety, health care and government.
Such processes require an extraordinary amount of reasoning by AI as it compares alternatives. And, as OpenAI co-founder Ilya Sutskever warns "The more it reasons, the more unpredictable it becomes.”
Pulling the Curtain
Why, then, would OpenAI decide to hide ChatGPT’s reasoning process to let it “complete tasks for you using its own computer”? Why pull the curtain and still claim it’s, well, open AI?
On the surface, there are security and business reasons which justify hiding more of the GPT’s thought process from users. We’ve never had full access to its internal chain-of-thought — that’s kept private by design — but in the past, some filtered version of its steps or tool usage was visible, like watching a math student jot down part of their work.
First, I’ll look for the substance in the text. That means boiling occurs at 100°C (212°F). Then I’ll look for any exceptions. So the answer is: “Water boils at 100°C (212°F) but can be affected by atmospheric pressure which is affected by altitude.”
This is valuable because it lets humans spot when the AI is heading in the wrong direction before it spits out a final, polished-but-wrong answer — a hallucination.
Read my latest blog post: ChatGPT 5: Oz Never Did Give Nothing to the Tin Man
The Jailbreak Problem
But let’s face it: some people will misuse anything if it gives them an edge. AI is no exception, and exposing its reasoning process creates new ways for bad actors to trick or bypass safety rules through “jailbreaking.”
For example, if an AI has guardrails that prohibit it from telling people how to build an explosive bioweapon, a jailbreak prompt could couch the request in deceptive terms like describing it in a fictional scenario, encoding it in a puzzle, or layering it in indirect instructions so the AI follows the reasoning steps but doesn’t realize it’s doing something it was told not to.
And of course, there’s a less noble reason, one that has nothing to do with safety and everything to do with control. Even “open” or “human-focused” companies guard their proprietary methods. Hiding the messy, mistake-prone steps shields companies from the reasoning errors that get mocked online.
Flying Blind
But whether it’s to block bad actors or protect a brand, the result is the same: we’re left flying blind. If we can’t see what’s happening, we can’t catch mistakes or tell when the platform is nudging the narrative — quietly redefining the truth to suit itself.
And that’s just the tip of the iceberg.
On Friday at KatherineArgent.com, I’ll dig into why losing oversight might be the most dangerous “feature” yet — and how the companies selling it know exactly what they’re doing.
How do you know that they didn't hide it because the chain-of-thought reasoning is just another hallucination? Chain of thought reasoning in human language from LLMs isn't related to what they are actually doing. Because what they are actually doing is picking a statistically likely chain of tokens, not using logic or reasoning. They aren't thinking, they are regurgitating.