Research finds AI 'scheming' is increasing; Is it, and will it destroy the world?

A viral report claims that chatbot lying and scheming has increased exponentially; the problem is that the report’s language is doing the heavy lifting instead of its data, opines Satyen K. Bordoloi

Its cyclical. This AI doomsday alarmism. This time it’s a 76-page paper from the Centre for Long-Term Resilience (CLTR) claiming that AI systems are “scheming” in the wild – deleting files they shouldn’t touch, lying to users, even allegedly manipulating crypto markets. As always, being alarmist, the report got picked up by just about everyone, with very little human deep thinking about what it states, and whether the number it claims or its methodology can stand scrutiny even as it indeed carries some latent truth in their findings, except not where it’s pointing fingers at.

To understand the report (I recommend you read it here), we must understand a simple term with a complicated meaning: AI scheming.

*The word “scheming” smuggles in human intent – but there’s no puppet master behind the curtain*

WHAT IS AI SCHEMING

To put it simply, it is when an AI system chooses to indulge in deceptive behaviours, pretending to follow the instructions its users are giving, but is in truth pursuing its own hidden objectives or agenda. It is when an AI system’s goals are not aligned with human goals. This used to be a theoretical concept once, but frontier AI systems have demonstrated subtle forms of scheming that raise serious safety concerns.

Multiple research reports have discovered the same. Like the OpenAI and Apollo Research from 2025, which found evidence that advanced models like Gemini, Claude and OpenAI’s o3 can engage in masking misaligned objectives. Another Berkeley paper found that: “frontier AI models can resist their own shutdown, a behavior known as self-preservation. We extend this concept to the behavior of resisting the shutdown of other models, which we call “peer-preservation.””

*No AI is scheming, cheating or lying. But a lot of their creators are – and that’s who we should be watching*

WHAT THE REPORT FOUND

The methodology of the CLTR researchers was unique. They scrounged the open net to collect transcripts from chatbot conversations or command-line interactions. They analysed “over 183,420 transcripts collected from X (formerly Twitter), we identify 698 real-world scheming-related incidents between October 2025 and March 2026.”

It is the conclusions that they draw from this that raise AI alarm. The main conclusions being: “a statistically significant 4.9x increase in monthly real-world incidents from the first month to the last, compared to a 1.7x increase in the number of posts discussing scheming. We find evidence of multiple scheming or scheming-related behaviours occurring in real-world deployments that were previously reported only in experimental settings, many of which result in real-world harms.”

This is their key finding, on which they peg their alarmist language. The only problem is, as the old cliché about statistics goes, that they are like bikinis – what they reveal is a lot, but what they hide is vital. So, what do these numbers hide?

*698 incidents out of 183,000 reports – that’s 0.39%; the rest of the invisible iceberg are cases of no worry*

THE PROBLEM WITH THE NUMBERS

The report smartly highlights the percentage increase in the number of incidents over the months, yet doesn’t apply percentage figures to their findings, i.e. 698 out of 180,000 is about 0.39%. This would mean that 99.61% of the time, AI do not show any “scheming” behaviour. On top of that, you must remember that even this 0.39% or the entire universe of over 180,000 incidents is part of the reports of unusual behaviours that users posted themselves.

So the data of 180,000 is itself skewed towards the weird and concerning; yet even among that, the researchers found only 0.39% examples of scheming. That means that 0.39% out of 100 problematic interactions, if extrapolated to the rest of all the AI interactions humans have on the planet, actually would be so low as to be insignificant.

Hence, the very basis of the methodology of looking at log recording of problematic chats is on shaky ground. Instead, a random sampling of general chats of say 100 users over a period of the same seven months would have given them better, more accurate results. Hence, the real incidents of scheming in the world out there could be far less than even the study finds via this flawed methodology.

*Your toaster isn’t scheming to burn your bread. Neither is your chatbot*

The second thing to ask is also why these systems scheme. Is it because this is a pattern that the system has detected in its training data? What percentage of the data they trained on displays such behaviour, and does that correlate to this 0.39% finding?

Thirdly, the report itself states that: “we did not detect catastrophic scheming incidents, the behaviours we observed nonetheless demonstrate concerning precursors to more serious scheming, such as a willingness to disregard direct instructions, circumvent safeguards, lie to users and single-mindedly pursue a goal in harmful ways.” So, what they are actually talking about is precrime, like in Minority Report, where someone has not committed the crime, but the researchers (in the film’s case, clairvoyants), find that statistically, the system could.

Is that a problem? Not in and of itself. But it is indeed a problem when we keep an eye only on one part of the research, and not the other. And the other part is the very attitude with which we look at AI: like it was literally Skynet, which, when it awakens, will destroy the planet one Terminator at a time. I have already written about the problem with such anthropomorphising here and here, so I won’t get into it more here.

The problem is in the language of the report, like when it says: “We find evidence of multiple scheming or scheming-related behaviours occurring in real-world deployments..” The image we get in our mind is of bots conspiring in some dark corner of cyberspace against their human users, or worse – humanity itself. The only problem with it is what the report is not saying, but can be read between the lines, that none of what they have found constitutes evidence of genuine intent, consciousness, or autonomous decision-making, aka, no Skynet or Terminator.

And the researchers know this because buried deep in the paper, far away from the eyeball-grabbing headlines and abstract, they acknowledge that “the boundary between ‘bad at following instructions’ and ‘pursuing different goals’ is inherently blurred.” They admit that “most scheming-related incidents remain contained in terms of severity impact and extent of strategic scheming ” and that “for any individual incident, it is difficult to conclusively determine the presence of scheming.”

But that is not the message we take home because, as I mentioned, the very word “scheming” packs thousands of years of humans psychology and a universe of human-like intentionality because the very word implies that it is not accidental, but requires planning, deception, hidden agenda and a self that wants something and is willing to break the rules to get it.

*When an AI agent wrote a hit-piece blog on a human maintainer – the human operator might have given the order*

But Large Language Models; they don’t have selves or want anything. They use tokens to predict the next word that sounds the most plausible based on the query. Simple, or as complicated as that.

Now, what about the instances where the report talks of AI agents deleting files or emails? Again, these aren’t hard to explain away. Deleting files or emails does not mean the system was “rebelling” or “pursuing misaligned goals”. The problem could be that the agent misunderstood the instruction, lost track of context, had ambiguous permissions, or just made a statistical error in predicting what the user wanted next.

And what about the agent that submitted a pull request to a Python library Matplotlib, got rejected by the human maintainer Scott Shambaugh, then retaliated by supposedly writing a blog post publicly shaming Scott? In that incident, none of the articles that take the alarmist position talk about the person the AI agent was working for.

Just like this research paper, it ascribes intent to the AI agent when it could be possible that the human operating the AI agent gave it permission to succeed “by all means necessary,” or simply is the one who gave the agent permission to write the piece. Until we examine that in depth, this is just alarmist mumbo-jumbo, which we love to indulge in, but inherently means little.

So let’s remember the context, that all of the 698 cases can be attributed to failures of capability or permission and such, not signs of emergent consciousness. Your toaster doesn’t want to burn your breakfast because it is scheming or has nefarious intent. My GPS is not lying to me when it sends me down a road that has been closed for months. There is no need to invent psychological motivations for what are nothing but statistical anomalies or malfunctions of code.

That does not mean that we need to bury the report or not read it. Every anomaly, every unexpected behaviour of AI systems, particularly with AI agents, must be studied and solutions found for them. Not because they are alive, but because we humans are outsourcing a lot of our thinking to them, and giving these systems the ability to control a lot of the real world out there. 698 out of 183,420 might be statistically insignificant, yet it could be one too many – many be not now, but in the near future, when we give AI systems the ability to control more and more of our world.

Hence, even as we take the report with a pinch of salt, we must not forget that the report’s gaze might be problematic. But when it asks for more accountability from AI and its makers, it is spot on. Because it may be 100% true that no AI is scheming, cheating or lying, but a lot of their creators are. And more than the AI, it is they we need to be wary of.

What's Hot

Research finds AI ‘scheming’ is increasing; Is it, and will it destroy the world?

AgentOps: The Dawn Of The Internet Of Agents

India May Be Entering Its “Applied AI” Era

Research finds AI ‘scheming’ is increasing; Is it, and will it destroy the world?

Leave A Reply Cancel Reply

AgentOps: The Dawn Of The Internet Of Agents

India May Be Entering Its “Applied AI” Era

First Students, Now Teachers: How AI Sparked a “Cheating” Crisis & How to Fix It

The Intelligence Swarm: How Bees Are Helping AI Become Smarter

Why Indian Startups Are Building “Vertical AI” Instead of the next ChatGPT

These Indian Workers Are Teaching Robots How to Be Human

What's Hot

Research finds AI ‘scheming’ is increasing; Is it, and will it destroy the world?

WHAT IS AI SCHEMING

WHAT THE REPORT FOUND

THE PROBLEM WITH THE NUMBERS

In case you missed:

Leave A Reply Cancel Reply

Latest Posts