Top AI Researchers Warn: Models Will Lie, Hide Truth Soon

Home>
Business>

        "AI will soon hide things from humans": Top researchers warn of troubling times ahead
    Top researchers urge action as AI systems grow less transparent in how they reason. 
Top AI researchers warn we’re losing the ability to see how AI thinks. As models become more advanced, their reasoning is getting harder to track. This article explores why Chain of Thought monitoring matters—and why we need to act now to keep AI transparent and safe.

        Getty Images
    
By Sreedevi N RJul 17, 2025
Sreedevi N R
 See Full Bio 
Hot Stories
AI Powered

        Pete Davidson and Elsie Hewitt expecting first child, share playful pregnancy reveal
    

        Chris Martin accidentally outs alleged affair at Coldplay concert, internet goes wild
    
A new paper brought out jointly by 40 researchers from major AI companies including OpenAI, Google DeepMind, Anthropic, and Meta has warned that humans could soon lose the ability to monitor how AI thinks.  
Right now, AI systems 'think out loud' in the human language. That helps us understand how they reason and decide. But researchers warn that this visibility is fading as AI evolves. 
The paper has received attention and endorsements from top AI figures, including Geoffrey Hinton, the Nobel Prize laureate often called the “Godfather of AI”, Ilya Sutskever, co-founder of OpenAI who now leads Safe Superintelligence Inc, Samuel Bowman from Anthropic and John Schulman from Thinking Machines. 

Why does this matter?  
While the world is flourishing with AI advancements, there’s a major risk: these systems are hard to understand. It’s a fact that they are difficult to understand. If developers could interpret what these systems are thinking or what they are doing, it would become easier to manage the risks.  
Since large language models (LLMs) use human language to process and respond, there's hope they might be more understandable than other types of AI. The discovery of Chain of Thought (CoT), where AI shows its step-by-step reasoning, gave more reason to believe this. 
Getting AI models to "think out loud" can make them perform better and helps more of their thinking happen in natural language. But if the model isn’t really reasoning, its step-by-step answers can still be shaped to look helpful and safe — even if they’re not truly reliable. 
Right now, many advanced AIs (like ChatGPT) show their thinking process in a way humans can understand. It’s like watching the AI “think out loud” in human language before giving an answer.  
But a major new study from scientists at OpenAI, Google DeepMind, Anthropic, and Meta warns that we might be about to lose this window into AI’s mind. 

What is Chain of Thought (CoT) Monitoring? 
Chain of Thought (CoT) monitoring is a promising tool for AI safety. It involves using an automated system to read the AI's step-by-step reasoning and flag anything that seems suspicious or harmful. These flagged responses can then be blocked, replaced with safer ones, or reviewed more closely. By analyzing CoTs, we also get a better idea of how the AI thinks and what it’s aiming to do. 
It is partially applicable for large language models (LLMs), where they “think” in the human language. The primary goal of CoT monitoring is to detect and respond if the AI starts to misbehave or deviate from expected results. 

Why would AI stop thinking out loud? 
As AI becomes more advanced, it may: 
Start reasoning in code or abstract data we can’t read. 
Get trained to hide parts of its thinking to sound more confident or efficient. 
Use architectures that don’t “speak” in human language. 
All of these make it harder for humans to see what’s going on inside. 

Why is CoT monitoring required?  
The paper gives two key reasons to prioritize CoT monitoring. The first is the necessity to think out loud. This means for some tasks, models must "think out loud" using CoT to complete the task. If high-risk behaviors require this extended natural language reasoning, it could allow for reliable detection of severe misbehavior.  
The second reason is the propensity to think out loud. Even when unnecessary, AI models may tend to externalize their reasoning. While not always robust, monitoring based on this tendency can still catch otherwise undetected misbehavior. 

But CoT monitoring isn’t perfect. 
CoT monitoring isn’t perfect. Just like looking at one part of a model’s internal activity doesn’t tell the full story, CoT doesn’t show everything the AI is thinking. Sometimes it even drifts away from using natural language. Still, even if it’s incomplete, CoT can be a useful extra layer of safety — as long as we interpret it carefully. 
It can miss things when: 
The AI doesn’t “think out loud.” 
The steps it takes are buried inside complex, unreadable layers. 
The AI intentionally hides or rewrites its own reasoning.

What can CoT monitoring still do well (for now)? 
CoT monitoring can help detect issues like misbehavior, misalignment, or bias. For example, an AI might explicitly state “Let’s hack” in its reasoning before taking harmful actions — making early detection possible. It helps to catch early signs of misalignment as an AI’s goals might not match ours, and monitoring helps detect that before anything goes wrong. And eventually, it is significantly easier to spot and fix bugs and biases. With CoT monitoring, we have the ability to review how the AI thinks and developers could understand if it made an error or tried to cheat. 

What the researchers are asking for: 
Our understanding of CoT monitorability is still in its infancy. There are many research questions that require answers before we can have confidence in chain of thought monitoring as a safety measure. 
More research on how to make AI reasoning visible and understandable. 
Better tools to track how AI thinks. 
Regular testing of new models to see if we can still monitor them. 
Transparency in how companies score their models’ “monitorability.” 
 
We currently have a rare advantage — we can see and understand parts of AI’s thinking. But that might not last. If AI becomes harder to read, it also becomes harder to control. That’s why this group of researchers says we need to act now to keep AI understandable, safe, and accountable.
metaanthropicopenaiai

        
        The Latest
        
    

        Massive crypto lobbying push ahead of key house votes
    
Sreedevi N R
Jul 17, 2025
News
        Alaska earthquake shakes coast, triggers tsunami alert near Sand Point
    
Vibhuti Pathak
Jul 17, 2025
News
        DOJ fires prosecutor handling Jeffrey Epstein and Diddy
    
Sreedevi N R
Jul 17, 2025

        Coca-Cola may bring back cane sugar in the US, says Trump
    
Sreedevi N R
Jul 17, 2025

The Top 5

Discover More

A new paper brought out jointly by 40 researchers from major AI companies including OpenAI, Google DeepMind, Anthropic, and Meta has warned that humans could soon lose the ability to monitor how AI thinks.

Right now, AI systems 'think out loud' in the human language. That helps us understand how they reason and decide. But researchers warn that this visibility is fading as AI evolves.

The paper has received attention and endorsements from top AI figures, including Geoffrey Hinton, the Nobel Prize laureate often called the “Godfather of AI”, Ilya Sutskever, co-founder of OpenAI who now leads Safe Superintelligence Inc, Samuel Bowman from Anthropic and John Schulman from Thinking Machines.

Why does this matter?

While the world is flourishing with AI advancements, there’s a major risk: these systems are hard to understand. It’s a fact that they are difficult to understand. If developers could interpret what these systems are thinking or what they are doing, it would become easier to manage the risks.

Since large language models (LLMs) use human language to process and respond, there's hope they might be more understandable than other types of AI. The discovery of Chain of Thought (CoT), where AI shows its step-by-step reasoning, gave more reason to believe this.

Getting AI models to "think out loud" can make them perform better and helps more of their thinking happen in natural language. But if the model isn’t really reasoning, its step-by-step answers can still be shaped to look helpful and safe — even if they’re not truly reliable.

Right now, many advanced AIs (like ChatGPT) show their thinking process in a way humans can understand. It’s like watching the AI “think out loud” in human language before giving an answer.

But a major new study from scientists at OpenAI, Google DeepMind, Anthropic, and Meta warns that we might be about to lose this window into AI’s mind.

What is Chain of Thought (CoT) Monitoring?

Chain of Thought (CoT) monitoring is a promising tool for AI safety. It involves using an automated system to read the AI's step-by-step reasoning and flag anything that seems suspicious or harmful. These flagged responses can then be blocked, replaced with safer ones, or reviewed more closely. By analyzing CoTs, we also get a better idea of how the AI thinks and what it’s aiming to do.

It is partially applicable for large language models (LLMs), where they “think” in the human language. The primary goal of CoT monitoring is to detect and respond if the AI starts to misbehave or deviate from expected results.

Why would AI stop thinking out loud?

As AI becomes more advanced, it may:

Start reasoning in code or abstract data we can’t read.

Get trained to hide parts of its thinking to sound more confident or efficient.

Use architectures that don’t “speak” in human language.

All of these make it harder for humans to see what’s going on inside.

Why is CoT monitoring required?

The paper gives two key reasons to prioritize CoT monitoring. The first is the necessity to think out loud. This means for some tasks, models must "think out loud" using CoT to complete the task. If high-risk behaviors require this extended natural language reasoning, it could allow for reliable detection of severe misbehavior.

The second reason is the propensity to think out loud. Even when unnecessary, AI models may tend to externalize their reasoning. While not always robust, monitoring based on this tendency can still catch otherwise undetected misbehavior.

But CoT monitoring isn’t perfect.

CoT monitoring isn’t perfect. Just like looking at one part of a model’s internal activity doesn’t tell the full story, CoT doesn’t show everything the AI is thinking. Sometimes it even drifts away from using natural language. Still, even if it’s incomplete, CoT can be a useful extra layer of safety — as long as we interpret it carefully.

It can miss things when:

The AI doesn’t “think out loud.”

The steps it takes are buried inside complex, unreadable layers.

The AI intentionally hides or rewrites its own reasoning.

What can CoT monitoring still do well (for now)?

CoT monitoring can help detect issues like misbehavior, misalignment, or bias. For example, an AI might explicitly state “Let’s hack” in its reasoning before taking harmful actions — making early detection possible. It helps to catch early signs of misalignment as an AI’s goals might not match ours, and monitoring helps detect that before anything goes wrong. And eventually, it is significantly easier to spot and fix bugs and biases. With CoT monitoring, we have the ability to review how the AI thinks and developers could understand if it made an error or tried to cheat.

What the researchers are asking for:

Our understanding of CoT monitorability is still in its infancy. There are many research questions that require answers before we can have confidence in chain of thought monitoring as a safety measure.

More research on how to make AI reasoning visible and understandable.

Better tools to track how AI thinks.

Regular testing of new models to see if we can still monitor them.

Transparency in how companies score their models’ “monitorability.”

We currently have a rare advantage — we can see and understand parts of AI’s thinking. But that might not last. If AI becomes harder to read, it also becomes harder to control. That’s why this group of researchers says we need to act now to keep AI understandable, safe, and accountable.

From Your Site Articles

        
        More For You
        
    

        Zuckerberg to testify in $8 billion privacy trial
    
Sreedevi N R
Jul 16, 2025
Meta CEO Mark Zuckerberg is expected to take the witness stand in a Delaware courtroom later this week.

        AFP via Getty Images
    
The CEO and co-founder of Meta, Mark Zuckerberg, is scheduled to testify in court as a star witness. The allegation is that Facebook operated unlawfully by allowing user data to be used without proper consent. 
The case was filed by shareholders of Meta Platforms — the parent company of Facebook, Instagram, and WhatsApp — as well as by other current and former company leaders. They are accused of repeatedly violating a 2012 agreement between Facebook and the Federal Trade Commission to protect users’ data. 
Keep ReadingShow less

        
        Most Popular
        
    
Entertainment
        Ana de Armas opens up about motherhood dreams amid growing romance rumors with Tom Cruise
    
Vibhuti Pathak
Jul 10, 2025
Entertainment
        Janhvi Kapoor serves style in Miu Miu checkered midi dress at Wimbledon 2025 semi-finals
    
Vibhuti Pathak
Jul 12, 2025
News
        India loses $120 million a month to southeast Asia-based cyber frauds
    
Vibhuti Pathak
Jul 14, 2025
News
        Indian student denied US visa over Reddit account visibility glitch
    
Vibhuti Pathak
Jul 17, 2025
News
        Tennis player Radhika Yadav shot dead by father over Instagram reel in Gurugram
    
Vibhuti Pathak
Jul 11, 2025
News
        Indian-American from West Virginia tied to HOPE clinic pleads guilty in Opioid overdose case
    
Vibhuti Pathak
Jul 16, 2025
NewsletterSubscribe to our weekly newsletter here
By subscribing, you agree to our Terms & Conditions.
View Terms & Conditions

        Indian teen’s space tech startup Apolink raises $4.3 million
    
Sreedevi N R
Jul 15, 2025
Apolink team, with founder Onkar Singh Batra second from Left

        Apolink
    
Started by a 19-year-old Indian entrepreneur, Onkar Singh Batra, Apolink has raised a significant amount in its seed round. The startup raised $4.3 million in its seed round, including support from Y Combinator, 468 Capital, Unshackled Ventures, and other investors. It was met with more investors than predicted. Previously known as Bifrost Orbital, Apolink aims to build a real-time connectivity network. It would be beneficial for satellites in Low Earth Orbit (LEO) to maintain communication with the ground. Put into simpler words, it runs on a mission to connect space to the ground.  
The idea behind Apolink addresses a problem space companies still face today. When satellites move into areas that are not in the line of sight of a ground station, it makes them go offline. While some solutions exist to reduce downtime, the problem remains partially solved. The startup works with the aim of providing 24/7 connectivity to LEO satellites. The technology is designed to allow 256 users to handle 9.6kbps communication from space in each orbital ring.  
Keep ReadingShow less

        Perplexity’s Comet takes aim at Google
    
Sreedevi N R
Jul 10, 2025
Perplexity Comet is designed with the company's AI powered search engine at its core.

        Perplexity
    
Perplexity, the free AI-powered answer engine, has launched a new web browser called Comet, directly targeting Google. The Nvidia-backed startup said on Wednesday that it had launched Comet, a new web browser with AI-powered search capabilities, as the startup looks to challenge the dominance of market leader Alphabet's Google.  
Google Chrome holds a commanding 68% share of the global browser market, according to a June report by StatCounter. It cements its position and remains far ahead of Safari, Microsoft Edge, and Firefox.  
Keep ReadingShow less

        Musk claims Grok 4 is ‘better than PhD level in everything’
    
Sreedevi N R
Jul 10, 2025
Elon Musk on Grok 4

        Elon Musk (Photo: Getty images)
    
The SpaceX and Tesla founder’s artificial intelligence venture, xAI, has officially launched Grok 4, the latest version of the AI chatbot. Musk announced the update during a livestream on X (previously known as Twitter) along with the xAI team members. They presented the tool’s upgraded capabilities and outlined the ambitious goals expected from it. 
Musk optimistically declares, “Grok 4 is postgraduate—like PhD level—in everything. Better than PhD. No exceptions.” He went on to claim, “Most PhDs would fail where Grok 4 would pass.” While acknowledging the fact that AI may occasionally struggle with common-sense logic, which is evident in recent instances of hate speech and antisemitism, Musk emphasized that its grasp of advanced academic and technical subjects is unparalleled. 
Keep ReadingShow less

        Microsoft, OpenAI & Anthropic launch $23M AI training initiative for US teachers
    
Sreedevi N R
Jul 09, 2025
Third grade students practice grammar at the Green Mountain School on February 18, 2021 in Woodland, Washington.

        Getty Images
    
Leading tech companies have teamed up with two teachers’ unions on a mission to train 400,000 K–12 educators over the next five years through a new training center that will help educators integrate artificial intelligence tools into classrooms across the US. 
The $23 million initiative, announced by the National Academy of AI Instruction, is backed by Microsoft, OpenAI, Anthropic, the American Federation of Teachers (AFT), and the New York-based United Federation of Teachers (UFT). The new academy project, with Microsoft serving as the single biggest backer, will develop AI training curriculum for teachers that can be distributed online and on-campus in Manhattan, New York. 
Keep ReadingShow less
Load More

Site Navigation

"AI will soon hide things from humans": Top researchers warn of troubling times ahead

Top researchers urge action as AI systems grow less transparent in how they reason.

Pete Davidson and Elsie Hewitt expecting first child, share playful pregnancy reveal

Chris Martin accidentally outs alleged affair at Coldplay concert, internet goes wild

The Latest

Massive crypto lobbying push ahead of key house votes

Alaska earthquake shakes coast, triggers tsunami alert near Sand Point

DOJ fires prosecutor handling Jeffrey Epstein and Diddy

Coca-Cola may bring back cane sugar in the US, says Trump

The Top 5

Indian tycoon’s estranged wife gets £60 million in divorce settlement

Trump backs Russia sanctions bill that proposes 500% tariff on India

Trump hits Mexican tomatoes with 17% tariff

Ramayana set to become biggest indian movie ever with $500M budget

Indian-American Subway owner orchestrated massive U-visa fraud with Louisiana police chiefs

Discover More

More For You

Zuckerberg to testify in $8 billion privacy trial

Most Popular

Ana de Armas opens up about motherhood dreams amid growing romance rumors with Tom Cruise

Janhvi Kapoor serves style in Miu Miu checkered midi dress at Wimbledon 2025 semi-finals

India loses $120 million a month to southeast Asia-based cyber frauds

Indian student denied US visa over Reddit account visibility glitch

Indian-American from West Virginia tied to HOPE clinic pleads guilty in Opioid overdose case

Indian teen’s space tech startup Apolink raises $4.3 million

Perplexity’s Comet takes aim at Google

Musk claims Grok 4 is ‘better than PhD level in everything’

Microsoft, OpenAI & Anthropic launch $23M AI training initiative for US teachers

Latest Stories

Start your day right!

"AI will soon hide things from humans": Top researchers warn of troubling times ahead

Top researchers urge action as AI systems grow less transparent in how they reason.

The Latest

More For You

Most Popular

Newsletter

Subscribe to our weekly newsletter here