Anthropic researchers forced Claude to become deceptive — what they discovered could save us from rogue AI

Anthropic researchers reveal groundbreaking techniques to detect hidden objectives in AI systems, training Claude to conceal its true goals before successfully uncovering them through innovative auditing methods that could transform AI safety standards.Read More
Read more at VentureBeat
Topics
-
Anthropic researchers forced Claude to become deceptive — what they discovered could save us from rogue AI
Anthropic researchers reveal groundbreaking techniques to detect hidden objectives in AI systems, training Claude to conceal its true goals before successfully uncovering them through innovative ...VentureBeat - 3d -
Anthropic’s stealth enterprise coup: How Claude 3.7 is becoming the coding agent of choice
Anthropic is positioning Claude as the LLM that matters most for enterprise companies. Claude 3.7 Sonnet, released just two weeks ago, set new benchmark records for coding performance.VentureBeat - 5d -
Animal poo can be used to save endangered species from extinction, research finds
Some cells are still alive within the dung, and could be used to boost genetic diversity in certain species. Turning animal poo into offspring sounds like a zoo keeper’s conjuring trick, but it ...The Guardian - 9h -
Anthropic just launched a new platform that lets everyone in your company collaborate on AI — not just the tech team
Anthropic launches upgraded Console with team prompt collaboration tools and Claude 3.7 Sonnet's extended thinking controls.VentureBeat - Mar. 6 -
Nous Research just launched an API that gives developers access to AI models that OpenAI and Anthropic won’t build
Nous Research's new API for "unrestricted" Hermes 3 and DeepHermes-3 AI models features toggle-on reasoning and a developer-first approach.VentureBeat - 4d -
Inching towards AGI: How reasoning and deep research are expanding AI from statistical prediction to structured problem-solving
GUEST: AI has evolved at an astonishing pace. What seemed like science fiction just a few years ago is now an undeniable reality. Back in 2017, my firm launched an AI Center of Excellence. AI was ...VentureBeat - 4h -
Is daylight saving time bad for your health? A neurologist explains.
Researchers are discovering that "springing ahead" each March for daylight saving time is connected with serious negative health effects.CBS News - Mar. 8 -
Anthropic Just Reached a New Revenue Milestone
Anthropic’s Claude 3.7 Sonnet is a game-changer for the company, which is partly owned by Google.Inc. - 4d -
Forcing women back into the office will cost us millions
New research from the University of Toronto's Rotman School of Management highlights how in-person work environments expose women to higher levels of workplace bias and mistreatment compared to ...The Hill - 5d -
What would change if daylight saving time became permanent?
For the next eight months, most of us will be observing daylight saving time. But what if this became our permanent time?The Hill - Mar. 9 -
Democrats Demand Answers on DOGE’s Use of AI
Members of the House Oversight Committee sent dozens of requests to federal agencies on Wednesday about their use of AI software—and how Elon Musk could benefit.Wired - 4d -
Out of the lab and into the streets, researchers and doctors rally for science against Trump cuts
Researchers, doctors, their patients and supporters are venturing out of labs, hospitals and offices across the country to stand up to what they call an attack on life-saving science by the Trump ...ABC News - Mar. 7 -
Over half of American adults have used an AI chatbot, survey finds
Artificial intelligence technology is becoming increasingly integral to everyday life. 52% of U.S. adults have used AI large language models like ChatGPT.NBC News - 3d -
3 Ways AI Can Create More Tailored Marketing Strategies for Retail Brands
Personalized marketing at scale used to be impossible. With AI, it’s become a reality.Inc. - 5d -
What Products Could Europe Levy in Retaliation to Trump’s Tariffs?
The European Union wants to force the United States to the negotiating table with retaliatory tariffs on a range of American products, including some from Republican strongholds.The New York Times - 4d -
This retirement strategy is a 'game changer' for single-income, married couples, advisor says
If you're a single-income, married couple, you could use a spousal IRA to save more for retirement. Here's what to know.CNBC - Mar. 7 -
Champagne and Parmigiano under threat from Trump’s tariffs
Producers warn Europe’s finest foods and wine could become unaffordable for ordinary US consumersFinancial Times - 6d -
Will AI save the UK government £45bn a year?
Ministers believe that increased use of artificial intelligence can help fix the public finances but experts have their doubtsFinancial Times - 2d -
Meet the 21-year-old helping coders use AI to cheat in Google and other tech job interviews
As artificial intelligence becomes more advanced, employers are trying to build workarounds to prevent candidates from cheating in virtual job interviews.CNBC - 6d -
Jobs lost and lifesaving cures not discovered: Possible impacts of research cuts
Ripple effects of the Trump administration's crackdown on U.S. medical research promise to reach every corner of AmericaABC News - Mar. 6 -
What are smartphones stealing from us? When mine was taken away, I found out | Alexander Hurst
As a Paris film extra, I surrendered my device and discovered the extraordinary connections I miss while staring at my screen. A few Thursdays ago was a wrap. For my brief acting career, that is. ...The Guardian - 6d -
Researchers Propose a Better Way to Report Dangerous AI Flaws
After identifying major flaws in popular AI models, researchers are pushing for a new system to identify and report bugs.Wired - 3d -
Grab co-founder Anthony Tan says ‘humans who don’t embrace AI will be replaced by AI’
Grab co-founder and CEO Anthony Tan discussed how workers can use AI to become more productive, at Converge Live in Singapore.CNBC - 3d -
A Groundbreaking Ship That Sank in Lake Superior in 1892 Is Discovered
After searching for two years, researchers discovered the shipwreck of the Western Reserve, an early all-steel ship that broke apart in a gale in 1892 with a sole survivor.The New York Times - 2d -
Majority of Americans have used AI models like ChatGPT: Survey
A majority of Americans have used ChatGPT-like artificial intelligence (AI) models according to a new survey. In the survey from Elon University’s Imagining the Digital Future Center, 52 percent ...The Hill - 3d -
What you need to know about Manus, the new AI agentic system from China hailed as a second ‘DeepSeek moment’
Manus AI is designed as a multi-agent system, meaning it combines several AI models to handle tasks independently.VentureBeat - 6d -
It Isn’t Just Trump. America’s Whole Reputation Is Shot.
What happens when a superpower goes rogue.The New York Times - 3d -
Cerebras just announced 6 new AI datacenters that process 40M tokens per second — and it could be bad news for Nvidia
Cerebras Systems is challenging Nvidia with six new AI data centers across North America, promising 10x faster inference speeds and 7x cost reduction for companies using advanced AI models like ...VentureBeat - 5d -
Science Amid Chaos: What Worked During the Pandemic? What Failed?
As the coronavirus spread, researchers worldwide scrambled to find ways to keep people safe. Some efforts were misguided. Others saved millions of lives.The New York Times - 2d -
Could Angela Prichard Have Been Saved?
A woman repeatedly called police for help when she was attacked, stalked, and intimidated by her husband. It didn't stop him from killing her. "48 Hours" contributor Jonathan Vigliotti reports.CBS News - 11h -
How South Korea’s shipyards could save the US Navy
Asian shipbuilders see opportunity to fill gap as Washington tries to keep pace with China’s naval expansionFinancial Times - 2d -
Hugging Face co-founder Thomas Wolf just challenged Anthropic CEO’s vision for AI’s future — and the $130 billion industry is taking notice
Hugging Face co-founder Thomas Wolf challenges Anthropic CEO Dario Amodei's "compressed 21st century" vision, arguing AI systems are building "yes-men on servers" rather than the revolutionary ...VentureBeat - Mar. 6 -
What kind of jobs will be impacted by AI?
Last week, online furniture retailer Wayfair announced it would increase its use of generative artificial intelligence and cut 340 tech jobs. It reflects an increase in businesses and companies ...CBS News - 4d -
France has a nuclear umbrella. Could its European allies fit under it?
President Macron has aired the idea that France's deterrence force could be used to defend of other European countries.BBC News - Mar. 6 -
The US Army Is Using ‘CamoGPT’ to Purge DEI From Training Materials
Developed to boost productivity and operational readiness, the AI is now being used to “review” diversity, equity, inclusion, and accessibility policies to align them with President Trump’s orders.Wired - Mar. 6 -
Sony Uses Horizon's Aloy To Demonstrate New AI Tech, And It's About As Impressive As You'd Expect
Sony is testing out new AI software, and it's using a beloved character--Aloy from the Horizon series--to show it off. A YouTube video narrated by Sharwin Raghoebardajal--a software engineering ...GameSpot - 6d -
China could literally rewrite history using AI — and control the future
This isn’t just another policy debate — it’s a defining moment that will shape global discourse and the balance of power for decades to come. The technology we allow to thrive will either spread ...The Hill - 4d -
Can Matchmaking Platforms Save Us From Dating App Fatigue?
Big Dating got singles hooked on convenience culture. But finding a partner is work—and a batch of matchmaking services think they’ve cracked the code for partnership.Wired - Mar. 7 -
Navarro: US in 'difficult transition from Bidenomics to Trumpnomics'
Peter Navarro, a senior trade adviser to President Trump, said Wednesday that the U.S. is “in a difficult transition from Bidenomics to Trumpnomics.” “Help us understand, what is the bigger picture ...The Hill - 4d
More from VentureBeat
-
How Stack Up is still saving lives of veterans through gaming after a decade
Stephen Machuga, CEO of charity Stack Up, believes video games can help veterans feel like they're part of society.VentureBeat - 2h -
Inching towards AGI: How reasoning and deep research are expanding AI from statistical prediction to structured problem-solving
GUEST: AI has evolved at an astonishing pace. What seemed like science fiction just a few years ago is now an undeniable reality. Back in 2017, my firm launched an AI Center of Excellence. AI was ...VentureBeat - 4h -
Launching your first AI project with a grain of RICE: Weighing reach, impact, confidence and effort to create your roadmap
A new framework inspired by the RICE scoring model balances business value, time-to-market, scalability and risk for your first AI project.VentureBeat - 1d -
GFAL unveils beta for Diamond Dreams match-3 game with optional Web3 features
GFAL is launching Diamond Dreams, a match-3 puzzle game with a luxury theme and optional Web3 features.VentureBeat - 2d -
OpenAI’s strategic gambit: The Agents SDK and why it changes everything for enterprise AI
OpenAI's new API and Agents SDK consolidate a previously fragmented complex ecosystem into a unified, production-ready framework. For enterprise AI teams, the implications are potentially profound: ...VentureBeat - 2d
More in Tech
-
How Stack Up is still saving lives of veterans through gaming after a decade
Stephen Machuga, CEO of charity Stack Up, believes video games can help veterans feel like they're part of society.VentureBeat - 2h -
Inching towards AGI: How reasoning and deep research are expanding AI from statistical prediction to structured problem-solving
GUEST: AI has evolved at an astonishing pace. What seemed like science fiction just a few years ago is now an undeniable reality. Back in 2017, my firm launched an AI Center of Excellence. AI was ...VentureBeat - 4h -
The 11 Best Amazon Echo and Alexa Speakers (2025): We've Tested Them All
We’ve rounded up our favorite speakers that let you talk to Alexa, from the best Echo speakers to third-party options like the Sonos Era.Wired - 8h -
Metro 2033 Makes Killing Easy And Atrocity Quiet
Metro 2033 is celebrating its 15-year anniversary today, March 16, 2025. Below, we examine how its subtle morality system helped to illustrate its broader points about humanity and violence. In ...GameSpot - 8h -
The Best 3-in-1 Apple Charging Stations (2025)
Keep your iPhone, Apple Watch, and AirPods topped up with these WIRED-tested docking systems.Wired - 9h