AI can fix bugs—but can’t find them: OpenAI’s study highlights limits of LLMs in software engineering

Glitchwave software developer typing on a computer lit in neon green against a neon yellow, orange and pink backdrop.

A new test from OpenAI researchers found that LLMs were unable to resolve some freelance coding tests, failing to earn full value.Read More

Topics

Taking AI to the playground: LinkedIn combines LLMs, LangChain and Jupyter Notebooks to improve prompt engineering

LinkedIn's collaborative prompt engineering playground helps bridge the gap between engineers and product managers.
VentureBeat - Feb. 13
Replit and Anthropic’s AI just helped Zillow build production software—without a single engineer

Replit partners with Anthropic's Claude and Google Cloud to enable non-programmers to build enterprise software, as Zillow and others deploy AI-generated applications at scale, signaling a shift in ...
VentureBeat - 3d
‘Jekyll and Hyde Leadership’ Can Hurt Performance, According to a New Study. Here’s How to Fix It

Research shows that oscillating leadership creates confusion that leaves employees emotionally exhausted.
Inc. - 23h
AI can help fix potholes, but we need to get basics right first

Successful innovation abounds but a centralised approach to roll out tech across local government is required
Financial Times - Feb. 10
Wild fish can tell humans apart when they dress differently, study finds

Researchers say study, which involved training bream to follow a specific diver for treats, could change the way we treat fish. Wild fish can tell people apart – at least when they are wearing ...
The Guardian - 1d
Massive Stalker 2 Patch Has Over 1,700 Changes, Including Big Bug Fixes

Stalker 2's much-anticipated 1.2 patch is here, and it continues where the last few 1.1 updates left off--and then some. According to its announcement on Steam , Stalker's 1.2 patch has more than ...
GameSpot - Feb. 13
Out-analyzing analysts: OpenAI’s Deep Research pairs reasoning LLMs with agentic RAG to automate work — and replace jobs

OpenAI’s Deep Research pairs advanced reasoning LLMs with agentic RAG, delivering automated reports that rival human analysts — at a fraction of the cost. This breakthrough AI tool could redefine ...
VentureBeat - 2d
Software stocks are on fire. Here’s why the party can keep going this year.

Smaller software stocks have been on a tear this year, and the sector broadly stands to benefit from lower costs for AI.
MarketWatch - 3d
A Chinese paper finds ChatGPT — not DeepSeek — can generate stock-market returns. But the key insight doesn’t have anything to do with AI.

A new research paper produced in China finds that ChatGPT — and not domestic rival DeepSeek — can be used to forecast the stock market and the economy.
MarketWatch - 1d

AI can fix bugs—but can’t find them: OpenAI’s study highlights limits of LLMs in software engineering

Topics

Related

Taking AI to the playground: LinkedIn combines LLMs, LangChain and Jupyter Notebooks to improve prompt engineering

Replit and Anthropic’s AI just helped Zillow build production software—without a single engineer

‘Jekyll and Hyde Leadership’ Can Hurt Performance, According to a New Study. Here’s How to Fix It

AI can help fix potholes, but we need to get basics right first

Wild fish can tell humans apart when they dress differently, study finds

Massive Stalker 2 Patch Has Over 1,700 Changes, Including Big Bug Fixes

Out-analyzing analysts: OpenAI’s Deep Research pairs reasoning LLMs with agentic RAG to automate work — and replace jobs

Software stocks are on fire. Here’s why the party can keep going this year.

A Chinese paper finds ChatGPT — not DeepSeek — can generate stock-market returns. But the key insight doesn’t have anything to do with AI.

More from VentureBeat

Ex-OpenAI CTO Mira Murati unveils Thinking Machines: A startup focused on multimodality, human-AI collaboration

Relic Entertainment unveils DLC for Company of Heroes 3

Sam Altman admits OpenAI was ‘on the wrong side of history’ in open source debate

Nintendo discontinues Gold Points rewards program

Not every AI prompt deserves multiple seconds of thinking: how Meta is teaching models to prioritize

More in Tech

USDA Layoffs Derail Projects Benefiting American Farmers

The Showdown Between Elon Musk and Sam Altman

The Watergate-Inspired Law That’s Being Used to Fight DOGE

DOGE Puts $1 Spending Limit on Government Employee Credit Cards

China's Alibaba sees revenue surge on back of artificial intelligence, e-commerce