Latest in Beyond generic benchmarks: How Yourbench lets enterprises evaluate AI models against actual data
Sort by
1,845 items
-
Beyond generic benchmarks: How Yourbench lets enterprises evaluate AI models against actual data
Hugging Face warned that Yourbench is compute intensive but this might be a price enterprises are willing to pay to evaluate models on their data.VentureBeat - 1d -
Google’s Gemini 2.5 Pro is the smartest model you’re not using – and 4 reasons it matters for enterprise AI
Gemini 2.5 Pro marks a significant leap forward for Google in the foundational model race – not just in benchmarks, but in usability. Based on early experiments, benchmark data, and hands-on ...VentureBeat - 5d -
The TAO of data: How Databricks is optimizing AI LLM fine-tuning without data labels
New approach flips the script on enterprise AI adoption by using input data you already have for fine-tuning instead of needing labelled data.VentureBeat - Mar. 27 -
This Tool Probes Frontier AI Models for Lapses in Intelligence
A new platform from data training company Scale AI will let artificial intelligence developers find their models’ weak spots.Wired - 1d -
Anthropic scientists expose how AI actually ‘thinks’ — and discover it secretly plans ahead and sometimes lies
Anthropic has developed a new method for peering inside large language models (LLMs) like Claude, revealing for the first time how these AI systems process information and make decisions. The ...VentureBeat - Mar. 27 -
AI lie detector: How HallOumi’s open-source approach to hallucination could unlock enterprise AI adoption
Oumi's open-source HallOumi tool helps enterprises combat AI hallucinations through sentence-level verification that provides confidence scores, citations and human-readable explanations.VentureBeat - 15h -
Pentagon watchdog to evaluate Hegseth's use of Signal
The Pentagon inspector general's office said it would evaluate Hegseth's use of Signal to discuss strikes against the Houthis.CBS News - 14h -
Credit where credit’s due: Inside Experian’s AI framework that’s changing financial access
Experian's enterprise AI framework offers valuable lessons for businesses seeking to scale beyond proof of concept.VentureBeat - 6d -
Zencoder’s ‘Coffee Mode’ is the future of coding: Hit a button and let AI write your unit tests
Zencoder launches powerful AI coding agents with "Coffee Mode" that outperform competitors on benchmarks while integrating with existing developer environments, allowing programmers to be more ...VentureBeat - 2d -
A new, enterprise-specific AI speech model is here: Jargonic from aiOla claims to best rivals at your business’s lingo
The model’s architecture integrates keyword spotting directly into the transcription process, allowing Jargonic to maintain accuracy...VentureBeat - 4d -
The tool integration problem that’s holding back enterprise AI (and how CoTools solves it)
CoTools uses hidden states and in-context learning to enable LLMs to use more than 1,000 tools very efficiently.VentureBeat - 1d -
Gartner forecasts gen AI spending to hit $644B in 2025: What it means for enterprise IT leaders
Gartner forecasts large growth in global AI spending as enterprises shift focus to commercial tools away from custom projects that often fail.VentureBeat - 3d -
This Crazy Instrument Lets Us Hear How Dinosaurs Might Have Sounded
Using 3D models of ancient skulls, Dinosaur Choir gets us closer than ever to understanding the noises that dinosaurs made.Wired - 6d -
Trump administration announces plans to build AI data centers on federal land
The Trump administration identified 16 sites for the development of artificial intelligence (AI) data centers Thursday on land owned by the Department of Energy. The centers comprise rows of ...The Hill - 19h -
Amazon's Nova AI agent launch puts it up against rivals OpenAI, Anthropic
Amazon on Monday released a new AI model that can take actions in a web browser on a user’s behalf, a move that puts it in more direct competition with OpenAI, Anthropic and other companies that ...NBC News - 3d -
🤖 AI vs Humans: Predicting the results of Premier League Matchday 30
Let us know your predictions in the comments!Before the international break, we pitted our special guest - electronic artist Turno - against an AI to see who could most accurately predict the ...Yahoo Sports - 3d -
Calling all fashion models … now AI is coming for you
As fashion brands create AI ‘twins’ with models’ permission, some believe this is just another form of exploitation. The impact of AI has been felt across industries from Hollywood to publishing – ...The Guardian - 4d -
Airplane Accidents Are Making People Re-Evaluate How They Fly With Infants
Recent airplane accidents have fueled concerns about whether young children are sufficiently protected on flights and prompted parents and caregivers to re-evaluate how, and even whether, they ...The New York Times - Mar. 28 -
An AI Image Generator’s Exposed Database Reveals What People Really Used It For
An unsecured database used by a generative AI app revealed prompts and tens of thousands of explicit images—some of which are likely illegal. The company deleted its websites after WIRED reached out.Wired - 4d -
Metroid Prime 4: Beyond On Switch 2 Will Let You Swap To Mouse Controls On The Fly
Metroid Prime 4: Beyond will be released across both the Nintendo Switch and the upcoming Switch 2 , with the Switch 2 edition sporting enhanced visuals. There is one other reason the Switch 2 ...GameSpot - 19h -
Why businesses judge AI like humans — and what that means for adoption
Enterprises adopting AI aren’t just signing a “utility contract” for revenue growth; they’re entering an “emotional contract.”VentureBeat - 5d -
OpenAI to release open-source model as AI economics force strategic shift
OpenAI plans to release its first open-weight AI model since 2019 as economic pressures mount from competitors like DeepSeek and Meta, marking a significant strategic reversal for the company ...VentureBeat - 3d -
Venture Capital Has Never Been This Obsessed With AI, New Data Shows
More than 70 percent of U.S. venture capital went toward AI investments during the first quarter of 2025, according to PitchBook.Inc. - 1d -
Anthropic announces updates on security safeguards for its AI models
Anthropic on Monday announced updates to the "responsible scaling" policy for its AI.CNBC - 3d -
4/1: CBS Morning News
President Trump prepares to unveil reciprocal tariffs; How to tell if an image is AI-generated.CBS News - 2d -
AI can predict when an earthquake might strike — here’s how
AI-enabled earthquake forecasting needs to be in Trump’s AI action plan, now.The Hill - 1d -
UK tech scheme includes AI tool to mark homework as ministers weigh selling data
Public records my be monetised as part of National Data Library within a decade, says science secretary Peter KyleFinancial Times - 4d -
Postecoglou on VAR: Might as well let AI be ref
Tottenham Hotspur manager Ange Postecoglou bemoaned the time-consuming VAR process that saw Pape Sarr's goal chopped off in their 1-0 defeat at Chelsea on Thursday.ESPN - 32m -
Gladia launches Solaria as AI-based multi-lingual speech recognition model for speech-to-text transcription
Gladia, an AI transcription and audio intelligence provider, launched Solaria, a next-gen automatic speech recognition (ASR) model designed to redefine real-time communications for call centers and ...VentureBeat - 2d -
Energy Department Invites AI Development at Los Alamos and Other Federal Lands
The DOE says that there are 16 federal sites that would be ideal places for tech companies to build AI data centers.Inc. - 16h -
Uplimit raises stakes in corporate learning with suite of AI agents that can train thousands of employees simultaneously
Uplimit launches AI learning agents that help enterprises boost employee skills with 94% completion rates while reducing training admin time by 75%, addressing the growing AI-driven skills gap.VentureBeat - 1d -
Always Wanted a Personal Assistant? This AI Startup Wants to Help
Yutori wants to bring AI into the workplace by creating a digital personal assistant that can actually do some work for you at the office. It just won’t bring you coffee.Inc. - 3d -
Immutable RavenQuest becomes most-streamed Web3 game with 1M-plus streams
Tavernlight Games has launched RavenQuest on Immutable as a next-generation MMORPG that is setting new benchmarks for Web3 gaming adoption.VentureBeat - 2d -
Anime lessons in the limits of AI
Generative images show us the risks of endowing the technology with magical powersFinancial Times - 3d -
Arizona Supreme Court taps AI avatars to make the judicial system more publicly accessible
Arizona highest court has created a pair of AI-generated avatars to deliver news of all rulings issued by the justicesABC News - Mar. 18 -
Amazon's AGI Lab Reveals Its First Work: Advanced AI Agents
Led by a former OpenAI executive, Amazon’s AI lab focuses on the decisionmaking capabilities of next-generation software agents—and borrows insights from physical robots.Wired - 4d -
$40B into the furnace: As OpenAI adds a million users an hour, the race for enterprise AI dominance hits a new gear
In a move that surprised the tech industry Monday, OpenAI said it has secured a monumental $40 billion funding round led by SoftBank, catapulting its valuation to an unprecedented $300 billion -- ...VentureBeat - 3d -
Brian Daboll on Travis Hunter: Takes a long time to evaluate, there's a lot of tape
Giants General Manager Joe Schoen said this week that signing quarterback Russell Wilson leaves the team open to taking a player at any position with the third overall pick in the draft, but ...Yahoo Sports - 2d -
Nintendo Switch 2 Will Let You Transfer System Data From Switch 1
Nintendo pulled back the curtain on its anticipated Switch successor, the Nintendo Switch 2 , during its lengthy April Nintendo Direct presentation. In addition to giving fans a closer look at the ...GameSpot - 1d -
Amazon's Nova AI agent launch puts it up against rivals OpenAI, Anthropic
The new tool, called Nova Act, will compete with other agentic AI tools launched by the likes of OpenAI and Anthropic.CNBC - 3d