Latest in Benchmarks
Sort by
7 items
-
Beyond benchmarks: How DeepSeek-R1 and o1 perform on real-world tasks
o1 is slightly better at reasoning, but DeepSeek-R1 provides much more details about its reasoning, which is very useful to the user.VentureBeat - 5h -
European Central bank cuts benchmark rate by a quarter percentage point to boost stagnant economy
The European Central Bank is cutting its key interest rate, a step to boost an economy that’s struggling to grow as consumers burned by inflation warily eye price tags and businesses try to chart a ...ABC News - 1d -
Turkey's central bank lowers benchmark interest rate to 45%
Turkey’s central bank has lowered its key interest rate by 2.5 percentage points to 45%ABC News - Jan. 23 -
China keeps benchmark lending rates unchanged as it contends with a weakening yuan
Beijing contends with a weakening yuan while awaiting policy clues from the incoming Donald Trump's administration.CNBC - Jan. 20 -
Google DeepMind researchers introduce new benchmark to improve LLM factuality, reduce hallucinations
Based on a new benchmark, Google DeepMind found Gemini 2.0 Flash to be the most factual LLM, with a score of 83.6%.VentureBeat - Jan. 10 -
Self-invoking code benchmarks help you decide which LLMs to use for your programming tasks
LLMs are good at coding simple functions. But how good are they at calling their own functions to solve complex problems?VentureBeat - Jan. 10 -
Active mutual funds struggle to beat large-cap stock benchmarks — again
Professional stock pickers in the mutual-fund industry had a tough time in 2024 beating indexes that passively track U.S. large-cap equities, according to BofA Global Research.MarketWatch - Jan. 8