Latest in Llamav O1
Sort by
3 items
-
Open-source DeepSeek-R1 uses pure reinforcement learning to match OpenAI o1 — at 95% less cost
The company developed DeepSeek-R1 by using pure reinforcement learning on top of DeepSeek-V3-Base, and matched or beat o1 on some benchmarks.VentureBeat - 1d -
Microsoft’s new rStar-Math technique upgrades small models to outperform OpenAI’s o1-preview at math problems
Phi-4 and an rStar-Math paper suggest that compact, specialized models can provide powerful alternatives to the industry’s largest systems.VentureBeat - Jan. 9