Latest in Llamav O1

Sort by

3 items

Open-source DeepSeek-R1 uses pure reinforcement learning to match OpenAI o1 — at 95% less cost

The company developed DeepSeek-R1 by using pure reinforcement learning on top of DeepSeek-V3-Base, and matched or beat o1 on some benchmarks.
VentureBeat - 1d
$Microsoft’s new rStar-Math technique upgrades small models to outperform OpenAI’s o1-preview at math problems$

Microsoft’s new rStar-Math technique upgrades small models to outperform OpenAI’s o1-preview at math problems

Phi-4 and an rStar-Math paper suggest that compact, specialized models can provide powerful alternatives to the industry’s largest systems.
VentureBeat - Jan. 9

Topics