Last update over a year ago, so I hope (2025) gets added to the title:&gt; [2025&#x2F;05&#x2F;26] (Step 1 completed!) We release Mixture-of-Thoughts--a curated reasoning dataset of 350k verified traces distilled from R1. The dataset spans tasks in mathematics, coding, and science, and is designed to teach language models to reason step-by-step. We also provide a recipe to train OpenR1-Distill-7B, which replicates the reasoning capabilities of deepseek-ai&#x2F;DeepSeek-R1-Distill-Qwen-7B and marks the completion of step 1 in the Open R1 project.Doesn&#x27;t look like they managed to actually reproduce R1, and only stopped on Step 1 out of their 3-step plan.

If you really want to see fully open training pipelines for modern LLMs, Olmo and to a lesser extent Nemotron are what you should look at.<a href="https:&#x2F;&#x2F;github.com&#x2F;allenai&#x2F;OLMo" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;allenai&#x2F;OLMo</a><a href="https:&#x2F;&#x2F;github.com&#x2F;NVIDIA-NeMo&#x2F;Nemotron" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;NVIDIA-NeMo&#x2F;Nemotron</a>

Check out OpenThoughts. It has a widely used dataset, a model that beats the deepseek&#x27;s smaller reasoning models, and a paper that talks in detail about the data curation methodology.<a href="https:&#x2F;&#x2F;www.open-thoughts.ai&#x2F;" rel="nofollow">https:&#x2F;&#x2F;www.open-thoughts.ai&#x2F;</a>

What is the estimated cost these days to train something like this to conclusion?

&quot;This will likely involve curating new, large-scale datasets for math, reasoning, and code.&quot;. ... everybody likes to hand-wave on this .

HN

Open Reproduction of DeepSeek-R1

Comments (15)