Zum Inhalt der Seite gehen


DeepSeek-R1 uses reinforcement learning to train the model without supervised fine-tuning.

DeepSeek-R1-Zero, the initial model trained without SFT, has some limitations, such as poor readability and language mixing. To address these issues, the authors introduce DeepSeek-R1, which incorporates multi-stage training and cold-start data before RL.

DeepSeek-R1 achieves performance comparable to OpenAI o1 on reasoning tasks.

https://arxiv.org/pdf/2501.12948

#deepseek #MachineLearning
Sounds like Donny's techfascist bffs hate competition @yogthos

"Donald Trump’s AI tsar has claimed there’s ‘substantial evidence’ that DeepSeek leaned on OpenAI’s models to develop its own technology." https://www.scmp.com/news/world/united-states-canada/article/3296667/microsoft-openai-investigate-chinas-deepseek-over-data-breach

At first I thought it said, perhaps correctly, 'learned on'😎

Re #TechFascists... a logical political progression from US #Libertarians, which describes most techies for the last couple of decades, or more. https://kafeneio.social/@heretical_i/113897993841616105
lol they're all very sour that their snake oil scam has been exposed for what it is
A half mil for a heads up display helmet that only fits one. Why beg for "cost-overruns' when you can simply bill it all at once @yogthos 🤣