back
Get SIGNAL/NOISE in your inbox daily

DeepSeek’s AI model R1 has become the first major large language model to undergo peer review, with researchers publishing details in Nature revealing the reasoning-focused system cost just $294,000 to train. The landmark study provides unprecedented transparency into how the Chinese startup created a model that rivals OpenAI’s offerings at a fraction of the cost, potentially reshaping expectations around AI development expenses and accessibility.

What you should know: The peer-reviewed paper confirms DeepSeek’s innovative approach to creating powerful AI without relying on competitor outputs.

  • R1 excels at reasoning tasks like mathematics and coding, competing directly with US-developed models while costing substantially less to develop.
  • The model has been downloaded 10.9 million times on Hugging Face, making it the most popular open-weight model on the platform.
  • Training costs totaled approximately $300,000, plus $6 million for the base model—far below the tens of millions typically spent on rival systems.

Technical breakthrough: DeepSeek used pure reinforcement learning rather than human-selected examples to teach R1 reasoning strategies.

  • The automated trial-and-error approach rewarded correct answers without prescribing specific reasoning tactics.
  • The model learned to verify its own work using a technique called group relative policy optimization, boosting efficiency by scoring its own attempts.
  • This method allowed R1 to develop reasoning-like strategies independently, rather than copying human-prescribed approaches.

In plain English: Instead of teaching the AI how to think by showing it examples of human reasoning, DeepSeek let the AI figure out its own problem-solving methods through trial and error—like letting a student discover the best study techniques by experimenting rather than forcing them to follow a specific textbook approach.

Industry impact: The model has influenced virtually all reinforcement learning research in large language models throughout 2025.

  • “Almost all work in 2025 so far that conducts reinforcement learning in LLMs might have been inspired by R1 one way or another,” says Huan Sun, an AI researcher at Ohio State University.
  • Other researchers are now applying DeepSeek’s methods to improve existing models and extend reasoning capabilities beyond mathematics and coding.
  • Lewis Tunstall from Hugging Face, an AI community platform, describes R1 as having “kick-started a revolution” in AI reasoning approaches.

Addressing controversy: DeepSeek researchers explicitly denied training R1 on OpenAI model outputs, countering speculation that emerged after the model’s January release.

  • Media reports suggested OpenAI, the San Francisco-based company behind ChatGPT, believed DeepSeek had used their models’ outputs to accelerate R1’s development.
  • While R1’s base model was trained on web data that may include AI-generated content, researchers stated they didn’t copy reasoning examples from OpenAI models.
  • Independent replication attempts by other labs support DeepSeek’s claims that pure reinforcement learning alone can achieve high performance.

Setting precedent: R1 represents the first major language model to undergo rigorous peer review, establishing a new standard for AI transparency.

  • “This is a very welcome precedent,” says Tunstall, who reviewed the Nature paper, noting the importance of public evaluation for assessing AI system risks.
  • The peer-review process led to reduced anthropomorphizing in descriptions and added technical clarifications about training data and safety measures.
  • Researchers hope other AI firms will follow suit with similar transparency practices.

Performance validation: Despite lower development costs, R1 remains highly competitive in scientific and reasoning tasks.

  • In ScienceAgentBench challenges involving data analysis and visualization, R1 ranked among the best models for balancing capability with cost-effectiveness.
  • The model was trained primarily on Nvidia H800 chips, which became restricted from sale to China under 2023 US export controls.
  • Current evidence suggests pure reinforcement learning can achieve very high performance without requiring access to competitor models.

Recent Stories

Oct 17, 2025

DOE fusion roadmap targets 2030s commercial deployment as AI drives $9B investment

The Department of Energy has released a new roadmap targeting commercial-scale fusion power deployment by the mid-2030s, though the plan lacks specific funding commitments and relies on scientific breakthroughs that have eluded researchers for decades. The strategy emphasizes public-private partnerships and positions AI as both a research tool and motivation for developing fusion energy to meet data centers' growing electricity demands. The big picture: The DOE's roadmap aims to "deliver the public infrastructure that supports the fusion private sector scale up in the 2030s," but acknowledges it cannot commit to specific funding levels and remains subject to Congressional appropriations. Why...

Oct 17, 2025

Tying it all together: Credo’s purple cables power the $4B AI data center boom

Credo, a Silicon Valley semiconductor company specializing in data center cables and chips, has seen its stock price more than double this year to $143.61, following a 245% surge in 2024. The company's signature purple cables, which cost between $300-$500 each, have become essential infrastructure for AI data centers, positioning Credo to capitalize on the trillion-dollar AI infrastructure expansion as hyperscalers like Amazon, Microsoft, and Elon Musk's xAI rapidly build out massive computing facilities. What you should know: Credo's active electrical cables (AECs) are becoming indispensable for connecting the massive GPU clusters required for AI training and inference. The company...

Oct 17, 2025

Vatican launches Latin American AI network for human development

The Vatican hosted a two-day conference bringing together 50 global experts to explore how artificial intelligence can advance peace, social justice, and human development. The event launched the Latin American AI Network for Integral Human Development and established principles for ethical AI governance that prioritize human dignity over technological advancement. What you should know: The Pontifical Academy of Social Sciences, the Vatican's research body for social issues, organized the "Digital Rerum Novarum" conference on October 16-17, combining academic research with practical AI applications. Participants included leading experts from MIT, Microsoft, Columbia University, the UN, and major European institutions. The conference...