OpenAI releases first open-source models with Phi-like synthetic training

OpenAI has released its first open-source large language models, gpt-oss-120b and gpt-oss-20b, marking the company’s entry into the open-weight model space. While these models excel at certain benchmarks, they appear to follow the same synthetic data training approach as Microsoft’s Phi series, potentially prioritizing safety over real-world performance in what amounts to OpenAI’s version of “Phi-5.”

What you should know: These models demonstrate strong benchmark performance but show significant gaps in practical applications and out-of-domain knowledge.

The models perform well on technical benchmarks but struggle with tasks like SimpleQA and lack knowledge in areas like popular culture.
Early user reactions are mixed, with some praising their capabilities while others expressing disappointment on social media.
The author predicts these models will fall into the category of “performs much better on benchmarks than on real-world tasks.”

The Phi connection: Former Microsoft researcher Sebastien Bubeck, who developed the Phi model series, joined OpenAI at the end of 2024, suggesting a direct influence on these new models.

Phi models were trained exclusively on synthetic data—text generated by other language models or curated textbooks rather than scraped internet content.
This approach consistently produced impressive benchmark results but disappointing real-world performance across the Phi series.
Training on synthetic data allows developers to “teach to the test” by generating data that matches benchmark problems, inflating scores while reducing practical utility.

In plain English: Think of synthetic data training like studying for a test using only practice exams created by the test makers themselves. You’ll ace the official test but struggle with real-world problems that weren’t in those practice materials.

Why synthetic data matters for safety: OpenAI likely chose this approach to minimize risks associated with releasing open-source models.

Once released, open-source models can be fine-tuned by anyone to remove safety guardrails, creating permanent liability for the company.
Training on controlled synthetic data makes it easier to produce models that decline harmful requests and avoid learning problematic behaviors.
The author notes that “the main use-case for fine-tuning small language models is for erotic role-play,” highlighting safety concerns for companies releasing open models.

Strategic positioning: Unlike Meta, OpenAI doesn’t need their open-source models to be exceptionally useful since their primary business relies on closed-source offerings.

The company needed models that could beat Chinese open-source competitors on benchmarks while avoiding potential scandals.
This release allows OpenAI to claim participation in the open-source space without cannibalizing their core business model.
The synthetic data approach provides a safety buffer that traditional training methods cannot offer.

OpenAI releases first open-source models with Phi-like synthetic training

Recent Stories

DOE fusion roadmap targets 2030s commercial deployment as AI drives $9B investment

Tying it all together: Credo’s purple cables power the $4B AI data center boom

Vatican launches Latin American AI network for human development