
OpenAI launched two open-source synthetic intelligence (AI) fashions on Tuesday. This marks the San Francisco-based AI agency’s first contribution to the open neighborhood since 2019, when GPT-2 was open sourced. The two new fashions, dubbed gpt-oss-120b and gpt-oss-20b, are stated to supply comparable efficiency to the o3 and o3-mini fashions. Built on the mixture-of-experts (MoE) structure, the corporate says these AI fashions have undergone rigorous security coaching and analysis. The open weights of those fashions can be found to obtain through Hugging Face.
In a submit on X (previously Twitter), OpenAI CEO Sam Altman introduced the discharge of those fashions, highlighting that “gpt-oss-120b performs about as well as o3 on challenging health issues.” Notably, each the fashions are at the moment being hosted on OpenAI’s Hugging Face itemizing, and people can obtain and domestically run the obtainable open weights.
On its web site, OpenAI explains that these fashions are suitable with the corporate’s Responses software programming interface (API), and might work with agentic workflows. These fashions additionally assist instrument use akin to net search or Python code execution. With native reasoning, the fashions additionally show clear chain-of-thought (CoT), which will be adjusted to both focus on high-quality responses or low latency outputs.
Coming to the structure, these fashions are constructed on MoE structure to scale back the variety of energetic parameters for processing effectivity. The gpt-oss-120b prompts 5.1 billion parameters per token, whereas gpt-oss-20b prompts 3.6b parameters per token. The former has a complete of 117 billion parameters and the latter has 21 billion parameters. Both fashions assist a content material size of 1,28,000 tokens.
These open-source AI fashions have been educated on principally English language textual content database. The firm centered on Science, Technology, Engineering, and Mathematics (STEM) fields, coding, and normal data. In the post-training stage, OpenAI used reinforcement studying (RL)-based fine-tuning.
![]()
Benchmark efficiency of the open-source OpenAI fashions
Photo Credit: OpenAI
Based on the corporate’s inner testing, gpt-oss-120b outperforms o3-mini on competitors coding (Codeforces), normal downside fixing (MMLU and Humanity’s Last Exam), and power calling (TauBench). But generally, these fashions marginally fall wanting o3 and o3-mini on different benchmarks akin to GPQA Diamond.
OpenAI highlights that these fashions have undergone intensive security coaching. In the pre-training stage, the corporate filtered out dangerous information relating chemical, organic, radiological, and nuclear (CBRN) threats. The AI agency additionally stated that it used particular methods to make sure the mannequin refuses unsafe prompts and is protected against immediate injections.
Despite being open-source, OpenAI claims that the fashions have been educated in a means that they can’t be fine-tuned by a foul actor to offer dangerous outputs.