Meta Llama 4 Scout and Maverick AI Models With MoE Architecture Released

headlines4Technology8 months ago1.6K Views

Home
Technology
Meta Llama 4 Scout and Maverick AI Models With MoE Architecture Released

Meta launched the primary synthetic intelligence (AI) fashions within the Llama 4 household on Saturday. The Menlo Park-based tech big launched two fashions — Llama 4 Scout and Llama 4 Maverick — with native multimodal capabilities to the open neighborhood. The firm says these are the primary open fashions constructed with Mixture-of-Experts (MoE) structure. Compared to the predecessor, these include larger context home windows and higher energy effectivity. Alongside, Meta additionally previewed Llama 4 Behemoth, the most important AI mannequin within the household unveiled to this point.

In a weblog put up, the tech big detailed its new AI fashions. Just just like the earlier Llama fashions, the Llama 4 Scout and Llama 4 Maverick are open-source AI fashions and could be downloaded by way of its Hugging Face itemizing or the devoted Llama web site. Starting as we speak, customers may also expertise the Llama 4 AI fashions in WhatsApp, Messenger, Instagram Direct, and on the Meta.AI web site.

The Llama 4 Scout is a 17 billion lively parameter mannequin with 16 consultants, whereas the Maverick mannequin comes with 17 billion lively parameters and 128 consultants. Scout is alleged to have the ability to run on a single Nvidia H100 GPU. Additionally, the corporate claimed that the previewed Llama 4 Behemoth outperforms GPT-4.5, Claude Sonnet 3.7, and Gemini 2.0 Pro on a number of benchmarks. Meta stated the Behemoth mannequin, with 288 billion lively parameters and 16 consultants, was not launched as it’s nonetheless being educated.

The MoE structure in Llama 4 AI fashions
(*4*)Photo Credit: Meta

Coming to the structure, the Llama 4 fashions are constructed on an MoE structure. The MoE structure prompts solely a fraction of the overall parameters primarily based on the requirement of the preliminary immediate, which makes it extra compute environment friendly for coaching and inference. In the pre-training part, Meta additionally used new methods akin to early fusion to combine textual content and imaginative and prescient tokens concurrently, and MetaP to set important mannequin hyper-parameters and initialisation scales.

For post-training, Meta selected to start out the method with light-weight supervised fine-tuning (SFT), adopted by on-line reinforcement studying (RL) and light-weight direct choice optimisation (DPO). The sequence was chosen to not over-constrain the mannequin. The researchers additionally carried out SFT on solely 50 % of the “harder” dataset.

Based on inside testing, the corporate claimed that the Maverick mannequin outperforms Gemini 2.0 Flash, DeepSeek v3.1, and GPT-4o on the MMMU (picture reasoning), ChartQA (picture understanding), GPQA Diamond (reasoning and data), and MTOB (lengthy context) benchmarks.

On the opposite hand, the Scout mannequin is alleged to outperform Gemma 3, Mistral 3.1, and Gemini 2.0 on the MMMU, ChartQA, MMLU (reasoning and data), GPQA Diamond, and MTOB benchmarks.

Meta has additionally taken steps to make the AI fashions safer in each the pre-training and post-training processes. In pre-training, the researchers used knowledge filtering strategies to make sure dangerous knowledge was not added to its data base. In post-training, the researchers added open-source security instruments akin to Llama Guard and Prompt Guard to guard the mannequin from exterior assaults. Additionally, the researchers have additionally stress-tested the fashions internally and have allowed red-teaming of the Llama 4 Scout and Maverick fashions.

Notably, the fashions can be found to the open neighborhood with a permissive Llama 4 licence. It permits each educational and industrial utilization of the fashions, nevertheless, Meta now not permits firms with greater than 700 million month-to-month lively customers to entry its AI fashions.

Upvote0PointsDownvote

0 Votes: 0 Upvotes, 0 Downvotes (0 Points)