Meta just dropped Llama-4, but it does NOT look good...

Watch on YouTube

# Meta’s Llama 4 Release: Impressive Claims, Mixed Reality

Meta recently released their new Llama 4 family of AI models, but the reception has been mixed despite some impressive technical specifications. This release has sparked both excitement and controversy in the AI community.

## The Llama 4 Family: Massive but Specialized

Meta introduced three new mixture-of-experts (MoE) models:

– **Llama 4 Scout**: 109 billion parameters with 17 billion active parameters and 16 experts
– **Llama 4 Maverick**: 400 billion parameters with 17 billion active parameters and 128 experts
– **Llama 4 Behemoth**: 2 trillion parameters with 288 billion active parameters and 16 experts (still in training)

Unlike traditional dense models where all parameters activate for each token, MoE models only activate a fraction of their parameters at any time, allowing for more knowledge storage with reduced computational demands.

## Headline Features

– **Massive context window**: Llama 4 Scout supports a 10 million token context window, 78 times larger than most open models
– **Multimodal capabilities**: Built from the ground up with text and image understanding (but can’t generate images)
– **High benchmark scores**: Claims an impressive 1417 ELO on LM Arena

## The Controversies

Despite the impressive specifications, several issues have emerged:

1. **Poor instruction following**: Independent testers found Llama 4 models frequently failing to follow even basic instructions, such as counting letters in a word

2. **Benchmark discrepancies**: Significant gaps between Meta’s claimed benchmark results and independent testing

3. **Misleading marketing**: Some community members felt the marketing around “active parameters” was intentionally misleading

4. **Context quality issues**: On Fiction Live Bench (which tests comprehension rather than search ability), performance declined dramatically after just 400 tokens

5. **Different benchmark models**: Meta acknowledged using a “version optimized for conversations” for LM Arena benchmarks, not the actual released models

## Current Strengths

Despite the criticisms, Llama 4 does excel in vision understanding capabilities, performing well on vision

Meta just dropped Llama-4, but it does NOT look good…

Outsider
Labs.

Meta just dropped Llama-4, but it does NOT look good…

More videos

Claude Fable 5: When Capability Meets Economics

Run Agentic AI Entirely on Your Mac—No Cloud, No Latency, No Privacy Tradeoffs

Hermes Agent Master Class

All Signal.No Noise.

OutsiderLabs.

All Signal.
No Noise.

Outsider
Labs.