CO/AI Subscribe
Friday · June 19, 2026 · Issue No. 900
Video

Meta just dropped Llama-4, but it does NOT look good…

Watch on YouTube

# Meta’s Llama 4 Release: Impressive Claims, Mixed Reality

Meta recently released their new Llama 4 family of AI models, but the reception has been mixed despite some impressive technical specifications. This release has sparked both excitement and controversy in the AI community.

## The Llama 4 Family: Massive but Specialized

Meta introduced three new mixture-of-experts (MoE) models:

– **Llama 4 Scout**: 109 billion parameters with 17 billion active parameters and 16 experts
– **Llama 4 Maverick**: 400 billion parameters with 17 billion active parameters and 128 experts
– **Llama 4 Behemoth**: 2 trillion parameters with 288 billion active parameters and 16 experts (still in training)

Unlike traditional dense models where all parameters activate for each token, MoE models only activate a fraction of their parameters at any time, allowing for more knowledge storage with reduced computational demands.

## Headline Features

– **Massive context window**: Llama 4 Scout supports a 10 million token context window, 78 times larger than most open models
– **Multimodal capabilities**: Built from the ground up with text and image understanding (but can’t generate images)
– **High benchmark scores**: Claims an impressive 1417 ELO on LM Arena

## The Controversies

Despite the impressive specifications, several issues have emerged:

1. **Poor instruction following**: Independent testers found Llama 4 models frequently failing to follow even basic instructions, such as counting letters in a word

2. **Benchmark discrepancies**: Significant gaps between Meta’s claimed benchmark results and independent testing

3. **Misleading marketing**: Some community members felt the marketing around “active parameters” was intentionally misleading

4. **Context quality issues**: On Fiction Live Bench (which tests comprehension rather than search ability), performance declined dramatically after just 400 tokens

5. **Different benchmark models**: Meta acknowledged using a “version optimized for conversations” for LM Arena benchmarks, not the actual released models

## Current Strengths

Despite the criticisms, Llama 4 does excel in vision understanding capabilities, performing well on vision

Share: X LinkedIn Email
Video Feed

More videos

All videos →
Claude Fable 5: When Capability Meets Economics
Video

Claude Fable 5: When Capability Meets Economics

Anthropic released Cloud Fable 5 with a paradox built in: safeguards sophisticated enough to let a mythosclass model...

Run Agentic AI Entirely on Your Mac—No Cloud, No Latency, No Privacy Tradeoffs
Video

Run Agentic AI Entirely on Your Mac—No Cloud, No Latency, No Privacy Tradeoffs

Apple’s MLX framework is mature enough now that you can run serious agentic AI workflows locally on Silicon...

Hermes Agent Master Class
Video

Hermes Agent Master Class

Welcome to the Hermes Agent Master Class — an 11-episode series taking you from zero to fully leveraging...

CONSULTING

Outsider
Labs.

A management consulting team focused on AI transformations for executives and business owners.

Work with us →