When you purchase through links on our site, we may earn an affiliate commission. Here’s how it works.

Unmodified Llama 4 Maverick ranks below rivals following Meta cheating allegations

Llama 4 logo

Recently, Meta released Llama 4, a new family of large language models consisting of Scout, Maverick, and Behemoth. From the benchmark results, Llama 4 Maverick (Llama-4-Maverick-03-26-Experimental) came 2nd, beating models like OpenAI's GPT-4o and Google's Gemini 2.0 Flash, and trailing only behind Gemini 2.5 Pro.

But pretty soon, the cracks began to form as users noticed differences in behavior between the Maverick used in benchmarks and the one available to the public. This led to accusations that Meta was cheating, prompting a response from a Meta executive on X:

LMArena acknowledged that Meta failed to abide by its policies, apologized to the public, and issued a policy update.

Now, the unmodified release version of the model (Llama-4-Maverick-17B-128E-Instruct) has been added to LMArena, and it ranks 32nd. For the record, older models like Claude 3.5 Sonnet, released last June, and Gemini-1.5-Pro-002, released last September, rank higher.

Llama 4 Maverick Unmodified ranking on LMArena

In a statement to TechCrunch, a Meta spokesperson mentioned that the Llama-4-Maverick-03-26-Experimental was specially tuned for chat and did pretty well on LMArena benchmarks, adding that the company is "excited" to see what developers will build now that an open source version of Llama 4 has been released.

Report a problem with article
The GameSir X3 Pro controller
Next Article

GameSir X3 Pro review: good mobile controller with Hall Effect sticks but not for everyone

Previous Article

Weekend PC Game Deals: Showcase offers, Neon bundles, fighting freebies, and more

Join the conversation!

Login or Sign Up to read and post a comment.

7 Comments - Add comment