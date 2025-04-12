When you purchase through links on our site, we may earn an affiliate commission. Here’s how it works.

Unmodified Llama 4 Maverick ranks below rivals following Meta cheating allegations

Neowin · with 0 comments

Llama 4 logo

Recently, Meta released Llama 4, a new family of large language models consisting of Scout, Maverick, and Behemoth. From the benchmark results, Llama 4 Maverick (Llama-4-Maverick-03-26-Experimental) came 2nd, beating models like OpenAI's GPT-4o and Google's Gemini 2.0 Flash, and trailing only behind Gemini 2.5 Pro.

But pretty soon, the cracks began to form as users noticed differences in behavior between the Maverick used in benchmarks and the one available to the public. This led to accusations that Meta was cheating, prompting a response from a Meta executive on X:

LMArena acknowledged that Meta failed to abide by its policies, apologized to the public, and issued a policy update.

Now, the unmodified release version of the model (Llama-4-Maverick-17B-128E-Instruct) has been added to LMArena, and it ranks 32nd. For the record, older models like Claude 3.5 Sonnet, released last June, and Gemini-1.5-Pro-002, released last September, rank higher.

Llama 4 Maverick Unmodified ranking on LMArena

In a statement to TechCrunch, a Meta spokesperson mentioned that the Llama-4-Maverick-03-26-Experimental was specially tuned for chat and did pretty well on LMArena benchmarks, adding that the company is "excited" to see what developers will build now that an open source version of Llama 4 has been released.

Report a problem with article
Previous Article

Weekend PC Game Deals: Showcase offers, Neon bundles, fighting freebies, and more

Join the conversation!

Login or Sign Up to read and post a comment.

0 Comments - Add comment