Ars Technica AI July 23, 2024 TERMINATOR

Meta makes GPT-4-class Llama 3.1 405B downloadable

Meta has released Llama 3.1 405B, an open-weights large language model it says rivals top AI systems in several capability areas. The release challenges closed model vendors, but it also renews a dispute over whether Meta should call restricted open-weights models “open source.”

WTF Index TERMINATOR

◄ Terminator 2 Idiocracy 0 ►

A frontier-level open-weights model becoming downloadable modestly increases access to powerful AI capabilities, though the story is mostly about release and competition rather than concrete harm.

Meta makes GPT-4-class Llama 3.1 405B downloadable

Meta’s Llama 3.1 405B puts a major AI capability shift in plain view: a GPT-4-class large language model that people can download for free and run on their own hardware. That does not mean it is a desktop-friendly tool. Meta says the model can run on a “single server node,” which still places it well above ordinary PC equipment.

What Meta Released

Llama 3.1 405B is the largest member of Meta’s new Llama 3.1 model lineup. The “405B” refers to 405 billion parameters, the numerical values in a neural network that store learned information.

In general, more parameters can support stronger performance because a larger neural network may make richer connections between concepts. The tradeoff is compute: larger models require more computing power to run.

Meta describes Llama 3.1 405B as “the first openly available model that rivals the top AI models when it comes to state-of-the-art capabilities in general knowledge, steerability, math, tool use, and multilingual translation.” Mark Zuckerberg also calls 405B “the first frontier-level open source AI model.”

The word “frontier” matters here. In the AI industry, a frontier model is meant to push current capability boundaries. Meta is placing Llama 3.1 405B in the same conversation as OpenAI’s GPT-4o, Claude’s 3.5 Sonnet, and Google Gemini 1.5 Pro.

Why 405B Is Being Compared With Top Closed Models

Meta published benchmark charts suggesting that 405B comes close to GPT-4 Turbo, GPT-4o, and Claude 3.5 Sonnet on tests including MMLU, GSM8K, and HumanEval. Those benchmarks are associated with undergraduate level knowledge, grade school math, and coding.

Even so, benchmark results are not the whole story. The source notes that these measures are not necessarily scientifically sound and do not capture the subjective experience of using a conversational AI model.

That gap is important for readers who do not live inside AI evaluation charts. A model can score well on a test and still feel different in everyday use, depending on how it responds, follows instructions, reasons through ambiguity, or handles conversation.

In the absence of Chatbot Arena data, Meta has also provided its own human evaluations of 405B outputs. Those results appear to show the model holding its own against GPT-4 Turbo and Claude 3.5 Sonnet. Early comments after the model leaked on 4chan yesterday also seem to match the idea that 405B is roughly equivalent to GPT-4.

The Training Scale Behind The Model

Llama 3.1 405B required a large training effort. Meta trained it on over 15 trillion tokens of training data scraped from the web. That data was then parsed, filtered, and annotated by Llama 2.

The compute involved was also substantial. Meta used more than 16,000 H100 GPUs to train the model. The source frames that scale as expensive computer training time, backed by a company with enough money to spend heavily on the effort.

Llama 3.1 is not only the 405B model. Meta also upgraded its smaller 8B and 70B models. Those versions now include multilingual support and an extended context length of 128,000 tokens.

Context length is roughly a model’s working memory capacity. Tokens are chunks of data that large language models use to process information. A longer context can allow a model to work with larger inputs in a single interaction.

What Developers Can Use It For

Meta says 405B is useful for long-form text summarization, multilingual conversational agents, and coding assistants. It also highlights another use: creating synthetic data used to train future AI language models.

That last point is notable because Meta’s Llama 3.1 license now officially supports using outputs from Llama models to improve other AI models for the first time.

For developers and companies, the practical appeal is control. An open-weights model can be downloaded, run, and fine-tuned rather than accessed only through a hosted subscription product or token-priced API.

That directly challenges companies such as OpenAI, which keep model weights private and sell access through products like ChatGPT or through API usage. Meta’s release therefore affects not only model capability, but also the business model around AI access.

The Open Source Dispute

The release also brings terminology into focus. Llama 3.1 405B is an open-weights model, meaning the trained neural network files can be downloaded, run, or fine-tuned.

That is not the same thing as “open source” in the traditional sense defined by the Open Source Initiative. The AI industry has not settled on a universal term for releases that provide code or weights with restrictions, or releases that do not provide training data.

Zuckerberg published a 2,300-word essay titled, “Open Source AI Is the Path Forward.” In it, he argues for customizable AI models, user control, better data security, higher cost-efficiency, and better future-proofing compared with vendor-locked systems.

There is also a competitive reason for Meta’s position. Zuckerberg says open releases benefit Meta because he does not want companies like his to pay a toll for AI capabilities, comparing that risk to “taxes” Apple levies on developers through its App Store.

Independent AI researcher Simon Willison objected to the wording even while liking Zuckerberg’s essay otherwise. “I see Zuck’s prominent misuse of ‘open source’ as a small-scale act of cultural vandalism,” Willison told Ars Technica. “Open source should have an agreed meaning. Abusing the term weakens that meaning which makes the term less generally useful, because if someone says ‘it’s open source,’ that no longer tells me anything useful. I have to then dig in and figure out what they’re actually talking about.”

The Llama 3.1 models are available through Meta’s own website and Hugging Face. Both require contact information and agreement to a license and acceptable use policy. That means Meta can technically legally pull the rug out from under use of Llama 3.1 or its outputs at any time.