TechCrunch AI December 6, 2024 TERMINATOR

Lower-cost Llama 3.3 70B raises Meta's open AI stakes

Meta has announced Llama 3.3 70B, a text-only generative AI model that it says can deliver Llama 3.1 405B-level performance at lower cost. The launch strengthens Meta's push around open models, while also highlighting licensing limits, regulatory pressure and rising compute demands.

WTF Index TERMINATOR

◄ Terminator 1 Idiocracy 0 ►

A cheaper, stronger open model mildly increases access to powerful AI capabilities, but this is mostly a routine product launch.

Lower-cost Llama 3.3 70B raises Meta's open AI stakes

Meta is adding another model to its Llama lineup, and the message is clear: it wants more performance to be easier and cheaper to run. The new model, Llama 3.3 70B, is positioned as a text-only system that can match the performance of Meta's much larger Llama 3.1 405B model at a lower cost.

The announcement matters because Meta has made Llama central to both its public AI strategy and its own products. Llama 3.3 70B is not just another technical update; it is part of a broader effort to make Meta's generative AI models widely used by developers, businesses and its own AI assistant.

What Meta says Llama 3.3 70B improves

Ahmad Al-Dahle, VP of generative AI at Meta, announced Llama 3.3 70B in a post on X. He described it as a new 70B model that delivers the performance of the company's 405B model while being easier and more cost-efficient to run.

By leveraging the latest advancements in post-training techniques ... this model improves core performance at a significantly lower cost

According to the source article, Al-Dahle also shared a chart showing Llama 3.3 70B beating Google's Gemini 1.5 Pro, OpenAI's GPT-4o and Amazon's newly released Nova Pro across a number of industry benchmarks. One of those benchmarks was MMLU, which evaluates a model's ability to understand language.

A Meta spokesperson said by email that the model should bring improvements in areas such as math, general knowledge, instruction following and app use. Those areas are practical signals for developers because they affect whether a model can follow user requests, reason through common tasks and support application workflows.

Llama 3.3 70B is available for download from Hugging Face and from other sources, including the official Llama website. That availability fits Meta's larger approach: pushing Llama as an open model family that can be used and commercialized across many kinds of applications.

The open model strategy still has limits

Meta often frames Llama around openness, but the source article makes clear that the model family is not open in the strictest possible sense. Meta's terms restrict how some developers can use Llama models.

The most concrete limit concerns very large platforms. Platforms with more than 700 million monthly users must request a special license before using Llama models under Meta's terms.

For many users and developers, that distinction may not be decisive. Meta says Llama has accumulated more than 650 million downloads, showing that the model family has already reached a broad audience despite those constraints.

The appeal is straightforward. If a model can deliver strong performance while costing less to run, it can widen the range of teams that are able to build with it. That is especially important in generative AI, where training and serving models require major investment in servers, data centers and network infrastructure.

Meta is also using Llama inside its own products

Llama is not only a developer-facing project. Meta uses Llama internally as the foundation for Meta AI, the company's AI assistant.

According to Meta CEO Mark Zuckerberg, Meta AI is powered entirely by Llama models and now has nearly 600 million monthly active users. Zuckerberg has also claimed that Meta AI is on track to become the most-used AI assistant in the world.

That gives the Llama project two roles at once. It is a model family that outside developers can download and build with, and it is also the technical base for a major consumer-facing assistant inside Meta's own ecosystem.

This dual role increases the stakes for each new Llama release. Improvements in model efficiency can matter to outside users, but they can also matter to Meta's own ability to operate AI features at scale.

Openness brings policy and security complications

The source article also shows the difficult side of Meta's open release strategy. In November, a report alleged that Chinese military researchers had used a Llama model to create a defense chatbot.

Meta responded by making its Llama models available to U.S. defense contractors. That response illustrates the complicated position Meta is in: broader availability can accelerate adoption, but it can also raise questions about who uses the models and for what purpose.

Regulation is another pressure point. Meta has said it is concerned about complying with the AI Act, the EU law that establishes a regulatory framework for AI. The company called the law's implementation too unpredictable for its open release strategy.

Meta is also facing issues tied to the GDPR, the EU's privacy law. The source article states that Meta trains AI models on public data from Instagram and Facebook users who have not opted out, and that in Europe this data is subject to GDPR guarantees.

EU regulators earlier this year asked Meta to pause training on European user data while they reviewed the company's GDPR compliance. Meta agreed to halt that training while also backing an open letter that called for a modern interpretation of GDPR that does not reject progress.

Compute needs keep rising

Even as Meta promotes a lower-cost Llama model, the company is still spending heavily on AI infrastructure. The source article notes that Meta announced Wednesday it would build a $10 billion AI data center in Louisiana, described as the largest AI data center Meta has ever built.

Zuckerberg said on Meta's Q4 earnings call in August that training the next major set of Llama models, Llama 4, will require 10x more compute than training Llama 3. Meta has also procured a cluster of more than 100,000 Nvidia GPUs for model development, putting its resources in the same competitive frame as xAI.

The financial pressure is already visible. Meta's capital expenditures rose nearly 33% to $8.5 billion in Q2 2024, up from $6.4 billion a year earlier, driven by spending on servers, data centers and network infrastructure.

That context makes Llama 3.3 70B more than a model announcement. It reflects the core tension in generative AI: companies want more capable systems, but the cost of building and running them keeps climbing. Meta's bet is that a more efficient Llama model can strengthen adoption now while the company prepares for even larger infrastructure demands ahead.