TechCrunch AI July 19, 2024 NEUTRAL

Why OpenAI is making GPT-4o mini smaller and cheaper

OpenAI is launching GPT-4o mini, a smaller version of its latest AI model. It is designed to be faster and more affordable than the full model, with current support for text and images and future plans for video and audio.

WTF Index NEUTRAL

◄ Terminator 0 Idiocracy 1 ►

This is mainly a routine product launch about a cheaper smaller model, with only a mild dependency angle from wider everyday AI use.

Why OpenAI is making GPT-4o mini smaller and cheaper

OpenAI is shrinking its latest AI model into a more lightweight option. The new model, GPT-4o mini, is built for developers who want AI features without taking on the cost and complexity of a full flagship system.

The move matters because not every AI task needs the largest available model. For many websites and apps, the practical question is simpler: can the model handle frequent, straightforward work quickly and affordably?

A smaller model for everyday AI workloads

GPT-4o mini is described as a pared down version of GPT-4o, OpenAI's latest flagship model. OpenAI announced GPT-4o back in May, and the “o” stands for “omni.” That name points to the model's intended ability to understand speech and video, as well as text.

The mini version narrows that idea into a smaller package. Instead of positioning it as the biggest or most capable model, OpenAI is making GPT-4o mini available for use cases where speed, cost, and scale are the main concerns.

Small AI models are meant to be faster and more affordable than full versions. That makes them especially useful for simple, high-volume tasks, where a developer may need to run many AI interactions but does not necessarily need the full capabilities of a flagship model each time.

Why cost matters for smaller developers

The clearest audience for GPT-4o mini is smaller developers. These teams may want to add AI to a website or app, but they may not have a large budget for ongoing AI costs.

For that kind of developer, a smaller model can change the calculation. If an AI feature is too expensive to run at volume, it may never make it into the product. If the model is cheaper and fast enough for the job, lightweight AI features become easier to consider.

TechCrunch frames GPT-4o mini as particularly relevant for simple, high-volume work. That could include AI tasks that are repeated often inside a product experience, as long as the task does not require the full model.

The key point is not that every developer should use the smallest model available. It is that OpenAI is adding a lower-cost option for cases where the full version may be more than the job requires.

What GPT-4o mini can do now

GPT-4o mini currently supports text and images. That gives it a narrower set of capabilities than the broader “omni” direction associated with GPT-4o, but it still covers core AI interactions that many apps and websites rely on.

OpenAI says it will add video and audio capabilities in the future. That would bring the mini model closer to the broader multimodal direction of GPT-4o, while keeping the focus on a smaller and more affordable model.

For now, the practical distinction is simple:

GPT-4o is the latest flagship model announced back in May.
GPT-4o mini is the smaller version now being launched.
GPT-4o mini currently supports text and images, with video and audio planned for later.

That staged rollout also shows how OpenAI is separating model size from model direction. The company is not only building larger systems. It is also creating smaller variants that can serve different product needs.

Replacing GPT 3.5 Turbo as the smallest option

GPT-4o mini is supposed to be more than 60% cheaper than GPT 3.5 Turbo. It is also replacing GPT 3.5 Turbo as OpenAI's smallest model.

That replacement is important because GPT 3.5 Turbo had been the lower-end option in OpenAI's lineup. By putting GPT-4o mini in that role, OpenAI is making the mini model the new entry point for developers looking for the smallest model from the company.

OpenAI is also pointing to performance. GPT-4o mini scores better than competing small models on the MMLU, which TechCrunch identifies as an industry benchmark for reasoning.

Cost alone would not be enough if the model could not handle useful work. The MMLU claim is part of the argument that GPT-4o mini is not only cheaper, but also competitive among small models on reasoning.

What this signals about AI products

GPT-4o mini reflects a broader product reality: AI models are being shaped for different levels of need. The full model may be the right choice for harder or richer interactions. A smaller model may be better for fast, frequent, lightweight features.

For developers, the decision is likely to come down to the task. If the work is simple and happens at high volume, a cheaper small model can be more practical. If the product depends on broader capabilities, the full model may still be the better fit.

OpenAI's launch of GPT-4o mini gives developers another option inside that tradeoff. It keeps the GPT-4o name connected to the company's latest model family, while offering a version meant for speed, affordability, and everyday use inside websites and apps.