OpenAI introduces Flex processing for cheaper, slower AI tasks

Reuters

OpenAI has unveiled a new option called Flex processing, an API service designed to provide more affordable AI model usage in exchange for slower response times and occasional resource unavailability.

This new feature, available in beta for OpenAI's recently released o3 and o4-mini reasoning models, aims to cater to lower-priority and non-production tasks such as model evaluations, data enrichment, and asynchronous workloads.

Flex processing cuts API costs by half, making it an attractive option for businesses and developers seeking to reduce expenses on non-urgent AI tasks. For instance, the cost of using the o3 model through Flex is $5 per million input tokens (approximately 750,000 words) and $20 per million output tokens, compared to the standard price of $10 and $40, respectively. Meanwhile, for the o4-mini model, Flex pricing drops to $0.55 per million input tokens and $2.20 per million output tokens, down from $1.10 and $4.40.

This move comes as OpenAI faces increased competition from rival AI companies, such as Google, which recently launched its Gemini 2.5 Flash reasoning model. Gemini 2.5 Flash offers similar performance to DeepSeek's R1 at a lower input token cost, underscoring the growing trend of more affordable, budget-friendly AI options.

Additionally, OpenAI announced that developers in usage tiers 1-3 will need to undergo a new ID verification process to access the o3 model. This verification is part of OpenAI's efforts to ensure that its services are not misused by bad actors, as it seeks to protect its usage policies.

With the introduction of Flex processing, OpenAI is positioning itself to remain competitive in the rapidly evolving AI landscape, offering cost-effective solutions for developers working with non-critical tasks while maintaining the integrity of its services.

Tags

Comments (0)

What is your opinion on this topic?

Leave the first comment