×
OpenAI’s new API lets you build real-time voice apps — at a substantial premium
Written by
Published on
Join our daily newsletter for breaking news, product launches and deals, research breakdowns, and other industry-leading AI coverage
Join Now

OpenAI expands developer offerings with real-time voice API: The company’s annual developer day introduced several new features, with the centerpiece being a real-time application programming interface (API) for voice interactions, albeit at a premium price point.

Real-time voice capabilities and pricing structure: OpenAI’s new API enables developers to create applications with fluid, real-time conversations between users and language models.

  • The real-time API is based on the GPT-4o large language model, which costs $2.50 per million input tokens and $10 per million output tokens for text-only interactions.
  • For real-time voice applications, the pricing is at least double, with input and output tokens costing $5 and $20 per million tokens, respectively.
  • Voice tokens come at an even higher premium: $100 per million audio input tokens and $200 per million audio output tokens.
  • OpenAI estimates that this pricing translates to approximately $0.06 per minute of audio input and $0.24 per minute of audio output for standard voice conversations.

Potential applications and cost-saving measures: The company showcased various use cases for real-time voice interactions while also introducing methods to reduce costs for developers.

  • Example applications include automated health coaches and language tutors that can engage in real-time conversations with users.
  • To help offset the higher costs, OpenAI introduced prompt caching, which reuses tokens from previously submitted inputs, cutting the price of GPT-4o input text tokens in half.

LLM distillation and fine-tuning enhancements: OpenAI also unveiled new tools to help developers create more efficient and specialized models.

  • The LLM distillation service allows developers to use data from larger models to train smaller ones, streamlining a previously complex process.
  • Developers can now fine-tune models with image data, enabling more specific applications in various domains.
  • Food delivery service Grab demonstrated the practical applications of image fine-tuning, improving their mapping operations for delivery routes.

Pricing for new services: OpenAI provided detailed pricing information for its new offerings, maintaining a premium pricing structure.

  • Image fine-tuning is priced at $3.75 per million input tokens and $15 per million output tokens, matching standard fine-tuning rates.
  • Training image models comes at a higher cost of $25 per million tokens.

Broader implications for AI development: OpenAI’s new features represent significant advancements in AI accessibility and customization for developers, but the premium pricing may impact widespread adoption.

  • The introduction of real-time voice capabilities could lead to more natural and engaging AI interactions across various industries.
  • However, the high costs associated with these new features may limit their use to larger companies or well-funded projects, potentially creating a divide in AI application development.
  • The emphasis on fine-tuning and distillation services suggests a trend towards more specialized and efficient AI models, which could lead to a wider range of targeted AI applications in the future.
OpenAI lets developers build real-time voice apps - at a substantial premium

Recent News

Propaganda is everywhere, even in LLMS — here’s how to protect yourself from it

Recent tragedy spurs examination of AI chatbot safety measures after automated responses proved harmful to a teenager seeking emotional support.

How Anthropic’s Claude is changing the game for software developers

AI coding assistants now handle over 10% of software development tasks, with major tech firms reporting significant time and cost savings from their deployment.

AI-powered divergent thinking: How hallucinations help scientists achieve big breakthroughs

Meta's new AI model combines powerful performance with unusually permissive licensing terms for businesses and developers.