×
OpenAI enhances Realtime API with new voices and lower prices
Written by
Published on
Join our daily newsletter for breaking news, product launches and deals, research breakdowns, and other industry-leading AI coverage
Join Now

OpenAI expands Realtime API capabilities: OpenAI has introduced significant updates to its Realtime API, enhancing its speech-to-speech functionality and reducing costs for developers.

  • The update includes five new voices for speech-to-speech applications, with names like Ash, Verse, and Ballad, the latter having a British accent.
  • These new voices are described as more expressive and steerable compared to previous versions.
  • The native speech-to-speech feature boasts low latency and nuanced output, potentially revolutionizing the development of voice assistants.

Cost reduction through prompt caching: OpenAI has implemented a pricing strategy that significantly lowers the cost of using the Realtime API.

  • Cached text inputs now receive a 50% discount.
  • Cached audio inputs are discounted by an impressive 80%.
  • This pricing model leverages prompt caching, which keeps frequently requested contexts and prompts in the model’s memory, reducing token generation and associated costs.
  • Prior to this update, pricing was set at $0.06 per minute of audio input and $0.24 per minute of audio output.

Current state and limitations of the Realtime API: While the API offers promising features, it’s important to note its current status and potential challenges.

  • The Realtime API is currently in beta, indicating ongoing development and potential for further improvements.
  • Client-side authentication is not yet available, which may impact certain implementation scenarios.
  • Real-time audio processing may face issues due to network conditions, potentially affecting the user experience in some cases.

Implications for enterprise applications: The enhancements to the Realtime API could have significant implications for businesses looking to implement advanced voice-based systems.

  • The improvements in speech-to-speech AI technology could enable enterprises to build more sophisticated and responsive real-time voice response systems.
  • The cost reductions make it more feasible for a wider range of businesses to implement these advanced voice technologies.

Past controversies and ethical considerations: OpenAI’s history with AI voices highlights the need for careful consideration in this technology’s development and use.

  • The company previously paused the use of a voice similar to Scarlett Johansson’s, indicating sensitivity to potential ethical issues surrounding AI-generated voices.
  • This history underscores the importance of responsible development and deployment of AI voice technologies.

Looking ahead: Potential impact and adoption: The updates to the Realtime API represent a significant step forward in making advanced AI voice technology more accessible and affordable.

  • The combination of improved functionality and reduced costs could accelerate the adoption of AI-powered voice assistants across various industries.
  • As the technology continues to evolve, it will be crucial to monitor its impact on user experiences, privacy considerations, and the broader implications for human-AI interaction.
OpenAI expands Realtime API with new voices and cuts prices for developers

Recent News

NiCE rebrands as human-centered AI platform to augment customer service

New CEO Scott Russell steers away from automation hype toward measurable outcomes.

We tested 3 AI travel agents and here’s what actually works

Ideal for brainstorming trips, but you'll still need to handle the actual bookings.