×
OpenAI enhances Realtime API with new voices and lower prices
Written by
Published on
Join our daily newsletter for breaking news, product launches and deals, research breakdowns, and other industry-leading AI coverage
Join Now

OpenAI expands Realtime API capabilities: OpenAI has introduced significant updates to its Realtime API, enhancing its speech-to-speech functionality and reducing costs for developers.

  • The update includes five new voices for speech-to-speech applications, with names like Ash, Verse, and Ballad, the latter having a British accent.
  • These new voices are described as more expressive and steerable compared to previous versions.
  • The native speech-to-speech feature boasts low latency and nuanced output, potentially revolutionizing the development of voice assistants.

Cost reduction through prompt caching: OpenAI has implemented a pricing strategy that significantly lowers the cost of using the Realtime API.

  • Cached text inputs now receive a 50% discount.
  • Cached audio inputs are discounted by an impressive 80%.
  • This pricing model leverages prompt caching, which keeps frequently requested contexts and prompts in the model’s memory, reducing token generation and associated costs.
  • Prior to this update, pricing was set at $0.06 per minute of audio input and $0.24 per minute of audio output.

Current state and limitations of the Realtime API: While the API offers promising features, it’s important to note its current status and potential challenges.

  • The Realtime API is currently in beta, indicating ongoing development and potential for further improvements.
  • Client-side authentication is not yet available, which may impact certain implementation scenarios.
  • Real-time audio processing may face issues due to network conditions, potentially affecting the user experience in some cases.

Implications for enterprise applications: The enhancements to the Realtime API could have significant implications for businesses looking to implement advanced voice-based systems.

  • The improvements in speech-to-speech AI technology could enable enterprises to build more sophisticated and responsive real-time voice response systems.
  • The cost reductions make it more feasible for a wider range of businesses to implement these advanced voice technologies.

Past controversies and ethical considerations: OpenAI’s history with AI voices highlights the need for careful consideration in this technology’s development and use.

  • The company previously paused the use of a voice similar to Scarlett Johansson’s, indicating sensitivity to potential ethical issues surrounding AI-generated voices.
  • This history underscores the importance of responsible development and deployment of AI voice technologies.

Looking ahead: Potential impact and adoption: The updates to the Realtime API represent a significant step forward in making advanced AI voice technology more accessible and affordable.

  • The combination of improved functionality and reduced costs could accelerate the adoption of AI-powered voice assistants across various industries.
  • As the technology continues to evolve, it will be crucial to monitor its impact on user experiences, privacy considerations, and the broader implications for human-AI interaction.
OpenAI expands Realtime API with new voices and cuts prices for developers

Recent News

Baidu reports steepest revenue drop in 2 years amid slowdown

China's tech giant Baidu saw revenue drop 3% despite major AI investments, signaling broader challenges for the nation's technology sector amid economic headwinds.

How to manage risk in the age of AI

A conversation with Palo Alto Networks CEO about his approach to innovation as new technologies and risks emerge.

How to balance bold, responsible and successful AI deployment

Major companies are establishing AI governance structures and training programs while racing to deploy generative AI for competitive advantage.