OpenAI enhances Realtime API with new voices and lower prices

Join our daily newsletter for breaking news, product launches and deals, research breakdowns, and other industry-leading AI coverage

Join Now

OpenAI expands Realtime API capabilities: OpenAI has introduced significant updates to its Realtime API, enhancing its speech-to-speech functionality and reducing costs for developers.

The update includes five new voices for speech-to-speech applications, with names like Ash, Verse, and Ballad, the latter having a British accent.
These new voices are described as more expressive and steerable compared to previous versions.
The native speech-to-speech feature boasts low latency and nuanced output, potentially revolutionizing the development of voice assistants.

Cost reduction through prompt caching: OpenAI has implemented a pricing strategy that significantly lowers the cost of using the Realtime API.

Cached text inputs now receive a 50% discount.
Cached audio inputs are discounted by an impressive 80%.
This pricing model leverages prompt caching, which keeps frequently requested contexts and prompts in the model’s memory, reducing token generation and associated costs.
Prior to this update, pricing was set at $0.06 per minute of audio input and $0.24 per minute of audio output.

Current state and limitations of the Realtime API: While the API offers promising features, it’s important to note its current status and potential challenges.

The Realtime API is currently in beta, indicating ongoing development and potential for further improvements.
Client-side authentication is not yet available, which may impact certain implementation scenarios.
Real-time audio processing may face issues due to network conditions, potentially affecting the user experience in some cases.

Implications for enterprise applications: The enhancements to the Realtime API could have significant implications for businesses looking to implement advanced voice-based systems.

The improvements in speech-to-speech AI technology could enable enterprises to build more sophisticated and responsive real-time voice response systems.
The cost reductions make it more feasible for a wider range of businesses to implement these advanced voice technologies.

Past controversies and ethical considerations: OpenAI’s history with AI voices highlights the need for careful consideration in this technology’s development and use.

The company previously paused the use of a voice similar to Scarlett Johansson’s, indicating sensitivity to potential ethical issues surrounding AI-generated voices.
This history underscores the importance of responsible development and deployment of AI voice technologies.

Looking ahead: Potential impact and adoption: The updates to the Realtime API represent a significant step forward in making advanced AI voice technology more accessible and affordable.

The combination of improved functionality and reduced costs could accelerate the adoption of AI-powered voice assistants across various industries.
As the technology continues to evolve, it will be crucial to monitor its impact on user experiences, privacy considerations, and the broader implications for human-AI interaction.

OpenAI expands Realtime API with new voices and cuts prices for developers

VentureBeat

Menu

OpenAI enhances Realtime API with new voices and lower prices

Recent News

Why most AI pilots fail to scale beyond proof-of-concept

AI as force nudifier suggests parents should rethink sharing kids’ photos online

On-premises GPU servers cost same as 6-9 months of cloud

Join the revolution

CO/AI

Resources

Join the revolution

Menu

Welcome

OpenAI enhances Realtime API with new voices and lower prices

Recent News

Why most AI pilots fail to scale beyond proof-of-concept

AI as force nudifier suggests parents should rethink sharing kids’ photos online

On-premises GPU servers cost same as 6-9 months of cloud

Join the revolution

CO/AI

Resources

Join the revolution