×
Perfectly imperfect: AI voice companions evolve beyond ChatGPT with unsettling realism
Written by
Published on
Join our daily newsletter for breaking news, product launches and deals, research breakdowns, and other industry-leading AI coverage
Join Now

A new conversational AI called Sesame is raising eyebrows with its uncannily human-like speech patterns, complete with hesitations, self-corrections, and natural interruptions. Unlike traditional AI assistants that simply convert text to speech, Sesame’s breakthrough Conversational Speech Model (CSM) generates speech in a way that mirrors authentic human conversation, potentially marking a significant shift in how we interact with AI systems.

The big picture: Sesame represents a departure from conventional AI voice assistants by deliberately incorporating human imperfections rather than striving for polished perfection.

How it works: Sesame’s Conversational Speech Model combines text and audio processing into a single unified system, enabling more natural speech generation.

  • Unlike ChatGPT and Gemini, which first generate text and then convert it to speech, Sesame creates speech directly with human-like pauses, tonal shifts, and filler words.
  • The system can interrupt conversations, apologize for interruptions, and even change its “mind” mid-sentence, mirroring natural human speech patterns.

Key features: The AI demonstrates sophisticated conversational abilities that go beyond traditional voice assistants.

  • It produces natural chuckles when saying something mildly amusing.
  • The system incorporates thoughtful pauses before responding to questions.
  • It seamlessly handles interruptions in both directions, creating more authentic dialogue.

Why this matters: Sesame’s ability to replicate human speech imperfections so accurately raises important questions about the future of AI-human interactions and the increasing difficulty of distinguishing between human and AI voices.

Behind the numbers: While Sesame currently remains a niche technology, its development suggests a future where phone conversations may require verification of whether the speaker is human or AI.

I tried the most realistic AI voice companion ever created - if ChatGPT or Gemini ever gets this good, reality is in trouble

Recent News

OpenAI revises restructure plan amid leadership changes

The AI company attempts to balance nonprofit oversight with commercial ambitions after abandoning its controversial plan to sell controlling shares to its for-profit division.

AI medical advice improves, but adoption remains a challenge

Despite improving AI performance on medical queries, significant challenges remain in bridging the gap between controlled testing environments and real-world emergency situations.

Whisper AI transcribes 10x faster with new Inference Endpoints

The open-source speech recognition model achieves dramatic speed improvements without sacrificing transcription accuracy through optimized GPU operations and reduced memory requirements.