TL;DR

  • OpenAI Voice Engine is a text-to-speech model that can clone a realistic voice from just a 15-second audio sample
  • It produces emotive, natural-sounding speech despite using a small model and minimal training data
  • Access has remained in limited preview since its 2024 announcement due to responsible AI concerns around voice cloning and impersonation
  • Approved testers must obtain clear consent from voice providers and inform listeners that voices are AI-generated
  • As of 2026, the technology is restricted to approved partners and researchers rather than general availability

About

OpenAI’s Voice Engine is a text-to-speech tool which can create realistic voices from just a 15-second audio sample. It is notable that a small model with a single 15-second sample can create emotive and realistic voices. To ensure responsible use testers must get clear consent from voice providers, avoid creating user-generated voices, and inform listeners that the voices are AI-generated.

Status & Access

Voice Engine has remained in limited preview since its 2024 announcement. OpenAI has been cautious about broader deployment due to responsible AI considerations around synthetic voice generation, particularly concerns about voice cloning and impersonation risks.

As of 2026, access is still restricted to approved partners and researchers, with clear consent and transparency requirements.

YouTube

OpenAI Introducing: A New Era of Human-like AI Voices