ETV Bharat / technology

OpenAI Voice Engine: AI Technology Generating Natural-Sounding Speech to be Released Soon

OpenAI shared preliminary insights and results from a small-scale preview of a model called Voice Engine, which uses text input and a single 15-second audio sample to generate natural-sounding speech that closely resembles the original speaker.

author img

By ETV Bharat Tech Team

Published : Mar 30, 2024, 9:27 AM IST

Etv Bharat
Etv Bharat

Hyderabad: OpenAI, the maker of ChatGPT, is entering into the voice assistant space and is showing off a new technology that it claims can replicate a person’s voice, says it won't yet release it publicly due to safety concerns.

OpenAI announced the launch of its Voice Engine technology on Friday, which was filed a week ago with the U.S. Patent and Trademark Office (USPTO) for a trademark application under the name “ChatGPT.” According to OpenAI, the technology is capable of reproducing someone’s voice with only 15 seconds of recorded speech. The company plans to preview the technology with early testers, but won’t be releasing it to the public at this time due to the risk of misuse.

"We recognize that generating speech that resembles people's voices has serious risks, which are especially top of mind in an election year," the San Francisco company said in a statement.

"We are engaging with U.S. and international partners from across government, media, entertainment, education, civil society and beyond to ensure we are incorporating their feedback as we build," the company added.

Authorities in New Hampshire are investigating a batch of robocalls that targeted thousands of voters in the run-up to the state’s presidential primary, featuring an artificial intelligence-generated voice that appears to be President Joe Biden’s voice. Some startups already offer voice cloning technology, and some of it is available to the general public or to select business customers like entertainment studios.

However, OpenAI claimed that the early Voice Engine testers have agreed to not impersonate a person without their consent and to disclose that the voices are AI-generated. "Finally, we have implemented a set of safety measures, including watermarking to trace the origin of any audio generated by Voice Engine, as well as proactive monitoring of how it's being used," OpenAI said.

The company, best known for its chatbot and the image-generator DALL-E, took a similar approach in announcing but not widely releasing its video-generator Sora.

Hyderabad: OpenAI, the maker of ChatGPT, is entering into the voice assistant space and is showing off a new technology that it claims can replicate a person’s voice, says it won't yet release it publicly due to safety concerns.

OpenAI announced the launch of its Voice Engine technology on Friday, which was filed a week ago with the U.S. Patent and Trademark Office (USPTO) for a trademark application under the name “ChatGPT.” According to OpenAI, the technology is capable of reproducing someone’s voice with only 15 seconds of recorded speech. The company plans to preview the technology with early testers, but won’t be releasing it to the public at this time due to the risk of misuse.

"We recognize that generating speech that resembles people's voices has serious risks, which are especially top of mind in an election year," the San Francisco company said in a statement.

"We are engaging with U.S. and international partners from across government, media, entertainment, education, civil society and beyond to ensure we are incorporating their feedback as we build," the company added.

Authorities in New Hampshire are investigating a batch of robocalls that targeted thousands of voters in the run-up to the state’s presidential primary, featuring an artificial intelligence-generated voice that appears to be President Joe Biden’s voice. Some startups already offer voice cloning technology, and some of it is available to the general public or to select business customers like entertainment studios.

However, OpenAI claimed that the early Voice Engine testers have agreed to not impersonate a person without their consent and to disclose that the voices are AI-generated. "Finally, we have implemented a set of safety measures, including watermarking to trace the origin of any audio generated by Voice Engine, as well as proactive monitoring of how it's being used," OpenAI said.

The company, best known for its chatbot and the image-generator DALL-E, took a similar approach in announcing but not widely releasing its video-generator Sora.

ETV Bharat Logo

Copyright © 2024 Ushodaya Enterprises Pvt. Ltd., All Rights Reserved.