OpenAI focuses on audio AI – new hardware in sight

Audio is said to be the focus of OpenAI's announced but so far mysterious hardware. The internal team is being massively restructured.

listen Print view
Jony Ive behind an OpenAI logo

(Image: Thrive Studios / Shutterstock.com)

2 min. read

It seems everything is about audio at OpenAI. The internal team responsible for audio features and models is reportedly being expanded over the past two months. This suggests that the announced hardware from the company is a device based on audio. This is not surprising.

As The Information reports, OpenAI has made internal restructuring to assign new teams to the development of audio models. They are all said to be working on an "audio-first" personal device. This mysterious, but long-announced device is expected to be released in about a year.

ChatGPT can, of course, already speak with users – as can other AI chatbots. How this works varies. Optionally, there is a text-based model that processes input and output, but then passes it on to another model that converts text to audio. Alternatively, a multimodal model can process audio directly without this translation step.

A person familiar with the development has reportedly told The Information that the model previously used by OpenAI is inferior in performance to the pure text model. Passing it on to a second model for speech output takes time – thus slowing down communication. However, the informant also reportedly speaks of initial successes with a new model specialized in audio.

Videos by heise

Other companies are also focusing on audio. The motto seems to be moving away from the screen. Google, for example, is planning an audio search engine with Audio Overviews. Initially, speaking with a computer works particularly well with smart glasses. Meta has had the Ray-Ban and Oakley versions on the market for a long time, while Google is lagging a bit behind with its new Glasses. In the meantime, several startups have also entered the market. These include, for example, the Rokid Glasses or those from Even Realities.

However, OpenAI says its device is intended to be more than just glasses. It is assumed that, in addition to audio, it will also focus on uninterrupted operation. „Always on“ is said to make AI hardware truly helpful – at least that's how Silicon Valley envisions it. Meta recently acquired the startup „Limitless“, which also relies on a permanently listening pendant.

(emw)

Don't miss any news – follow us on Facebook, LinkedIn or Mastodon.

This article was originally published in German. It was translated with technical assistance and editorially reviewed before publication.