Connect with us

AI / Tech

ElevenLabs CEO: Voice is the next interface for AI

Published

on


ElevenLabs co-founder and CEO Mati Staniszewski says voice is becoming the next major interface for AI – the way people will increasingly interact with machines as models move beyond text and screens.

Speaking at Web Summit in Doha, Staniszewski told TechCrunch voice models like those developed by ElevenLabs have recently moved beyond simply mimicking human speech — including emotion and intonation – to working in tandem with the reasoning capabilities of large language models. The result, he argued, is a shift in how people interact with technology. 

In the years ahead, he said, “hopefully all our phones will go back in our pockets, and we can immerse ourselves in the real world around us, with voice as the mechanism that controls technology.”

That vision fueled ElevenLabs’s $500 million raise this week at an $11 billion valuation, and it is increasingly shared across the AI industry. OpenAI and Google have both made voice a central focus of their next-generation models, while Apple appears to be quietly building voice-adjacent, always-on technologies through acquisitions like Q.ai. As AI spreads into wearables, cars, and other new hardware, control is becoming less about tapping screens and more about speaking, making voice a key battleground for the next phase of AI development. 

Iconiq Capital general partner Seth Pierrepont echoed that view onstage at Web Summit, arguing that while screens will continue to matter for gaming and entertainment, traditional input methods like keyboards are starting to feel “outdated.”

And as AI systems become more agentic, Pierrepont said, the interaction itself will also change, with models gaining guardrails, integrations, and context needed to respond with less explicit prompting from users. 

Staniszewski pointed to that agentic shift as one of the biggest changes underway. Rather than spelling out every instruction, he said future voice systems will increasingly rely on persistent memory and context built up over time, making interactions feel more natural and requiring less effort from users. 

Techcrunch event

Boston, MA
|
June 23, 2026

That evolution, he added, will influence how voice models are deployed. While high-quality audio models have largely lived in the cloud, Staniszewski said ElevenLabs is working toward a hybrid approach that blends cloud and on-device processing — a move aimed at supporting new hardware, including headphones and other wearables, where voice becomes a constant companion rather than a feature you decide when to engage with. 

ElevenLabs is already partnering with Meta to bring its voice technology to products including Instagram and Horizon Worlds, the company’s virtual reality platform. Staniszewski said he would also be open to working with Meta on its Ray-Ban smart glasses as voice-driven interfaces expand into new form factors. 

But as voice becomes more persistent and embedded in everyday hardware, it opens the door to serious concerns around privacy, surveillance, and how much personal data voice-based systems will store as they move closer to users’ daily lives — something companies like Google have already been accused of abusing.



Source link

Continue Reading
Click to comment

Leave a Reply

Your email address will not be published. Required fields are marked *