ElevenLabs launches 'Scribe': stand-alone speech-to-text model for global languages

Reuters

By Elnur Mirzazada

February 27, 2025 20:00 Updated 246d ago

AI startup ElevenLabs, known for its cutting-edge audio-generation technology, has unveiled its first stand-alone speech-to-text model, named Scribe.

The company, fresh off a $180 million funding round and valued at $3.3 billion, is now expanding its technology portfolio to compete in the speech detection arena.

Scribe supports over 99 languages at launch, with more than 25 languages achieving an “excellent” accuracy rating—defined as a word error rate of less than 5%. This list includes English, with a claimed accuracy rate of 97%, as well as French, German, Hindi, Indonesian, Japanese, Kannada, Malayalam, Polish, Portuguese, Spanish, and Vietnamese. Other languages are categorized into high, good, and moderate accuracy levels based on their word error rates.

According to benchmark tests using FLEURS and Common Voice datasets, Scribe has outperformed competitors such as Google Gemini 2.0 Flash and OpenAI’s Whisper Large V3 across multiple languages. Previously, ElevenLabs developed a speech-to-text component for its AI conversational agent platform, but Scribe marks the first time the company is releasing a dedicated, stand-alone speech detection model.

CEO Mati Staniszewski told TechCrunch last month, “We want to understand what’s being said by you in a conversation better. We are working on ways to move away from only generating content and understanding and transcribing speech.” He noted that while many consider speech-to-text a solved problem, performance for many languages remains suboptimal. “We think we can build better speech detection models because we have in-house teams to annotate data and give us quick feedback,” Staniszewski added.

In addition to accurate transcription, Scribe incorporates smart speaker diarization to identify who is speaking, provides word-level timestamps for precise subtitle generation, and auto-tags sound events such as audience laughter. The model currently processes pre-recorded audio formats, and ElevenLabs plans to release a low-latency real-time version in the near future, which would extend its use to meeting transcriptions and live voice note-taking.

Scribe is priced competitively at $0.40 per hour of transcribed audio, although some rival services offer lower prices with different feature sets. As ElevenLabs continues to push the boundaries of generative AI technology, the launch of Scribe marks another significant step in expanding its influence across both audio-generation and speech detection markets.

Japan's Takaichi and China's Xi agree to pursue stable ties

China has expressed readiness to work with Japan to foster a stable and constructive relationship, state broadcaster China Central Television (CCTV) r...

58min ago

China

Shenzhou-21 enters orbit successfully

China successfully carried out its latest crewed space mission on Friday night. The Shenzhou-21 spacecraft was launched at 23:44 (Beijing Time) from t...

How Armenia’s European turn and Georgia’s EU path are redrawing the region’s political map

1h ago

South Caucasus

How Armenia’s European turn and Georgia’s EU path are redrawing the region’s political map

For the first time in decades, the South Caucasus' political gravity appears to be shifting, with Brussels not Moscow increasingly shaping the languag...

1h ago

COP30

U.S. will not send officials to COP30 climate talks, White House says

The United States will not send senior officials to the COP30 climate summit in Brazil, according to a White House statement to Reuters, easing fears ...

Over 62,000 flee Sudan’s Al-Fashir after RSF captures city, says UN agency

1h ago

World News

Over 62,000 flee Sudan’s Al-Fashir after RSF captures city, says UN agency

More than 62,000 people have fled Al-Fashir in Sudan’s North Darfur within four days of the Rapid Support Forces (RSF) seizing the city, according t...

Nokia chief likens AI surge to 1990s internet boom but dismisses bubble fears

Nokia chief executive Justin Hotard said artificial intelligence is fuelling a structural growth cycle similar to the internet expansion of the 1990s, but rejected fears that investor enthusiasm has reached unsustainable levels.

Israeli scientists develop AI tool that reveals real age from DNA

10:00

AI & DNA

Israeli scientists develop AI tool that reveals real age from DNA

Israeli researchers have unveiled an artificial intelligence tool that can determine a person’s true biological age from tiny DNA samples with remarkable precision.

20:45

SpeciesNet

Google releases SpeciesNet, an AI model to identify wildlife

Google has open sourced SpeciesNet, an artificial intelligence model designed to automatically identify animal species from photos captured by camera traps.

22:00

DeepSeek

DeepSeek claims ‘Theoretical’ profit margins of 545%, sparking debate over AI profitability

Chinese AI startup DeepSeek has stirred conversation in the tech community by claiming that its AI models could achieve a “cost profit margin” of 545% under ideal conditions.

comments (0)

What is your opinion on this topic?

Leave the first comment

Most viewed

Azerbaijan weighs U.S. request to join Gaza stabilisation force

11:45

AnewZ exclusive

Azerbaijan weighs U.S. request to join Gaza stabilisation force

Reliable sources have confirmed to AnewZ that the United States has asked Azerbaijan to join a Stabilisation Force in Gaza, as part of a proposed international mission to secure the territory.

12:28

Tanzania

Tanzania police fire tear gas as post-election protests continue in Dar es Salaam

Tanzanian police fired tear gas and live rounds on Thursday to disperse protesters in Dar es Salaam and other cities, a day after a disputed election marked by violence and claims of political repression, witnesses said.

Dutch centrist D66 wins big as far right loses ground

13:41

Dutch Elections

Dutch centrist D66 wins big as far right loses ground

Centrist liberal party D66, led by 38-year-old Rob Jetten, has made sweeping gains in the Dutch election, emerging neck and neck with Geert Wilders’ far-right Freedom Party (PVV) in early results — a stunning reversal just two years after D66 ranked sixth.

06:30

Trump

Trump cuts China tariffs in deal with Xi on fentanyl and rare earths

U.S. President Donald Trump agreed with President Xi Jinping to trim tariffs on China in exchange for Beijing cracking down on the illicit fentanyl trade, Trump said.

Zelenskyy reports intense fighting in Pokrovsk, Kyiv forces hold Kupiansk

04:30

Russia-Ukraine War

Zelenskyy reports intense fighting in Pokrovsk, Kyiv forces hold Kupiansk

Ukrainian President Volodymyr Zelenskyy said on Wednesday that the most difficult situation on the front line remains the eastern city of Pokrovsk, where fighting continues to be most intense due to a strong concentration of Russian forces.

Science

Artificial Intelligence Innovations & Technology Science News

13:48

Nvidia

Nvidia partners with South Korea to advance AI development

Nvidia has announced a major partnership with the South Korean government and top companies to strengthen the country’s artificial intelligence capabilities by supplying hundreds of thousands of its advanced GPUs.

Character.AI bans users under 18 after lawsuit over teen suicide

14:11

Artificial Intelligence

Character.AI bans users under 18 after lawsuit over teen suicide

Character.AI will ban under-18s from chatting with its AI characters and introduce time limits, following lawsuits alleging the platform contributed to a teenager’s death.

Mysterious comet from deep space is raising big questions at Harvard

16:49

Cosmos

Mysterious comet from deep space is raising big questions at Harvard

A small, silent object from another star is cutting through the Solar System. It’s real, not a film, and one scientist thinks it might be sending a message.

11:14

ChatGPT

Florida student, 13, arrested after violent ChatGPT query triggers school alert

A 13-year-old boy in central Florida has been arrested after typing a violent question into ChatGPT during class, prompting an emergency police response when school monitoring software flagged the message in real time.