Mistral unveils Voxtral, open source AI audio model

Technology Jul 15, 2025

French AI startup Mistral just launched Voxtral, its first open model audio model family.
Built for businesses but open to all, Voxtral promises high-quality speech recognition without the price tag or headaches of closed systems.

API beats OpenAI Whisper at less than half the price

Voxtral comes in two flavors: Small (24B parameters) for big jobs, and Mini (3B parameters) if you want to run things locally.
The dedicated API—Voxtral Mini Transcribe—beats OpenAI Whisper at less than half the price.
You get audio transcription up to 30 minutes and understanding up to 40 minutes per file.

Supports 8 major languages; super affordable API

Voxtral supports eight major languages including English, Hindi, French, and Spanish.
The API is super affordable at $0.001 per minute—and you can test it out free on Hugging Face or Mistral's chatbot.

Takes on ElevenLabs Scribe, GPT-4o-mini

Powered by Mistral Small 3.1 (known for its speed), Voxtral handles tasks like summarizing audio or answering questions from voice commands.
It's taking on ElevenLabs Scribe and GPT-4o-mini by being cheaper, open model, and giving users full control over deployment.