Voxtral Small 24B
Voxtral Small 24B is Mistral's lightweight multimodal model that processes both text and audio with a 32K token context window.
API Pricing
Cheapest on OpenRouter — 27% below avg| Provider | Input / 1M | Output / 1M | Updated |
|---|---|---|---|
| $0.100 | $0.300 | 4/14/2026 | |
| $0.176 | $0.410 | 4/13/2026 |
Prices updated daily. Last check: 4/14/2026
Model Details
General
- Creator
- Mistral
- Family
- Voxtral
- Tier
- Lightweight
- Context Window
- 32K
- Modalities
- Text, Audio
Capabilities
- Tool Calling
- No
- Open Source
- No
Strengths & Limitations
- Supports both text and audio input modalities
- 32,000 token context window for substantial input capacity
- 24B parameter size balances capability with efficiency
- Lightweight tier positioning for cost-effective deployment
- Part of Mistral's Voxtral family with consistent API interfaces
- Enables audio transcription and analysis workflows
- Suitable for high-volume multimodal processing
- No tool calling or function execution support
- Proprietary model with weights not publicly available
- Smaller parameter count than flagship multimodal alternatives
- Limited to text and audio modalities only
- Lightweight tier may have reduced reasoning capabilities compared to larger models
Key Features
About Voxtral Small 24B
Common Use Cases
Voxtral Small 24B is well-suited for applications requiring efficient audio and text processing at scale. Common use cases include audio transcription services, voice-to-text applications, podcast analysis, customer service call processing, and content moderation for audio platforms. Its lightweight architecture makes it appropriate for high-volume scenarios where audio understanding is needed but computational budgets are constrained. The model works well for building voice interfaces, analyzing recorded meetings, and processing multimedia content where both spoken and written elements need to be understood together.
Frequently Asked Questions
How much does Voxtral Small 24B cost per million tokens?
Voxtral Small 24B pricing varies by provider and usage patterns. As a lightweight multimodal model, costs will differ for text versus audio processing. Check the pricing table above for current rates across all providers offering this model.
What is Voxtral Small 24B best used for?
Voxtral Small 24B excels at audio transcription, voice analysis, and applications requiring both text and audio understanding. Its lightweight design makes it ideal for high-volume scenarios like customer service call analysis, podcast processing, and voice interface development where efficiency is important.
Does Voxtral Small 24B support tool calling or function execution?
No, Voxtral Small 24B does not support tool calling or function execution capabilities. It focuses specifically on text and audio processing tasks. If you need function calling alongside multimodal capabilities, you would need to consider other models in Mistral's lineup or alternative providers.