Pixtral Large
Pixtral Large is Mistral's flagship multimodal model supporting text and image inputs with a 131K token context window.
API Pricing
| Provider | Input / 1M | Output / 1M | Speed | TTFT | Updated |
|---|---|---|---|---|---|
| $2.00 | $6.00 | 59.0 t/s | 498ms | 4/14/2026 | |
| $2.00 | $6.00 | 59.0 t/s | 498ms | 4/14/2026 |
Prices updated daily. Last check: 4/14/2026
Model Details
General
- Creator
- Mistral
- Family
- Pixtral
- Tier
- Flagship
- Context Window
- 131K
- Modalities
- Text, Image
Capabilities
- Tool Calling
- No
- Open Source
- No
Strengths & Limitations
- Flagship-tier capabilities from Mistral for complex multimodal tasks
- 131,072 token context window supports lengthy documents with images
- Multimodal input support for both text and image analysis
- 52.68 tokens per second output speed for responsive inference
- 446ms time to first token provides quick response initiation
- Positions Mistral competitively in the flagship multimodal model segment
- No tool calling support limits agentic applications
- Proprietary model with no open source weights available
- Limited to text and image modalities without audio or video support
- Smaller context window than some competing flagship models
Key Features
About Pixtral Large
Common Use Cases
Pixtral Large serves applications requiring sophisticated multimodal understanding, particularly where both visual and textual analysis are critical. Its flagship-tier capabilities make it suitable for complex document processing involving charts, diagrams, and text, visual content moderation and analysis, multimodal research applications, and detailed image captioning or visual question answering. The 131K context window enables analysis of lengthy reports with embedded images or processing multiple images alongside extensive text. Organizations typically deploy it for high-value use cases where the superior multimodal reasoning capabilities justify the cost of a flagship model, rather than high-volume applications better suited for lighter alternatives.
Frequently Asked Questions
How much does Pixtral Large cost per million tokens?
Pixtral Large pricing varies by provider and may differ for input versus output tokens, as well as text versus image processing. Check the pricing table above for current rates across all providers offering this model.
What is Pixtral Large best used for?
Pixtral Large excels at complex multimodal tasks requiring analysis of both text and images, such as processing documents with charts and diagrams, visual content analysis, detailed image captioning, and applications where sophisticated cross-modal reasoning is needed. Its flagship-tier capabilities and 131K context window make it suitable for high-value applications rather than high-volume use cases.
Does Pixtral Large support tool calling or function execution?
No, Pixtral Large does not support tool calling or function execution capabilities. It focuses on multimodal understanding and generation tasks with text and image inputs, but cannot interact with external tools or APIs through structured function calls.