FlagshipXiaomi

MiMo v2 Pro

MiMo v2 Pro is Xiaomi's flagship text-only language model with a 1M token context window for complex reasoning and long-form content tasks.

Context 1.0M
Tier Flagship
Input from
$1.00 / 1M tokens
across 1 provider

API Pricing

ProviderInput / 1MOutput / 1MSpeedTTFTUpdated
$1.00$3.0060.5 t/s2.4s4/14/2026

Prices updated daily. Last check: 4/14/2026

Model Details

General

Creator
Xiaomi
Family
MiMo
Tier
Flagship
Context Window
1.0M
Modalities
Text

Capabilities

Tool Calling
No
Open Source
No

Strengths & Limitations

  • 1,048,576 token context window supports extremely long text processing
  • 67 tokens per second generation speed for reasonable throughput
  • Flagship tier model designed for complex reasoning tasks
  • Developed by Xiaomi, bringing additional competition to the LLM market
  • Text-focused architecture may provide optimized performance for language tasks
  • Large context capacity enables processing of entire books or large codebases
  • No tool calling or function execution capabilities
  • Text-only input - no support for images or other modalities
  • Proprietary model with no open source availability
  • 2.4 second time to first token is slower than many competing models
  • Limited ecosystem and integration compared to established model providers

Key Features

1,048,576 token context window
Text input and output processing
Streaming response generation
Large-scale document comprehension
Extended conversation memory
Complex reasoning over long contexts
Multi-turn dialogue support
Long-form content generation

About MiMo v2 Pro

MiMo v2 Pro is Xiaomi's flagship language model in the MiMo family, designed for complex text processing and reasoning tasks. As a proprietary model from the Chinese technology company, it represents Xiaomi's entry into the competitive large language model market alongside their consumer electronics and mobile device offerings. The model features a substantial 1,048,576 token context window (1M tokens), enabling it to process extremely long documents, codebases, or conversations while maintaining coherence. MiMo v2 Pro focuses exclusively on text input and output, without multimodal capabilities like image processing or tool calling functions. Performance benchmarks show the model generates approximately 67 tokens per second with a time to first token of 2.4 seconds. MiMo v2 Pro competes in the flagship tier against other large context models, though it lacks some features common in competing models like function calling or multimodal support. The model's primary strength lies in its ability to handle extensive text processing tasks that require understanding of very long contexts, making it suitable for applications involving large document analysis or extended reasoning chains.

Common Use Cases

MiMo v2 Pro is designed for applications requiring extensive context understanding and complex text processing. Its 1M token context window makes it particularly suitable for analyzing large documents, legal contracts, research papers, or entire codebases where maintaining context across long passages is critical. The model works well for extended creative writing projects, comprehensive document summarization, and complex reasoning tasks that require synthesizing information from lengthy sources. Organizations dealing with large-scale text analysis, content research, or applications requiring deep understanding of extensive written materials would benefit from MiMo v2 Pro's capabilities. However, users requiring multimodal processing, tool integration, or faster response times may need to consider alternative models with those specific features.

Frequently Asked Questions

How much does MiMo v2 Pro cost per million tokens?

MiMo v2 Pro pricing varies by provider and pricing type. Check the pricing table above for current rates across all available providers offering this model.

What is MiMo v2 Pro best used for?

MiMo v2 Pro excels at tasks requiring extensive context understanding, such as analyzing large documents, processing entire codebases, extended creative writing, and complex reasoning over long text sequences. Its 1M token context window makes it ideal for applications where maintaining context across very long passages is essential.

Does MiMo v2 Pro support function calling or multimodal inputs?

No, MiMo v2 Pro is a text-only model that does not support function calling, tool use, or multimodal inputs like images. It focuses exclusively on text processing and generation, optimized for handling very long text contexts rather than external integrations or visual content.