Gemini 3.1 Flash Lite
Gemini 3.1 Flash Lite is Google's lightweight multimodal model offering fast inference across text, image, audio, and video with a 1M token context window.
API Pricing
| Provider | Input / 1M | Output / 1M | Speed | TTFT | Updated |
|---|---|---|---|---|---|
| $0.250 | $1.50 | 199 t/s | 7.8s | 4/10/2026 | |
| $0.250 | $1.50 | 199 t/s | 7.8s | 4/10/2026 |
Prices updated daily. Last check: 4/14/2026
Model Details
General
- Creator
- Family
- Gemini
- Tier
- Lightweight
- Context Window
- 1.0M
- Modalities
- Text, Image, Audio, Video
Capabilities
- Tool Calling
- No
- Open Source
- No
- Aliases
- gemini-3-1-flash-lite-preview
Strengths & Limitations
- Supports four modalities: text, image, audio, and video input
- Large 1 million token context window for processing lengthy documents
- Fast output generation at 199 tokens per second
- Lightweight design optimized for speed and efficiency
- Part of Google's current-generation Gemini 3.1 model family
- Multimodal capabilities in a cost-optimized package
- No tool calling or function execution capabilities
- Positioned as lightweight tier with reduced reasoning compared to Gemini 3.1 Pro
- Proprietary model with no open-source availability
- Time to first token of 7.8 seconds is slower than some competitors
Key Features
About Gemini 3.1 Flash Lite
Common Use Cases
Gemini 3.1 Flash Lite is designed for applications requiring fast multimodal processing without the complexity of tool calling or maximum reasoning capability. Its combination of speed, large context window, and broad modality support makes it suitable for content analysis workflows, document processing with mixed media, rapid prototyping of multimodal applications, and high-throughput scenarios where cost efficiency matters. The model works well for summarizing long documents with embedded images, processing video content for basic analysis, and applications needing quick responses across multiple input types. Organizations looking for multimodal capabilities at scale, rather than complex reasoning or agentic workflows, will find this model appropriate for their needs.
Frequently Asked Questions
How much does Gemini 3.1 Flash Lite cost per million tokens?
Gemini 3.1 Flash Lite pricing varies by provider and usage type. Check the pricing table above for current rates across all available providers and compare input vs output token costs.
What is Gemini 3.1 Flash Lite best used for?
Gemini 3.1 Flash Lite excels at fast multimodal processing tasks including document analysis with images, basic video content processing, and high-volume applications where speed and cost efficiency are priorities over maximum reasoning capability.
Does Gemini 3.1 Flash Lite support tool calling and function execution?
No, Gemini 3.1 Flash Lite does not include tool calling capabilities. For applications requiring function execution or API integrations, consider Gemini 3.1 Pro or other models that specifically support structured tool calling.