Gemini 3 Pro
Gemini 3 Pro is Google's flagship multimodal model supporting text, image, video, and audio inputs with a 1M token context window.
API Pricing
Cheapest on Google Cloud — 40% below avg| Provider | Input / 1M | Output / 1M | Speed | TTFT | Updated |
|---|---|---|---|---|---|
| $1.00 | $6.00 | 134 t/s | 28.7s | 4/13/2026 | |
| $2.00 | $12.00 | 134 t/s | 28.7s | 4/14/2026 | |
| $2.00 | $12.00 | 134 t/s | 28.7s | 4/13/2026 |
Prices updated daily. Last check: 4/14/2026
Model Details
General
- Creator
- Family
- Gemini
- Tier
- Flagship
- Context Window
- 1.0M
- Knowledge Cutoff
- Jun 2025
- Modalities
- Text, Image, Video, Audio
Capabilities
- Tool Calling
- Yes
- Open Source
- No
- Subtypes
- Chat Completion, Code Generation
Strengths & Limitations
- Supports four modalities: text, image, video, and audio input
- 1 million token context window for processing extensive documents
- Tool calling functionality with structured outputs
- Output speed of 139 tokens per second
- Knowledge cutoff of June 2025 for current information
- Code generation capabilities alongside chat completion
- Video input support distinguishes it from many competing models
- Proprietary model with no open-source weights available
- Time to first token of 20.4 seconds is slower than some competitors
- No longer the newest model in Google's lineup with Gemini 3.1 Pro available
- Multimodal capabilities may come with higher computational costs
Key Features
About Gemini 3 Pro
Common Use Cases
Gemini 3 Pro is designed for complex multimodal applications requiring flagship-level performance across diverse input types. Its video and audio processing capabilities make it suitable for multimedia content analysis, educational applications involving varied media formats, and research tasks requiring understanding of visual and auditory information. The 1 million token context window enables processing of extensive documents, lengthy conversations, and large codebases. Organizations use it for sophisticated AI agents that need to interpret and reason about multiple data types simultaneously, complex content moderation involving video and audio, and applications requiring deep understanding of multimedia educational or training materials.
Frequently Asked Questions
How much does Gemini 3 Pro cost per million tokens?
Gemini 3 Pro pricing varies by provider and pricing type (standard vs batch). Input and output tokens typically have different rates, and multimodal inputs may have separate pricing structures. Check the pricing table above for current rates across all providers.
What is Gemini 3 Pro best used for?
Gemini 3 Pro excels at complex multimodal tasks requiring understanding of text, images, video, and audio. Its 1M token context window makes it ideal for extensive document analysis, multimedia content processing, educational applications with varied media types, and AI agents that need to reason across multiple input modalities simultaneously.
How does Gemini 3 Pro compare to Gemini 3.1 Pro?
Gemini 3.1 Pro is the newer model in Google's flagship tier, likely offering improved capabilities over Gemini 3 Pro. Both models share similar multimodal capabilities and large context windows, but Gemini 3.1 Pro represents Google's latest advancements in the Gemini family. The choice depends on whether you need the absolute latest capabilities or if Gemini 3 Pro's features meet your requirements.