FlagshipGoogle

Gemini 3 Pro

Name: Gemini 3 Pro
Availability: InStock
Author: Google

Gemini 3 Pro is Google's flagship multimodal model supporting text, image, video, and audio inputs with a 1M token context window.

Context 1.0M

Tier Flagship

Knowledge Jun 2025

Tools Supported

Modalities text, image, video, audio

Input from

$1.00 / 1M tokens

across 2 providers

Compare Prices Model Page →API Docs

API Pricing

Cheapest on Google Cloud — 40% below avg

Provider	Input / 1M	Output / 1M	Cached / 1M	Updated
Google CloudBatch	$1.00	$6.00	-	7/9/2026
OpenRouter	$2.00	$12.00	$0.200	7/13/2026
Google Cloud	$2.00	$12.00	-	7/12/2026

Prices updated daily. Last check: Jul 13, 2026

Performance & Benchmarks

Source: Artificial Analysis →

Intelligence

39.6 / 100

Math

95.7 / 100

Reasoning & Knowledge

MMLU-Pro89.8%
GPQA Diamond90.8%
Humanity's Last Exam37.2%

Coding

LiveCodeBench91.7%
SciCode56.1%

Math

AIME 202595.7%

Agentic & Tool Use

Terminal-Bench Hard41.7%
τ²-bench87.1%

Instruction & Long Context

IFBench70.4%
Long-Context Reasoning70.7%

Benchmarks measured Jul 2026. Scores are independent evaluations, not vendor-reported.

Model Details

General

Creator: Google
Family: Gemini
Tier: Flagship
Context Window: 1.0M
Knowledge Cutoff: Jun 2025
Modalities: Text, Image, Video, Audio

Capabilities

Tool Calling: Yes
Open Source: No
Subtypes: Chat Completion, Code Generation

Strengths & Limitations

Strengths

Supports four modalities: text, image, video, and audio input
1 million token context window for processing extensive documents
Tool calling functionality with structured outputs
Output speed of 139 tokens per second
Knowledge cutoff of June 2025 for current information
Code generation capabilities alongside chat completion
Video input support distinguishes it from many competing models

Limitations

Proprietary model with no open-source weights available
Time to first token of 20.4 seconds is slower than some competitors
No longer the newest model in Google's lineup with Gemini 3.1 Pro available
Multimodal capabilities may come with higher computational costs

Key Features

•1 million token context window

•Multimodal input (text, image, video, audio)

•Tool calling with structured output

•Chat completion

•Code generation

•Streaming responses

•Video processing capabilities

•Function calling API

About Gemini 3 Pro

Gemini 3 Pro is Google's flagship model in the Gemini family, representing the company's most capable offering for complex multimodal tasks. Developed by Google DeepMind, it sits at the top tier of the Gemini model lineup, designed to handle sophisticated reasoning and generation across multiple input types. The model features a 1 million token context window and supports text, image, video, and audio inputs, making it one of the most versatile multimodal models available. It includes tool calling capabilities and covers chat completion and code generation tasks. Performance benchmarks show an output speed of 139 tokens per second with a time to first token of 20.4 seconds. The model's knowledge cutoff is June 2025, providing relatively current information. Gemini 3 Pro competes directly with other flagship models like Claude Opus 4.6 and GPT-5.4, offering Google's approach to multimodal AI capabilities. Its extensive context window and broad modality support make it suitable for complex document analysis, multimedia content processing, and applications requiring understanding across different input types.

Common Use Cases

Gemini 3 Pro is designed for complex multimodal applications requiring flagship-level performance across diverse input types. Its video and audio processing capabilities make it suitable for multimedia content analysis, educational applications involving varied media formats, and research tasks requiring understanding of visual and auditory information. The 1 million token context window enables processing of extensive documents, lengthy conversations, and large codebases. Organizations use it for sophisticated AI agents that need to interpret and reason about multiple data types simultaneously, complex content moderation involving video and audio, and applications requiring deep understanding of multimedia educational or training materials.

Frequently Asked Questions

How much does Gemini 3 Pro cost per million tokens?

Gemini 3 Pro pricing varies by provider and pricing type (standard vs batch). Input and output tokens typically have different rates, and multimodal inputs may have separate pricing structures. Check the pricing table above for current rates across all providers.

What is Gemini 3 Pro best used for?

Gemini 3 Pro excels at complex multimodal tasks requiring understanding of text, images, video, and audio. Its 1M token context window makes it ideal for extensive document analysis, multimedia content processing, educational applications with varied media types, and AI agents that need to reason across multiple input modalities simultaneously.

How does Gemini 3 Pro compare to Gemini 3.1 Pro?

Gemini 3.1 Pro is the newer model in Google's flagship tier, likely offering improved capabilities over Gemini 3 Pro. Both models share similar multimodal capabilities and large context windows, but Gemini 3.1 Pro represents Google's latest advancements in the Gemini family. The choice depends on whether you need the absolute latest capabilities or if Gemini 3 Pro's features meet your requirements.