LightweightGoogle

Gemini 3.1 Flash Image

Gemini 3.1 Flash Image is Google's lightweight multimodal model optimized for fast text and image processing tasks with a 65K token context window.

Context 66K
Tier Lightweight
Modalities text, image
Input from
$0.250 / 1M tokens
across 2 providers

API Pricing

Cheapest on Google Cloud 33% below avg
ProviderInput / 1MOutput / 1MSpeedTTFTUpdated
$0.250$1.50203 t/s7.8s4/13/2026
$0.500$3.00203 t/s7.8s4/14/2026

Prices updated daily. Last check: 4/14/2026

Model Details

General

Creator
Google
Family
Gemini
Tier
Lightweight
Context Window
66K
Modalities
Text, Image

Capabilities

Tool Calling
No
Open Source
No
Aliases
gemini-3-1-flash-image-preview

Strengths & Limitations

  • Supports both text and image input modalities in a single request
  • 65,536 token context window allows processing multiple images with substantial text
  • Lightweight architecture designed for faster inference speeds
  • Part of Google's Gemini 3.1 family with consistent API integration
  • Optimized for high-volume visual understanding tasks
  • Preview access through gemini-3-1-flash-image-preview alias
  • No tool calling or function execution capabilities
  • Proprietary model with no open source weights available
  • Smaller context window compared to flagship Gemini 3.1 Pro
  • Lightweight tier may have reduced reasoning capabilities versus Pro models
  • Limited to text and image modalities only

Key Features

Text and image multimodal processing
65,536 token context window
Streaming response support
Batch processing capabilities
REST API integration
Preview model access
Multi-image input support
Google Cloud Platform integration

About Gemini 3.1 Flash Image

Gemini 3.1 Flash Image is Google's lightweight multimodal model within the Gemini family, positioned below the flagship Gemini 3.1 Pro for applications requiring fast processing of both text and images. As a tier-optimized model, it prioritizes speed and efficiency over maximum capability, making it suitable for high-volume visual understanding tasks. The model supports both text and image inputs with a 65,536 token context window, enabling processing of multiple images alongside substantial text content in a single request. Unlike some models in the Gemini family, it does not include tool calling capabilities, keeping the architecture focused on core text and vision understanding tasks. Gemini 3.1 Flash Image serves applications where visual understanding needs to be balanced with cost and latency constraints. It provides an alternative to heavier multimodal models when maximum reasoning capability is less critical than processing speed and throughput for image analysis workflows.

Common Use Cases

Gemini 3.1 Flash Image is designed for applications requiring fast visual understanding at scale, such as content moderation systems that need to analyze images with accompanying text, e-commerce platforms processing product images with descriptions, or document analysis workflows involving scanned images and OCR tasks. Its lightweight architecture makes it suitable for real-time applications like chatbots with image upload capabilities, automated image captioning services, or visual search systems where processing speed and cost efficiency are more important than maximum reasoning depth. The 65K context window supports batch processing of multiple images in educational content analysis, social media monitoring, or automated quality control systems.

Frequently Asked Questions

How much does Gemini 3.1 Flash Image cost per million tokens?

Gemini 3.1 Flash Image pricing varies by provider and includes separate rates for text and image tokens. Check the pricing table above for current rates across all providers offering this model.

What is Gemini 3.1 Flash Image best used for?

Gemini 3.1 Flash Image excels at fast visual understanding tasks like content moderation, document analysis with images, product catalog processing, and real-time chatbots with image capabilities. Its lightweight design prioritizes speed over maximum reasoning capability.

Does Gemini 3.1 Flash Image support tool calling or function execution?

No, Gemini 3.1 Flash Image does not include tool calling capabilities. It focuses specifically on text and image understanding tasks. For function calling with multimodal inputs, consider Gemini 3.1 Pro or other models in the family with tool support.