FlagshipxAI

Grok 4

Grok 4 is xAI's flagship multimodal model with text and image capabilities, featuring a 256K token context window for complex reasoning tasks.

Context 256K
Tier Flagship
Modalities text, image
Input from
$3.00 / 1M tokens
across 1 provider

API Pricing

ProviderInput / 1MOutput / 1MSpeedTTFTUpdated
$3.00$15.00205 t/s345ms4/14/2026

Prices updated daily. Last check: 4/14/2026

Model Details

General

Creator
xAI
Family
Grok
Tier
Flagship
Context Window
256K
Modalities
Text, Image

Capabilities

Tool Calling
No
Open Source
No

Strengths & Limitations

  • Large 256K token context window for processing extensive documents
  • Multimodal support for both text and image inputs
  • Flagship-tier reasoning capabilities within the Grok family
  • Competitive output speed of 165.2 tokens per second
  • Developed by xAI with focus on factual accuracy
  • Suitable for complex analysis tasks requiring long context retention
  • No tool calling or function calling support
  • Proprietary model with no open source availability
  • Longer time to first token (3.1 seconds) compared to some competitors
  • Limited to text and image modalities only
  • Newer entrant with less ecosystem integration than established models

Key Features

256K token context window
Text input and generation
Image input processing
Multimodal reasoning capabilities
Streaming response support
Extended conversation memory
Document analysis and summarization
Visual content understanding

About Grok 4

Grok 4 is xAI's flagship large language model, representing the most advanced offering in the Grok family. As a proprietary model developed by Elon Musk's xAI company, Grok 4 positions itself as a direct competitor to other flagship models in the market. The model supports both text and image inputs, making it a multimodal system capable of understanding and reasoning across different data types. Technically, Grok 4 features a substantial 256,000 token context window, allowing it to process lengthy documents, maintain extended conversations, and work with large codebases. The model demonstrates solid performance metrics with an output speed of 165.2 tokens per second and a time to first token of 3.1 seconds. Its multimodal capabilities enable users to combine text prompts with image analysis in a single request. Grok 4 is designed for applications requiring sophisticated reasoning and multimodal understanding. Unlike some competing flagship models, it does not currently support tool calling functionality, which may influence its suitability for certain agentic workflows compared to models like Claude Opus 4.6 or GPT-5.4.

Common Use Cases

Grok 4 is well-suited for complex analytical tasks that require processing large amounts of text or combining text with visual information. Its 256K context window makes it effective for document analysis, research synthesis, and long-form content creation. The multimodal capabilities enable use cases like image analysis with detailed text explanations, visual data interpretation, and content creation that incorporates both textual and visual elements. Organizations needing sophisticated reasoning over extended contexts, such as legal document review, academic research, or comprehensive report generation, can leverage Grok 4's flagship-tier capabilities. However, workflows requiring tool integration or function calling may need to consider alternative models.

Frequently Asked Questions

How much does Grok 4 cost per million tokens?

Grok 4 pricing varies by provider and pricing type (standard vs batch). Check the pricing table above for current rates across all providers.

What is Grok 4 best used for?

Grok 4 excels at complex reasoning tasks that require large context windows and multimodal understanding. It's particularly effective for document analysis, research synthesis, image analysis with detailed explanations, and long-form content creation that combines text and visual elements.

Does Grok 4 support tool calling or function calling?

No, Grok 4 does not currently support tool calling or function calling capabilities. Users requiring these features for agentic workflows should consider alternative flagship models that offer tool integration.