FlagshipOpenAI

GPT-4.1

Name: GPT-4.1
Availability: InStock
Author: OpenAI

GPT-4.1 is OpenAI's flagship multimodal model with a 1M token context window for complex reasoning, coding, and image analysis tasks.

Context 1.0M

Tier Flagship

Knowledge Jun 2024

Tools Supported

Modalities text, image

Input from

$2.00 / 1M tokens

across 2 providers

Compare Prices Model Page →API Docs

API Pricing

Provider	Input / 1M	Output / 1M	Cached / 1M	Speed	TTFT	Updated
OpenRouter	$2.00	$8.00	$0.500	128 t/s	653ms	7/13/2026
Microsoft Azure	$2.00	$8.00	$0.500	128 t/s	653ms	7/11/2026

Prices updated daily. Last check: Jul 13, 2026

Performance & Benchmarks

Source: Artificial Analysis →

Intelligence

19.4 / 100

Math

34.7 / 100

Output Speed

128 t/s

Latency (TTFT)

653ms

Reasoning & Knowledge

MMLU-Pro80.6%
GPQA Diamond66.6%
Humanity's Last Exam4.6%

Coding

LiveCodeBench45.7%
SciCode38.1%

Math

AIME 202534.7%
AIME43.7%
MATH-50091.3%

Agentic & Tool Use

Terminal-Bench Hard13.6%
τ²-bench47.1%

Instruction & Long Context

IFBench43.0%
Long-Context Reasoning61.0%

Benchmarks measured Jul 2026. Scores are independent evaluations, not vendor-reported.

Model Details

General

Creator: OpenAI
Family: GPT
Tier: Flagship
Context Window: 1.0M
Knowledge Cutoff: Jun 2024
Modalities: Text, Image

Capabilities

Tool Calling: Yes
Open Source: No
Subtypes: Chat Completion, Code Generation

Strengths & Limitations

Strengths

1 million token context window supports extremely long documents and conversations
Multimodal capabilities process both text and image inputs
Tool calling support enables integration with external APIs and systems
88.34 tokens per second output speed for responsive interactions
June 2024 knowledge cutoff provides relatively recent training data
Chat completion and code generation optimizations
Flagship-tier reasoning and problem-solving capabilities

Limitations

Proprietary model with no access to weights or local deployment
599ms time to first token latency higher than some competing models
No audio input or output modalities supported
Knowledge cutoff predates some recent developments and events
Requires API access rather than self-hosted options

Key Features

•1 million token context window

•Text and image input processing

•Tool calling with external API integration

•Chat completion optimization

•Code generation capabilities

•Structured output formatting

•Streaming response delivery

•Function calling support

About GPT-4.1

GPT-4.1 is OpenAI's flagship model in the GPT family, representing the current top tier of OpenAI's language model capabilities. As a multimodal model, it processes both text and image inputs while maintaining the conversational and reasoning abilities that define the GPT lineage. The model features a 1 million token context window, enabling it to process and maintain coherence across extremely long documents and conversations. It supports tool calling for integration with external systems and APIs, and handles both chat completion and code generation tasks. With a knowledge cutoff of June 2024, it incorporates more recent information than earlier GPT models. Performance benchmarks show an output speed of 88.34 tokens per second with a time to first token of 599 milliseconds. GPT-4.1 is designed for complex reasoning tasks, advanced coding projects, and multimodal workflows that require processing both textual and visual information. Within the broader landscape of flagship models, it competes alongside other top-tier systems while offering OpenAI's particular approach to language understanding and generation.

Common Use Cases

GPT-4.1 is suited for complex reasoning tasks that require processing large amounts of context, such as analyzing lengthy documents, conducting multi-step research, and maintaining coherent conversations across extended interactions. Its multimodal capabilities make it effective for workflows combining text and visual analysis, including document processing with embedded images, visual content description, and image-based reasoning tasks. The tool calling functionality enables sophisticated agent workflows and API integrations for business automation. Its flagship-tier capabilities support advanced coding projects, technical documentation analysis, and complex problem-solving scenarios where nuanced understanding and reasoning are essential.

Frequently Asked Questions

How much does GPT-4.1 cost per million tokens?

GPT-4.1 pricing varies by provider and whether you use standard or batch processing. Check the pricing table above for current rates across all available providers offering GPT-4.1 access.

What is GPT-4.1 best used for?

GPT-4.1 excels at complex reasoning tasks requiring large context windows, such as analyzing lengthy documents, advanced coding projects, and multimodal workflows that combine text and image processing. Its tool calling capabilities also make it suitable for building AI agents and automated systems.

How does GPT-4.1's 1M token context window compare to other flagship models?

GPT-4.1's 1 million token context window is competitive with other flagship models, enabling processing of extremely long documents, codebases, and conversations while maintaining coherence throughout the entire context length. This makes it particularly effective for tasks requiring comprehensive document analysis or extended reasoning chains.