LightweightAnthropic

Claude Haiku 4.5

Name: Claude Haiku 4.5
Availability: InStock
Author: Anthropic

Claude Haiku 4.5 is Anthropic's lightweight model optimized for speed and efficiency, with vision capabilities and a 200K token context window.

Context 200K

Tier Lightweight

Knowledge Feb 2025

Tools Supported

Modalities text, image

Input from

$0.500 / 1M tokens

across 4 providers

Compare Prices Model Page →API Docs

API Pricing

Cheapest on Anthropic — 42% below avg

Provider	Input / 1M	Output / 1M	Cached / 1M	Speed	TTFT	Updated
AnthropicBatch	$0.500	$2.50	$0.050	98.0 t/s	612ms	7/12/2026
Amazon AWSBatch	$0.550	$2.75	-	98.0 t/s	612ms	7/13/2026
Deep Infra	$1.00	$5.00	-	98.0 t/s	612ms	7/13/2026
Anthropic	$1.00	$5.00	$0.100	98.0 t/s	612ms	7/12/2026
OpenRouter	$1.00	$5.00	$0.100	98.0 t/s	612ms	7/13/2026
Amazon AWS	$1.10	$5.50	-	98.0 t/s	612ms	7/13/2026

Prices updated daily. Last check: Jul 13, 2026

Performance & Benchmarks

Source: Artificial Analysis →

Intelligence

23.7 / 100

Math

39.0 / 100

Output Speed

98.0 t/s

Latency (TTFT)

612ms

Reasoning & Knowledge

MMLU-Pro80.0%
GPQA Diamond64.6%
Humanity's Last Exam4.3%

Coding

LiveCodeBench51.1%
SciCode34.4%

Math

AIME 202539.0%

Agentic & Tool Use

Terminal-Bench Hard27.3%
τ²-bench32.5%

Instruction & Long Context

IFBench42.0%
Long-Context Reasoning43.7%

Benchmarks measured Jul 2026. Scores are independent evaluations, not vendor-reported.

Model Details

General

Creator: Anthropic
Family: Haiku
Tier: Lightweight
Context Window: 200K
Knowledge Cutoff: Feb 2025
Modalities: Text, Image

Capabilities

Tool Calling: Yes
Open Source: No
Subtypes: Chat Completion

Strengths & Limitations

Strengths

200K token context window enables processing of long documents
Vision input support for analyzing images alongside text
Tool calling functionality with structured output capabilities
Knowledge cutoff of February 2025 provides recent training data
Optimized for fast inference speeds in the Claude family
Multimodal capabilities despite being the lightweight tier
Streaming response support for real-time applications

Limitations

Proprietary model with no open-source weights available
Lower reasoning capabilities compared to Sonnet and Opus tiers
Reduced performance on complex analytical tasks
Limited compared to flagship models for advanced coding work
May struggle with highly technical or specialized domains

Key Features

•200K token context window

•Vision input (image analysis)

•Tool calling with structured output

•Streaming responses

•Multimodal chat completion

•Function calling capabilities

•JSON mode output formatting

•Batch API support

About Claude Haiku 4.5

Claude Haiku 4.5 is Anthropic's lightweight model in the Claude family, positioned as the fast and efficient option below the more capable Sonnet and Opus tiers. As part of Anthropic's fourth-generation Claude models, Haiku 4.5 represents the speed-optimized variant designed for high-volume applications where rapid response times are prioritized. The model supports both text and image inputs through its 200,000 token context window, enabling multimodal conversations and document analysis. Claude Haiku 4.5 includes tool calling capabilities, allowing it to interact with external APIs and execute functions. With a knowledge cutoff of February 2025, it has relatively current training data. Despite being the lightweight tier, it maintains vision processing abilities that were not available in earlier Haiku generations. Haiku 4.5 serves applications requiring fast inference speeds at scale, such as content moderation, customer support automation, and real-time chat applications. While it sacrifices some reasoning depth compared to Sonnet and Opus variants, it delivers significantly faster response times, making it suitable for latency-sensitive deployments where good-enough performance is preferred over maximum capability.

Common Use Cases

Claude Haiku 4.5 is well-suited for high-volume applications where speed and cost efficiency are priorities. Its fast inference makes it ideal for customer support chatbots, content moderation systems, and real-time messaging applications. The vision capabilities enable document processing workflows, image captioning, and visual content analysis at scale. With tool calling support, it can power simple automation tasks and API integrations. The 200K context window allows for processing lengthy documents, transcripts, or conversations while maintaining the quick response times needed for interactive applications. It's particularly valuable for organizations that need good language model performance across many concurrent users without the latency or cost overhead of more powerful tiers.

Frequently Asked Questions

How much does Claude Haiku 4.5 cost per million tokens?

Claude Haiku 4.5 pricing varies by provider and usage type (standard vs batch processing). As the lightweight tier in Anthropic's model family, it's positioned as the most cost-effective option. Check the pricing table above for current rates across all available providers.

What is Claude Haiku 4.5 best used for?

Claude Haiku 4.5 excels in high-volume, latency-sensitive applications like customer support automation, content moderation, and real-time chat systems. Its vision capabilities make it suitable for document processing and image analysis workflows, while tool calling enables simple automation tasks. It's the optimal choice when you need good performance at scale rather than maximum reasoning capability.

How does Claude Haiku 4.5 compare to other models in the Claude family?

Haiku 4.5 is the fastest and most cost-effective model in the Claude family, trading some reasoning capability for superior speed and efficiency. Unlike earlier Haiku versions, it includes vision input support and maintains the 200K context window found in higher tiers. It's less capable than Sonnet for complex analysis or Opus for advanced reasoning, but delivers much faster response times for straightforward tasks.