Llama 4 Scout
Llama 4 Scout is Meta's flagship multimodal model with text and image input capabilities, featuring a 327K token context window for complex reasoning tasks.
API Pricing
Cheapest on Deep Infra — 32% below avg| Provider | Input / 1M | Output / 1M | Speed | TTFT | Updated |
|---|---|---|---|---|---|
| $0.080 | $0.300 | 136 t/s | 498ms | 4/4/2026 | |
| $0.080 | $0.300 | 136 t/s | 498ms | 4/14/2026 | |
| $0.085 | $0.330 | 136 t/s | 498ms | 4/14/2026 | |
| $0.110 | $0.340 | 136 t/s | 498ms | 4/14/2026 | |
| $0.170 | $0.660 | 136 t/s | 498ms | 4/14/2026 | |
| $0.180 | $0.590 | 136 t/s | 498ms | 4/14/2026 |
Prices updated daily. Last check: 4/14/2026
Model Details
General
- Creator
- Meta
- Family
- Llama
- Tier
- Flagship
- Context Window
- 328K
- Modalities
- Text, Image
Capabilities
- Tool Calling
- No
- Open Source
- No
Strengths & Limitations
- Large 327,680 token context window supports extensive document processing
- Native multimodal support for both text and image inputs
- Fast token generation at 133.36 tokens per second
- Relatively quick time-to-first-token at 473 milliseconds
- Flagship-tier reasoning capabilities across modalities
- Suitable for complex document analysis with visual components
- Meta's most advanced model architecture to date
- No tool calling or function execution capabilities
- Proprietary model with no open-source weights available
- Limited to text and image modalities only
- Does not support structured output modes like JSON
- Newer model family with less deployment history than established alternatives
Key Features
About Llama 4 Scout
Common Use Cases
Llama 4 Scout is designed for sophisticated multimodal applications requiring deep understanding of both textual and visual content. Its large context window makes it particularly effective for analyzing lengthy documents that include charts, diagrams, or images, such as research papers, technical manuals, or financial reports. The model excels at tasks like visual question answering, document summarization with image content, educational content creation, and complex reasoning across multiple data types. Its flagship-tier capabilities make it suitable for enterprise applications requiring nuanced understanding of mixed-media content, though the absence of tool calling limits its effectiveness for agentic workflows that require external API integration.
Frequently Asked Questions
How much does Llama 4 Scout cost per million tokens?
Llama 4 Scout pricing varies by provider and service type. Check the pricing table above for current rates across all available providers offering this model.
What is Llama 4 Scout best used for?
Llama 4 Scout excels at multimodal tasks requiring understanding of both text and images, such as document analysis with visual components, research paper summarization, educational content creation, and complex reasoning across mixed media. Its large 327K context window makes it particularly effective for processing extensive documents containing charts, diagrams, or other visual elements.
Can Llama 4 Scout call functions or use tools?
No, Llama 4 Scout does not support tool calling or function execution capabilities. It focuses on multimodal reasoning and content generation but cannot integrate with external APIs or execute structured function calls, limiting its use in agentic workflows that require external tool integration.