ERNIE 4.5 VL 424B
ERNIE 4.5 VL 424B is Baidu's flagship multimodal model with 424 billion parameters, supporting text and image inputs with a 123K token context window.
API Pricing
| Provider | Input / 1M | Output / 1M | Updated |
|---|---|---|---|
| $0.420 | $1.25 | 4/14/2026 |
Prices updated daily. Last check: 4/14/2026
Model Details
General
- Creator
- Baidu
- Family
- ERNIE
- Tier
- Flagship
- Context Window
- 123K
- Modalities
- Text, Image
Capabilities
- Tool Calling
- No
- Open Source
- No
Strengths & Limitations
- 424 billion parameters provide substantial model capacity
- Multimodal support for both text and image inputs
- 123,000 token context window handles lengthy documents
- Flagship tier positioning within ERNIE model family
- Developed by Baidu with focus on Chinese language capabilities
- Large parameter count suitable for complex reasoning tasks
- Vision-language understanding in single model
- No tool calling or function execution capabilities
- Proprietary model with no open-source availability
- Smaller context window compared to some competing flagship models
- Limited to text and image modalities only
Key Features
About ERNIE 4.5 VL 424B
Common Use Cases
ERNIE 4.5 VL 424B is suited for complex enterprise applications requiring multimodal AI capabilities, particularly those involving both text and visual content analysis. Its 424B parameter count and flagship positioning make it appropriate for sophisticated reasoning tasks, document analysis with visual elements, content generation, and applications requiring deep understanding of both textual and visual information. The model is especially valuable for organizations needing advanced AI capabilities in Chinese language contexts or those requiring a single model to handle diverse multimodal workloads without tool integration.
Frequently Asked Questions
How much does ERNIE 4.5 VL 424B cost per million tokens?
ERNIE 4.5 VL 424B pricing varies by provider and may differ for text versus image inputs. Check the pricing table above for current rates across all available providers.
What is ERNIE 4.5 VL 424B best used for?
ERNIE 4.5 VL 424B excels at complex multimodal tasks requiring both text and image understanding, such as document analysis with visual elements, content creation involving images, and sophisticated reasoning tasks. Its 424B parameters make it suitable for enterprise applications requiring advanced AI capabilities.
Does ERNIE 4.5 VL 424B support tool calling or function execution?
No, ERNIE 4.5 VL 424B does not support tool calling or function execution capabilities. It focuses on text and image processing without external tool integration features.