GLM-4.6
GLM-4.6 is Zhipu's lightweight text model with tool calling support and a 128K token context window, optimized for efficient performance.
API Pricing
Cheapest on OpenRouter — 71% below avg| Provider | Input / 1M | Output / 1M | Speed | TTFT | Updated |
|---|---|---|---|---|---|
| $0.100 | $0.100 | 27.3 t/s | 1.2s | 4/14/2026 | |
| $0.600 | $2.20 | 27.3 t/s | 1.2s | 4/14/2026 |
Prices updated daily. Last check: 4/14/2026
Model Details
General
- Creator
- Zhipu
- Family
- GLM
- Tier
- Lightweight
- Context Window
- 128K
- Modalities
- Text
Capabilities
- Tool Calling
- Yes
- Open Source
- No
- Subtypes
- Chat Completion
- Aliases
- glm-4-32b
Strengths & Limitations
- Tool calling support enables API integrations and structured workflows
- 128K token context window accommodates substantial document processing
- 28.25 tokens per second output speed for responsive applications
- Lightweight design reduces computational requirements compared to flagship models
- Chat completion format supports conversational applications
- Efficient time to first token at 1,193 milliseconds
- Text-only modality lacks image or multimodal input support
- Proprietary model with no open source availability
- Lightweight tier may have reduced reasoning capabilities compared to flagship models
- Limited to chat completion format without other generation modes
Key Features
About GLM-4.6
Common Use Cases
GLM-4.6 is well-suited for applications requiring efficient text processing with tool integration capabilities. Its lightweight design makes it appropriate for high-volume chat applications, customer service bots, and automated workflows that need to call external APIs. The 128K context window supports document analysis, content summarization, and multi-turn conversations with substantial history. Organizations building cost-effective language applications that don't require the advanced reasoning of frontier models can leverage GLM-4.6 for reliable performance in production environments where response speed and resource efficiency are priorities.
Frequently Asked Questions
How much does GLM-4.6 cost per million tokens?
GLM-4.6 pricing varies by provider and pricing type. Check the pricing table above for current rates across all providers offering this model.
What is GLM-4.6 best used for?
GLM-4.6 excels at efficient text processing tasks including chat applications, customer service automation, and workflows requiring tool calling capabilities. Its lightweight design and 128K context window make it suitable for document processing and multi-turn conversations where performance efficiency is important.
Does GLM-4.6 support image inputs or multimodal capabilities?
No, GLM-4.6 is a text-only model that supports chat completions but does not process images or other multimodal inputs. It focuses on text generation with tool calling functionality for API integrations.