FlagshipZhipu

GLM-4.7

GLM-4.7 is Zhipu's flagship text-only model with a 128K token context window, designed for complex reasoning and tool-calling applications.

Context 128K
Tier Flagship
Tools Supported
Input from
$0.070 / 1M tokens
across 3 providers

API Pricing

Cheapest on Amazon AWS 77% below avg
ProviderInput / 1MOutput / 1MSpeedTTFTUpdated
$0.070$0.40077.0 t/s693ms4/14/2026
$0.390$1.7577.0 t/s693ms4/14/2026
$0.450$2.0077.0 t/s693ms4/14/2026

Prices updated daily. Last check: 4/14/2026

Model Details

General

Creator
Zhipu
Family
GLM
Tier
Flagship
Context Window
128K
Modalities
Text

Capabilities

Tool Calling
Yes
Open Source
No
Subtypes
Chat Completion

Strengths & Limitations

  • 128,000 token context window for processing lengthy documents
  • Tool calling support enables integration with external APIs and services
  • 87.72 tokens per second output speed for responsive interactions
  • Flagship-tier capabilities for complex reasoning tasks
  • Part of established GLM model family with proven track record
  • Competitive inference speed for real-time applications
  • Supports chat completion format for conversational applications
  • Text-only input support - no image or multimodal capabilities
  • Proprietary model with no open source availability
  • 1,380ms time to first token is slower than some competitors
  • Smaller context window compared to some flagship models with 200K+ tokens
  • Limited benchmark data available for comprehensive evaluation

Key Features

128,000 token context window
Tool calling with external API integration
Chat completion interface
Text generation and analysis
Streaming response support
Function calling capabilities
Complex reasoning processing
Agent-based application support

About GLM-4.7

GLM-4.7 is Zhipu's flagship model in the GLM family, representing the company's most advanced text generation capabilities. As a proprietary model from the Chinese AI company Zhipu, GLM-4.7 sits at the top of their model lineup for complex reasoning tasks. The model features a 128,000 token context window and supports tool calling functionality, enabling it to interact with external APIs and services. GLM-4.7 processes text-only inputs and generates responses at 87.72 tokens per second with a time to first token of 1,380 milliseconds, according to Artificial Analysis benchmarks. GLM-4.7 competes in the flagship model category alongside other advanced language models, positioning itself as Zhipu's answer to complex reasoning, coding, and agent-based applications that require sophisticated tool integration capabilities.

Common Use Cases

GLM-4.7 is designed for sophisticated applications requiring advanced reasoning and tool integration. Its flagship-tier capabilities make it suitable for complex coding assistance, research analysis, and agent-based systems that need to interact with external APIs. The 128K context window enables processing of lengthy documents, technical specifications, and multi-turn conversations with substantial context retention. Organizations building AI assistants, automated workflows, or applications requiring function calling will find GLM-4.7's tool integration capabilities particularly valuable for creating interactive systems that can query databases, call APIs, and perform multi-step reasoning tasks.

Frequently Asked Questions

How much does GLM-4.7 cost per million tokens?

GLM-4.7 pricing varies by provider and pricing type (standard vs batch). Check the pricing table above for current rates across all providers.

What is GLM-4.7 best used for?

GLM-4.7 excels at complex reasoning tasks, coding assistance, and agent-based applications requiring tool calling capabilities. Its 128K context window makes it well-suited for document analysis, research tasks, and multi-turn conversations with substantial context retention.

Does GLM-4.7 support image input or multimodal capabilities?

No, GLM-4.7 is text-only and does not support image input or other modalities. It focuses exclusively on text generation, reasoning, and tool calling functionality for text-based applications.