LightweightAlibaba

Qwen 3 30B

Name: Qwen 3 30B
Availability: InStock
Author: Alibaba

Qwen 3 30B is Alibaba's lightweight text-only model with a 262K token context window, optimized for efficient text processing tasks.

Context 262K

Tier Lightweight

Input from

$0.117 / 1M tokens

across 4 providers

Compare Prices

API Pricing

Cheapest on IO.NET — 7% below avg

Provider	Input / 1M	Output / 1M	Cached / 1M	Speed	TTFT	Updated
IO.NET	$0.117	$1.14	$0.059	153 t/s	971ms	6/18/2026
Deep Infra	$0.120	$0.500	-	153 t/s	971ms	7/13/2026
OpenRouter	$0.120	$0.500	-	153 t/s	971ms	7/13/2026
Together AI	$0.150	$1.50	-	153 t/s	971ms	6/18/2026

Prices updated daily. Last check: Jul 13, 2026

Performance & Benchmarks

Source: Artificial Analysis →

Intelligence

9.1 / 100

Math

66.3 / 100

Output Speed

153 t/s

Latency (TTFT)

971ms

Reasoning & Knowledge

MMLU-Pro77.7%
GPQA Diamond65.9%
Humanity's Last Exam6.8%

Coding

LiveCodeBench51.5%
SciCode30.4%

Math

AIME 202566.3%
AIME72.7%
MATH-50097.5%

Agentic & Tool Use

Terminal-Bench Hard6.1%
τ²-bench10.2%

Instruction & Long Context

IFBench33.1%
Long-Context Reasoning22.7%

Benchmarks measured Jul 2026. Scores are independent evaluations, not vendor-reported.

Model Details

General

Creator: Alibaba
Family: Qwen
Tier: Lightweight
Context Window: 262K
Modalities: Text

Capabilities

Tool Calling: No
Open Source: No
Aliases: qwen3-next-80b-a3b-thinking, qwen3-next-80b-a3b-instruct, qwen3-next-80b-a3b, qwen3-30b-a3b-thinking-2507, qwen3-30b-a3b-instruct-2507

Strengths & Limitations

Strengths

Large 262,144 token context window for processing lengthy documents
Fast generation speed at 73.23 tokens per second
Lightweight architecture for efficient resource utilization
Part of Alibaba's established Qwen model family
Reasonable time to first token at 1,210 milliseconds
Text-focused design optimized for language tasks
Multiple deployment variants available through aliases

Limitations

No tool calling or function execution capabilities
Text-only modality without image or audio support
Proprietary model with no open source availability
Lightweight tier positioning limits complex reasoning capabilities
No multimodal input processing

Key Features

•262,144 token context window

•Text input and output processing

•Streaming response generation

•Multiple model variants (instruct and thinking modes)

•Batch processing support

•API-based deployment

•Chinese and multilingual text support

•Document-length context handling

About Qwen 3 30B

Qwen 3 30B is a lightweight model from Alibaba's Qwen family, positioned as an efficient option for text-only processing tasks. As part of the Qwen 3 generation, it represents Alibaba's approach to balancing capability with computational efficiency in the lightweight tier. The model features a 262,144 token context window and processes text-only inputs. Performance benchmarks show it generates approximately 73 tokens per second with a time to first token of 1,210 milliseconds. The model operates as a proprietary offering without tool calling capabilities, focusing on core text understanding and generation tasks. Qwen 3 30B serves applications requiring efficient text processing without the computational overhead of larger models. Its extended context window makes it suitable for document analysis and longer conversations while maintaining faster response times compared to flagship-tier models in the Qwen family.

Common Use Cases

Qwen 3 30B is designed for applications requiring efficient text processing with extended context support. Its lightweight architecture makes it suitable for high-volume text classification, content summarization, document analysis, and conversational applications where speed and efficiency are priorities over complex reasoning. The large context window enables processing of lengthy documents, research papers, or extended conversations without context truncation. Organizations needing cost-effective text processing for customer support, content moderation, or document processing workflows can leverage its balance of capability and efficiency.

Frequently Asked Questions

How much does Qwen 3 30B cost per million tokens?

Qwen 3 30B pricing varies by provider and pricing type (standard vs batch). Check the pricing table above for current rates across all providers.

What is Qwen 3 30B best used for?

Qwen 3 30B excels at efficient text processing tasks including document analysis, content summarization, conversational applications, and high-volume text classification. Its 262K context window and fast generation speed make it suitable for applications requiring extended context understanding without the computational cost of larger models.

Does Qwen 3 30B support tool calling or multimodal inputs?

No, Qwen 3 30B is a text-only model without tool calling capabilities or support for images, audio, or other modalities. It focuses specifically on text understanding and generation tasks with optimized performance for these core language processing functions.