LightweightAlibaba

Qwen 3.5 9B

Name: Qwen 3.5 9B
Availability: InStock
Author: Alibaba

Qwen 3.5 9B is Alibaba's lightweight multimodal model supporting text, image, and video inputs with a 256K token context window.

Context 256K

Tier Lightweight

Modalities text, image, video

Input from

$0.065 / 1M tokens

across 3 providers

Compare Prices

API Pricing

Cheapest on OpenRouter — 42% below avg

Provider	Input / 1M	Output / 1M	Updated
OpenRouter	$0.065	$0.260	7/13/2026
Deep Infra	$0.100	$0.150	7/13/2026
Together AI	$0.170	$0.250	7/13/2026

Prices updated daily. Last check: Jul 13, 2026

Performance & Benchmarks

Source: Artificial Analysis →

Intelligence

20.3 / 100

Coding

23.5 / 100

Reasoning & Knowledge

GPQA Diamond78.6%
Humanity's Last Exam8.6%

Coding

SciCode27.7%

Agentic & Tool Use

Terminal-Bench Hard18.2%
Terminal-Bench v2.121.3%
τ²-bench85.1%
τ-bench Banking4.1%

Instruction & Long Context

IFBench37.8%
Long-Context Reasoning38.0%

Benchmarks measured Jul 2026. Scores are independent evaluations, not vendor-reported.

Model Details

General

Creator: Alibaba
Family: Qwen
Tier: Lightweight
Context Window: 256K
Modalities: Text, Image, Video

Capabilities

Tool Calling: No
Open Source: No
Aliases: qwen3-5-flash-02-23

Strengths & Limitations

Strengths

Multimodal support for text, image, and video inputs
Large 256K token context window for processing lengthy documents
Fast inference speed at 114.53 output tokens per second
Quick response initiation with 289ms time to first token
Lightweight architecture suitable for high-throughput applications
Video understanding capabilities beyond standard text and image models
Efficient resource utilization compared to larger models in family

Limitations

No tool calling or function execution capabilities
Proprietary model with no open-source availability
Limited to 9B parameters compared to larger Qwen family models
Lightweight tier may have reduced reasoning capabilities versus flagship models

Key Features

•256K token context window

•Text input and generation

•Image processing and understanding

•Video content analysis

•Streaming response support

•Multi-language text processing

•Fast inference optimization

•Batch processing capabilities

About Qwen 3.5 9B

Qwen 3.5 9B is a lightweight multimodal model developed by Alibaba as part of the Qwen family. Positioned as an efficient option within the Qwen lineup, this model balances capability with speed for applications requiring moderate complexity processing. The model features a 256K token context window and supports multimodal inputs including text, image, and video content. With benchmark performance showing 114.53 output tokens per second and a time to first token of 289ms, Qwen 3.5 9B demonstrates competitive inference speeds. The model operates as a proprietary system without open-source availability and does not include tool calling functionality, focusing instead on core multimodal understanding and generation tasks.

Common Use Cases

Qwen 3.5 9B is well-suited for applications requiring multimodal content processing at scale, including document analysis with embedded images, video content summarization, and educational content creation. Its lightweight architecture and fast inference speeds make it appropriate for real-time applications like customer service chatbots that need to handle mixed media inputs, content moderation systems processing images and videos, and automated transcription services. The large context window supports processing of lengthy documents with multimedia elements, while the efficient performance characteristics enable deployment in cost-sensitive environments where high throughput is prioritized over maximum model capability.

Frequently Asked Questions

How much does Qwen 3.5 9B cost per million tokens?

Qwen 3.5 9B pricing varies by provider and pricing type (standard vs batch). Check the pricing table above for current rates across all providers.

What is Qwen 3.5 9B best used for?

Qwen 3.5 9B excels at multimodal content processing tasks including document analysis with images, video summarization, and real-time applications requiring fast inference speeds. Its lightweight architecture makes it suitable for high-throughput scenarios where efficiency is prioritized.

Does Qwen 3.5 9B support tool calling and function execution?

No, Qwen 3.5 9B does not include tool calling or function execution capabilities. It focuses on multimodal understanding and generation tasks across text, image, and video inputs without external tool integration.