LightweightAlibaba

Qwen 3.5 27B

Qwen 3.5 27B is Alibaba's lightweight multimodal model supporting text, image, and video inputs with a 262K token context window.

Context 262K
Tier Lightweight
Modalities text, image, video
Input from
$0.195 / 1M tokens
across 1 provider

API Pricing

ProviderInput / 1MOutput / 1MSpeedTTFTUpdated
$0.195$1.5690.5 t/s1.5s4/14/2026

Prices updated daily. Last check: 4/14/2026

Model Details

General

Creator
Alibaba
Family
Qwen
Tier
Lightweight
Context Window
262K
Modalities
Text, Image, Video

Capabilities

Tool Calling
No
Open Source
No

Strengths & Limitations

  • Supports video input processing alongside text and images
  • Large 262K token context window for lengthy multimodal content
  • Output speed of 91.58 tokens per second for responsive applications
  • Lightweight tier positioning enables cost-effective scaling
  • Multimodal capabilities across three input types
  • Time to first token under 1.4 seconds for reasonable latency
  • No tool calling or function execution support
  • Proprietary model with no open source availability
  • Lightweight tier may have reduced reasoning capabilities compared to flagship models
  • Limited API ecosystem compared to major frontier models

Key Features

262,144 token context window
Text input processing
Image input support
Video input processing
Multimodal content understanding
Streaming response generation

About Qwen 3.5 27B

Qwen 3.5 27B is a lightweight-tier model developed by Alibaba as part of the Qwen family. It sits in the more accessible segment of Alibaba's model lineup, designed to balance capability with efficiency for high-volume applications. The model features a 262,144 token context window and supports multimodal inputs including text, image, and video processing. Performance benchmarks show an output rate of 91.58 tokens per second with a time to first token of 1,325 milliseconds. The model does not include tool calling capabilities and is not available as open source. Qwen 3.5 27B targets applications requiring multimodal understanding at scale, where the combination of video processing capabilities and reasonable inference speeds makes it suitable for content analysis, moderation, and other high-throughput multimodal tasks.

Common Use Cases

Qwen 3.5 27B is well-suited for high-volume multimodal applications that need to process video content alongside text and images. Its lightweight positioning makes it appropriate for content moderation systems, video analysis pipelines, and multimodal search applications where cost efficiency is important. The large context window enables processing of lengthy video transcripts or extensive multimodal documents. Organizations building multimodal applications at scale can leverage its video processing capabilities without the higher costs associated with flagship-tier models, making it particularly valuable for batch processing workflows and applications requiring consistent multimodal understanding across large datasets.

Frequently Asked Questions

How much does Qwen 3.5 27B cost per million tokens?

Qwen 3.5 27B pricing varies by provider and may differ for text versus image/video inputs. Check the pricing table above for current rates across all available providers.

What is Qwen 3.5 27B best used for?

Qwen 3.5 27B excels at multimodal applications requiring video processing capabilities, particularly for high-volume use cases like content moderation, video analysis, and multimodal search. Its lightweight tier positioning makes it cost-effective for scaling multimodal understanding across large datasets.

Does Qwen 3.5 27B support tool calling or function execution?

No, Qwen 3.5 27B does not support tool calling or function execution capabilities. It focuses on multimodal understanding and generation tasks across text, image, and video inputs without external tool integration.