LightweightStepfun

Step 3.5 Flash

Name: Step 3.5 Flash
Availability: InStock
Author: Stepfun

Step 3.5 Flash is Stepfun's lightweight model designed for fast text generation, featuring a 262K token context window and high throughput performance.

Context 262K

Tier Lightweight

Input from

$0.090 / 1M tokens

across 2 providers

Compare Prices

API Pricing

Provider	Input / 1M	Output / 1M	Cached / 1M	Speed	TTFT	Updated
OpenRouter	$0.090	$0.300	$0.020	186 t/s	804ms	5/28/2026
Deep Infra	$0.090	$0.300	$0.020	186 t/s	804ms	5/29/2026

Prices updated daily. Last check: May 29, 2026

Model Details

General

Creator: Stepfun
Family: Step
Tier: Lightweight
Context Window: 262K
Modalities: Text

Capabilities

Tool Calling: No
Open Source: No

Strengths & Limitations

Strengths

Fast token generation at 169.5 tokens per second
Large 262K token context window for processing lengthy documents
Quick response initiation with 812ms time-to-first-token
Lightweight architecture optimized for speed over complexity
Suitable for high-throughput text processing workflows
Streamlined feature set reduces overhead for basic text tasks

Limitations

No tool calling or function execution capabilities
Text-only modality - no image or multimodal input support
Proprietary model with no open-source weights available
Limited positioning as lightweight tier within Step family
Fewer advanced reasoning capabilities compared to flagship models

Key Features

•262,144 token context window

•Text input and output processing

•Streaming response generation

•Fast token generation (169.5 tokens/second)

•Quick response initiation (812ms TTFT)

•Lightweight model architecture

•High-throughput text processing

•API access through multiple providers

About Step 3.5 Flash

Step 3.5 Flash is a lightweight language model developed by Stepfun, positioned as a fast-generation option within the Step model family. As a tier-focused model, it emphasizes speed and efficiency over maximum capability, making it suitable for applications requiring quick responses and high throughput. The model supports a 262,144-token context window, allowing it to process substantial amounts of text in a single request. Performance benchmarks show Step 3.5 Flash generates tokens at 169.5 tokens per second with a time-to-first-token of 812 milliseconds. The model handles text-only interactions and does not include tool calling capabilities, keeping its feature set streamlined for core text generation tasks. Step 3.5 Flash targets use cases where response speed and processing efficiency are prioritized over complex reasoning or multimodal capabilities. Its combination of a large context window and fast generation makes it practical for content processing, summarization, and other text-heavy workflows where quick turnaround is essential.

Common Use Cases

Step 3.5 Flash is designed for applications requiring fast text generation and high throughput processing. Its large context window combined with quick generation speeds makes it well-suited for document summarization, content processing pipelines, customer service automation, and real-time text analysis. The model works effectively for workflows that need to process substantial text volumes quickly, such as content moderation, text classification at scale, or generating responses in chat applications where speed is prioritized. Its lightweight nature makes it cost-effective for high-volume deployments where complex reasoning capabilities are not required.

Frequently Asked Questions

How much does Step 3.5 Flash cost per million tokens?

Step 3.5 Flash pricing varies by provider and may include different rates for input and output tokens. Check the pricing table above for current rates across all available providers offering this model.

What is Step 3.5 Flash best used for?

Step 3.5 Flash excels at high-throughput text processing tasks where speed is important. With its 169.5 tokens per second generation rate and large 262K context window, it's ideal for document summarization, content processing pipelines, customer service automation, and real-time text analysis where quick responses matter more than complex reasoning.

Does Step 3.5 Flash support tool calling or multimodal inputs?

No, Step 3.5 Flash is designed as a streamlined text-only model without tool calling capabilities or support for images and other modalities. This focused approach contributes to its fast performance characteristics and makes it suitable for pure text generation tasks.