LightweightReka

Reka Flash 3

Reka Flash 3 is Reka's lightweight text model designed for fast inference, featuring a 65K token context window and optimized for speed-focused applications.

Context 66K
Tier Lightweight
Input from
$0.100 / 1M tokens
across 1 provider

API Pricing

ProviderInput / 1MOutput / 1MSpeedTTFTUpdated
$0.100$0.20087.0 t/s1.2s4/14/2026

Prices updated daily. Last check: 4/14/2026

Model Details

General

Creator
Reka
Family
Reka
Tier
Lightweight
Context Window
66K
Modalities
Text

Capabilities

Tool Calling
No
Open Source
No

Strengths & Limitations

  • High throughput at 85.37 output tokens per second for fast text generation
  • 65,536 token context window allows processing of substantial documents
  • Lightweight architecture optimized for speed and efficiency
  • Reasonable time to first token at 1,268ms for interactive applications
  • Text-focused design without complexity of multimodal processing
  • Suitable for high-volume batch processing scenarios
  • No tool calling or function execution capabilities
  • Text-only input - no support for images or other modalities
  • Proprietary model with no open source weights available
  • Lightweight tier limits advanced reasoning compared to flagship models
  • Longer time to first token compared to some speed-optimized competitors

Key Features

65,536 token context window
Text input and output processing
Streaming response support
Batch processing capabilities
REST API access
Lightweight inference optimization
High-throughput text generation

About Reka Flash 3

Reka Flash 3 is a lightweight text generation model developed by Reka, positioned as their speed-optimized offering within the Reka model family. As a lightweight tier model, it prioritizes fast inference and efficiency over maximum capability, making it suitable for applications where response time is critical. The model operates with a 65,536 token context window and focuses exclusively on text input and output, without multimodal capabilities or tool calling functionality. Performance benchmarks show it delivers 85.37 output tokens per second with a time to first token of 1,268 milliseconds, indicating its optimization for throughput-oriented workloads. Reka Flash 3 serves applications requiring rapid text processing at scale, competing with other lightweight models in scenarios where cost efficiency and speed matter more than advanced reasoning capabilities. The model is proprietary and available only through API access.

Common Use Cases

Reka Flash 3 is designed for applications requiring fast, cost-effective text processing at scale. Its lightweight architecture and high throughput make it well-suited for content generation, document summarization, text classification, and customer service chatbots where speed matters more than complex reasoning. The 65K context window enables processing of longer documents while maintaining efficiency. Organizations running high-volume text processing workloads, automated content workflows, or real-time chat applications can benefit from its speed-optimized design, though users needing advanced capabilities like tool use or multimodal input should consider higher-tier alternatives.

Frequently Asked Questions

How much does Reka Flash 3 cost per million tokens?

Reka Flash 3 pricing varies by provider and usage patterns. Check the pricing table above for current rates across all available providers offering this model.

What is Reka Flash 3 best used for?

Reka Flash 3 excels at high-volume text processing tasks requiring fast response times, including content generation, document processing, text classification, and chatbot applications where speed and cost efficiency are prioritized over advanced reasoning capabilities.

Does Reka Flash 3 support tool calling or function execution?

No, Reka Flash 3 does not support tool calling or function execution. It focuses exclusively on text generation tasks. Users needing tool integration should consider higher-tier models in the Reka family or other providers offering function calling capabilities.