LightweightOpen SourceOpenAI

GPT-OSS-20B

GPT-OSS-20B is OpenAI's open-source lightweight model with 20 billion parameters, offering 128K context and fast inference for high-volume applications.

Context 128K
Tier Lightweight
Knowledge Jan 2025
License Open Source
Input from
$0.030 / 1M tokens
across 4 providers

API Pricing

Cheapest on OpenRouter 42% below avg
ProviderInput / 1MOutput / 1MSpeedTTFTUpdated
$0.030$0.140208 t/s443ms4/14/2026
$0.035$0.150208 t/s443ms4/14/2026
$0.050$0.200208 t/s443ms4/14/2026
$0.070$0.300208 t/s443ms4/14/2026
$0.075$0.300208 t/s443ms4/14/2026

Prices updated daily. Last check: 4/14/2026

Model Details

General

Creator
OpenAI
Family
GPT-OSS
Tier
Lightweight
Context Window
128K
Knowledge Cutoff
Jan 2025
Modalities
Text

Capabilities

Tool Calling
No
Open Source
Yes
Subtypes
Chat Completion

Strengths & Limitations

  • Open-source model weights available for self-hosting and customization
  • Fast inference speed at 222.5 tokens per second output rate
  • 128K token context window supports long document processing
  • Low time to first token at 389ms enables responsive applications
  • January 2025 knowledge cutoff provides recent training data
  • 20B parameter size balances capability with computational efficiency
  • Text-focused architecture optimized for chat completion tasks
  • No tool calling or function execution capabilities
  • Text-only modality lacks image or multimodal input support
  • Lightweight tier positioning means less capable than frontier models
  • Smaller parameter count than larger models in competing families
  • Self-hosting requires significant infrastructure and technical expertise

Key Features

128,000 token context window
Open-source model weights and architecture
Chat completion API compatibility
Streaming response support
20 billion parameter architecture
High-speed inference optimization
January 2025 knowledge cutoff

About GPT-OSS-20B

GPT-OSS-20B is OpenAI's open-source language model in the GPT-OSS family, positioned as a lightweight tier option with 20 billion parameters. This represents OpenAI's entry into open-source models, making their technology accessible for self-hosting and customization. The model supports a 128,000 token context window and is optimized for text-only chat completion tasks. Performance benchmarks show 222.5 output tokens per second with a 389ms time to first token, indicating strong inference speed characteristics. The model has a January 2025 knowledge cutoff, providing relatively current training data. GPT-OSS-20B targets use cases where organizations need OpenAI-quality language understanding but require model ownership, customization capabilities, or cost control through self-hosting. Its lightweight architecture makes it suitable for applications requiring fast responses at scale while maintaining competitive language capabilities.

Common Use Cases

GPT-OSS-20B is designed for organizations requiring high-volume text processing with model ownership and customization capabilities. Its lightweight architecture and fast inference make it suitable for customer service chatbots, content moderation, document summarization, and text classification tasks where response speed matters. The open-source nature enables fine-tuning for domain-specific applications, compliance requirements, or cost optimization through self-hosting. The 128K context window supports applications processing long documents, legal texts, or extended conversations while maintaining computational efficiency.

Frequently Asked Questions

How much does GPT-OSS-20B cost per million tokens?

GPT-OSS-20B pricing varies by provider and deployment method, with some providers offering hosted inference while others support self-hosting the open-source weights. Check the pricing table above for current rates across all available providers.

What is GPT-OSS-20B best used for?

GPT-OSS-20B excels at high-volume text processing applications like customer service automation, content moderation, document analysis, and text classification. Its open-source nature makes it ideal for organizations needing model customization, compliance requirements, or cost control through self-hosting.

Can I fine-tune or modify GPT-OSS-20B for my specific use case?

Yes, GPT-OSS-20B provides open-source model weights that can be fine-tuned and customized for specific domains or applications. This enables organizations to adapt the model for specialized tasks, compliance requirements, or performance optimization while maintaining full control over the deployment.