LightweightOpen SourceOpenAI

GPT-OSS-20B

Name: GPT-OSS-20B
Availability: InStock
Author: OpenAI

GPT-OSS-20B is OpenAI's open-source lightweight model with 20 billion parameters, offering 128K context and fast inference for high-volume applications.

Context 128K

Tier Lightweight

Knowledge Jan 2025

License Open Source

Input from

$0.030 / 1M tokens

across 7 providers

Compare Prices Model Page →Paper

API Pricing

Cheapest on Deep Infra — 43% below avg

Provider	Input / 1M	Output / 1M	Cached / 1M	Speed	TTFT	Updated
Deep Infra	$0.030	$0.140	-	269 t/s	437ms	5/30/2026
Amazon AWSBatch	$0.030	$0.100	-	269 t/s	437ms	5/30/2026
OpenRouter	$0.030	$0.140	-	269 t/s	437ms	5/30/2026
Together AI	$0.050	$0.200	-	269 t/s	437ms	5/30/2026
IO.NET	$0.063	$0.268	$0.032	269 t/s	437ms	5/30/2026
Amazon AWS	$0.070	$0.200	-	269 t/s	437ms	5/30/2026
Fireworks AI	$0.070	$0.300	-	269 t/s	437ms	5/29/2026
Groq	$0.075	$0.300	-	269 t/s	437ms	5/29/2026

Prices updated daily. Last check: May 30, 2026

Model Details

General

Creator: OpenAI
Family: GPT-OSS
Tier: Lightweight
Context Window: 128K
Knowledge Cutoff: Jan 2025
Modalities: Text

Capabilities

Tool Calling: No
Open Source: Yes
Subtypes: Chat Completion
Aliases: GPT OSS Safeguard 20B

Strengths & Limitations

Strengths

Open-source model weights available for self-hosting and customization
Fast inference speed at 222.5 tokens per second output rate
128K token context window supports long document processing
Low time to first token at 389ms enables responsive applications
January 2025 knowledge cutoff provides recent training data
20B parameter size balances capability with computational efficiency
Text-focused architecture optimized for chat completion tasks

Limitations

No tool calling or function execution capabilities
Text-only modality lacks image or multimodal input support
Lightweight tier positioning means less capable than frontier models
Smaller parameter count than larger models in competing families
Self-hosting requires significant infrastructure and technical expertise

Key Features

•128,000 token context window

•Open-source model weights and architecture

•Chat completion API compatibility

•Streaming response support

•20 billion parameter architecture

•High-speed inference optimization

•January 2025 knowledge cutoff

About GPT-OSS-20B

GPT-OSS-20B is OpenAI's open-source language model in the GPT-OSS family, positioned as a lightweight tier option with 20 billion parameters. This represents OpenAI's entry into open-source models, making their technology accessible for self-hosting and customization. The model supports a 128,000 token context window and is optimized for text-only chat completion tasks. Performance benchmarks show 222.5 output tokens per second with a 389ms time to first token, indicating strong inference speed characteristics. The model has a January 2025 knowledge cutoff, providing relatively current training data. GPT-OSS-20B targets use cases where organizations need OpenAI-quality language understanding but require model ownership, customization capabilities, or cost control through self-hosting. Its lightweight architecture makes it suitable for applications requiring fast responses at scale while maintaining competitive language capabilities.

Common Use Cases

GPT-OSS-20B is designed for organizations requiring high-volume text processing with model ownership and customization capabilities. Its lightweight architecture and fast inference make it suitable for customer service chatbots, content moderation, document summarization, and text classification tasks where response speed matters. The open-source nature enables fine-tuning for domain-specific applications, compliance requirements, or cost optimization through self-hosting. The 128K context window supports applications processing long documents, legal texts, or extended conversations while maintaining computational efficiency.

Frequently Asked Questions

How much does GPT-OSS-20B cost per million tokens?

GPT-OSS-20B pricing varies by provider and deployment method, with some providers offering hosted inference while others support self-hosting the open-source weights. Check the pricing table above for current rates across all available providers.

What is GPT-OSS-20B best used for?

GPT-OSS-20B excels at high-volume text processing applications like customer service automation, content moderation, document analysis, and text classification. Its open-source nature makes it ideal for organizations needing model customization, compliance requirements, or cost control through self-hosting.

Can I fine-tune or modify GPT-OSS-20B for my specific use case?

Yes, GPT-OSS-20B provides open-source model weights that can be fine-tuned and customized for specific domains or applications. This enables organizations to adapt the model for specialized tasks, compliance requirements, or performance optimization while maintaining full control over the deployment.