Nemotron 3 Super 120B
Nemotron 3 Super 120B is NVIDIA's flagship 120-billion parameter language model with a 262K token context window for complex text processing tasks.
API Pricing
Cheapest on Deep Infra — 14% below avg| Provider | Input / 1M | Output / 1M | Speed | TTFT | Updated |
|---|---|---|---|---|---|
| $0.100 | $0.500 | 154 t/s | 687ms | 4/4/2026 | |
| $0.100 | $0.500 | 154 t/s | 687ms | 4/14/2026 | |
| $0.150 | $0.650 | 154 t/s | 687ms | 4/14/2026 |
Prices updated daily. Last check: 4/14/2026
Model Details
General
- Creator
- NVIDIA
- Family
- Nemotron
- Tier
- Flagship
- Context Window
- 262K
- Modalities
- Text
Capabilities
- Tool Calling
- No
- Open Source
- No
Strengths & Limitations
- Large 262K token context window supports extensive document processing
- 120 billion parameters provide substantial model capacity
- Output rate of 164.34 tokens per second for consistent generation speed
- Developed by NVIDIA with potential optimization for their hardware ecosystem
- Flagship tier positioning within the Nemotron model family
- Time to first token of 744ms enables responsive initial output
- Extended context enables processing of lengthy conversations and documents
- No tool calling or function execution capabilities
- Text-only modality limits use cases compared to multimodal alternatives
- Proprietary model with no open source availability
- No image, audio, or video input support
- Limited API features compared to models with structured output modes
Key Features
About Nemotron 3 Super 120B
Common Use Cases
Nemotron 3 Super 120B is designed for applications requiring extensive language processing capabilities and long context understanding. Its 262K token context window makes it suitable for document analysis, legal document review, academic research processing, and lengthy technical documentation tasks. The model's flagship tier positioning and 120B parameter count enable complex reasoning over extended text, making it appropriate for content summarization, research synthesis, and detailed text analysis workflows. Organizations working within NVIDIA's ecosystem may find it particularly suitable for text-heavy AI applications that benefit from the model's substantial capacity and extended context capabilities.
Frequently Asked Questions
How much does Nemotron 3 Super 120B cost per million tokens?
Nemotron 3 Super 120B pricing varies by provider and usage patterns. Check the pricing table above for current rates across all available providers and pricing tiers.
What is Nemotron 3 Super 120B best used for?
Nemotron 3 Super 120B excels at tasks requiring extensive context understanding and complex text processing. Its 262K token context window makes it ideal for document analysis, research synthesis, legal document review, and processing lengthy technical materials where maintaining context across extended passages is crucial.
Does Nemotron 3 Super 120B support tool calling or multimodal inputs?
No, Nemotron 3 Super 120B focuses exclusively on text processing and does not support tool calling, function execution, or multimodal inputs like images or audio. It is designed as a pure language model for text-based applications requiring substantial context and processing capacity.