GB300 GPU
The NVIDIA GB300 Grace Blackwell Ultra Superchip connects two B300 GPUs to a Grace CPU via NVLink-C2C, offering 576GB HBM3e memory and massive compute for AI reasoning and trillion-parameter models.

Cloud Pricing
Cheapest on Verda — 48% below avg| Provider | GPUs | Price / hr | Updated | Source |
|---|---|---|---|---|
| 2× GPU | $2.80 | 4/8/2026 | ||
| 1× GPU | $2.80 | 4/8/2026 | ||
| 4× GPU | $2.80 | 4/8/2026 | ||
| 1× GPU | $7.99 | 4/8/2026 | ||
| 2× GPU | $7.99 | 4/8/2026 | ||
| 4× GPU | $7.99 | 4/8/2026 |
Prices updated daily. Last check: 4/8/2026
Performance
Strengths & Limitations
- 576 GB HBM3E memory capacity enables processing of extremely large models and datasets
- 10,000 TFLOPS FP16 performance delivers exceptional throughput for AI inference workloads
- Fifth-Generation NVLink with 130 TB/s bandwidth provides high-speed GPU-to-GPU communication
- Enhanced FP4 Tensor Core density offers 1.5x improvement over standard Blackwell GPUs
- 16 TB/s memory bandwidth supports high-throughput data processing applications
- AI Reasoning Inference optimization specifically targets test-time scaling workloads
- Integrated ConnectX-8 SuperNIC provides advanced networking capabilities
- 2,700W TDP requires specialized liquid cooling infrastructure and significant power capacity
- Rack-scale NVL72 architecture limits deployment flexibility compared to individual GPU configurations
- Ultra-performance tier positioning makes it overkill for standard training or basic inference tasks
- High power consumption may limit deployment in power-constrained data center environments
- Liquid cooling requirement increases infrastructure complexity and maintenance needs
Key Features
About GB300
Common Use Cases
The GB300 is designed for hyperscale AI factory applications that require maximum computational throughput and memory capacity. Its 576 GB memory and AI Reasoning Inference capabilities make it well-suited for large language model inference, real-time video generation, and test-time scaling workloads where models need extensive memory for processing complex reasoning tasks. The rack-scale architecture and high interconnect bandwidth support distributed inference across multiple models simultaneously, making it appropriate for cloud providers offering premium AI services or research institutions running large-scale AI experiments that demand the highest available performance tier.
Full Specifications
Hardware
- Manufacturer
- NVIDIA
- Architecture
- Blackwell
- TDP
- 2700W
- Max Power
- 2800W
Memory & Performance
- VRAM
- 576GB
- Memory Bandwidth
- 16000 GB/s
- FP32
- 200 TFLOPS
- FP16
- 10000 TFLOPS
- Release
- 2025
Frequently Asked Questions
How much does a GB300 cost per hour in the cloud?
GB300 pricing varies by provider, region, and commitment level. Check the pricing table above for current rates across all providers offering this ultra-performance tier GPU.
What is the GB300 best used for?
The GB300 excels at AI factory applications, large-scale inference workloads, real-time video generation, and test-time scaling inference. Its 576 GB memory capacity and AI Reasoning Inference optimizations make it ideal for processing the largest language models and complex reasoning tasks that require maximum memory and computational throughput.
How does the GB300 compare to the H100 for inference workloads?
The GB300 offers significantly higher memory capacity (576 GB vs 80 GB HBM3E on H100), enhanced AI Reasoning Inference capabilities, and 1.5x more dense FP4 Tensor Core operations. The Blackwell architecture provides optimizations specifically for test-time scaling and real-time processing that weren't available in the previous-generation Hopper H100, though the GB300 requires liquid cooling and consumes substantially more power.