GB200 GPU
The NVIDIA GB200 Grace Blackwell Superchip connects two B200 GPUs to a Grace CPU via NVLink, offering massive memory capacity and bandwidth for trillion-parameter AI models and HPC workloads.

Cloud Pricing
Prices updated daily. Last check: 4/8/2026
Performance
Strengths & Limitations
- 384GB VRAM capacity supports very large models and datasets
- 16,000 GB/s memory bandwidth enables high-throughput data processing
- 9,000 TFLOPS FP16 performance for AI and ML workloads
- Second-generation Transformer Engine with FP4 precision support
- Fifth-generation NVIDIA NVLink with 130 TB/s inter-GPU communication
- 72-GPU NVLink domain capability for massive parallel processing
- Blackwell architecture optimizations for current AI workloads
- 2700W TDP requires specialized power and cooling infrastructure
- Rack-scale configuration limits deployment flexibility
- Liquid cooling requirement increases infrastructure complexity
- High memory capacity may be excessive for smaller workloads
- Limited to NVIDIA CUDA ecosystem and compatible software
Key Features
About GB200
Common Use Cases
The GB200 is designed for large-scale AI training and inference workloads that require substantial memory capacity and high inter-GPU bandwidth. Its 384GB VRAM makes it suitable for training and serving large language models, while the 72-GPU NVLink domain capability supports massive parallel training runs. The high memory bandwidth and FP4 precision support through the Transformer Engine optimize it for transformer-based models and large-scale inference serving. HPC applications requiring high memory capacity and low-latency GPU-to-GPU communication also benefit from its specifications.
Full Specifications
Hardware
- Manufacturer
- NVIDIA
- Architecture
- Blackwell
- TDP
- 2700W
Memory & Performance
- VRAM
- 384GB
- Memory Bandwidth
- 16000 GB/s
- FP32
- 160 TFLOPS
- FP16
- 9000 TFLOPS
- BF16
- 4500 TFLOPS
- FP8
- 9000 TFLOPS
- Release
- 2024
Frequently Asked Questions
How much does a GB200 cost per hour in the cloud?
GB200 pricing varies by provider, region, and commitment level. Check the pricing table above for current rates across all providers.
What is the GB200 best used for?
The GB200 is optimized for large language model training and inference, massive-scale AI workloads, and HPC applications requiring high memory capacity. Its 384GB VRAM and 72-GPU NVLink domain capability make it particularly suitable for training very large models and high-throughput inference serving.
How does the GB200 compare to the H100 for AI workloads?
The GB200 offers significantly higher memory capacity (384GB vs 80GB), newer Blackwell architecture optimizations, and second-generation Transformer Engine with FP4 precision support. It also provides enhanced NVLink connectivity for larger GPU domains, making it better suited for very large models and scale-out training compared to the previous-generation H100.