ultraData Center

GB200 GPU

The NVIDIA GB200 Grace Blackwell Superchip connects two B200 GPUs to a Grace CPU via NVLink, offering massive memory capacity and bandwidth for trillion-parameter AI models and HPC workloads.

VRAM 384GB
TDP 2700W
From
$10.50/hr
across 1 provider
GB200 GPU

Cloud Pricing

ProviderGPUsPrice / hrUpdatedSource
4× GPU
$10.50
4/7/2026
Direct from providerVia marketplace

Prices updated daily. Last check: 4/8/2026

Performance

FP16
9000 TFLOPS
FP32
160 TFLOPS
BF16
4500 TFLOPS
FP8
9000 TFLOPS
Bandwidth
16000 GB/s

Strengths & Limitations

  • 384GB VRAM capacity supports very large models and datasets
  • 16,000 GB/s memory bandwidth enables high-throughput data processing
  • 9,000 TFLOPS FP16 performance for AI and ML workloads
  • Second-generation Transformer Engine with FP4 precision support
  • Fifth-generation NVIDIA NVLink with 130 TB/s inter-GPU communication
  • 72-GPU NVLink domain capability for massive parallel processing
  • Blackwell architecture optimizations for current AI workloads
  • 2700W TDP requires specialized power and cooling infrastructure
  • Rack-scale configuration limits deployment flexibility
  • Liquid cooling requirement increases infrastructure complexity
  • High memory capacity may be excessive for smaller workloads
  • Limited to NVIDIA CUDA ecosystem and compatible software

Key Features

Second-generation Transformer Engine
FP4 precision support
Fifth-generation NVIDIA NVLink
NVLink Switch System
HBM3E memory technology
72-GPU NVLink domain
Liquid cooling system
Blackwell architecture optimizations

About GB200

The NVIDIA GB200 is a data center GPU based on the Blackwell architecture, representing NVIDIA's current-generation offering for large-scale AI and HPC workloads. Built on the Blackwell architecture, the GB200 delivers 384GB of VRAM with 16,000 GB/s of memory bandwidth, positioning it as a high-capacity solution for memory-intensive applications. The GPU features a 2700W TDP and is designed for rack-scale deployments requiring liquid cooling infrastructure. Key technical specifications include 9,000 TFLOPS of FP16 performance and support for FP4 precision through its second-generation Transformer Engine. The GB200 incorporates fifth-generation NVIDIA NVLink technology enabling up to 130 TB/s of low-latency GPU-to-GPU communication within 72-GPU NVLink domains. The substantial 384GB memory capacity and high memory bandwidth make it suitable for applications requiring large model storage and high-throughput data processing. In cloud deployments, the GB200 targets large language model training and inference, massive-scale AI workloads, and high-performance computing applications that benefit from its extensive memory capacity and inter-GPU connectivity. The rack-scale configuration and liquid cooling requirements make it primarily suitable for specialized data center environments optimized for high-density compute workloads.

Common Use Cases

The GB200 is designed for large-scale AI training and inference workloads that require substantial memory capacity and high inter-GPU bandwidth. Its 384GB VRAM makes it suitable for training and serving large language models, while the 72-GPU NVLink domain capability supports massive parallel training runs. The high memory bandwidth and FP4 precision support through the Transformer Engine optimize it for transformer-based models and large-scale inference serving. HPC applications requiring high memory capacity and low-latency GPU-to-GPU communication also benefit from its specifications.

Full Specifications

Hardware

Manufacturer
NVIDIA
Architecture
Blackwell
TDP
2700W

Memory & Performance

VRAM
384GB
Memory Bandwidth
16000 GB/s
FP32
160 TFLOPS
FP16
9000 TFLOPS
BF16
4500 TFLOPS
FP8
9000 TFLOPS
Release
2024

Frequently Asked Questions

How much does a GB200 cost per hour in the cloud?

GB200 pricing varies by provider, region, and commitment level. Check the pricing table above for current rates across all providers.

What is the GB200 best used for?

The GB200 is optimized for large language model training and inference, massive-scale AI workloads, and HPC applications requiring high memory capacity. Its 384GB VRAM and 72-GPU NVLink domain capability make it particularly suitable for training very large models and high-throughput inference serving.

How does the GB200 compare to the H100 for AI workloads?

The GB200 offers significantly higher memory capacity (384GB vs 80GB), newer Blackwell architecture optimizations, and second-generation Transformer Engine with FP4 precision support. It also provides enhanced NVLink connectivity for larger GPU domains, making it better suited for very large models and scale-out training compared to the previous-generation H100.