ultra

B200 GPU

The B200 is pushing the boundaries of AI model scale and performance, enabling computations that were previously impractical, and doing so with potentially better total cost of ownership and energy efficiency compared to scaling out with older generations for the same task.

VRAM 192GB
TDP 1000W
From
$2.25/hr
across 14 providers
B200 GPU

Cloud Pricing

Cheapest on Packet AI 68% below avg
ProviderGPUsPrice / hrUpdatedSource
1× GPU$2.253/30/2026
2× GPU$2.454/1/2026
1× GPU$2.794/1/2026
2× GPU$2.994/1/2026
1× GPU$2.993/12/2026
8× GPU$2.993/9/2026
1× GPU$3.503/10/2026
1× GPU$3.753/22/2026
8× GPU$4.113/31/2026
1× GPU$4.493mo3/30/2026
1× GPU$4.496mo3/30/2026
8× GPU$4.503/31/2026
4× GPU$4.844/1/2026
1× GPU$4.894/1/2026
4× GPU$4.893/31/2026
8× GPU$4.893/31/2026
2× GPU$4.984/1/2026
2× GPU$5.294/1/2026
1× GPU$5.493/11/2026
1× GPU$5.503/11/2026
8× GPU$5.744/1/2026
1× GPU$5.824/1/2026
4× GPU$5.854/1/2026
1× GPU$5.984/1/2026
1× GPU$6.084/1/2026
1× GPU$7.151mo4/1/2026
1× GPU$7.494/1/2026
8× GPU$8.604/1/2026
4× GPU$9.953/31/2026
8× GPU$9.953/31/2026
8× GPU$22.3236mo3/31/2026
8× GPU$23.9224mo3/31/2026
8× GPU$27.9212mo3/31/2026
Direct from providerVia marketplace

Prices updated daily. Last check: 4/1/2026

Performance

FP16
4500 TFLOPS
FP32
80 TFLOPS
BF16
2250 TFLOPS
FP8
4500 TFLOPS
INT8
9000 TOPS
Bandwidth
8000 GB/s

Strengths & Limitations

  • 192 GB VRAM capacity supports large models and datasets
  • 8,000 GB/s memory bandwidth enables efficient data processing
  • 4,500 TFLOPS FP16 performance for AI training workloads
  • 9,000 TOPS INT8 performance optimized for inference tasks
  • NVIDIA NVLink 5.0 provides high-speed multi-GPU scaling
  • FP8 and FP4 Tensor Core support for mixed-precision computing
  • Blackwell architecture includes current-generation features
  • 1,000-watt TDP requires substantial power and cooling infrastructure
  • Superseded by newer GB300 series for absolute peak performance
  • High power consumption may limit deployment density
  • Enterprise-focused design may be overkill for smaller workloads
  • Requires NVIDIA software ecosystem for optimal utilization

Key Features

NVIDIA NVLink 5.0
Tensor Cores with FP8 support
FP4 Tensor Core operations
HBM3e memory subsystem
NVIDIA Mission Control compatibility
NVIDIA AI Enterprise support
Blackwell architecture compute units
Multi-precision format support

About B200

The NVIDIA B200 is a data center GPU built on the Blackwell architecture, representing NVIDIA's current-generation compute accelerator for AI and HPC workloads. As part of NVIDIA's enterprise GPU lineup, the B200 sits below the newer GB300 series in the product hierarchy but remains a capable accelerator for demanding computational tasks. The B200 features 192 GB of VRAM and incorporates advanced interconnect technologies for multi-GPU deployments. The B200 delivers 4,500 TFLOPS of FP16 performance and 9,000 TOPS of INT8 performance, with 8,000 GB/s of memory bandwidth supporting its substantial VRAM capacity. The GPU includes NVIDIA NVLink 5.0 for high-speed GPU-to-GPU communication and supports both FP8 and FP4 precision formats through its Tensor Cores. With a TDP of 1,000 watts, the B200 requires robust cooling and power infrastructure typical of enterprise data center environments. In cloud deployments, the B200 serves workloads requiring substantial memory capacity and compute throughput, particularly for large-scale AI training and inference tasks. The combination of high VRAM and memory bandwidth makes it suitable for applications that process large datasets or models that exceed the memory constraints of lower-tier accelerators.

Common Use Cases

The B200 is designed for large-scale AI training and inference workloads that require substantial memory capacity and compute throughput. Its 192 GB VRAM makes it suitable for training large language models, processing extensive recommendation system datasets, and running memory-intensive scientific computing applications. The high INT8 performance and FP8 Tensor Core support optimize it for AI inference scenarios, while the substantial FP16 capability handles training workloads effectively. Organizations deploying chatbots, large language models, or complex AI pipelines benefit from the B200's combination of memory capacity and computational performance, particularly when workloads exceed the capabilities of lower-tier accelerators.

Full Specifications

Hardware

Manufacturer
NVIDIA
Architecture
Blackwell
TDP
1000W

Memory & Performance

VRAM
192GB
Memory Bandwidth
8000 GB/s
FP32
80 TFLOPS
FP16
4500 TFLOPS
BF16
2250 TFLOPS
FP8
4500 TFLOPS
FP64
40 TFLOPS
INT8
9000 TOPS
Release
2024

Frequently Asked Questions

How much does a B200 cost per hour in the cloud?

B200 pricing varies by provider, region, and commitment level. Check the pricing table above for current rates across all providers.

What is the B200 best used for?

The B200 excels at large-scale AI training and inference workloads requiring substantial memory capacity. Its 192 GB VRAM and high-bandwidth memory subsystem make it well-suited for large language models, recommendation systems, and memory-intensive scientific computing applications.

How does the B200 compare to the newer GB300 series?

While the B200 offers substantial compute performance with 4,500 TFLOPS FP16 and 192 GB VRAM, the GB300 series represents NVIDIA's latest Blackwell Ultra architecture with improved performance and efficiency characteristics. The B200 remains capable for current workloads but the GB300 series provides higher absolute performance for the most demanding applications.