B200 GPU
The B200 is pushing the boundaries of AI model scale and performance, enabling computations that were previously impractical, and doing so with potentially better total cost of ownership and energy efficiency compared to scaling out with older generations for the same task.

Cloud Pricing
Cheapest on Packet AI — 68% below avgPrices updated daily. Last check: 4/1/2026
Performance
Strengths & Limitations
- 192 GB VRAM capacity supports large models and datasets
- 8,000 GB/s memory bandwidth enables efficient data processing
- 4,500 TFLOPS FP16 performance for AI training workloads
- 9,000 TOPS INT8 performance optimized for inference tasks
- NVIDIA NVLink 5.0 provides high-speed multi-GPU scaling
- FP8 and FP4 Tensor Core support for mixed-precision computing
- Blackwell architecture includes current-generation features
- 1,000-watt TDP requires substantial power and cooling infrastructure
- Superseded by newer GB300 series for absolute peak performance
- High power consumption may limit deployment density
- Enterprise-focused design may be overkill for smaller workloads
- Requires NVIDIA software ecosystem for optimal utilization
Key Features
About B200
Common Use Cases
The B200 is designed for large-scale AI training and inference workloads that require substantial memory capacity and compute throughput. Its 192 GB VRAM makes it suitable for training large language models, processing extensive recommendation system datasets, and running memory-intensive scientific computing applications. The high INT8 performance and FP8 Tensor Core support optimize it for AI inference scenarios, while the substantial FP16 capability handles training workloads effectively. Organizations deploying chatbots, large language models, or complex AI pipelines benefit from the B200's combination of memory capacity and computational performance, particularly when workloads exceed the capabilities of lower-tier accelerators.
Full Specifications
Hardware
- Manufacturer
- NVIDIA
- Architecture
- Blackwell
- TDP
- 1000W
Memory & Performance
- VRAM
- 192GB
- Memory Bandwidth
- 8000 GB/s
- FP32
- 80 TFLOPS
- FP16
- 4500 TFLOPS
- BF16
- 2250 TFLOPS
- FP8
- 4500 TFLOPS
- FP64
- 40 TFLOPS
- INT8
- 9000 TOPS
- Release
- 2024
Frequently Asked Questions
How much does a B200 cost per hour in the cloud?
B200 pricing varies by provider, region, and commitment level. Check the pricing table above for current rates across all providers.
What is the B200 best used for?
The B200 excels at large-scale AI training and inference workloads requiring substantial memory capacity. Its 192 GB VRAM and high-bandwidth memory subsystem make it well-suited for large language models, recommendation systems, and memory-intensive scientific computing applications.
How does the B200 compare to the newer GB300 series?
While the B200 offers substantial compute performance with 4,500 TFLOPS FP16 and 192 GB VRAM, the GB300 series represents NVIDIA's latest Blackwell Ultra architecture with improved performance and efficiency characteristics. The B200 remains capable for current workloads but the GB300 series provides higher absolute performance for the most demanding applications.