H100 NVL GPU
The H100 NVL is optimized for large language model inference, featuring dual-GPU design with 94GB combined memory and high NVLink bandwidth.

Cloud Pricing
Cheapest on Latitude.sh — 70% below avgPrices updated daily. Last check: 4/8/2026
Performance
Strengths & Limitations
- 94 GB HBM3 memory capacity supports large model training and inference
- Transformer Engine with FP8 support optimizes transformer-based AI workloads
- 400W TDP provides better power efficiency than higher-wattage H100 variants
- Multi-Instance GPU (MIG) enables workload partitioning and multi-tenancy
- 3,958 GB/s memory bandwidth facilitates high-throughput data processing
- Fourth-generation Tensor Cores deliver 835 TFLOPS FP16 performance
- Built-in Confidential Computing capabilities enable secure processing scenarios
- 400W power consumption requires substantial cooling and power infrastructure
- Previous-generation architecture compared to current GB300 Blackwell Ultra lineup
- High-end specifications may be excessive for smaller AI models or basic compute tasks
- Limited to Hopper architecture capabilities versus newer architectural improvements
- Premium positioning makes it cost-inefficient for workloads not requiring full capability set
Key Features
About H100 NVL
Common Use Cases
The H100 NVL is suited for large language model training and inference where its 94 GB memory capacity and Transformer Engine optimization provide advantages for transformer-based architectures. Its substantial compute capability makes it appropriate for high-performance computing applications requiring significant parallel processing power, while the Multi-Instance GPU feature enables cloud providers to partition resources for multiple tenants. The built-in Confidential Computing capabilities make it suitable for secure AI processing scenarios, and the balanced power consumption profile works well in data centers with thermal constraints while still requiring substantial AI compute capability.
Full Specifications
Hardware
- Manufacturer
- NVIDIA
- Architecture
- Hopper
- CUDA Cores
- 14,592
- Tensor Cores
- 456
- TDP
- 400W
Memory & Performance
- VRAM
- 94GB
- Memory Bandwidth
- 3958 GB/s
- FP32
- 67 TFLOPS
- FP16
- 835 TFLOPS
- FP64
- 34 TFLOPS
- Release
- 2023
Frequently Asked Questions
How much does an H100 NVL cost per hour in the cloud?
H100 NVL pricing varies by provider, region, and commitment level. Check the pricing table above for current rates across all providers.
What is the H100 NVL best used for?
The H100 NVL excels at large language model training and inference, leveraging its 94 GB memory capacity and Transformer Engine optimization. It's also well-suited for high-performance computing workloads, accelerated data analytics, and scenarios requiring Confidential Computing capabilities.
How does the H100 NVL compare to the H100 SXM?
The H100 NVL operates at 400W TDP compared to the SXM variant's higher power consumption, making it more power-efficient while maintaining the same Hopper architecture and 94 GB memory capacity. The NVL features 600 GB/s NVLink bandwidth versus 900 GB/s on SXM variants, representing a trade-off between power efficiency and maximum interconnect performance.