ultraData Center

H100 NVL GPU

The H100 NVL is optimized for large language model inference, featuring dual-GPU design with 94GB combined memory and high NVLink bandwidth.

VRAM 94GB
CUDA Cores 14,592
Tensor Cores 456
TDP 400W
From
$0.40/hr
across 4 providers
H100 NVL GPU

Cloud Pricing

Cheapest on Latitude.sh 70% below avg
ProviderGPUsPrice / hrUpdatedSource
8× GPU
$0.40
4/6/2026
8× GPU
$0.43
4/6/2026
4× GPU
$0.79
4/6/2026
4× GPU
$0.87
4/6/2026
1× GPU
$1.40
4/8/2026
2× GPU
$1.58
4/8/2026
2× GPU
$1.74
4/8/2026
1× GPU
$1.95
4/6/2026
1× GPU
$2.59
4/8/2026
Direct from providerVia marketplace

Prices updated daily. Last check: 4/8/2026

Performance

FP16
835 TFLOPS
FP32
67 TFLOPS
Bandwidth
3958 GB/s

Strengths & Limitations

  • 94 GB HBM3 memory capacity supports large model training and inference
  • Transformer Engine with FP8 support optimizes transformer-based AI workloads
  • 400W TDP provides better power efficiency than higher-wattage H100 variants
  • Multi-Instance GPU (MIG) enables workload partitioning and multi-tenancy
  • 3,958 GB/s memory bandwidth facilitates high-throughput data processing
  • Fourth-generation Tensor Cores deliver 835 TFLOPS FP16 performance
  • Built-in Confidential Computing capabilities enable secure processing scenarios
  • 400W power consumption requires substantial cooling and power infrastructure
  • Previous-generation architecture compared to current GB300 Blackwell Ultra lineup
  • High-end specifications may be excessive for smaller AI models or basic compute tasks
  • Limited to Hopper architecture capabilities versus newer architectural improvements
  • Premium positioning makes it cost-inefficient for workloads not requiring full capability set

Key Features

Transformer Engine with FP8 precision
Fourth-generation Tensor Cores
Multi-Instance GPU (MIG)
HBM3 memory technology
NVIDIA Hopper architecture
Built-in Confidential Computing
NVLink interconnect support
PCIe Gen5 interface

About H100 NVL

The H100 NVL is NVIDIA's Hopper architecture GPU designed for data center deployment, serving as a more power-efficient variant of the H100 line with reduced power consumption compared to the SXM model. Built on the Hopper architecture, it occupies a position in NVIDIA's previous-generation lineup, having been succeeded by the GB300 Blackwell Ultra series in the company's product hierarchy. The NVL variant features 94 GB of HBM3 memory and incorporates fourth-generation Tensor Cores alongside the Transformer Engine for AI workloads. The H100 NVL delivers 835 TFLOPS of FP16 performance through its 14,592 CUDA cores and 456 Tensor cores, with 3,958 GB/s of memory bandwidth. Its 400W TDP makes it more power-efficient than higher-wattage variants while maintaining substantial compute capability. The GPU supports Multi-Instance GPU (MIG) technology for workload isolation and includes built-in Confidential Computing capabilities for secure processing scenarios. In cloud deployments, the H100 NVL finds application in large language model training and inference workloads where its substantial VRAM and Transformer Engine optimization provide benefits. The lower power draw compared to SXM variants makes it suitable for deployments where thermal and power constraints are considerations, while still delivering performance for AI training, high-performance computing applications, and accelerated data analytics tasks.

Common Use Cases

The H100 NVL is suited for large language model training and inference where its 94 GB memory capacity and Transformer Engine optimization provide advantages for transformer-based architectures. Its substantial compute capability makes it appropriate for high-performance computing applications requiring significant parallel processing power, while the Multi-Instance GPU feature enables cloud providers to partition resources for multiple tenants. The built-in Confidential Computing capabilities make it suitable for secure AI processing scenarios, and the balanced power consumption profile works well in data centers with thermal constraints while still requiring substantial AI compute capability.

Full Specifications

Hardware

Manufacturer
NVIDIA
Architecture
Hopper
CUDA Cores
14,592
Tensor Cores
456
TDP
400W

Memory & Performance

VRAM
94GB
Memory Bandwidth
3958 GB/s
FP32
67 TFLOPS
FP16
835 TFLOPS
FP64
34 TFLOPS
Release
2023

Frequently Asked Questions

How much does an H100 NVL cost per hour in the cloud?

H100 NVL pricing varies by provider, region, and commitment level. Check the pricing table above for current rates across all providers.

What is the H100 NVL best used for?

The H100 NVL excels at large language model training and inference, leveraging its 94 GB memory capacity and Transformer Engine optimization. It's also well-suited for high-performance computing workloads, accelerated data analytics, and scenarios requiring Confidential Computing capabilities.

How does the H100 NVL compare to the H100 SXM?

The H100 NVL operates at 400W TDP compared to the SXM variant's higher power consumption, making it more power-efficient while maintaining the same Hopper architecture and 94 GB memory capacity. The NVL features 600 GB/s NVLink bandwidth versus 900 GB/s on SXM variants, representing a trade-off between power efficiency and maximum interconnect performance.