highData Center

Gaudi 2 GPU

Intel Gaudi 2 is an AI training and inference accelerator designed for efficient deep learning workloads.

VRAM 96GB
TDP 600W
From
$1.21/hr
across 1 provider
Gaudi 2 GPU

Cloud Pricing

ProviderGPUsPrice / hrUpdatedSource
8× GPU
$1.21
4/8/2026
Direct from providerVia marketplace

Prices updated daily. Last check: 4/8/2026

Performance

FP16
432 TFLOPS
Bandwidth
2450 GB/s

Strengths & Limitations

  • 96GB of high-bandwidth memory provides substantial capacity for large model training and inference
  • 2,450 GB/s memory bandwidth supports memory-intensive AI workloads
  • 432 TFLOPS FP16 performance delivers competitive compute throughput
  • Open standard Ethernet networking eliminates proprietary interconnect requirements
  • PCIe Gen5 connectivity enables integration with standard server architectures
  • Multiple form factors (PCIe and mezzanine) offer deployment flexibility
  • High-efficiency architecture designed for cost-effective AI scaling
  • 600W TDP requires substantial power and cooling infrastructure
  • Ethernet-based interconnect may have higher latency compared to proprietary solutions like NVLink
  • Limited ecosystem compared to more established GPU platforms
  • May be overkill for smaller AI models or inference-only workloads
  • Released in 2022, making it a previous-generation offering in the rapidly evolving AI accelerator market

Key Features

Intel Gaudi architecture optimized for AI training and inference
Open standard Ethernet networking for multi-node scaling
PCIe Gen5 connectivity
High-efficiency compute architecture
96GB high-bandwidth memory subsystem
Multiple precision support including FP16
Seamless server integration capabilities
Standard Ethernet interconnect support

About Gaudi 2

The Intel Gaudi 2 is an AI accelerator built on Intel's Gaudi architecture, positioned as an alternative to NVIDIA's data center GPUs in the AI training and inference market. Released in 2022, the Gaudi 2 represents Intel's entry into high-performance AI computing, designed to integrate with standard Ethernet networking infrastructure rather than proprietary interconnects. As a previous-generation offering in Intel's AI accelerator lineup, it competes in the high-performance AI training segment. The Gaudi 2 features 96GB of high-bandwidth memory with 2,450 GB/s of memory bandwidth and delivers 432 TFLOPS of FP16 performance. Key technical differentiators include its use of open standard Ethernet networking for multi-node scaling and PCIe Gen5 connectivity. The accelerator operates with a 600W TDP and is available in both PCIe and mezzanine form factors for flexible server integration. In cloud deployments, the Gaudi 2 targets large language model training, multi-modal AI workloads, and enterprise retrieval-augmented generation (RAG) applications. Its Ethernet-based scaling approach allows cloud providers to leverage existing network infrastructure while delivering high AI compute performance for training and inference workloads that require substantial memory capacity and bandwidth.

Common Use Cases

The Gaudi 2 is well-suited for large language model training and inference, multi-modal AI applications, and enterprise RAG deployments that require substantial memory capacity and bandwidth. With 96GB of memory and 432 TFLOPS of FP16 performance, it handles memory-intensive workloads effectively. The Ethernet-based scaling makes it particularly attractive for organizations with existing network infrastructure who want to avoid proprietary interconnect investments. Its high memory bandwidth of 2,450 GB/s supports applications where model parameters and activations exceed typical GPU memory limits, making it suitable for training large transformer models and running inference on models that require significant memory footprints.

Full Specifications

Hardware

Manufacturer
Intel
Architecture
Gaudi
TDP
600W

Memory & Performance

VRAM
96GB
Memory Bandwidth
2450 GB/s
FP16
432 TFLOPS
Release
2022

Frequently Asked Questions

How much does a Gaudi 2 cost per hour in the cloud?

Gaudi 2 pricing varies by provider, region, and commitment level. Check the pricing table above for current rates across all providers.

What is the Gaudi 2 best used for?

The Gaudi 2 excels at large language model training and inference, multi-modal AI workloads, and enterprise RAG applications. Its 96GB memory capacity and high bandwidth make it particularly suitable for memory-intensive AI workloads that exceed typical GPU memory limits.

How does Gaudi 2's Ethernet networking compare to NVIDIA's NVLink?

Gaudi 2 uses open standard Ethernet for multi-node communication, allowing integration with existing network infrastructure without proprietary hardware requirements. While this may result in higher latency compared to dedicated interconnects like NVLink, it provides cost-effective scalability and leverages standard networking equipment for AI cluster deployment.