highData Center

Gaudi 2 GPU

Name: Gaudi 2
Brand: Intel
Price: 1.25 USD
Availability: InStock

Intel Gaudi 2 is an AI training and inference accelerator designed for efficient deep learning workloads.

VRAM 96GB

TDP 600W

From

$1.21/hr

across 2 providers

Compare Prices Specs →

Cloud Pricing

Cheapest on Sesterce — 2% below avg

Provider	Config	Price / hr	Updated	Source
Sesterce	8×	$1.21/hr	5/23/2026
Denvr Dataworks	8×	$1.25/hr	5/22/2026

Direct from providerVia marketplace

Prices updated daily. Last check: May 23, 2026

Performance

FP16

432 TFLOPS

Bandwidth

2450 GB/s

Strengths & Limitations

Strengths

96GB of high-bandwidth memory provides substantial capacity for large model training and inference
2,450 GB/s memory bandwidth supports memory-intensive AI workloads
432 TFLOPS FP16 performance delivers competitive compute throughput
Open standard Ethernet networking eliminates proprietary interconnect requirements
PCIe Gen5 connectivity enables integration with standard server architectures
Multiple form factors (PCIe and mezzanine) offer deployment flexibility
High-efficiency architecture designed for cost-effective AI scaling

Limitations

600W TDP requires substantial power and cooling infrastructure
Ethernet-based interconnect may have higher latency compared to proprietary solutions like NVLink
Limited ecosystem compared to more established GPU platforms
May be overkill for smaller AI models or inference-only workloads
Released in 2022, making it a previous-generation offering in the rapidly evolving AI accelerator market

Key Features

•Intel Gaudi architecture optimized for AI training and inference

•Open standard Ethernet networking for multi-node scaling

•PCIe Gen5 connectivity

•High-efficiency compute architecture

•96GB high-bandwidth memory subsystem

•Multiple precision support including FP16

•Seamless server integration capabilities

•Standard Ethernet interconnect support

About Gaudi 2

The Intel Gaudi 2 is an AI accelerator built on Intel's Gaudi architecture, positioned as an alternative to NVIDIA's data center GPUs in the AI training and inference market. Released in 2022, the Gaudi 2 represents Intel's entry into high-performance AI computing, designed to integrate with standard Ethernet networking infrastructure rather than proprietary interconnects. As a previous-generation offering in Intel's AI accelerator lineup, it competes in the high-performance AI training segment. The Gaudi 2 features 96GB of high-bandwidth memory with 2,450 GB/s of memory bandwidth and delivers 432 TFLOPS of FP16 performance. Key technical differentiators include its use of open standard Ethernet networking for multi-node scaling and PCIe Gen5 connectivity. The accelerator operates with a 600W TDP and is available in both PCIe and mezzanine form factors for flexible server integration. In cloud deployments, the Gaudi 2 targets large language model training, multi-modal AI workloads, and enterprise retrieval-augmented generation (RAG) applications. Its Ethernet-based scaling approach allows cloud providers to leverage existing network infrastructure while delivering high AI compute performance for training and inference workloads that require substantial memory capacity and bandwidth.

Common Use Cases

The Gaudi 2 is well-suited for large language model training and inference, multi-modal AI applications, and enterprise RAG deployments that require substantial memory capacity and bandwidth. With 96GB of memory and 432 TFLOPS of FP16 performance, it handles memory-intensive workloads effectively. The Ethernet-based scaling makes it particularly attractive for organizations with existing network infrastructure who want to avoid proprietary interconnect investments. Its high memory bandwidth of 2,450 GB/s supports applications where model parameters and activations exceed typical GPU memory limits, making it suitable for training large transformer models and running inference on models that require significant memory footprints.

Full Specifications

Hardware

Manufacturer: Intel
Architecture: Gaudi
TDP: 600W

Memory & Performance

VRAM: 96GB
Memory Bandwidth: 2450 GB/s
FP16: 432 TFLOPS
Release: 2022

Frequently Asked Questions

How much does a Gaudi 2 cost per hour in the cloud?

Gaudi 2 pricing varies by provider, region, and commitment level. Check the pricing table above for current rates across all providers.

What is the Gaudi 2 best used for?

The Gaudi 2 excels at large language model training and inference, multi-modal AI workloads, and enterprise RAG applications. Its 96GB memory capacity and high bandwidth make it particularly suitable for memory-intensive AI workloads that exceed typical GPU memory limits.

How does Gaudi 2's Ethernet networking compare to NVIDIA's NVLink?

Gaudi 2 uses open standard Ethernet for multi-node communication, allowing integration with existing network infrastructure without proprietary hardware requirements. While this may result in higher latency compared to dedicated interconnects like NVLink, it provides cost-effective scalability and leverages standard networking equipment for AI cluster deployment.