MI355X GPU
The AMD Instinct MI355X is the next-generation CDNA 4 accelerator with 288GB HBM3E memory, targeting AI training and large-scale inference.

Cloud Pricing
Cheapest on Vultr — 11% below avg| Provider | GPUs | Price / hr | Updated | Source |
|---|---|---|---|---|
| 1× GPU | $2.59 | 4/2/2026 | ||
| 8× GPU | $2.59 | 4/7/2026 | ||
| 1× GPU | $2.95 | 4/7/2026 | ||
| 1× GPU | $3.45 | 4/2/2026 |
Prices updated daily. Last check: 4/8/2026
Performance
Strengths & Limitations
- 288GB HBM3E memory capacity supports large model training without partitioning
- 8TB/s memory bandwidth enables efficient data movement for memory-intensive workloads
- MXFP4 support delivers 10.1 PetaFLOPs peak performance for AI inference
- Enhanced sparse matrix calculation performance optimizes certain AI and scientific workloads
- FP64 performance of 78.6 TFLOPs serves HPC applications requiring double precision
- 7 Infinity Fabric links enable high-bandwidth multi-GPU scaling
- CDNA 4 architecture provides dedicated AI and HPC optimizations
- 1400W typical board power requires substantial cooling infrastructure
- OAM form factor limits deployment to specialized server platforms
- AMD ROCm ecosystem has fewer pre-optimized frameworks compared to CUDA
- Ultra-tier pricing makes it cost-prohibitive for smaller workloads
- Limited availability in cloud providers compared to NVIDIA alternatives
Key Features
About MI355X
Common Use Cases
The MI355X is designed for large-scale AI training and inference workloads that benefit from its 288GB memory capacity, including training transformer models, large language models, and computer vision networks that exceed the memory limits of smaller accelerators. The 8TB/s memory bandwidth and MXFP4 support make it effective for high-throughput inference serving. In HPC environments, the 78.6 TFLOPs FP64 performance and sparse matrix optimizations suit computational fluid dynamics, molecular dynamics simulations, and other scientific computing applications requiring both high memory capacity and compute precision.
Full Specifications
Hardware
- Manufacturer
- AMD
- Architecture
- CDNA 4
- TDP
- 1000W
Memory & Performance
- VRAM
- 288GB
- Memory Bandwidth
- 8000 GB/s
- FP16
- 1800 TFLOPS
- Release
- 2025
Frequently Asked Questions
How much does a MI355X cost per hour in the cloud?
MI355X pricing varies by provider, region, and commitment level. Check the pricing table above for current rates across all providers.
What is the MI355X best used for?
The MI355X excels at large-scale AI training and inference workloads requiring substantial memory capacity, particularly transformer models and LLMs that benefit from the 288GB HBM3E memory. It also serves HPC applications needing high FP64 performance and memory bandwidth for scientific simulations.
How does the MI355X compare to NVIDIA's H100 for AI workloads?
The MI355X offers 288GB of memory versus the H100's 80GB, providing advantages for large model training. However, the H100 benefits from broader CUDA ecosystem support and more mature AI framework optimizations. Performance comparisons depend on specific workloads and framework optimization levels.