A40 GPU
The A40 combines professional visualization and AI acceleration in a single GPU, supporting virtual workstations and rendering workloads alongside inference tasks. It offers a good balance of memory and compute for mixed graphics and AI use cases.

Cloud Pricing
Cheapest on Vast.ai — 72% below avgPrices updated daily. Last check: May 13, 2026
Performance
Strengths & Limitations
Strengths
- 48GB GDDR6 memory with ECC enables training of large AI models and complex simulations
- 336 third-generation Tensor Cores provide 149.7 TFLOPS of FP16 performance for AI workloads
- Second-generation RT Cores deliver hardware-accelerated ray tracing for professional graphics
- 696 GB/s memory bandwidth supports memory-intensive applications
- Third-generation NVLink at 112.5 GB/s enables multi-GPU scaling
- PCIe Gen 4 support provides modern system compatibility
- NVIDIA vGPU software support enables virtualization and multi-user scenarios
Limitations
- 300W power consumption requires robust cooling and power infrastructure
- Ampere architecture is a previous generation compared to newer Hopper and Blackwell offerings
- Dual-slot form factor may limit density in space-constrained deployments
- Lacks specialized features found in newer data center GPUs like Transformer Engine
- May be overkill for basic inference workloads that don't require 48GB VRAM
Key Features
About A40
Common Use Cases
The A40 is well-suited for AI model training and data science workflows that require substantial VRAM capacity, particularly models with large parameter counts or extensive datasets that benefit from the 48GB memory buffer. Professional graphics applications, including CAD, content creation, and scientific visualization, leverage the RT Cores and high memory bandwidth. Virtual workstation deployments benefit from vGPU software support, enabling multiple concurrent users. The combination of traditional compute performance and AI acceleration makes it appropriate for mixed workloads in research environments and development workflows that span both graphics and machine learning requirements.
Full Specifications
Hardware
- Manufacturer
- NVIDIA
- Architecture
- Ampere
- CUDA Cores
- 10,752
- Tensor Cores
- 336
- RT Cores
- 84
- Process Node
- 8nm
- TDP
- 300W
Memory & Performance
- VRAM
- 48GB
- Memory Interface
- 384-bit
- Memory Bandwidth
- 696 GB/s
- FP32
- 37.4 TFLOPS
- FP16
- 149.7 TFLOPS
- INT8
- 299.3 TOPS
- Release
- 2020
Frequently Asked Questions
How much does an A40 cost per hour in the cloud?
A40 pricing varies by provider, region, and commitment level. Check the pricing table above for current rates across all providers.
What is the A40 best used for?
The A40 excels at AI model training requiring large memory capacity, professional graphics workloads with ray tracing, virtual workstations, and data science applications. Its 48GB VRAM and Tensor Cores make it particularly suitable for training large neural networks, while RT Cores accelerate professional visualization tasks.
How does the A40 compare to modern data center GPUs like the H100?
The A40 offers 48GB VRAM compared to the H100's 80GB, and lacks the H100's Transformer Engine and fourth-generation Tensor Cores. The H100 provides significantly higher AI performance with specialized features for transformer models, while the A40 combines AI capabilities with professional graphics features like RT Cores that the H100 omits.