L40 GPU
The L40 targets visual computing and AI inference, supporting media processing, real-time rendering, and computer vision tasks in one GPU. It helps consolidate workloads that previously required separate hardware.

Cloud Pricing
Cheapest on IO.NET — 33% below avgPrices updated daily. Last check: 4/27/2026
Performance
Strengths & Limitations
- 48GB GDDR6 memory with ECC provides substantial capacity for large models and datasets
- Fourth-generation Tensor Cores deliver 362 INT8 TOPS for AI inference workloads
- Third-generation RT Cores enable hardware-accelerated ray tracing for rendering applications
- 8K AV1 encode and decode support for high-resolution video processing
- 300W TDP offers reasonable power efficiency for the performance tier
- Enterprise-grade design optimized for 24x7 continuous operation
- Secure Boot with internal root of trust for security-sensitive environments
- 300W power consumption requires robust cooling and power infrastructure
- Not a dedicated data center accelerator, lacking some enterprise features found in H100 or A100
- Ada Lovelace architecture is a previous generation compared to newer Blackwell designs
- May be overkill for basic compute tasks that don't require the large memory capacity
- Limited to PCIe Gen 4 interconnect without high-speed GPU-to-GPU communication options
Key Features
About L40
Common Use Cases
The L40 is well-suited for professional workloads that demand large memory capacity and mixed compute requirements. Its 48GB of ECC memory makes it appropriate for training medium to large AI models, running inference on memory-intensive models, and supporting virtualized workstation environments where multiple users share GPU resources. The combination of RT Cores and substantial VRAM supports 3D rendering, architectural visualization, and content creation workflows. Data science applications benefit from the large memory for processing extensive datasets, while the enterprise-grade design supports 24x7 cloud deployment scenarios requiring reliability and security features.
Full Specifications
Hardware
- Manufacturer
- NVIDIA
- Architecture
- Ada Lovelace
- CUDA Cores
- 18,176
- Tensor Cores
- 568
- RT Cores
- 142
- Process Node
- 4nm
- TDP
- 300W
Memory & Performance
- VRAM
- 40GB
- Memory Interface
- 384-bit
- Memory Bandwidth
- 864 GB/s
- FP32
- 90.5 TFLOPS
- FP16
- 181.05 TFLOPS
- FP64
- 1.41 TFLOPS
- INT8
- 362 TOPS
- Release
- 2022
Frequently Asked Questions
How much does an L40 cost per hour in the cloud?
L40 pricing varies by provider, region, and commitment level. Check the pricing table above for current rates across all providers.
What is the L40 best used for?
The L40 excels at workloads requiring large memory capacity, including AI model training and inference with substantial datasets, 3D rendering and visualization, virtualized workstation environments, and professional content creation. Its 48GB ECC memory and enterprise-grade design make it suitable for professional applications that need reliability and substantial VRAM.
How does the L40 compare to dedicated data center GPUs like the H100?
The L40 offers 48GB of memory compared to the H100's 80GB, and lacks the H100's specialized features like NVLink interconnect and Transformer Engine optimizations. However, the L40 includes RT Cores for ray tracing and video encoding capabilities that the H100 lacks, making it more versatile for mixed professional workloads rather than pure AI compute.