high

L40 GPU

The L40 targets visual computing and AI inference, supporting media processing, real-time rendering, and computer vision tasks in one GPU. It helps consolidate workloads that previously required separate hardware.

VRAM 40GB
CUDA Cores 18,176
Tensor Cores 568
TDP 300W
Process 4nm
From
$0.66/hr
across 9 providers
L40 GPU

Cloud Pricing

Cheapest on IO.NET 33% below avg
ProviderConfigPrice / hrUpdatedSource
1×2×4×
$0.66/hr
4/22/2026
1×2×4×
$0.67/hr
4/26/2026
1×2×4×8×
$0.69/hr
4/26/2026
1×
$0.84/hr
4/26/2026
1×
$0.95/hr
4/24/2026
1×2×
$0.99/hr
4/26/2026
1×2×4×8×
$1.00/hr
4/26/2026
1×2×
$1.09/hr
4/26/2026
4×8×
$1.10/hr
4/26/2026
8×
$1.25/hr
4/25/2026
1×2×4×8×
$1.49/hr
4/26/2026
4×
$1.49/hr
4/26/2026
Direct from providerVia marketplace

Prices updated daily. Last check: 4/27/2026

Performance

FP16
181.05 TFLOPS
FP32
90.5 TFLOPS
INT8
362 TOPS
Bandwidth
864 GB/s

Strengths & Limitations

  • 48GB GDDR6 memory with ECC provides substantial capacity for large models and datasets
  • Fourth-generation Tensor Cores deliver 362 INT8 TOPS for AI inference workloads
  • Third-generation RT Cores enable hardware-accelerated ray tracing for rendering applications
  • 8K AV1 encode and decode support for high-resolution video processing
  • 300W TDP offers reasonable power efficiency for the performance tier
  • Enterprise-grade design optimized for 24x7 continuous operation
  • Secure Boot with internal root of trust for security-sensitive environments
  • 300W power consumption requires robust cooling and power infrastructure
  • Not a dedicated data center accelerator, lacking some enterprise features found in H100 or A100
  • Ada Lovelace architecture is a previous generation compared to newer Blackwell designs
  • May be overkill for basic compute tasks that don't require the large memory capacity
  • Limited to PCIe Gen 4 interconnect without high-speed GPU-to-GPU communication options

Key Features

NVIDIA Ada Lovelace Architecture
Fourth-generation Tensor Cores
Third-generation RT Cores
8K AV1 encode and decode
48GB GDDR6 memory with ECC
Secure Boot with internal root of trust
PCIe Gen 4 interface
Dual-slot form factor

About L40

The NVIDIA L40 is a professional graphics and compute GPU based on the Ada Lovelace architecture, positioned as a versatile workstation and data center solution. Built on a 4nm manufacturing process, the L40 bridges the gap between consumer gaming GPUs and dedicated data center accelerators, offering 48GB of GDDR6 memory with ECC for professional workloads that require substantial memory capacity and reliability. The L40 features 18,176 CUDA cores and 568 fourth-generation Tensor Cores, delivering 90.5 TFLOPS of FP32 performance and 181.05 TFLOPS of FP16 performance. With 864 GB/s of memory bandwidth across a 384-bit memory interface, the GPU provides sufficient throughput for memory-intensive applications. The Ada Lovelace architecture introduces third-generation RT Cores for hardware-accelerated ray tracing and supports 8K AV1 encoding and decoding capabilities. In cloud deployments, the L40 serves workloads requiring large memory footprints combined with professional-grade reliability, including AI model training with medium to large datasets, 3D rendering pipelines, virtualized desktop infrastructure, and multi-tenant workstation environments where the 48GB memory capacity can be shared across multiple users or applications.

Common Use Cases

The L40 is well-suited for professional workloads that demand large memory capacity and mixed compute requirements. Its 48GB of ECC memory makes it appropriate for training medium to large AI models, running inference on memory-intensive models, and supporting virtualized workstation environments where multiple users share GPU resources. The combination of RT Cores and substantial VRAM supports 3D rendering, architectural visualization, and content creation workflows. Data science applications benefit from the large memory for processing extensive datasets, while the enterprise-grade design supports 24x7 cloud deployment scenarios requiring reliability and security features.

Full Specifications

Hardware

Manufacturer
NVIDIA
Architecture
Ada Lovelace
CUDA Cores
18,176
Tensor Cores
568
RT Cores
142
Process Node
4nm
TDP
300W

Memory & Performance

VRAM
40GB
Memory Interface
384-bit
Memory Bandwidth
864 GB/s
FP32
90.5 TFLOPS
FP16
181.05 TFLOPS
FP64
1.41 TFLOPS
INT8
362 TOPS
Release
2022

Frequently Asked Questions

How much does an L40 cost per hour in the cloud?

L40 pricing varies by provider, region, and commitment level. Check the pricing table above for current rates across all providers.

What is the L40 best used for?

The L40 excels at workloads requiring large memory capacity, including AI model training and inference with substantial datasets, 3D rendering and visualization, virtualized workstation environments, and professional content creation. Its 48GB ECC memory and enterprise-grade design make it suitable for professional applications that need reliability and substantial VRAM.

How does the L40 compare to dedicated data center GPUs like the H100?

The L40 offers 48GB of memory compared to the H100's 80GB, and lacks the H100's specialized features like NVLink interconnect and Transformer Engine optimizations. However, the L40 includes RT Cores for ray tracing and video encoding capabilities that the H100 lacks, making it more versatile for mixed professional workloads rather than pure AI compute.