high

L40 GPU

Name: L40
Brand: NVIDIA
Price: 1.25 USD
Availability: InStock

The L40 targets visual computing and AI inference, supporting media processing, real-time rendering, and computer vision tasks in one GPU. It helps consolidate workloads that previously required separate hardware.

VRAM 40GB

CUDA Cores 18,176

Tensor Cores 568

TDP 300W

Process 4nm

From

$0.66/hr

across 9 providers

Compare Prices Specs →

Cloud Pricing

Cheapest on IO.NET — 33% below avg

Provider	Config	Price / hr	Updated
IO.NET	1×2×4×	$0.66/hr	4/22/2026
Massed Compute	1×2×4×	$0.67/hr	4/26/2026
RunPod	1×2×4×8×	$0.69/hr	4/26/2026
Massed Compute	1×	$0.84/hr	4/26/2026
TensorDock	1×	$0.95/hr	4/24/2026
Latitude.sh	1×2×	$0.99/hr	4/26/2026
Hyperstack	1×2×4×8×	$1.00/hr	4/26/2026
Sesterce	1×2×	$1.09/hr	4/26/2026
Sesterce	4×8×	$1.10/hr	4/26/2026
CoreWeave	8×	$1.25/hr	4/25/2026
Together AI	1×2×4×8×	$1.49/hr	4/26/2026
Latitude.sh	4×	$1.49/hr	4/26/2026

Direct from providerVia marketplace

Prices updated daily. Last check: 4/27/2026

Performance

FP16

181.05 TFLOPS

FP32

90.5 TFLOPS

INT8

362 TOPS

Bandwidth

864 GB/s

Strengths & Limitations

48GB GDDR6 memory with ECC provides substantial capacity for large models and datasets
Fourth-generation Tensor Cores deliver 362 INT8 TOPS for AI inference workloads
Third-generation RT Cores enable hardware-accelerated ray tracing for rendering applications
8K AV1 encode and decode support for high-resolution video processing
300W TDP offers reasonable power efficiency for the performance tier
Enterprise-grade design optimized for 24x7 continuous operation
Secure Boot with internal root of trust for security-sensitive environments

300W power consumption requires robust cooling and power infrastructure
Not a dedicated data center accelerator, lacking some enterprise features found in H100 or A100
Ada Lovelace architecture is a previous generation compared to newer Blackwell designs
May be overkill for basic compute tasks that don't require the large memory capacity
Limited to PCIe Gen 4 interconnect without high-speed GPU-to-GPU communication options

Key Features

•NVIDIA Ada Lovelace Architecture

•Fourth-generation Tensor Cores

•Third-generation RT Cores

•8K AV1 encode and decode

•48GB GDDR6 memory with ECC

•Secure Boot with internal root of trust

•PCIe Gen 4 interface

•Dual-slot form factor

About L40

The NVIDIA L40 is a professional graphics and compute GPU based on the Ada Lovelace architecture, positioned as a versatile workstation and data center solution. Built on a 4nm manufacturing process, the L40 bridges the gap between consumer gaming GPUs and dedicated data center accelerators, offering 48GB of GDDR6 memory with ECC for professional workloads that require substantial memory capacity and reliability. The L40 features 18,176 CUDA cores and 568 fourth-generation Tensor Cores, delivering 90.5 TFLOPS of FP32 performance and 181.05 TFLOPS of FP16 performance. With 864 GB/s of memory bandwidth across a 384-bit memory interface, the GPU provides sufficient throughput for memory-intensive applications. The Ada Lovelace architecture introduces third-generation RT Cores for hardware-accelerated ray tracing and supports 8K AV1 encoding and decoding capabilities. In cloud deployments, the L40 serves workloads requiring large memory footprints combined with professional-grade reliability, including AI model training with medium to large datasets, 3D rendering pipelines, virtualized desktop infrastructure, and multi-tenant workstation environments where the 48GB memory capacity can be shared across multiple users or applications.

Common Use Cases

The L40 is well-suited for professional workloads that demand large memory capacity and mixed compute requirements. Its 48GB of ECC memory makes it appropriate for training medium to large AI models, running inference on memory-intensive models, and supporting virtualized workstation environments where multiple users share GPU resources. The combination of RT Cores and substantial VRAM supports 3D rendering, architectural visualization, and content creation workflows. Data science applications benefit from the large memory for processing extensive datasets, while the enterprise-grade design supports 24x7 cloud deployment scenarios requiring reliability and security features.

Full Specifications

Hardware

Manufacturer: NVIDIA
Architecture: Ada Lovelace
CUDA Cores: 18,176
Tensor Cores: 568
RT Cores: 142
Process Node: 4nm
TDP: 300W

Memory & Performance

VRAM: 40GB
Memory Interface: 384-bit
Memory Bandwidth: 864 GB/s
FP32: 90.5 TFLOPS
FP16: 181.05 TFLOPS
FP64: 1.41 TFLOPS
INT8: 362 TOPS
Release: 2022

Frequently Asked Questions

How much does an L40 cost per hour in the cloud?

L40 pricing varies by provider, region, and commitment level. Check the pricing table above for current rates across all providers.

What is the L40 best used for?

The L40 excels at workloads requiring large memory capacity, including AI model training and inference with substantial datasets, 3D rendering and visualization, virtualized workstation environments, and professional content creation. Its 48GB ECC memory and enterprise-grade design make it suitable for professional applications that need reliability and substantial VRAM.

How does the L40 compare to dedicated data center GPUs like the H100?

The L40 offers 48GB of memory compared to the H100's 80GB, and lacks the H100's specialized features like NVLink interconnect and Transformer Engine optimizations. However, the L40 includes RT Cores for ray tracing and video encoding capabilities that the H100 lacks, making it more versatile for mixed professional workloads rather than pure AI compute.