entryData Center

L4 GPU

The NVIDIA L4 is a versatile data center GPU optimized for AI inference, video processing, and graphics workloads.

VRAM 24GB
CUDA Cores 7,424
Tensor Cores 232
TDP 72W
From
$0.32/hr
across 7 providers
L4 GPU

Cloud Pricing

Cheapest on Seeweb 59% below avg
ProviderGPUsPrice / hrUpdatedSource
1× GPU
6mo$0.32
3/30/2026
1× GPU
$0.34
4/4/2026
1× GPU
3mo$0.34
3/30/2026
1× GPU
1mo$0.36
3/30/2026
1× GPU
$0.44
4/6/2026
1× GPU
$0.44
4/8/2026
1× GPU
$0.77
4/8/2026
2× GPU
$0.80
4/8/2026
4× GPU
$0.80
4/8/2026
1× GPU
$0.80
4/8/2026
1× GPU
$0.87
4/8/2026
2× GPU
$0.87
4/8/2026
4× GPU
$0.87
4/8/2026
8× GPU
$0.87
4/8/2026
8× GPU
$0.92
4/8/2026
4× GPU
$0.93
4/8/2026
2× GPU
$0.95
4/8/2026
1× GPU
$0.99
4/8/2026
4× GPU
$1.15
4/8/2026
8× GPU
$1.67
4/8/2026
Direct from providerVia marketplace

Prices updated daily. Last check: 4/8/2026

Performance

FP16
121 TFLOPS
FP32
30.29 TFLOPS
BF16
121 TFLOPS
FP8
121 TFLOPS
Bandwidth
300 GB/s

Strengths & Limitations

  • 24GB GPU memory enables deployment of larger AI models that require substantial memory buffers
  • 72-watt maximum power consumption allows high-density server deployments with minimal cooling requirements
  • Fourth-generation Tensor Cores provide hardware acceleration for modern AI inference workloads
  • Single-slot low-profile form factor fits in space-constrained server chassis
  • 300 GB/s memory bandwidth supports memory-intensive video processing and inference tasks
  • Ada Lovelace architecture delivers improved energy efficiency compared to previous generation server GPUs
  • PCIe Gen4 x16 interface provides 64 GB/s host connectivity for data transfer
  • Entry-tier positioning limits raw compute performance compared to higher-end data center GPUs
  • 232 Tensor Cores may insufficient for training large models or compute-intensive inference workloads
  • Single GPU configuration lacks multi-GPU interconnect technologies like NVLink
  • Ada Lovelace architecture represents previous generation compared to newer Blackwell-based offerings
  • Limited to PCIe connectivity without high-speed GPU-to-GPU communication capabilities

Key Features

Fourth-generation Tensor Cores
NVIDIA Ada Lovelace architecture
CV-CUDA acceleration
NVIDIA Deep Learning Super Sampling 3
PCIe Gen4 x16 interface
Single-slot low-profile form factor
Hardware video encoding and decoding
CUDA Compute Capability support

About L4

The NVIDIA L4 is an entry-tier data center GPU based on the Ada Lovelace architecture, positioned as an energy-efficient solution for AI inference and video processing workloads. Released in 2023, the L4 serves as a successor to the T4 in NVIDIA's server GPU lineup, offering improved performance while maintaining a low 72-watt power envelope suitable for space-constrained deployments. The L4 features 24GB of GPU memory with 300 GB/s memory bandwidth, 7,424 CUDA cores, and 232 fourth-generation Tensor Cores. Its standout characteristic is the combination of substantial memory capacity with minimal power consumption, achieved through the Ada Lovelace architecture's manufacturing efficiency. The GPU delivers 121 TFLOPS of FP16 performance and 30.29 TFLOPS of FP32 performance while fitting in a single-slot, low-profile form factor. In cloud deployments, the L4 typically serves high-throughput inference scenarios where memory capacity matters more than raw compute power. Its 24GB memory buffer enables deployment of larger language models and computer vision workloads that would exceed the memory limits of smaller inference GPUs, while its low power draw allows dense server configurations.

Common Use Cases

The L4 is well-suited for AI inference deployments requiring substantial memory capacity within power-constrained environments. Its 24GB memory buffer makes it appropriate for deploying medium-sized language models, computer vision applications processing high-resolution imagery, and video analytics workloads that benefit from keeping large datasets in GPU memory. The low 72-watt power envelope and compact form factor make it suitable for edge AI deployments, telecommunications infrastructure, and cloud providers seeking to maximize inference throughput per rack unit while minimizing cooling costs.

Full Specifications

Hardware

Manufacturer
NVIDIA
Architecture
Ada Lovelace
CUDA Cores
7,424
Tensor Cores
232
TDP
72W

Memory & Performance

VRAM
24GB
Memory Bandwidth
300 GB/s
FP32
30.29 TFLOPS
FP16
121 TFLOPS
BF16
121 TFLOPS
FP8
121 TFLOPS
Release
2023

Frequently Asked Questions

How much does an L4 cost per hour in the cloud?

L4 pricing varies by provider, region, and commitment level. Check the pricing table above for current rates across all providers.

What is the L4 best used for?

The L4 excels at AI inference tasks requiring substantial GPU memory, video processing workloads, and deployments where power efficiency is critical. Its 24GB memory capacity makes it suitable for medium-sized language models and computer vision applications, while its 72-watt power draw enables dense server configurations.

How does the L4 compare to the T4 for inference workloads?

The L4 offers 24GB GPU memory compared to the T4's 16GB, fourth-generation Tensor Cores versus the T4's first-generation cores, and improved energy efficiency through the Ada Lovelace architecture. Both maintain similar low-power profiles, but the L4 provides better performance per watt and can handle larger models due to increased memory capacity.