entryData Center

L4 GPU

Name: L4
Brand: NVIDIA
Price: 0.8048 USD
Availability: InStock

The NVIDIA L4 is a versatile data center GPU optimized for AI inference, video processing, and graphics workloads.

VRAM 24GB

CUDA Cores 7,424

Tensor Cores 232

TDP 72W

From

$0.13/hr

across 10 providers

Compare Prices Specs →

Cloud Pricing

Cheapest on Vast.ai — 84% below avg

Provider	Config	Price / hr	Updated
Vast.ai	1×	$0.13/hr	4/28/2026
TensorDock	1×2×	$0.23/hr	5/23/2026
Vast.ai	2×	$0.27/hr	5/23/2026
Jarvis Labs	1×	$0.44/hr	5/23/2026
Seeweb	1×	$0.44/hr	5/23/2026
UpCloud	1×	$0.69/hr	5/21/2026
Koyeb	1×	$0.70/hr	5/15/2026
UpCloud	2×	$0.70/hr	5/21/2026
AceCloud	1×	$0.77/hr	5/15/2026
AceCloud	2×4×	$0.80/hr	5/15/2026
Amazon AWS	1×	$0.80/hr	5/23/2026
Scaleway	1×	$0.87/hr	5/23/2026
Scaleway	2×4×8×	$0.95/hr	5/23/2026
Sesterce	1×2×4×8×	$1.05/hr	5/23/2026
Amazon AWS	4×	$1.15/hr	5/23/2026
Amazon AWS	8×	$1.67/hr	5/23/2026
UpCloud	3×	$2.12/hr	5/8/2026

Direct from providerVia marketplace

Prices updated daily. Last check: May 23, 2026

Performance

FP16

121 TFLOPS

FP32

30.29 TFLOPS

BF16

121 TFLOPS

FP8

121 TFLOPS

Bandwidth

300 GB/s

Strengths & Limitations

Strengths

24GB GPU memory enables deployment of larger AI models that require substantial memory buffers
72-watt maximum power consumption allows high-density server deployments with minimal cooling requirements
Fourth-generation Tensor Cores provide hardware acceleration for modern AI inference workloads
Single-slot low-profile form factor fits in space-constrained server chassis
300 GB/s memory bandwidth supports memory-intensive video processing and inference tasks
Ada Lovelace architecture delivers improved energy efficiency compared to previous generation server GPUs
PCIe Gen4 x16 interface provides 64 GB/s host connectivity for data transfer

Limitations

Entry-tier positioning limits raw compute performance compared to higher-end data center GPUs
232 Tensor Cores may insufficient for training large models or compute-intensive inference workloads
Single GPU configuration lacks multi-GPU interconnect technologies like NVLink
Ada Lovelace architecture represents previous generation compared to newer Blackwell-based offerings
Limited to PCIe connectivity without high-speed GPU-to-GPU communication capabilities

Key Features

•Fourth-generation Tensor Cores

•NVIDIA Ada Lovelace architecture

•CV-CUDA acceleration

•NVIDIA Deep Learning Super Sampling 3

•PCIe Gen4 x16 interface

•Single-slot low-profile form factor

•Hardware video encoding and decoding

•CUDA Compute Capability support

About L4

The NVIDIA L4 is an entry-tier data center GPU based on the Ada Lovelace architecture, positioned as an energy-efficient solution for AI inference and video processing workloads. Released in 2023, the L4 serves as a successor to the T4 in NVIDIA's server GPU lineup, offering improved performance while maintaining a low 72-watt power envelope suitable for space-constrained deployments. The L4 features 24GB of GPU memory with 300 GB/s memory bandwidth, 7,424 CUDA cores, and 232 fourth-generation Tensor Cores. Its standout characteristic is the combination of substantial memory capacity with minimal power consumption, achieved through the Ada Lovelace architecture's manufacturing efficiency. The GPU delivers 121 TFLOPS of FP16 performance and 30.29 TFLOPS of FP32 performance while fitting in a single-slot, low-profile form factor. In cloud deployments, the L4 typically serves high-throughput inference scenarios where memory capacity matters more than raw compute power. Its 24GB memory buffer enables deployment of larger language models and computer vision workloads that would exceed the memory limits of smaller inference GPUs, while its low power draw allows dense server configurations.

Common Use Cases

The L4 is well-suited for AI inference deployments requiring substantial memory capacity within power-constrained environments. Its 24GB memory buffer makes it appropriate for deploying medium-sized language models, computer vision applications processing high-resolution imagery, and video analytics workloads that benefit from keeping large datasets in GPU memory. The low 72-watt power envelope and compact form factor make it suitable for edge AI deployments, telecommunications infrastructure, and cloud providers seeking to maximize inference throughput per rack unit while minimizing cooling costs.

Full Specifications

Hardware

Manufacturer: NVIDIA
Architecture: Ada Lovelace
CUDA Cores: 7,424
Tensor Cores: 232
TDP: 72W

Memory & Performance

VRAM: 24GB
Memory Bandwidth: 300 GB/s
FP32: 30.29 TFLOPS
FP16: 121 TFLOPS
BF16: 121 TFLOPS
FP8: 121 TFLOPS
Release: 2023

Frequently Asked Questions

How much does an L4 cost per hour in the cloud?

L4 pricing varies by provider, region, and commitment level. Check the pricing table above for current rates across all providers.

What is the L4 best used for?

The L4 excels at AI inference tasks requiring substantial GPU memory, video processing workloads, and deployments where power efficiency is critical. Its 24GB memory capacity makes it suitable for medium-sized language models and computer vision applications, while its 72-watt power draw enables dense server configurations.

How does the L4 compare to the T4 for inference workloads?

The L4 offers 24GB GPU memory compared to the T4's 16GB, fourth-generation Tensor Cores versus the T4's first-generation cores, and improved energy efficiency through the Ada Lovelace architecture. Both maintain similar low-power profiles, but the L4 provides better performance per watt and can handle larger models due to increased memory capacity.