L4 GPU
The NVIDIA L4 is a versatile data center GPU optimized for AI inference, video processing, and graphics workloads.

Cloud Pricing
Cheapest on Seeweb — 59% below avgPrices updated daily. Last check: 4/8/2026
Performance
Strengths & Limitations
- 24GB GPU memory enables deployment of larger AI models that require substantial memory buffers
- 72-watt maximum power consumption allows high-density server deployments with minimal cooling requirements
- Fourth-generation Tensor Cores provide hardware acceleration for modern AI inference workloads
- Single-slot low-profile form factor fits in space-constrained server chassis
- 300 GB/s memory bandwidth supports memory-intensive video processing and inference tasks
- Ada Lovelace architecture delivers improved energy efficiency compared to previous generation server GPUs
- PCIe Gen4 x16 interface provides 64 GB/s host connectivity for data transfer
- Entry-tier positioning limits raw compute performance compared to higher-end data center GPUs
- 232 Tensor Cores may insufficient for training large models or compute-intensive inference workloads
- Single GPU configuration lacks multi-GPU interconnect technologies like NVLink
- Ada Lovelace architecture represents previous generation compared to newer Blackwell-based offerings
- Limited to PCIe connectivity without high-speed GPU-to-GPU communication capabilities
Key Features
About L4
Common Use Cases
The L4 is well-suited for AI inference deployments requiring substantial memory capacity within power-constrained environments. Its 24GB memory buffer makes it appropriate for deploying medium-sized language models, computer vision applications processing high-resolution imagery, and video analytics workloads that benefit from keeping large datasets in GPU memory. The low 72-watt power envelope and compact form factor make it suitable for edge AI deployments, telecommunications infrastructure, and cloud providers seeking to maximize inference throughput per rack unit while minimizing cooling costs.
Full Specifications
Hardware
- Manufacturer
- NVIDIA
- Architecture
- Ada Lovelace
- CUDA Cores
- 7,424
- Tensor Cores
- 232
- TDP
- 72W
Memory & Performance
- VRAM
- 24GB
- Memory Bandwidth
- 300 GB/s
- FP32
- 30.29 TFLOPS
- FP16
- 121 TFLOPS
- BF16
- 121 TFLOPS
- FP8
- 121 TFLOPS
- Release
- 2023
Frequently Asked Questions
How much does an L4 cost per hour in the cloud?
L4 pricing varies by provider, region, and commitment level. Check the pricing table above for current rates across all providers.
What is the L4 best used for?
The L4 excels at AI inference tasks requiring substantial GPU memory, video processing workloads, and deployments where power efficiency is critical. Its 24GB memory capacity makes it suitable for medium-sized language models and computer vision applications, while its 72-watt power draw enables dense server configurations.
How does the L4 compare to the T4 for inference workloads?
The L4 offers 24GB GPU memory compared to the T4's 16GB, fourth-generation Tensor Cores versus the T4's first-generation cores, and improved energy efficiency through the Ada Lovelace architecture. Both maintain similar low-power profiles, but the L4 provides better performance per watt and can handle larger models due to increased memory capacity.