midData Center

A16 GPU

Name: A16
Brand: NVIDIA
Price: 0.51 USD
Availability: InStock

The NVIDIA A16 is a multi-GPU card with 4 GPUs designed for graphics-intensive virtual desktop infrastructure (VDI) and cloud gaming.

VRAM 64GB

CUDA Cores 2,560

TDP 250W

From

$0.51/hr

across 2 providers

Compare Prices Specs →

Cloud Pricing

Cheapest on Vultr — 7% below avg

Provider	Config	Price / hr	Updated
Vultr	1×2×	$0.51/hr	5/23/2026
Vultr	8×	$0.51/hr	5/23/2026
Vultr	4×	$0.51/hr	5/23/2026
Sesterce	1×2×	$0.56/hr	5/23/2026
Sesterce	8×	$0.56/hr	5/23/2026
Sesterce	4×	$0.56/hr	5/23/2026
Vultr	16×	$0.57/hr	5/23/2026
Sesterce	16×	$0.63/hr	5/23/2026

Direct from providerVia marketplace

Prices updated daily. Last check: May 23, 2026

Performance

FP16

35 TFLOPS

FP32

17.5 TFLOPS

Bandwidth

800 GB/s

Strengths & Limitations

Strengths

64GB total VRAM with ECC support across four GPU dies provides substantial memory capacity for multi-user scenarios
Quad-GPU board design enables support for up to 64 concurrent users per card
Advanced video encoding with H.265, VP9, and AV1 codec support delivers efficient streaming capabilities
Third-generation Tensor Cores provide AI acceleration capabilities for inference workloads
Second-generation RT Cores enable hardware-accelerated ray tracing for graphics applications
PCI Express Gen 4 x16 interface provides modern connectivity standards
Passive thermal cooling design reduces acoustic noise and cooling complexity

Limitations

250W power consumption may be significant for high-density deployments with multiple cards
2021 release date means it predates newer architectures like Ada Lovelace and Hopper with improved efficiency
Mid-tier performance positioning limits suitability for high-end compute or training workloads
Specialized VDI focus makes it potentially overkill for basic virtualization needs
Ampere architecture lacks some optimizations found in newer GPU generations

Key Features

•NVIDIA Virtual PC (vPC) support

•NVIDIA RTX Virtual Workstation (vWS) support

•Third-generation Tensor Cores

•Second-generation RT Cores

•H.265, VP9, and AV1 video codec support

•GDDR6 memory with error-correcting code (ECC)

•PCI Express Gen 4 support

•Passive thermal cooling design

About A16

The NVIDIA A16 is a mid-tier server GPU based on the Ampere architecture, designed specifically for virtual desktop infrastructure and multi-user environments. Released in 2021, the A16 features a unique quad-GPU board design with four separate GPU dies on a single card, each equipped with 16GB of GDDR6 memory with ECC support for a total of 64GB VRAM. This design positions it as a specialized solution within NVIDIA's data center lineup, distinct from the compute-focused H100 and newer GB300 series. The A16 incorporates 2560 CUDA cores across its four GPU configuration, delivering 35 TFLOPS of FP16 performance and 17.5 TFLOPS of FP32 performance with 800 GB/s of memory bandwidth. The card includes second-generation RT Cores for ray tracing workloads, third-generation Tensor Cores for AI acceleration, and advanced video encoding capabilities supporting H.265, VP9, and AV1 codecs. With a 250W TDP and passive thermal cooling design, the A16 maintains relatively modest power requirements compared to high-end compute GPUs. In cloud deployments, the A16 serves virtual desktop infrastructure providers and organizations requiring high-density graphics virtualization. Its ability to support up to 64 concurrent users per board makes it particularly suitable for VDI environments, virtual workstations, and scenarios where multiple users need dedicated graphics resources without requiring the computational power of larger datacenter GPUs.

Common Use Cases

The A16 is optimized for virtual desktop infrastructure deployments where multiple users require dedicated graphics resources. Its quad-GPU design and 64GB total VRAM make it well-suited for VDI providers serving knowledge workers, designers using CAD applications, or organizations running graphics-rich virtual desktops. The card's video encoding capabilities and support for up to 64 concurrent users make it effective for virtual workstation environments, remote work scenarios, and educational institutions requiring scalable graphics virtualization. The Tensor Cores also enable AI inference workloads in virtualized environments, though the A16 is not positioned for large-scale training tasks.

Full Specifications

Hardware

Manufacturer: NVIDIA
Architecture: Ampere
CUDA Cores: 2,560
TDP: 250W

Memory & Performance

VRAM: 64GB
Memory Bandwidth: 800 GB/s
FP32: 17.5 TFLOPS
FP16: 35 TFLOPS
Release: 2021

Frequently Asked Questions

How much does an A16 cost per hour in the cloud?

A16 pricing varies by provider, region, and commitment level. Check the pricing table above for current rates across all providers.

What is the A16 best used for?

The A16 excels in virtual desktop infrastructure and multi-user graphics virtualization scenarios. Its quad-GPU design supports up to 64 concurrent users, making it ideal for VDI deployments, virtual workstations, and organizations requiring scalable graphics resources for remote work or educational environments.

How does the A16 compare to other virtualization-focused GPUs?

The A16's unique quad-GPU board design with 64GB total VRAM provides higher user density than single-GPU solutions like the T4, while its specialized VDI features distinguish it from compute-focused cards like the A100. The Ampere architecture offers better performance per user than previous-generation virtualization GPUs, though newer architectures provide improved efficiency.