Question 1

What is a cloud GPU and how does GPU cloud computing work?

Accepted Answer

A cloud GPU is a graphics processing unit hosted in a remote datacenter that you can rent by the hour. Instead of purchasing hardware, you access GPU compute through a cloud provider's API or virtual machine. This lets you scale up for training large AI models or running inference workloads, and scale down when you're done — paying only for the time you use.

Question 2

How many GPU models are available for cloud rental?

Accepted Answer

We track 66 GPU models across 40 cloud providers. These span NVIDIA, AMD, Intel hardware, covering architectures from Ampere, Blackwell, Hopper, Ada Lovelace, Turing and more. Each GPU has different VRAM, compute throughput, and pricing — use the filters above to narrow down what fits your workload.

Question 3

What GPU do I need for machine learning training?

Accepted Answer

It depends on model size. For training models under 7B parameters, a 24 GB GPU like the RTX 4090 or A10 is sufficient. For 7B–70B parameter models, you need 48–80 GB VRAM — the L40S, A100, or H100 are common choices. For training models above 70B parameters, you need multiple ultra-tier GPUs (H100, H200, MI300X) with NVLink interconnects for multi-GPU communication.

Question 4

What is the difference between NVIDIA, AMD, and Intel GPUs for AI?

Accepted Answer

NVIDIA dominates with the CUDA ecosystem and widest cloud availability. Their Hopper (H100/H200) and Blackwell (B200/GB200) architectures lead in training throughput. AMD Instinct accelerators (MI300X) compete on VRAM capacity (192 GB) and price-performance using the ROCm software stack. Intel offers Gaudi accelerators optimized for deep learning and Max Series (Xe HPC) for scientific computing, both at competitive price points.

Question 5

What does GPU architecture mean and why does it matter?

Accepted Answer

GPU architecture refers to the chip design generation — it determines compute capabilities, supported precision formats, memory type, and power efficiency. For example, NVIDIA's Hopper architecture introduced FP8 Transformer Engines for faster AI training, while Blackwell added FP4 for doubled inference throughput. Newer architectures generally offer better performance per watt and per dollar. Browse by architecture to compare GPUs within the same generation.

Question 6

How much VRAM do I need?

Accepted Answer

VRAM determines how large a model you can fit in GPU memory. At FP16 precision, each billion parameters requires roughly 2 GB of VRAM. So a 7B model needs ~14 GB, a 13B model needs ~26 GB, and a 70B model needs ~140 GB. Quantization (INT8, INT4) can halve or quarter these requirements. For inference with quantized models, 24 GB handles up to 30B parameters, 48 GB handles up to 70B, and 80 GB+ is needed for larger models at full precision.

Question 7

What are GPU performance tiers (entry, mid, high, ultra)?

Accepted Answer

Performance tiers group GPUs by their compute capability and typical use case. Entry-tier GPUs (T4, RTX 4060) suit prototyping and light inference. Mid-tier GPUs (A10G, RTX 3080) handle production inference and fine-tuning. High-tier GPUs (L40S, RTX 4090) serve demanding inference and medium-scale training. Ultra-tier GPUs (H100, A100 80GB, MI300X) are built for large-scale distributed training and high-throughput serving.

Question 8

What is the difference between server GPUs and consumer GPUs?

Accepted Answer

Server GPUs (A100, H100, L40S) use ECC memory for data integrity, support NVLink for multi-GPU scaling, have higher sustained power budgets, and include enterprise driver support. Consumer GPUs (RTX 4090, RTX 3090) use GDDR memory, lack NVLink, and are designed for desktop use — but they can still be cost-effective for inference and smaller training jobs where ECC and multi-GPU interconnects aren't required.

GPU Models

H100 SXM

A100 SXM

H200

L40S

B200

RTX A6000

RTX PRO 6000

L4

L40

RTX 6000 Ada

Browse by Category

By Manufacturer

By Architecture

By VRAM

By Performance

By Type

Frequently Asked Questions