Skip to main content
Modal logo

Modal

Serverless GPUs for AI workloads with per-second billing

Inference specialist🇺🇸 US

Last reviewed May 8, 2026

Modal is a serverless GPU platform that lets developers run Python functions, jobs and inference endpoints on NVIDIA GPUs with per-second billing and scale-to-zero.

GPU Models
7
From / hour
$0.01

Available GPUs

Hourly on-demand pricing. Click column headers to sort.

Prices last updated: June 22, 2026

GPU Model
Memory
GPUs
Price / hr
Updated
Source
A1024GB
1×
$0.011/hr
6/22/2026
A100 SXM80GB
1×
$0.020/hr
6/22/2026
B200192GB
1×
$0.104/hr
6/22/2026
H100 SXM80GB
1×
$0.066/hr
6/22/2026
H200141GB
1×
$0.076/hr
6/22/2026
L40S48GB
1×
$0.018/hr
6/22/2026
RTX PRO 600096GB
1×
$0.045/hr
6/22/2026

Pros & Cons

Advantages

  • Serverless model removes idle-instance costs
  • Per-second billing across the full GPU range
  • Strong fit for inference, batch jobs and ML pipelines

Limitations

  • Long-running, fixed-instance training is not the primary use case
  • Cold starts and storage limits require some application design
  • No bare-metal access; workloads run inside Modal's runtime

Key Features

Serverless GPUs

Run Python functions on NVIDIA GPUs without provisioning instances; cold starts in seconds

Per-Second Billing

Pay for actual GPU runtime at sub-minute granularity, with scale-to-zero by default

Container-Native

Define environments in code, with automatic image building and caching

Wide GPU Catalog

From T4 and L4 through A100, L40S, H100, H200 and B200

Pricing Options

OptionDetails
Per-Second GPU BillingCharged per second of GPU runtime, with scale-to-zero when idle
Free TierMonthly free credits for experimentation and personal projects
Team and Enterprise PlansVolume commitments and enterprise support for production deployments

Availability & Support

Regions

Multi-region availability across North America and Europe

Support

Documentation, community forum, and enterprise support for paid plans

Getting Started

  1. 1

    Install the SDK

    Run `pip install modal` and authenticate via the CLI

  2. 2

    Define a function

    Decorate a Python function with the desired GPU and image specification

  3. 3

    Run or deploy

    Invoke locally or deploy as a long-lived endpoint or scheduled job

Compare Providers

Find the best prices for the same GPUs from other providers

Oracle Cloud logo

Oracle Cloud

7 shared GPUs with Modal

Sesterce logo

Sesterce

7 shared GPUs with Modal

CoreWeave logo

CoreWeave

6 shared GPUs with Modal