Deep Infra Deep Infra

Deep Infra offers serverless AI APIs and dedicated GPU rentals with fast SSH access and low hourly pricing across flagship NVIDIA accelerators.

Key Features

Serverless Model APIs
OpenAI-compatible endpoints for 100+ models with autoscaling and pay-per-token billing
Dedicated GPU Rentals
B200 instances with SSH access spin up in about 10 seconds and bill hourly
Custom LLM Deployments
Deploy your own Hugging Face models onto dedicated A100, H100, H200, or B200 GPUs
Transparent GPU Pricing
Published per-GPU rates: A100 $0.89/hr, H100 $1.69/hr, H200 $1.99/hr, B200 $2.49/hr promo
Inference-Optimized Hardware
All hosted models run on H100 or A100 hardware tuned for low latency

Provider Comparison

Advantages

  • Simple OpenAI-compatible API alongside controllable GPU rentals
  • Competitive hourly rates for flagship NVIDIA GPUs including B200 promo pricing
  • Fast provisioning with SSH access for dedicated instances
  • Supports custom deployments in addition to hosted public models

Limitations

  • Region list is not clearly published in the public marketing pages
  • Primarily focused on inference and GPU rentals rather than broader cloud services
  • B200 promo pricing is time-limited per site note

Available GPUs

GPU Modelโ†‘
Memory
Hourly Price
A100 SXM
80GB$1.50/hr
H100
80GB$2.40/hr
H200
141GB$3.00/hr

Compute Services

Serverless Inference

Hosted model APIs with autoscaling on H100/A100 hardware.

Features

  • OpenAI-compatible REST API surface
  • Runs 100+ public models with pay-per-token pricing
  • Autoscaling for low latency without manual instance management

Dedicated GPU Instances

On-demand GPU nodes with SSH access for custom workloads.

Pricing Options

OptionDetails
Serverless pay-per-tokenOpenAI-compatible inference APIs with pay-per-request billing on H100/A100 hardware
Dedicated GPU hourly ratesPublished pricing: A100 $0.89/hr, H100 $1.69/hr, H200 $1.99/hr, B200 $2.49/hr promo (then $4.49/hr)
B200 GPU rentalsSSH-accessible B200 nodes with flexible hourly billing and promo pricing noted on the site

Getting Started

1

Create an account

Sign up (GitHub-supported) and open the Deep Infra dashboard

2

Enable billing

Add a payment method to unlock GPU rentals and API usage

3

Pick a GPU option

Choose serverless APIs or dedicated A100, H100, H200, or B200 instances

4

Launch and connect

Start instances with SSH access or call the OpenAI-compatible API endpoints

5

Monitor usage

Track spend and instance status from the dashboard and shut down when idle

Compare Providers

Find the best prices for the same GPUs from other providers

CoreWeave logo

CoreWeave

3 shared GPUs with Deep Infra

RunPod logo

RunPod

3 shared GPUs with Deep Infra

Amazon AWS logo

Amazon AWS

3 shared GPUs with Deep Infra