Skip to main content
Replicate logo

Replicate

Run open-source models at scale

Model marketplace🇺🇸 USinferencemodelsmarketplace

Last reviewed Mar 14, 2026

Replicate is a platform for running machine learning models in the cloud, offering thousands of open-source models with simple API access and pay-per-use pricing.

6
LLM Models
From / 1M input

LLM API Pricing

Pay-per-token pricing. Prices shown per 1M tokens.

Prices last updated: May 19, 2026

ModelCreatorContextInput/1MOutput/1MUpdated
Black Forest Labs$0.025/img-5/19/2026
Recraft$0.040/img-5/19/2026
Black Forest Labs$0.040/img-5/19/2026
WaveSpeed$0.090/sec-5/19/2026
Ideogram$0.090/img-5/19/2026
WaveSpeed$0.250/sec-5/19/2026

Pros & Cons

Advantages

  • Largest selection of open-source models on one platform
  • Simple pay-per-prediction pricing with no minimum
  • Easy deployment of custom models via Cog
  • Active community contributing new models daily

Limitations

  • Cold start latency for less popular models
  • Pricing can be unpredictable for high-volume use
  • Less optimized than specialized inference providers

Key Features

Vast Model Library

Access thousands of open-source models including LLMs, image generators, and more

Simple API

Consistent REST API across all models with webhooks for async processing

Custom Model Hosting

Deploy your own models using Cog containerization

Serverless Scaling

Automatic scaling with cold-start optimization

Pricing Options

OptionDetails
Pay-per-predictionCharged per model run based on compute time and hardware
Free tierLimited free predictions for new users

Availability & Support

Regions

US-based infrastructure with global CDN

Support

Documentation, Discord community, email support

Getting Started

  1. 1

    Create an account

    Sign up at replicate.com with GitHub or email

  2. 2

    Get API token

    Copy your API token from account settings

  3. 3

    Run a prediction

    Use the API or Python client to run any model