What GPU types does Perplexity offer?

Perplexity offers various GPU types including . Check the pricing table above for current availability and pricing.

How do I get started with Perplexity?

Create an account, Generate API key, Choose a model, Make first API call

What are Perplexity's main advantages?

Perplexity's main advantages include: Unique search-augmented generation with real-time web data, Built-in citations reduce hallucination risk, OpenAI-compatible API for easy integration, Competitive pricing on open-source model hosting, No need to build your own RAG pipeline for web-grounded answers, Fast inference speeds on Sonar models.

What are Perplexity's limitations?

Perplexity's main limitations include: Smaller model selection than general-purpose platforms, Search-augmented models have higher per-request costs, Less mature enterprise offering compared to larger providers, Limited fine-tuning options.

Perplexity

Search-augmented AI models with real-time web grounding

Inference specialist🇺🇸 USinferencesearchgrounding

Last reviewed Mar 14, 2026

Perplexity is an AI company offering inference APIs powered by their Sonar models, which combine large language models with real-time web search. Their API provides grounded, citation-backed responses alongside standard chat completion capabilities using popular open-source models.

Visit Perplexity Documentation

LLM Models

From / 1M input

$0.25

LLM API Pricing

Pay-per-token pricing. Prices shown per 1M tokens.

Prices last updated: July 7, 2026

Model	Creator	Context	Input/1M	Output/1M	Updated
GPT-5 mini	OpenAI	200K	$0.250	$2.00	7/6/2026
Gemini 3.1 Flash Lite	Google	1.0M	$0.250	$1.50	7/6/2026
Sonar	Perplexity	127K	$1.00	$1.00	7/7/2026
Sonar Reasoning Pro	Perplexity	128K	$2.00	$8.00	7/7/2026
Sonar Deep Research	Perplexity	128K	$2.00	$8.00	7/7/2026
GPT-5.4	OpenAI	128K	$2.50	$15.00	7/6/2026
Sonar Pro	Perplexity	200K	$3.00	$15.00	7/7/2026

Pros & Cons

Advantages

Unique search-augmented generation with real-time web data
Built-in citations reduce hallucination risk
OpenAI-compatible API for easy integration
Competitive pricing on open-source model hosting
No need to build your own RAG pipeline for web-grounded answers
Fast inference speeds on Sonar models

Limitations

Smaller model selection than general-purpose platforms
Search-augmented models have higher per-request costs
Less mature enterprise offering compared to larger providers
Limited fine-tuning options

Key Features

Sonar Search Models

Proprietary models combining LLM reasoning with real-time web search and citations

Online Grounding

Responses include inline citations and sources from live web data

OpenAI-Compatible API

Drop-in replacement using standard chat completions format

Open-Source Models

Access to Llama and other open-source models alongside Sonar

Structured Outputs

JSON mode and structured output support for reliable parsing

Search Domain Filtering

Restrict or focus web search to specific domains for targeted results

Pricing Options

Option	Details
Pay-per-token	Per million token pricing with separate input and output rates
Search pricing	Per-request pricing for search-augmented Sonar queries

Availability & Support

Regions

Global availability via cloud infrastructure

Support

API documentation, Discord community, email support

Getting Started

1
Create an account
Sign up at perplexity.ai and navigate to API settings
2
Generate API key
Create an API key from your account settings
3
Choose a model
Select from Sonar (search-augmented) or standard open-source models
4
Make first API call
Use the OpenAI-compatible chat completions endpoint