
Perplexity
Search-augmented AI models with real-time web grounding
Perplexity is an AI company offering inference APIs powered by their Sonar models, which combine large language models with real-time web search. Their API provides grounded, citation-backed responses alongside standard chat completion capabilities using popular open-source models.
We're actively tracking prices for Perplexity. Check back soon, or browse other providers with current pricing.
Pros & Cons
Advantages
- Unique search-augmented generation with real-time web data
- Built-in citations reduce hallucination risk
- OpenAI-compatible API for easy integration
- Competitive pricing on open-source model hosting
- No need to build your own RAG pipeline for web-grounded answers
- Fast inference speeds on Sonar models
Limitations
- Smaller model selection than general-purpose platforms
- Search-augmented models have higher per-request costs
- Less mature enterprise offering compared to larger providers
- Limited fine-tuning options
Key Features
Sonar Search Models
Proprietary models combining LLM reasoning with real-time web search and citations
Online Grounding
Responses include inline citations and sources from live web data
OpenAI-Compatible API
Drop-in replacement using standard chat completions format
Open-Source Models
Access to Llama and other open-source models alongside Sonar
Structured Outputs
JSON mode and structured output support for reliable parsing
Search Domain Filtering
Restrict or focus web search to specific domains for targeted results
Pricing Options
| Option | Details |
|---|---|
| Pay-per-token | Per million token pricing with separate input and output rates |
| Search pricing | Per-request pricing for search-augmented Sonar queries |
Availability & Support
Regions
Global availability via cloud infrastructure
Support
API documentation, Discord community, email support
Getting Started
- 1
Create an account
Sign up at perplexity.ai and navigate to API settings
- 2
Generate API key
Create an API key from your account settings
- 3
Choose a model
Select from Sonar (search-augmented) or standard open-source models
- 4
Make first API call
Use the OpenAI-compatible chat completions endpoint