Perplexity
Perplexity is an AI company offering inference APIs powered by their Sonar models, which combine large language models with real-time web search. Their API provides grounded, citation-backed responses alongside standard chat completion capabilities using popular open-source models.
Key Features
- Sonar Search Models
- Proprietary models combining LLM reasoning with real-time web search and citations
- Online Grounding
- Responses include inline citations and sources from live web data
- OpenAI-Compatible API
- Drop-in replacement using standard chat completions format
- Open-Source Models
- Access to Llama and other open-source models alongside Sonar
- Structured Outputs
- JSON mode and structured output support for reliable parsing
- Search Domain Filtering
- Restrict or focus web search to specific domains for targeted results
Provider Comparison
Advantages
- Unique search-augmented generation with real-time web data
- Built-in citations reduce hallucination risk
- OpenAI-compatible API for easy integration
- Competitive pricing on open-source model hosting
- No need to build your own RAG pipeline for web-grounded answers
- Fast inference speeds on Sonar models
Limitations
- Smaller model selection than general-purpose platforms
- Search-augmented models have higher per-request costs
- Less mature enterprise offering compared to larger providers
- Limited fine-tuning options
Compute Services
Pricing Options
| Option | Details |
|---|---|
| Pay-per-token | Per million token pricing with separate input and output rates |
| Search pricing | Per-request pricing for search-augmented Sonar queries |
Getting Started
1
Create an account
Sign up at perplexity.ai and navigate to API settings
2
Generate API key
Create an API key from your account settings
3
Choose a model
Select from Sonar (search-augmented) or standard open-source models
4
Make first API call
Use the OpenAI-compatible chat completions endpoint