Perplexity Perplexity

Perplexity is an AI company offering inference APIs powered by their Sonar models, which combine large language models with real-time web search. Their API provides grounded, citation-backed responses alongside standard chat completion capabilities using popular open-source models.

Key Features

Sonar Search Models
Proprietary models combining LLM reasoning with real-time web search and citations
Online Grounding
Responses include inline citations and sources from live web data
OpenAI-Compatible API
Drop-in replacement using standard chat completions format
Open-Source Models
Access to Llama and other open-source models alongside Sonar
Structured Outputs
JSON mode and structured output support for reliable parsing
Search Domain Filtering
Restrict or focus web search to specific domains for targeted results

Provider Comparison

Advantages

  • Unique search-augmented generation with real-time web data
  • Built-in citations reduce hallucination risk
  • OpenAI-compatible API for easy integration
  • Competitive pricing on open-source model hosting
  • No need to build your own RAG pipeline for web-grounded answers
  • Fast inference speeds on Sonar models

Limitations

  • Smaller model selection than general-purpose platforms
  • Search-augmented models have higher per-request costs
  • Less mature enterprise offering compared to larger providers
  • Limited fine-tuning options

Compute Services

Pricing Options

OptionDetails
Pay-per-tokenPer million token pricing with separate input and output rates
Search pricingPer-request pricing for search-augmented Sonar queries

Getting Started

1

Create an account

Sign up at perplexity.ai and navigate to API settings

2

Generate API key

Create an API key from your account settings

3

Choose a model

Select from Sonar (search-augmented) or standard open-source models

4

Make first API call

Use the OpenAI-compatible chat completions endpoint