Together AI
Together AI is the AI Native Cloud platform engineered for developers building with open-source and frontier AI models. They provide serverless inference, fine-tuning, and GPU clusters with industry-leading performance optimizations.
Key Features
- 100+ Open-Source Models
- Access to Llama, DeepSeek, Qwen, and other leading open-source models
- Serverless Inference
- Pay-per-token API with OpenAI-compatible endpoints
- Fine-Tuning Platform
- LoRA and full fine-tuning with proprietary optimizations
- GPU Clusters
- Instant self-service or reserved dedicated clusters with H100, H200, B200 access
- Batch API
- 50% cost reduction for non-urgent inference workloads
- Code Interpreter
- Execute LLM-generated code in sandboxed environments
Provider Comparison
Advantages
- 3.5x faster inference and 2.3x faster training than alternatives
- Competitive pricing with 50% batch API discount
- Wide selection of 100+ open-source models
- OpenAI-compatible APIs for easy migration
- Research leadership with FlashAttention contributions
- Global data center network across 25+ cities
Limitations
- Primarily focused on open-source models
- GPU cluster pricing requires custom quotes for reserved capacity
- Smaller ecosystem compared to major cloud providers
Compute Services
Pricing Options
| Option | Details |
|---|---|
| Serverless pay-per-token | Starting at $0.06/1M tokens for small models up to $3.50/1M for 405B models |
| Batch API | 50% discount for non-urgent inference workloads |
| Fine-tuning | $0.48-$3.20 per 1M tokens depending on model size |
| GPU Clusters | $2.20-$5.50/hour per GPU for instant clusters, custom pricing for reserved |
Getting Started
1
Create an account
Sign up at together.ai
2
Get API key
Generate an API key from your dashboard
3
Choose a model
Browse 100+ models for chat, code, images, video, and audio
4
Make API calls
Use OpenAI-compatible endpoints or Together SDK