Together AI Together AI

Together AI is the AI Native Cloud platform engineered for developers building with open-source and frontier AI models. They provide serverless inference, fine-tuning, and GPU clusters with industry-leading performance optimizations.

Key Features

100+ Open-Source Models
Access to Llama, DeepSeek, Qwen, and other leading open-source models
Serverless Inference
Pay-per-token API with OpenAI-compatible endpoints
Fine-Tuning Platform
LoRA and full fine-tuning with proprietary optimizations
GPU Clusters
Instant self-service or reserved dedicated clusters with H100, H200, B200 access
Batch API
50% cost reduction for non-urgent inference workloads
Code Interpreter
Execute LLM-generated code in sandboxed environments

Provider Comparison

Advantages

  • 3.5x faster inference and 2.3x faster training than alternatives
  • Competitive pricing with 50% batch API discount
  • Wide selection of 100+ open-source models
  • OpenAI-compatible APIs for easy migration
  • Research leadership with FlashAttention contributions
  • Global data center network across 25+ cities

Limitations

  • Primarily focused on open-source models
  • GPU cluster pricing requires custom quotes for reserved capacity
  • Smaller ecosystem compared to major cloud providers

Compute Services

Pricing Options

OptionDetails
Serverless pay-per-tokenStarting at $0.06/1M tokens for small models up to $3.50/1M for 405B models
Batch API50% discount for non-urgent inference workloads
Fine-tuning$0.48-$3.20 per 1M tokens depending on model size
GPU Clusters$2.20-$5.50/hour per GPU for instant clusters, custom pricing for reserved

Getting Started

1

Create an account

Sign up at together.ai

2

Get API key

Generate an API key from your dashboard

3

Choose a model

Browse 100+ models for chat, code, images, video, and audio

4

Make API calls

Use OpenAI-compatible endpoints or Together SDK