Loading Comparison
Fetching pricing data and provider information...
Loading Comparison
Fetching pricing data and provider information...
Compare GPU and LLM inference API pricing between Amazon AWS and Together AI. Find the best rates for AI training, inference, and ML workloads.
Provider 1
Provider 2
Average Price Difference: $2.97/hour between comparable GPUs
| GPU Model ↑ | Amazon AWS Price | Together AI Price | Price Diff ↕ | Sources |
|---|---|---|---|---|
A10 24GB VRAM • Amazon AWS | Not Available | — | ||
A10 24GB VRAM • | ||||
A100 SXM 80GB VRAM • Amazon AWSTogether AI | 8x GPU | 2x GPU | ↑+$1.45(+111.8%) | |
A100 SXM 80GB VRAM • $2.74/hour 8x GPU configuration Updated: 4/18/2026 $1.30/hour 2x GPU configuration Updated: 4/18/2026 ★Best Price Price Difference:↑+$1.45(+111.8%) | ||||
B200 192GB VRAM • Together AI | Not Available | — | ||
B200 192GB VRAM • | ||||
H100 SXM 80GB VRAM • Amazon AWSTogether AI | 8x GPU | 2x GPU | ↑+$4.88(+244.9%) | |
H100 SXM 80GB VRAM • $6.88/hour 8x GPU configuration Updated: 4/18/2026 $2.00/hour 2x GPU configuration Updated: 4/18/2026 ★Best Price Price Difference:↑+$4.88(+244.9%) | ||||
H200 141GB VRAM • Amazon AWSTogether AI | 8x GPU | ↑+$5.32(+205.5%) | ||
H200 141GB VRAM • $7.91/hour 8x GPU configuration Updated: 4/18/2026 $2.59/hour Updated: 3/30/2026 ★Best Price Price Difference:↑+$5.32(+205.5%) | ||||
L4 24GB VRAM • Amazon AWS | Not Available | — | ||
L4 24GB VRAM • | ||||
L40 40GB VRAM • Together AI | Not Available | — | ||
L40 40GB VRAM • | ||||
L40S 48GB VRAM • Amazon AWSTogether AI | ↓$0.24(11.4%) | |||
L40S 48GB VRAM • $1.86/hour Updated: 4/18/2026 ★Best Price $2.10/hour Updated: 4/18/2026 Price Difference:↓$0.24(11.4%) | ||||
Tesla T4 16GB VRAM • Amazon AWS | Not Available | — | ||
Tesla T4 16GB VRAM • | ||||
A10 24GB VRAM • Amazon AWS | Not Available | — | ||
A10 24GB VRAM • | ||||
A100 SXM 80GB VRAM • Amazon AWSTogether AI | 8x GPU | 2x GPU | ↑+$1.45(+111.8%) | |
A100 SXM 80GB VRAM • $2.74/hour 8x GPU configuration Updated: 4/18/2026 $1.30/hour 2x GPU configuration Updated: 4/18/2026 ★Best Price Price Difference:↑+$1.45(+111.8%) | ||||
B200 192GB VRAM • Together AI | Not Available | — | ||
B200 192GB VRAM • | ||||
H100 SXM 80GB VRAM • Amazon AWSTogether AI | 8x GPU | 2x GPU | ↑+$4.88(+244.9%) | |
H100 SXM 80GB VRAM • $6.88/hour 8x GPU configuration Updated: 4/18/2026 $2.00/hour 2x GPU configuration Updated: 4/18/2026 ★Best Price Price Difference:↑+$4.88(+244.9%) | ||||
H200 141GB VRAM • Amazon AWSTogether AI | 8x GPU | ↑+$5.32(+205.5%) | ||
H200 141GB VRAM • $7.91/hour 8x GPU configuration Updated: 4/18/2026 $2.59/hour Updated: 3/30/2026 ★Best Price Price Difference:↑+$5.32(+205.5%) | ||||
L4 24GB VRAM • Amazon AWS | Not Available | — | ||
L4 24GB VRAM • | ||||
L40 40GB VRAM • Together AI | Not Available | — | ||
L40 40GB VRAM • | ||||
L40S 48GB VRAM • Amazon AWSTogether AI | ↓$0.24(11.4%) | |||
L40S 48GB VRAM • $1.86/hour Updated: 4/18/2026 ★Best Price $2.10/hour Updated: 4/18/2026 Price Difference:↓$0.24(11.4%) | ||||
Tesla T4 16GB VRAM • Amazon AWS | Not Available | — | ||
Tesla T4 16GB VRAM • | ||||
Explore how these providers compare to other popular GPU cloud services
Compare Amazon AWS with another leading provider
Compare Amazon AWS with another leading provider
Compare Amazon AWS with another leading provider
Compare Amazon AWS with another leading provider
Compare Amazon AWS with another leading provider
Compare Amazon AWS with another leading provider
Extensive network of data centers across multiple regions worldwide
Flexible pricing model with no upfront commitments required
Comprehensive security tools and compliance certifications
Automatically adjust resources based on demand
Extensive ecosystem of services that work seamlessly together
Comprehensive suite of tools for development, deployment, and management
Access to Llama, DeepSeek, Qwen, and other leading open-source models
Pay-per-token API with OpenAI-compatible endpoints
LoRA and full fine-tuning with proprietary optimizations
Instant self-service or reserved dedicated clusters with H100, H200, B200, GB200, GB300 access
50% cost reduction for non-urgent inference workloads
Execute LLM-generated code in sandboxed environments
Virtual servers in the cloud with a wide range of instance types.
Fully managed container orchestration service.
Managed Kubernetes service for container orchestration.
Pay for compute capacity by the second with no long-term commitments.
Use spare EC2 capacity at up to 90% off the On-Demand price.
Save up to 72% compared to On-Demand pricing with a 1 or 3-year commitment.
Save up to 72% on compute usage with a 1 or 3-year commitment to a consistent amount of usage.
Per-token pricing scales based on model size, from small open-source models to 405B parameter frontier models
50% discount for non-urgent inference workloads
Per-token pricing for LoRA and full fine-tuning based on model size and dataset
Hourly GPU pricing for instant self-service clusters
Custom pricing for reserved capacity with significant discounts for longer commitments
Single-tenant GPU instances with guaranteed performance
Create an AWS account to access the cloud platform.
Select from EC2, Lambda, or container services based on your workload needs.
Configure and launch your first compute instance or container.
Configure security groups and access controls for your resources.
Use AWS CloudWatch and Compute Optimizer to monitor performance and reduce costs.
Sign up at together.ai
Generate an API key from your dashboard
Browse 100+ models for chat, code, images, video, and audio
Use OpenAI-compatible endpoints or Together SDK
30+ regions and 100+ availability zones worldwide.
Basic (free), Developer, Business, Enterprise support plans with varying response times and features. Extensive documentation, forums, and training resources.
Global data center network across 25+ cities with frontier hardware including GB300, GB200, B200, H200, H100
Documentation, community Discord, email support, and expert support for reserved cluster customers