
Google Cloud
Enterprise cloud with advanced AI/ML services
Last reviewed Mar 14, 2026
GCP provides powerful GPU instances with flexible pricing and integration with Google's AI and machine learning tools. It's a major cloud provider known for its innovation in Kubernetes, AI/ML, and data analytics.
Available GPUs
Hourly on-demand pricing. Click column headers to sort.
Prices last updated: April 20, 2026
GPU Model↑ | Memory↑ | GPUs | Price / hr↑ | Updated↑ | Source |
|---|---|---|---|---|---|
| Tesla T4 | 16GB | 1× | $0.35/hr | 3/31/2026 | |
| Tesla V100 | 32GB | 1× | $2.48/hr | 3/31/2026 |
LLM API Pricing
Pay-per-token pricing. Prices shown per 1M tokens.
Prices last updated: April 21, 2026
| Model | Creator | Context | Input/1M | Output/1M | Updated |
|---|---|---|---|---|---|
| OpenAI | 128K | $0.090 | $0.360 | 4/3/2026 | |
| Mistral | 32K | $0.100 | $0.300 | 4/1/2026 | |
| 1.0M | $0.100 | $0.400 | 4/19/2026 | ||
| 1.0M | $0.150 | $0.600 | 4/21/2026 | ||
| 1.0M | $0.250 | $1.50 | 4/19/2026 | ||
| 1.0M | $0.300 | $2.50 | 4/21/2026 | ||
| 66K | $0.500 | $3.00 | 4/21/2026 | ||
| 1.0M | $0.500 | $3.00 | 4/21/2026 | ||
| DeepSeek | 64K | $0.600 | $1.70 | 4/3/2026 | |
| Anthropic | 200K | $1.00 | $5.00 | 4/3/2026 | |
Pros & Cons
Advantages
- Flexible pricing options, including sustained use discounts
- Strong AI and machine learning tools (Vertex AI)
- Good integration with other Google services
- Cutting-edge Kubernetes implementation (GKE)
- Competitive pricing, especially for sustained use
- Strong global network infrastructure
- Innovative AI/ML and data analytics services
Limitations
- Limited availability in some regions compared to AWS
- Complexity in managing resources
- Support can be costly
- Steeper learning curve for some services
Key Features
Compute Engine
Scalable virtual machines with a wide range of machine types, including GPUs.
Google Kubernetes Engine (GKE)
Managed Kubernetes service for deploying and managing containerized applications.
Cloud Functions
Event-driven serverless compute platform.
Cloud Run
Fully managed serverless platform for containerized applications.
Vertex AI
Unified ML platform for building, deploying, and managing ML models.
Preemptible VMs
Short-lived compute instances at a significant discount, suitable for fault-tolerant workloads.
Cloud Storage
Scalable and durable object storage.
Persistent Disk
Block storage for Compute Engine instances.
Cloud Load Balancing
High-performance, scalable load balancing.
Virtual Private Cloud (VPC)
Software-defined networking for your cloud resources.
Compute Services
Compute Engine
Offers customizable virtual machines running in Google's data centers.
Google Kubernetes Engine (GKE)
Managed Kubernetes service for running containerized applications.
- Automated Kubernetes operations
- Integration with Google Cloud services
- Advanced cluster management features
Cloud Functions
Serverless compute platform for running code in response to events.
- Automatic scaling and high availability
- Pay only for the compute time consumed
- Supports multiple programming languages
Cloud Run
Fully managed serverless platform for deploying and scaling containerized applications.
- Runs stateless containers on a fully managed environment
- Automatic scaling and high availability
- Pay only for the resources used
Inference Services
Vertex AI
Access to Google's Gemini models and other foundation models through a fully managed platform with enterprise security, MLOps tools, and Google Cloud integration.
- Gemini Models: Access Google's latest Gemini Pro and Flash models with multimodal capabilities
- Model Garden: Curated collection of open-source and Google-developed models
- Grounding: Connect models to Google Search or your own data for accurate responses
- Context Caching: Cache large context windows for cost savings on repeated queries
Pricing Models
- Pay-per-token: Standard per-token pricing for input and output
- Context Caching: Reduced rates for cached context in long conversations
- Provisioned Throughput: Reserved capacity for predictable performance
Pricing Options
| Option | Details |
|---|---|
| On-Demand | Pay for compute capacity per hour or per second, with no long-term commitments. |
| Sustained Use Discounts | Automatic discounts for running instances for a significant portion of the month. |
| Committed Use Discounts | Save up to 57% with a 1-year or 3-year commitment to a minimum level of resource usage. |
| Preemptible VMs | Save up to 80% for fault-tolerant workloads that can be interrupted. |
Availability & Support
Regions
40+ regions and 120+ zones worldwide.
Support
Role-based (free), Standard, Enhanced and Premium support plans. Comprehensive documentation, community forums, and training resources.
Getting Started
- 1
Create a Google Cloud project
Set up a project in the Google Cloud Console.
- 2
Enable billing
Set up a billing account to pay for resource usage.
- 3
Choose a compute service
Select Compute Engine, GKE, Cloud Functions, or Cloud Run based on your needs.
- 4
Create and configure an instance
Launch a VM instance, configure a Kubernetes cluster, or deploy a function/application.
- 5
Manage resources
Use the Cloud Console, command-line tools, or APIs to manage your resources.
Compare Providers
Find the best prices for the same GPUs from other providers