Google Cloud logo

Google Cloud

Enterprise cloud with advanced AI/ML services

Classical hyperscaler🇺🇸 USinferenceenterprisemultimodal

Last reviewed Mar 14, 2026

GCP provides powerful GPU instances with flexible pricing and integration with Google's AI and machine learning tools. It's a major cloud provider known for its innovation in Kubernetes, AI/ML, and data analytics.

2
GPU Models
$0.16
From / hour
26
LLM Models
$0.04
From / 1M input

Available GPUs

Hourly on-demand pricing. Click column headers to sort.

Prices last updated: April 20, 2026

Pricing
GPU Model
Memory
GPUs
Price / hr
Updated
Source
Tesla T416GB
1×
$0.35/hr
3/31/2026
Tesla V10032GB
1×
$2.48/hr
3/31/2026

LLM API Pricing

Pay-per-token pricing. Prices shown per 1M tokens.

Prices last updated: April 21, 2026

Pricing
ModelCreatorContextInput/1MOutput/1MUpdated
OpenAI128K$0.090$0.3604/3/2026
Mistral32K$0.100$0.3004/1/2026
Google1.0M$0.100$0.4004/19/2026
Google1.0M$0.150$0.6004/21/2026
Google1.0M$0.250$1.504/19/2026
Google1.0M$0.300$2.504/21/2026
Google66K$0.500$3.004/21/2026
Google1.0M$0.500$3.004/21/2026
DeepSeek64K$0.600$1.704/3/2026
Anthropic200K$1.00$5.004/3/2026

Pros & Cons

Advantages

  • Flexible pricing options, including sustained use discounts
  • Strong AI and machine learning tools (Vertex AI)
  • Good integration with other Google services
  • Cutting-edge Kubernetes implementation (GKE)
  • Competitive pricing, especially for sustained use
  • Strong global network infrastructure
  • Innovative AI/ML and data analytics services

Limitations

  • Limited availability in some regions compared to AWS
  • Complexity in managing resources
  • Support can be costly
  • Steeper learning curve for some services

Key Features

Compute Engine

Scalable virtual machines with a wide range of machine types, including GPUs.

Google Kubernetes Engine (GKE)

Managed Kubernetes service for deploying and managing containerized applications.

Cloud Functions

Event-driven serverless compute platform.

Cloud Run

Fully managed serverless platform for containerized applications.

Vertex AI

Unified ML platform for building, deploying, and managing ML models.

Preemptible VMs

Short-lived compute instances at a significant discount, suitable for fault-tolerant workloads.

Cloud Storage

Scalable and durable object storage.

Persistent Disk

Block storage for Compute Engine instances.

Cloud Load Balancing

High-performance, scalable load balancing.

Virtual Private Cloud (VPC)

Software-defined networking for your cloud resources.

Compute Services

Compute Engine

Offers customizable virtual machines running in Google's data centers.

Google Kubernetes Engine (GKE)

Managed Kubernetes service for running containerized applications.

  • Automated Kubernetes operations
  • Integration with Google Cloud services
  • Advanced cluster management features

Cloud Functions

Serverless compute platform for running code in response to events.

  • Automatic scaling and high availability
  • Pay only for the compute time consumed
  • Supports multiple programming languages

Cloud Run

Fully managed serverless platform for deploying and scaling containerized applications.

  • Runs stateless containers on a fully managed environment
  • Automatic scaling and high availability
  • Pay only for the resources used

Inference Services

Vertex AI

Access to Google's Gemini models and other foundation models through a fully managed platform with enterprise security, MLOps tools, and Google Cloud integration.

  • Gemini Models: Access Google's latest Gemini Pro and Flash models with multimodal capabilities
  • Model Garden: Curated collection of open-source and Google-developed models
  • Grounding: Connect models to Google Search or your own data for accurate responses
  • Context Caching: Cache large context windows for cost savings on repeated queries

Pricing Models

  • Pay-per-token: Standard per-token pricing for input and output
  • Context Caching: Reduced rates for cached context in long conversations
  • Provisioned Throughput: Reserved capacity for predictable performance

Pricing Options

OptionDetails
On-DemandPay for compute capacity per hour or per second, with no long-term commitments.
Sustained Use DiscountsAutomatic discounts for running instances for a significant portion of the month.
Committed Use DiscountsSave up to 57% with a 1-year or 3-year commitment to a minimum level of resource usage.
Preemptible VMsSave up to 80% for fault-tolerant workloads that can be interrupted.

Availability & Support

Regions

40+ regions and 120+ zones worldwide.

Support

Role-based (free), Standard, Enhanced and Premium support plans. Comprehensive documentation, community forums, and training resources.

Getting Started

  1. 1

    Create a Google Cloud project

    Set up a project in the Google Cloud Console.

  2. 2

    Enable billing

    Set up a billing account to pay for resource usage.

  3. 3

    Choose a compute service

    Select Compute Engine, GKE, Cloud Functions, or Cloud Run based on your needs.

  4. 4

    Create and configure an instance

    Launch a VM instance, configure a Kubernetes cluster, or deploy a function/application.

  5. 5

    Manage resources

    Use the Cloud Console, command-line tools, or APIs to manage your resources.

Compare Providers

Find the best prices for the same GPUs from other providers

IO.NET logo

IO.NET

2 shared GPUs with Google Cloud

Theta EdgeCloud logo

Theta EdgeCloud

2 shared GPUs with Google Cloud

RunPod logo

RunPod

1 shared GPU with Google Cloud