Gemini 3.5 Flash

Google · Active · Updated May 22, 2026

Google's next-generation fast model, combining lightning inference with competitive reasoning and a 1M context window.

Input Price
$1.50/M
per million tokens
Output Price
$9.00/M
per million tokens
Context Window
1,048,576
tokens
Max Output
16,384
tokens

Technical Specifications

ProviderGoogle
Release DateMay 15, 2026
Pricing Typeper token
Input Price$1.5.00 / 1M tokens
Output Price$9.00 / 1M tokens
Cached Input$0.15 / 1M tokens
Context Window1,048,576 tokens
Max Output16,384 tokens
Input Modalitiestext, image, audio
Output Modalitiestext
Statusactive
Availabilityapi, web_app
Latencyvery fast
Rate Limit30,000 RPM
Pricing URLView official pricing →
Docs URLView documentation →

Capability Scores

Coding
80
Reasoning
78
Math
77
Image
70
Speed
97

Overview

Gemini 3.5 Flash is Google's latest speed-optimized model, delivering the fastest inference available while maintaining a massive 1M token context window. It achieves significantly better reasoning and coding scores than its predecessor Gemini 2.5 Flash, narrowing the gap with frontier models while keeping pricing extremely competitive. For high-throughput applications requiring both speed and quality, Gemini 3.5 Flash is an exceptional choice.

Pros

  • +Fastest inference among all models (speed: 97/100)
  • +1M context window at a budget-friendly price
  • +Near-frontier reasoning at a fraction of the cost

Cons

  • Moderate coding performance compared to frontier models
  • Text-only output — no audio or image generation
  • Not designed for complex multi-step agentic tasks

Use Cases

Real-time content generation and moderation
High-volume data processing with long-context understanding
Cost-effective customer service and chat applications

Frequently Asked Questions about Gemini 3.5 Flash

How much does Gemini 3.5 Flash cost?
Gemini 3.5 Flash costs $1.5 per million input tokens and $9 per million output tokens. Cached input is $0.15 per million tokens.
What is the context window of Gemini 3.5 Flash?
Gemini 3.5 Flash has a 1,048,576 token context window, with a maximum output of 16,384 tokens.
Is Gemini 3.5 Flash good for coding?
Gemini 3.5 Flash scores 80/100 on coding benchmarks.
What modalities does Gemini 3.5 Flash support?
Gemini 3.5 Flash supports text, image, audio input and text output.