Command Palette

Search for a command to run...

Gemini 2.5 Flash

Google

Frontier ModelVision ModelProprietary

Context

Knowledge Cutoff
Jan 01, 2025
Window
1M

PricingPer 1M tokens

Input
$0.15
Output
$0.6
Blended 3:1
$0.2625

Capabilities

Speed
279 t/s
Input
Output
Reasoning tokens

Latency

TTFT
0.31 ms
500 token response
2.10 s

Benchmarks

Intelligence
●●●○○
Math
●●●●
Coding
●●○○○
MMLU Pro
80.9%
GPQA
68.3%
HLE
5.1%
SciCode
29.1%
AIME
50.0%
MATH 500
93.2%
LiveCodeBench
49.5%
HumanEval
95.1%

Gemini 2.5 Flash is Google's latency-first Gemini variant: a sparse MoE transformer that handles text, images, audio, and video with a 1 M-token context window while streaming ~280 tokens/s. It delivers GPT-4-class reasoning and coding scores (e.g., 55 % LiveCodeBench, 82 % GPQA) at a fraction of the compute of Gemini 2.5 Pro.

Developers reach for Flash when they need high-throughput chat, real-time multimodal analysis, or agentic function-calling without breaking the budget. Served exclusively through Google AI Studio, it costs $0.15 / $0.60 per million input/output tokens and is the cheapest path to Gemini-level quality—no weights, just a fast API.