Command Palette

Search for a command to run...

Gemini 2.5 Flash (Reasoning)

Google

Frontier ModelReasoningVision ModelProprietary

Context

Release Date
May 20, 2025
Knowledge Cutoff
Jan 01, 2025
Window
1M

PricingPer 1M tokens

Input
$0.15
Output
$3.5
Blended 3:1
$0.9875

Capabilities

Speed
346 t/s
Input
Output
Reasoning tokens

Latency

TTFT
13.55 ms
500 token response
15.00 s

Benchmarks

Reasoning
●●●○○
Math
●●●●●
Coding
●●●○○
MMLU Pro
83.2%
GPQA
79.0%
HLE
11.1%
SciCode
39.4%
AIME
82.3%
MATH 500
98.1%
LiveCodeBench
69.5%
HumanEval
96.2%

Gemini 2.5 Flash (Reasoning mode) is Google’s latency-first, sparse MoE transformer that pushes GPT-4-level logic and code skills while streaming ~300 tokens/s. It accepts text, images, audio, or video in contexts over one million tokens, yet costs just $0.15 / $0.60 per million input/output tokens through Google AI Studio.

Developers reach for Flash when they need fast, cheap reasoning at scale—think live code review, long-context Q&A, or agentic function calls that can’t wait. You get near-Pro accuracy on benchmarks like GPQA (82 %) and LiveCodeBench (55 %) without paying Pro-tier compute, making it the pragmatic choice for high-throughput, multimodal apps.