Gemini 2.5 Flash (Reasoning)

Gemini 2.5 Flash (Reasoning mode) is Google’s latency-first, sparse MoE transformer that pushes GPT-4-level logic and code skills while streaming ~300 tokens/s. It accepts text, images, audio, or video in contexts over one million tokens, yet costs just $0.15 / $0.60 per million input/output tokens through Google AI Studio.

Developers reach for Flash when they need fast, cheap reasoning at scale—think live code review, long-context Q&A, or agentic function calls that can’t wait. You get near-Pro accuracy on benchmarks like GPQA (82 %) and LiveCodeBench (55 %) without paying Pro-tier compute, making it the pragmatic choice for high-throughput, multimodal apps.

Gemini 2.5 Flash (Reasoning)

Context

PricingPer 1M tokens

Capabilities

Latency

Benchmarks

Command Palette

Gemini 2.5 Flash (Reasoning)

Context

PricingPer 1M tokens

Capabilities

Latency

Benchmarks