Gemini 2.5 Flash is Google's latency-first Gemini variant: a sparse MoE transformer that handles text, images, audio, and video with a 1 M-token context window while streaming ~280 tokens/s. It delivers GPT-4-class reasoning and coding scores (e.g., 55 % LiveCodeBench, 82 % GPQA) at a fraction of the compute of Gemini 2.5 Pro.
Developers reach for Flash when they need high-throughput chat, real-time multimodal analysis, or agentic function-calling without breaking the budget. Served exclusively through Google AI Studio, it costs $0.15 / $0.60 per million input/output tokens and is the cheapest path to Gemini-level quality—no weights, just a fast API.