GPT-4o (Nov '24)

OpenAI

Vision ModelProprietary

Context

Release Date: Nov 20, 2024
Knowledge Cutoff: Oct 01, 2023
Window: 128k

PricingPer 1M tokens

Input: $2.5
Output: $10
Blended 3:1: $4.375

Capabilities

Speed: 159 t/s
Input
Output
Reasoning tokens

Latency

TTFT: 0.40 ms
500 token response: 3.55 s

Benchmarks

Intelligence: ●●○○○
Math: ●●○○○
Coding: ●●○○○
MMLU Pro: 74.8%
GPQA: 54.3%
HLE: 3.3%
SciCode: 33.3%
AIME: 15.0%
MATH 500: 75.9%
LiveCodeBench: 30.9%
HumanEval: 93.0%

GPT-4o is OpenAI's latest multimodal model that handles text, code, images, audio, and video in one network. It matches GPT-4 Turbo on English tasks, is stronger in other languages, and offers a 128k context window with ~300 ms speech response at half the price.

Developers get a single, cheaper endpoint for chat, coding, voice, and vision features without juggling multiple models or pipelines. The large context and real-time audio enable fast assistants and rich multimodal analysis in any app.