Command Palette

Search for a command to run...

Mistral Small 3.1

Mistral

Vision ModelOpen

Context

Release Date
Mar 17, 2025
Window
128k

PricingPer 1M tokens

Input
$0.1
Output
$0.3
Blended 3:1
$0.15

Capabilities

Speed
153 t/s
Input
Output
Reasoning tokens

Latency

TTFT
0.26 ms
500 token response
3.52 s

Benchmarks

Intelligence
●●○○○
Math
●●○○○
Coding
○○○○
MMLU Pro
65.9%
GPQA
45.4%
HLE
4.8%
SciCode
26.5%
AIME
9.3%
MATH 500
70.7%
LiveCodeBench
21.2%
HumanEval
85.9%

Mistral Small 3.1 is a 24-billion-parameter, Apache-2.0 model that natively handles text and images, speaks 20 + languages, and accepts up to 128k tokens. It outperforms Gemma 3 and GPT-4o Mini on core text, vision, and multilingual benchmarks while decoding at roughly 150 tokens/s.

A single RTX 4090 or 32 GB Mac can host it, so you avoid cloud lock-in and per-token fees. Strong instruction following, low-latency function calling, and easy fine-tuning make it a solid open-source base for chatbots, on-device assistants, and multimodal agent workflows.