Command Palette

Search for a command to run...

Mistral Small 3

Mistral

Open

Context

Release Date
Jan 30, 2025
Window
32,000

PricingPer 1M tokens

Input
$0.1
Output
$0.3
Blended 3:1
$0.15

Capabilities

Speed
161 t/s
Input
Output
Reasoning tokens

Latency

TTFT
0.27 ms
500 token response
3.39 s

Benchmarks

Intelligence
●●○○○
Math
●●○○○
Coding
○○○○
MMLU Pro
65.2%
GPQA
46.2%
HLE
4.1%
SciCode
23.6%
AIME
8.0%
MATH 500
71.5%
LiveCodeBench
25.2%
HumanEval
85.4%

Mistral Small 3 is a 24 B-parameter language model built for speed: ~150 tokens/s, a 32 k-token context window, and 81% MMLU—all released under the permissive Apache-2.0 license in both base and instruct variants.

For developers, it's an open, low-latency alternative to mid-size proprietary LLMs. Run it locally on a single RTX 4090/MacBook, fine-tune it into a domain expert, or use the cloud API for $0.10 / $0.30 per million input/output tokens to power chatbots, function-calling agents, or edge workloads without vendor lock-in.