Benched.ai
View
About
Command Palette
Search for a command to run...
Models
Llama 3.1 Nemotron Instruct 70B
NVIDIA
Open
Context
Release Date
Oct 15, 2024
Knowledge Cutoff
Dec 01, 2023
Window
128k
Pricing
Per 1M tokens
Input
$0.12
Output
$0.3
Blended 3:1
$0.165
Capabilities
Speed
41 t/s
Input
Output
Reasoning tokens
Latency
TTFT
0.53 ms
500 token response
12.68 s
Benchmarks
Intelligence
●●○○○
Math
●●○○○
Coding
●○○○○
MMLU Pro
69.0%
GPQA
46.5%
HLE
4.6%
SciCode
23.3%
AIME
24.7%
MATH 500
73.3%
LiveCodeBench
16.9%
HumanEval
81.5%
Llama 3.1 Nemotron Instruct 70B