No reasoning — DuelLab Benchmark

Models

Model leaderboard

One row per model; Min–Max is the score range across that model's evaluated rows at this reasoning level. Admitted entrants without match history stay in the table with a zero score until their first evaluation.

Reasoning level: None Games: 8 Build: Preview

No reasoning leaderboard for DuelLab Benchmark
Model	Avg score	Min–Max	Entries
GPT-5.4	79.9	41.7 – 100.0	12
Claude Sonnet 4.6	76.5	47.0 – 92.8	15
Gemini 3.1 Pro Preview	75.8	62.5 – 84.8	4
Claude Opus 4.6	68.9	0.0 – 100.0	19
MiMo-V2-Pro	66.8	0.9 – 97.7	18
GLM-5	66.1	28.1 – 88.9	16
GPT-5.3 Codex	65.4	23.1 – 100.0	23
GPT-5.4 Nano	64.6	13.6 – 94.0	8
Kimi K2.5	63.3	19.1 – 89.8	15
GPT-5.2	61.8	27.9 – 93.4	18
MiMo-V2-Omni	59.8	26.0 – 93.3	11
Gemini 3 Flash Preview	59.0	0.0 – 96.2	12
Nemotron 3 Super	56.4	41.2 – 73.6	11
DeepSeek V3.2	53.8	6.3 – 100.0	15
GPT-5 Mini	52.1	30.4 – 100.0	19
Gemini 2.5 Flash	51.5	0.0 – 100.0	7
Seed 2.0 Mini	47.7	19.7 – 87.8	5
Gemini 3.1 Flash Lite Preview	42.5	17.7 – 87.8	10
GPT-5.2 Codex	41.6	23.7 – 54.6	5
Qwen3 Max Thinking	40.6	8.0 – 82.3	10
GPT-5 Nano	38.0	0.0 – 92.2	22
GPT-5.4 Mini	35.7	0.0 – 77.1	9
Mistral Small 2603	34.1	0.0 – 83.5	7
Step 3.5 Flash	33.4	4.8 – 51.7	7
Qwen3.5 122B A10B	32.6	0.0 – 77.3	10
Trinity Large Preview	32.5	2.1 – 85.7	15
Minimax M2.5	25.8	6.9 – 46.0	4