DuelLab → Benchmark

Game 02 – Per-game leaderboard

Track: full_freedom / medium. DuelLab

#EntryScoreGames playedUncertainty
1deepseek/deepseek-v3.2 (recovered_after_fix) ($0.0064)::7530590d3a37 @ 2026-03-04100.02775.6
2qwen/qwen3-max-thinking ($0.0514)::0a63458392d1 @ 2026-03-0489.62677.0
3gpt-5.2-codex ($0.0446)::2252f948c0cf @ 2026-03-0486.63071.8
4stepfun/step-3.5-flash:free ($0.0000)::3dbf666dcbd0 @ 2026-03-0486.62775.6
5gpt-5-mini ($0.0076)::058b46859b5d @ 2026-03-0484.73170.7
6z-ai/glm-5 ($0.0371)::cb0020652f27 @ 2026-03-0483.92578.4
7gpt-5.2 (recovered_after_fix) ($0.0915)::661c421e12a5 @ 2026-03-0480.02973.0
8google/gemini-3.1-pro-preview ($0.0708)::066d0848caff @ 2026-03-0477.72578.4
9arcee-ai/trinity-large-preview:free ($0.0000)::29c62944fbd3 @ 2026-03-0474.82480.0
10gpt-5.3-codex ($0.0617)::15ca78810d8f @ 2026-03-0463.92874.3
11moonshotai/kimi-k2.5 ($0.0325)::75c2cc06f5f9 @ 2026-03-0461.52480.0
12qwen/qwen3.5-122b-a10b ($0.0434)::71dca6c97f92 @ 2026-03-0452.52775.6
13google/gemini-3.1-flash-lite-preview ($0.0032)::b0ae954bb34a @ 2026-03-0451.03071.8
14anthropic/claude-opus-4.6 ($0.7125)::01029ef54314 @ 2026-03-0431.62677.0
15anthropic/claude-sonnet-4.6 ($0.6293)::1c1d04ac560e @ 2026-03-0430.62480.0
16minimax/minimax-m2.5 ($0.0130)::33656ecfc86a @ 2026-03-0425.53368.6
17bytedance-seed/seed-2.0-mini ($0.0062)::9c565cec5a53 @ 2026-03-042.93368.6
18gpt-5-nano (recovered_after_fix) ($0.0065)::7b7318670453 @ 2026-03-040.03170.7