Sigmabench Update: GPT 5.2 Codex is now live, ranking third overall.
When running in Codex CLI, GPT 5.2 Codex ties GPT-5.1 Codex-Max on accuracy and consistency but outperforms it by being significantly faster.
As of 15 January 2026
Latest Leaderboard
We compare the latest on , , , and , grouping them by relative performance for each metric.
Alerts
Agent
Sigmascore
Accuracy
Consistency
Speed
Cursor CLI
Composer-1
42.9%
39.1%
51.2%
39.5%
Cursor CLI
Grok Code Fast 1
37.8%
34.4%
48.5%
32.4%
Codex CLI
new
GPT-5.2-Codex
33.9%
45.9%
57.6%
14.8%
Gemini CLI
Gemini 3 Flash Preview
33.4%
38.0%
54.8%
17.9%
Codex CLI
GPT-5.1-Codex-Max
31.8%
44.3%
58.3%
12.5%
Gemini CLI
Gemini 3 Pro Preview
30.3%
36.0%
53.2%
14.6%
Claude Code
Claude 4.5 Opus
30.1%
40.9%
52.4%
12.8%
Codex CLI
GPT-5.1-Codex-Mini
29.3%
40.0%
51.5%
12.2%
Codex CLI
GPT-5.1-Codex
26.3%
40.2%
57.1%
7.9%
Claude Code
Claude Haiku 4.5
24.6%
26.2%
33.0%
17.2%
Gemini CLI
Gemini 2.5 Flash
23.0%
17.9%
37.9%
18.1%
OpenCode
Request
 
Warp
Request
 
Amp
Request
 
Cursor CLI
Grok Code Fast 1
Accuracy:
34.4%
Consistency:
48.5%
Speed:
32.4%
37.8%
Gemini CLI
Gemini 3 Flash Preview
Accuracy:
38.0%
Consistency:
54.8%
Speed:
17.9%
33.4%
Gemini CLI
Gemini 3 Pro Preview
Accuracy:
36.0%
Consistency:
53.2%
Speed:
14.6%
30.3%
Codex CLI
GPT-5.1-Codex-Mini
Accuracy:
40.0%
Consistency:
51.5%
Speed:
12.2%
29.3%
Claude Code
Claude Haiku 4.5
Accuracy:
26.2%
Consistency:
33.0%
Speed:
17.2%
24.6%
OpenCode
Amp