You won't know which agent works best for you unless you benchmark your codebase.

Contact

Home

Company

Latest

Will Palmer

17th December 2025

Cursor Composer-1 is 4x Faster and Remains #1 Overall

On our first full real-world evaluation run, Cursor CLI (Composer-1) ranks #1 on Sigmabench, leading the field in overall score and showing standout performance across complex software engineering tasks.

Cursor (Composer-1)

Sigmascore

42.9%

Accuracy

39.1%

Consistency

50.9%

Speed

39.5%

What we measured:

Sigmascore — the overall measure of an agent’s real-world coding performance
Accuracy — how often outputs meet quality thresholds
Consistency — how often outputs remain useful even when not fully completing a task
Speed — how quickly tasks are completed

Each score is assigned a tier based on how close they are with respect to the margin of error. The top-scoring group of agents are in Tier 1, the next-best group are in Tier 2, and so on.

Cursor (Composer-1) is the overall leader on the Sigmabench benchmark:

It’s the only Tier 1 agent on Sigmascore
It’s the only Tier 1 agent on Speed
It’s Tier 2 on both Accuracy and Consistency

See our methodology for additional details.

Comparisons

Cursor (Composer-1) vs Codex CLI (GPT-5.1 Codex Max)

Codex is both the most accurate (+5.2 points) agent in this comparison. Consistency is statistically tied. Cursor, however, leads by a large margin of 27 percentage points in speed, representing about a 4x speed advantage.

Metric

Cursor CLI

Codex CLI

Sigmascore

42.9%

30.2%

Accuracy

39.1%

44.3%

Consistency

50.9%

49.9%

Speed

39.5%

12.5%

Cursor CLI (Composer-1) vs Claude Code (Opus 4.5)

Claude Code (Opus 4.5) and Cursor CLI are statistically tied on Consistency. Claude Code is leading by a full tier in Accuracy. Cursor leads once again by a large margin of 27 percentage points in speed, representing about a 4x speed advantage.

Metric

Cursor CLI

Claude Code

Sigmascore

42.9%

32.0%

Accuracy

39.1%

43.1%

Consistency

50.9%

49.8%

Speed

39.5%

15.3%

Key Insights

Speed advantage: Cursor CLI (Composer-1) is 4x faster on average than both Claude Code (Opus 4.5) and Codex CLI (GPT-5.1-Codex-Max).
Cursor CLI (Composer-1) trails the accuracy leader Codex CLI (GPT-5.1-Codex-Max) by 5 percentage points, and is tied on Consistency. The speed advantage is significant enough to maintain the #1 Sigmascore rank overall. points, but the speed advantage is significant enough to maintain the #1 Sigmascore rank overall.

See our methodology for additional details.

Benchmarks are read-only and SOC 2-compliant.