You won't know which agent works best for you unless you benchmark your codebase.
Will Palmer
Will Palmer
17th December 2025
Cursor Composer-1 is 4x Faster and Remains #1 Overall

On our first full real-world evaluation run, Cursor CLI (Composer-1) ranks #1 on Sigmabench, leading the field in overall score and showing standout performance across complex software engineering tasks.


Cursor (Composer-1)
Sigmascore

42.9%

Accuracy

39.1%

Consistency

51.2%

Speed

39.5%



What we measured:

  • Sigmascore — the overall measure of an agent’s real-world coding performance
  • Accuracy — how often outputs meet quality thresholds
  • Consistency — how often outputs remain useful even when not fully completing a task
  • Speed — how quickly tasks are completed

Each score is assigned a tier based on how close they are with respect to the margin of error. The top-scoring group of agents are in Tier 1, the next-best group are in Tier 2, and so on.

Cursor (Composer-1) is the overall leader on the Sigmabench benchmark:

  • It’s the only Tier 1 agent on Sigmascore
  • It’s the only Tier 1 agent on Speed
  • It’s Tier 2 on both Accuracy and Consistency
See our methodology for additional details.

Comparisons

Cursor (Composer-1) vs Codex CLI (GPT-5.1 Codex Max)

Codex is both the most accurate (+5.2 points) and most consistent (+7.1 points) agent in this comparison. Cursor, however, leads by a large margin of 27 percentage points in speed, representing about a 4x speed advantage.



Metric
Cursor CLI
Codex CLI
Sigmascore

42.9%

31.8%

Accuracy

39.1%

44.3%

Consistency

51.2%

58.3%

Speed

39.5%

12.5%



Cursor CLI (Composer-1) vs Claude Code (Opus 4.5)

Claude Code (Opus 4.5) and Cursor CLI are statistically tied on both Accuracy and Consistency. Cursor leads once again by a large margin of 27 percentage points in speed, representing about a 4x speed advantage.



Metric
Cursor CLI
Claude Code
Sigmascore

42.9%

30.1%

Accuracy

39.1%

40.9%

Consistency

51.2%

52.4%

Speed

39.5%

12.8%

Key Insights

  • Speed advantage: Cursor CLI (Composer-1) is 4x faster on average than both Claude Code (Opus 4.5) and Codex CLI (GPT-5.1-Codex-Max).

  • Cursor CLI (Composer-1) trails the accuracy and consistency leader Codex CLI (GPT-5.1-Codex-Max) by 5–7 percentage points, but the speed advantage is significant enough to maintain the #1 Sigmascore rank overall.


See our methodology for additional details.