You won't know which agent works best for you unless you benchmark your codebase.
Nicolas Maquet
Nicolas Maquet
19th December 2025
Flash Beats Pro: Gemini CLI (Gemini 3 Flash Preview) takes #2 on Sigmabench

Gemini CLI users rejoice: Gemini 3 Flash Preview is now available. Let’s take a look at the numbers!


Gemini CLI (Gemini 3 Flash Preview)
Sigmascore

33.4%

Accuracy

38.0%

Consistency

54.8%

Speed

17.9%



What we measured:

  • Sigmascore — the overall measure of an agent’s real-world coding performance
  • Accuracy — how often outputs meet quality thresholds
  • Consistency — how often outputs remain useful even when not fully completing a task
  • Speed — how quickly tasks are completed

Each score is assigned a tier based on how close they are with respect to the margin of error. The top-scoring group of agents are in Tier 1, the next-best group are in Tier 2, and so on.

Gemini CLI (Gemini 3 Flash Preview) ranks #2 on the Sigmabench benchmark:

  • It outperforms both Gemini 2.5 Flash and Gemini 3 Pro Preview on every Sigmabench metric
  • It’s Tier 2 on Accuracy, statistically tied with the likes of Claude Code (Opus 4.5)
  • It’s Tier 1 in Consistency, which only the Codex family has achieved thus far
  • It’s Tier 2 in Speed, second only to Cursor CLI (Composer-1)
See our methodology for additional details.

Comparisons

Gemini CLI: Gemini 3 Flash Preview vs Gemini 2.5 Flash

For Gemini CLI users, Gemini 3 Flash Preview is a huge improvement over Gemini 2.5 Flash. It’s still just as fast, but it’s a lot more accurate and consistent.



Metric
Gemini 3 Flash Preview
Gemini 2.5 Flash
Sigmascore

33.4%

23.0%

Accuracy

38.0%

17.9%

Consistency

54.8%

37.9%

Speed

17.9%

18.1%



Gemini CLI: Gemini 3 Flash Preview vs Gemini 3 Pro Preview

Surprisingly, Gemini 3 Flash Preview outperforms Gemini 3 Pro Preview on every Sigmabench metric. Note that both models were tested on the same Gemini CLI version (0.21.1), released on 18 December 2025. Also note that we observed a significantly higher number timeouts (77 for Pro vs 43 for Flash) when running Gemini 3 Pro Preview which hurts both its Accuracy and Consistency scores.

It’s also woth mentioning that our benchmarking of Gemini CLI has run into a known issue with loop detection. We will be updating our benchmark results once this issue is resolved.



Metric
Gemini 3 Flash Preview
Gemini 3 Pro Preview
Sigmascore

33.4%

30.3%

Accuracy

38.0%

36.0%

Consistency

54.8%

53.2%

Speed

17.9%

14.6%

Key Insights

  • Gemini 3 Flash Preview is a big step forward for Gemini CLI users. Its Tier 1 Consistency places it alongside the Codex family, while maintaining strong Accuracy and competitive Speed, resulting in a #2 overall Sigmascore on Sigmabench.

  • Gemini 3 Flash Preview is a clear upgrade over prior Gemini models. It significantly improves Accuracy and Consistency over Gemini 2.5 Flash, and even outperforms Gemini 3 Pro Preview across all metrics, despite Pro’s positioning as a higher-tier model.

  • Lower timeout rates and more dependable partial outputs give Gemini CLI (Gemini 3 Flash Preview) a practical edge, making it one of the most dependable choices for iterative, CLI-based coding workflows today.


See our methodology for additional details.