Gemini CLI users rejoice: Gemini 3 Flash Preview is now available. Let’s take a look at the numbers!
33.4%
38.0%
54.8%
17.9%
What we measured:
Each score is assigned a tier based on how close they are with respect to the margin of error. The top-scoring group of agents are in Tier 1, the next-best group are in Tier 2, and so on.
Gemini CLI (Gemini 3 Flash Preview) ranks #2 on the Sigmabench benchmark:
For Gemini CLI users, Gemini 3 Flash Preview is a huge improvement over Gemini 2.5 Flash. It’s still just as fast, but it’s a lot more accurate and consistent.
33.4%
23.0%
38.0%
17.9%
54.8%
37.9%
17.9%
18.1%
Surprisingly, Gemini 3 Flash Preview outperforms Gemini 3 Pro Preview on every Sigmabench metric. Note that both models were tested on the same Gemini CLI version (0.21.1), released on 18 December 2025. Also note that we observed a significantly higher number timeouts (77 for Pro vs 43 for Flash) when running Gemini 3 Pro Preview which hurts both its Accuracy and Consistency scores.
It’s also woth mentioning that our benchmarking of Gemini CLI has run into a known issue with loop detection. We will be updating our benchmark results once this issue is resolved.
33.4%
30.3%
38.0%
36.0%
54.8%
53.2%
17.9%
14.6%
Gemini 3 Flash Preview is a big step forward for Gemini CLI users. Its Tier 1 Consistency places it alongside the Codex family, while maintaining strong Accuracy and competitive Speed, resulting in a #2 overall Sigmascore on Sigmabench.
Gemini 3 Flash Preview is a clear upgrade over prior Gemini models. It significantly improves Accuracy and Consistency over Gemini 2.5 Flash, and even outperforms Gemini 3 Pro Preview across all metrics, despite Pro’s positioning as a higher-tier model.
Lower timeout rates and more dependable partial outputs give Gemini CLI (Gemini 3 Flash Preview) a practical edge, making it one of the most dependable choices for iterative, CLI-based coding workflows today.