OpenCode + Kimi K2.5: Open Source, Open Weight Agentic Coding is Here

Moonshot AI shook the AI agent industry this week with the release of Kimi K2.5, an open-weight model with astounding performance. The OpenCode team immediately added support for it and is giving it a high praise.

Dax Raad, one of the lead developers of OpenCode, recently shared his experience with Kimi K2.5:

i’ve been using it for all my work for the past 24 hours and i don’t see much of a difference from opus

maybe opus is a bit smarter but this guy is so fast and so cheap

and we’re probably going to drop our prices even further https://t.co/AkOnQamAfb
January 29, 2026

Faster and cheaper than Claude 4.5 Opus while not being quite as smart? Let’s find out!
We benchmarked OpenCode (Kimi K2.5) and compare it to Claude Code (Claude 4.5 Opus).

What we measured:

Sigmascore — the overall measure of an agent’s real-world coding performance
Accuracy — how often outputs meet quality thresholds
Consistency — how often outputs remain useful even when not fully completing a task
Speed — how quickly tasks are completed

Each score is assigned a tier based on how close they are with respect to the margin of error. The top-scoring group of agents are in Tier 1, the next-best group are in Tier 2, and so on.

See our methodology for additional details.

OpenCode (Kimi K2.5) vs Claude Code (Claude 4.5 Opus)

The two agents have quite different performance profiles. Claude Code (Claude 4.5 Opus) is more accurate and more consistent (by a full tier each) but OpenCode (Kimi K2.5) is a lot faster (by two full tiers).

The Sigmabench benchmark has been carefully calibrated so that tiers represent meaningful and perceptible differences in performance. Dax’s empirical observation that Kimi K2.5 is “so fast” with Opus 4.5 being “a bit smarter” matches the Sigmabench results very accurately.

Metric

OpenCode

Claude Code

Sigmascore

32.7%

32.0%

Accuracy

39.4%

43.1%

Consistency

46.8%

49.8%

Speed

18.9%

15.3%

Median Runtime Comparison

OpenCode (Kimi K2.5) is about 22% faster (315s vs 405s) than Claude Code (Claude 4.5 Opus).

At Tier 4, OpenCode (Kimi K2.5) will definitely feel a lot faster than Claude Code, Codex CLI and Gemini CLI using their respective flagship models (all at Tier 6).

For reference, the chart below includes Cursor CLI (Composer 1) and its blazing fast Tier 1 speed.

Benchmarking Costs Comparison

The chart below shows our best estimate for the cost of running Sigmabench on each agent/model combination. The difference in inference costs is staggering.

Running OpenCode (Kimi K2.5) is 88% cheaper than Claude Code (Claude 4.5 Opus) at API prices.

Compared to Codex CLI and Gemini CLI and their flagship models, OpenCode (Kimi K2.5) is 73% less expensive.

Exact benchmarking costs are difficult to report on due to failures, retries, timeouts and so on.
The figures above should be accurate within +/- 5%.

A New Benchmark for Open Source Agentic Coding

The pairing of OpenCode and Kimi K2.5 represents a new benchmark for what’s possible with open source software and open weight AI models.

Agents from the top AI labs still lead on raw intelligence, but the gap is now narrow enough that speed, cost, and weight access may become decisive factors for more and more use cases. For many real-world coding workflows, OpenCode (Kimi K2.5) offers a compelling trade-off that simply didn’t exist before.

This release marks a shift from open weight models being “good for the price” to being genuinely competitive on raw capability. If this trajectory continues, the distinction between proprietary and open agentic coding systems will increasingly come down to ecosystem and ergonomics rather than performance alone.

See our methodology for additional details.

Benchmarks are read-only and SOC 2-compliant.