What is Kimi K2.7-Code?

Kimi K2.7-Code is a 1 trillion-parameter mixture-of-experts (MoE) coding model released by Moonshot AI on June 12, 2026 under a Modified MIT license. It activates 32B parameters per token across 384 experts, supports a 256K-token context window, multimodal inputs via MoonViT, and requires thinking mode — it cannot run in non-thinking mode.

How does Kimi K2.7-Code compare to K2.6?

K2.7-Code is a coding-specialist refresh of K2.6 on the same 1T MoE backbone. Moonshot reports +21.8% on Kimi Code Bench v2, +11.0% on Program Bench, +31.5% on MLS Bench Lite, and ~10% gains on agentic suites (MCP Mark, MCP Atlas, Kimi Claw 24/7). Reasoning token usage drops ~30%. For general writing and conversation, Moonshot still recommends K2.6.

How does Kimi K2.7-Code compare to Claude Opus 4.8 and Fable 5?

On Moonshot's published tables, K2.7-Code trails Opus 4.8 on Kimi Code Bench v2 (62.0 vs 67.4) and Program Bench (53.6 vs 63.8) but beats Opus on MCP Mark Verified (81.1 vs 76.4) — a human-checked MCP tool-use benchmark. Community agent rankings often place K2.7 above Opus 4.8 but below Fable 5. Fable 5 remains offline for general users as of June 30, 2026 (Day 18 of the US export ban).

What does Kimi K2.7-Code cost on the API?

Kimi API pricing: $0.95 per million input tokens (cache miss), $0.19 per million on cache hit, $4.00 per million output tokens, 262,144-token context. Kimi Code subscription plans start at $15/month (Moderato) for terminal and IDE plugin access. See platform.kimi.ai for current rates.

Can I self-host Kimi K2.7-Code?

Yes. Weights are on Hugging Face under Modified MIT. Moonshot documents deployment with vLLM, SGLang, and KTransformers. A 1T MoE with 32B active parameters requires multi-GPU server hardware — typically 2–8× A100/H100 class GPUs for production inference, though quantization and expert offloading reduce requirements. Verify the model card for exact memory specs.

Why is K2.7-Code significant during the Fable 5 ban?

US export controls suspended Fable 5 for all general users on June 12 — the same day K2.7-Code launched. International developers locked out of Anthropic's frontier model gained an open-weight coding alternative with no geographic restrictions, Modified MIT licensing, and MCP-native agent benchmarks. See our Day 18 Fable status hub for the latest ban timeline.

Kimi K2.7-Code: 1T MoE Open Coding Model — Full Guide (2026) | explainx.ai Blog

Update — July 17, 2026: Kimi K3 tops nextjs.org/evals and Arena Frontend Code — surpassing Fable 5; open weights promised July 27. K2.7 remains the open-weight coding model with weights public today — see also K3 local desktop prep and K3 mobile guide. Main API hub: Kimi K3 guide.

Update — July 16, 2026 (evening): Kimi K3 has launched — 2.8T parameters, open-weight timeline, API live at platform.kimi.ai.

Update — July 3, 2026: Kimi K2.7 Code is generally available in GitHub Copilot — the first open-weight model in Copilot's model picker, hosted on Microsoft Azure. Pro/Pro+/Max rollout first; Business and Enterprise require admin policy enablement. Full setup guide: Kimi K2.7 in GitHub Copilot.

Update — June 30, 2026: Eighteen days into the Fable 5 export ban, Kimi K2.7-Code remains one of the most cited open-weight alternatives in developer threads — especially for long-horizon agent coding and MCP tool use. This guide expands on architecture, benchmarks, pricing, and how to run it in production.

TL;DR: On June 12, 2026 — the same day the US government suspended Claude Fable 5 — Moonshot AI released Kimi K2.7-Code: a 1T-parameter MoE coding model with 32B active parameters, 256K context, and open weights under Modified MIT. Vendor benchmarks show +21.8% on Kimi Code Bench v2 vs K2.6 and ~30% fewer reasoning tokens. Notably, MCP Mark Verified scores 81.1% vs Opus 4.8 at 76.4% on Moonshot's published table. API: $0.95/M input (cache miss), $4/M output.

Spec	Value
Release	June 12, 2026 (Moonshot announcement)
Architecture	MoE — 1T total, 32B active, 384 experts, 8 selected + 1 shared per token
Context	256K (262,144 tokens)
License	Modified MIT (open weights on Hugging Face)
Thinking mode	Required — non-thinking requests fall back to K2.6
API input	$0.95/M cache miss · $0.19/M cache hit
API output	$4.00/M
Best for	Long-horizon agent coding, MCP workflows, self-hosted SWE agents
Not for	General chat — use K2.6 instead

What Moonshot AI shipped — and why the timing mattered

Kimi K2.7-Code is not a bigger model. It is a coding-specialist execution refresh of the K2 MoE backbone Moonshot has iterated since early 2026.

Milestone	Date	Notes
K2 series	Early 2026	1T MoE foundation established
Kimi K2.6	April 20, 2026	General-purpose K2 refresh
Kimi K2.7-Code	June 12, 2026	Coding + agent specialization
Fable 5 global suspension	June 12, 2026	Same day — US export control

Moonshot's pitch: real software engineering is long-horizon — refactors across files, multi-step debugging, agent sessions that run for hours. K2.7-Code retrains the reward model and data pipeline around those workflows rather than single-shot completion.

The model is multimodal via MoonViT — a 400M-parameter vision encoder — supporting text, image, and video input alongside code. For teams piping UI screenshots or architecture diagrams into agent loops, that matters.

Official resource page: kimi.com/resources/kimi-k2-7-code.

Architecture — what 1T MoE actually means here

K2.7-Code shares the K2 MoE stack. The numbers from Moonshot's model card:

Parameter	Value
Total parameters	1 trillion
Activated per token	32 billion
Experts	384 (8 selected + 1 shared per token)
Layers	61 (1 dense + 60 MoE)
Attention	Multi-head Latent Attention (MLA), 64 heads, hidden dim 7168
MoE expert hidden dim	2048 per expert
Activation	SwiGLU
Vocabulary	160K tokens
Vision	MoonViT (400M params)

Why MoE for coding agents: you get frontier-scale total capacity without activating every parameter on every token — important when agent transcripts balloon with tool outputs, diffs, and stderr logs. The 256K context window (262,144 tokens) is sized for repository-scale traces, not just single-file edits.

Thinking mode is mandatory. K2.7-Code does not support non-thinking inference. On Kimi Code and the Kimi API, requests with thinking disabled are automatically routed to K2.6. Plan token budgets accordingly — "thinking" tokens are part of the cost model, though Moonshot claims ~30% fewer reasoning tokens vs K2.6 on the same tasks.

Benchmarks — coding, agentic, and the honesty filter

Moonshot evaluated K2.7-Code vs K2.6, GPT-5.5 (Codex xhigh), and Claude Opus 4.8 (Claude Code xhigh). Methodology: K2.7/K2.6 via Kimi Code CLI, thinking enabled, temperature 1.0, top-p 0.95, 262K context. Full details on the Hugging Face model card.

Coding benchmarks

Benchmark	K2.6	K2.7-Code	GPT-5.5	Opus 4.8	Δ vs K2.6
Kimi Code Bench v2	50.9	62.0	69.0	67.4	+21.8%
Program Bench	48.3	53.6	69.1	63.8	+11.0%
MLS Bench Lite	26.7	35.1	35.5	42.8	+31.5%

Kimi Code Bench v2 — Moonshot in-house coding suite (absolute score 62.0 for K2.7).
Program Bench — program synthesis evaluation.
MLS Bench Lite — multi-language, multi-step coding.

K2.7 trails Opus 4.8 and GPT-5.5 on these three Moonshot-run tables — but closes much of the gap vs K2.6, especially on MLS Bench Lite where it nearly matches GPT-5.5 (35.1 vs 35.5).

Agentic benchmarks — where K2.7 stands out

Benchmark	K2.6	K2.7-Code	GPT-5.5	Opus 4.8	What it measures
Kimi Claw 24/7 Bench	42.9	46.9	52.8	50.4	Persistent multi-day agent coworking (SWE, ML, recruiting, etc.)
MCP Atlas	69.4	76.0	79.4	81.3	Broad MCP tool orchestration
MCP Mark Verified	72.8	81.1	92.9	76.4	Human-verified MCP tasks (Notion, GitHub, FS, Postgres, Playwright)

MCP Mark Verified is the headline for agent builders: K2.7-Code at 81.1% vs Opus 4.8 at 76.4% on Moonshot's table — a benchmark focused on correct tool invocation through Model Context Protocol servers. That aligns with the June 2026 MCP wave (X hosted MCP, Claude Code MCP guide, explainx.ai's /mcp-servers directory).

GPT-5.5 still leads MCP Mark at 92.9% — but K2.7 is the open-weight option in this comparison set.

What is still missing from public leaderboards

As of late June 2026, Moonshot did not ship launch-day scores on cross-vendor suites like SWE-bench Verified, SWE-bench Pro, Terminal-Bench 2.0, LiveCodeBench, or Aider Polyglot. Community reproductions typically land within 1–2 weeks of open-weight drops — check DevThrottle scoreboards and independent evaluators before treating vendor tables as field position.

Rule: vendor benchmarks measure delta and direction. Your eval on your repo is the number that matters.

Reasoning efficiency — the 30% token cut

Reasoning models often overthink — burning thousands of tokens on problems that do not need deep deliberation. Moonshot claims K2.7-Code cuts thinking-token usage ~30% vs K2.6 while scoring higher on the same benchmarks.

For production agent loops, that compounds:

Workload	Effect of 30% fewer reasoning tokens
Interactive Kimi Code sessions	Faster turn-around, less waiting on chain-of-thought
API agent runs	Lower bill per completed task (thinking tokens are billed)
Long-horizon traces	More room inside 256K for actual code and tool output

Pair with automatic context caching on the Kimi API ($0.19/M cache hit vs $0.95/M cache miss) when agent harnesses reuse system prompts and repo context across turns.

Pricing — API, subscriptions, and vs closed frontiers

Kimi API (usage-based)

Model	Input (cache hit)	Input (cache miss)	Output	Context
kimi-k2.7-code	$0.19 / 1M	$0.95 / 1M	$4.00 / 1M	262,144

Source: Moonshot K2.7 resource page. Prices exclude tax; verify platform.kimi.ai for updates.

Rough comparison: Opus-class closed models often run $15–75/M output depending on tier. At $4/M output and sub-dollar input, K2.7-Code is priced for high-volume agent workloads — the same profile OpenRouter Fusion targets for multi-model routing.

Kimi Code subscription plans (terminal + IDE)

For developers using Kimi Code directly (terminal and IDE plugins, K2.7 as default model):

Plan	Monthly (annual billing)	Best for
Moderato	$15	Regular coding workflows, weekly refreshed quotas
Allegretto	$31	Larger weekly limits, higher concurrency
Allegro	$79	Intensive development, complex projects
Vivace	$159	Largest weekly quotas, big codebases

Each tier includes weekly refreshed usage limits and rising concurrency caps. K2.7-Code is the default model in Kimi Code with thinking enabled.

How to access K2.7-Code

1. Kimi Code (fastest path)

Kimi Code — Moonshot's coding agent surface. K2.7-Code is the default model. Setup instructions on the page. Good for trying the model before committing to self-host infrastructure.

2. Kimi API

platform.kimi.ai — REST API for agents, IDEs, and custom harnesses. Model id: kimi-k2.7-code. Integrates with the same thinking-only constraint.

3. Hugging Face weights (self-host)

Download open weights from Moonshot's Hugging Face repository (verify exact repo name on release — search moonshotai/Kimi-K2.7-Code). License: Modified MIT — read the README for commercial restrictions before shipping a product.

Inference stacks Moonshot documents:

vLLM — production serving, continuous batching
SGLang — structured generation, MoE-friendly paths
KTransformers — optimized MoE inference

Hardware reality check: 1T MoE with 32B active parameters is not a laptop model. Enterprise guides in our Fable alternatives post cite 2–8× A100/H100 or 4× RTX 4090 as starting points for MoE-class open weights — adjust for quantization (AWQ/GPTQ) and your concurrency target.

4. Third-party routers

K2.7-Code appears on aggregators like OpenRouter and Fireworks (verify availability at publish time). Useful when you already route through OpenRouter Fusion and want one API key for GLM-5.2, Kimi, and Qwen.

K2.7-Code vs K2.6 — which Kimi should you use?

Moonshot's own guidance is clear:

Use case	Model
Software engineering agents, refactors, MCP tool loops, repo-scale context	K2.7-Code
General writing, analysis, conversation, non-coding tasks	K2.6

K2.7 is a specialist — not a drop-in replacement for every K2.6 workload. If your harness sends thinking: false, you are already on K2.6 whether you intended to or not.

How K2.7-Code fits the open-weight landscape (June 2026)

With Fable 5 offline Day 18, teams compare open coders on three axes: benchmarks, license, and agent harness fit.

Model	License	Strength (vendor/community)	Fable-ban relevance
Kimi K2.7-Code	Modified MIT	Long-horizon agents, MCP tool use, 256K context	Strong international API + self-host path
GLM-5.2	MIT	BridgeBench reasoning, Zhipu open-weight cadence	Same week as ban; fully open
DeepSeek V4-Pro	Open weights	Cost-efficient coding, strong public bench culture	Popular API alternative
Sakana Fugu Ultra	API	Orchestrates public models — no restricted weights needed	SWE-bench Pro 73.7 without Fable in pool
Opus 4.8	Closed API	Anthropic's official Fable fallback	Available globally; not Fable-class

Many teams run GLM-5.2 for reasoning-heavy tickets and K2.7 for MCP-heavy agent loops — or route both through OpenRouter. See Claude Code vs Codex vs Gemini vs GLM-5.2 for harness-level comparisons.

Community reception — signals, not verdicts

Developer reaction on X after launch was strongly positive. Treat these as pointers for your eval queue, not production decisions:

Jun Song (Qwen ambassador, local LLM ecosystem) — informal agent ranking after hands-on tests:

Fable > Kimi-2.7 > Opus-4.8 = GLM-5.2 > GPT5.5 > Minimax-M3

xjdr (AI infrastructure): "k2.7 has been extremely impressive so far (as was k2.6 before it). Fantastic job Moonshot team."

Noctus: "Some of the things it pulls off in a single shot are absurd."

POM: "GLM 5.2 and Kimi 2.7 are another impressive leap forward—feeling around GPT5.4/Claude 4.6-level."

International Cyber Digest (Fable ban context):

"While the US is restricting and banning its frontier AI models, the Chinese are open-sourcing theirs."

Community rank ≠ SWE-bench Verified. Run 500 real tickets from your backlog before switching production defaults.

Modified MIT license — what to verify

Open weights under Modified MIT are more permissive than CC-BY-NC, but Modified means Moonshot added terms beyond standard MIT. Before commercial deployment:

Read the full license in the Hugging Face repo README
Check attribution requirements
Confirm redistribution rules for fine-tunes and derivatives
Align with your legal/compliance team if you ship customer-facing products

Self-hosting removes per-token API cost but adds hardware, ops, and security ownership — same trade-off as any open MoE at trillion-parameter scale.

The Fable 5 context — Day 18 and counting

K2.7-Code launched on June 12, 2026 — the same evening Anthropic disabled Fable 5 and Mythos 5 globally. As of June 30 (Day 18):

Fable 5 remains offline for general users (live status)
Mythos 5 partially restored for US Annex A critical-infrastructure orgs only
Axios reporting (June 27) suggested a near-term Fable lift — unconfirmed after the weekend
International developers in Europe, the UK, and India still have no bilateral restoration path

K2.7-Code does not "replace" Fable for every workload — community tests still rank Fable above K2.7 when Fable was available. It does provide an unrestricted open-weight coding frontier while US policy negotiates the Anthropic models back online.

Same-week agent infrastructure: Cursor for iOS and X hosted MCP — many teams pair Kimi/GLM for model inference with Cursor/MCP for harness and data plane. June 30: Meituan shipped LongCat-2.0 — 1.6T MoE, 48B active, Terminal-Bench 70.8 — another open-weight entry in the same ladder.

Production evaluation checklist

If you are benchmarking K2.7-Code against Opus 4.8 or a pre-ban Fable baseline:

Task distribution — agent loop completion rate, tool-call accuracy, single-shot generation on your repo layout
MCP harness — if you use Notion/GitHub/Postgres tools, run the same suite you would on Claude Code with MCP
Context utilization — does coherence hold at 80K, 120K, 200K tokens on your longest real traces?
Thinking token budget — measure tokens per solved task vs K2.6 and vs Opus; the 30% claim is vendor-reported
Latency — API p95 under your concurrency; self-hosted throughput with your GPU config
License — Modified MIT compliance for your shipping model
Fallback — keep Opus 4.8 or OpenRouter Fusion routing for regression comparison

Bottom line

Kimi K2.7-Code is a credible open-weight frontier coding model — not because Moonshot says so, but because the architecture (1T MoE, 256K, mandatory thinking), agentic benchmark direction (especially MCP Mark vs Opus), pricing, and license form a coherent package for teams building long-horizon software agents.

It launched the same day as the Fable 5 ban — and eighteen days later, that ban is still driving international developers toward open Chinese coders while US negotiations continue.

Do not trust our summary or Moonshot's tables alone. Download the weights or hit the API. Run your eval. The benchmark that matters is the one on your codebase.

Kimi K2.7-Code specs, benchmarks, and pricing accurate as of June 30, 2026 per Moonshot's official resource page. Verify Hugging Face repo, API rates, and license terms before production deployment.

Update — July 16, 2026 (evening): Kimi K3 has launched — 2.8T parameters, open-weight timeline, API live at platform.kimi.ai.

Spec	Value
Release	June 12, 2026 (Moonshot announcement)
Architecture	MoE — 1T total, 32B active, 384 experts, 8 selected + 1 shared per token
Context	256K (262,144 tokens)
License	Modified MIT (open weights on Hugging Face)
Thinking mode	Required — non-thinking requests fall back to K2.6
API input	$0.95/M cache miss · $0.19/M cache hit
API output	$4.00/M
Best for	Long-horizon agent coding, MCP workflows, self-hosted SWE agents
Not for	General chat — use K2.6 instead

What Moonshot AI shipped — and why the timing mattered

Kimi K2.7-Code is not a bigger model. It is a coding-specialist execution refresh of the K2 MoE backbone Moonshot has iterated since early 2026.

Milestone	Date	Notes
K2 series	Early 2026	1T MoE foundation established
Kimi K2.6	April 20, 2026	General-purpose K2 refresh
Kimi K2.7-Code	June 12, 2026	Coding + agent specialization
Fable 5 global suspension	June 12, 2026	Same day — US export control

Official resource page: kimi.com/resources/kimi-k2-7-code.

Architecture — what 1T MoE actually means here

K2.7-Code shares the K2 MoE stack. The numbers from Moonshot's model card:

Parameter	Value
Total parameters	1 trillion
Activated per token	32 billion
Experts	384 (8 selected + 1 shared per token)
Layers	61 (1 dense + 60 MoE)
Attention	Multi-head Latent Attention (MLA), 64 heads, hidden dim 7168
MoE expert hidden dim	2048 per expert
Activation	SwiGLU
Vocabulary	160K tokens
Vision	MoonViT (400M params)

Benchmarks — coding, agentic, and the honesty filter

Coding benchmarks

Benchmark	K2.6	K2.7-Code	GPT-5.5	Opus 4.8	Δ vs K2.6
Kimi Code Bench v2	50.9	62.0	69.0	67.4	+21.8%
Program Bench	48.3	53.6	69.1	63.8	+11.0%
MLS Bench Lite	26.7	35.1	35.5	42.8	+31.5%

Kimi Code Bench v2 — Moonshot in-house coding suite (absolute score 62.0 for K2.7).
Program Bench — program synthesis evaluation.
MLS Bench Lite — multi-language, multi-step coding.

K2.7 trails Opus 4.8 and GPT-5.5 on these three Moonshot-run tables — but closes much of the gap vs K2.6, especially on MLS Bench Lite where it nearly matches GPT-5.5 (35.1 vs 35.5).

Agentic benchmarks — where K2.7 stands out

Benchmark	K2.6	K2.7-Code	GPT-5.5	Opus 4.8	What it measures
Kimi Claw 24/7 Bench	42.9	46.9	52.8	50.4	Persistent multi-day agent coworking (SWE, ML, recruiting, etc.)
MCP Atlas	69.4	76.0	79.4	81.3	Broad MCP tool orchestration
MCP Mark Verified	72.8	81.1	92.9	76.4	Human-verified MCP tasks (Notion, GitHub, FS, Postgres, Playwright)

GPT-5.5 still leads MCP Mark at 92.9% — but K2.7 is the open-weight option in this comparison set.

What is still missing from public leaderboards

Rule: vendor benchmarks measure delta and direction. Your eval on your repo is the number that matters.

Reasoning efficiency — the 30% token cut

For production agent loops, that compounds:

Workload	Effect of 30% fewer reasoning tokens
Interactive Kimi Code sessions	Faster turn-around, less waiting on chain-of-thought
API agent runs	Lower bill per completed task (thinking tokens are billed)
Long-horizon traces	More room inside 256K for actual code and tool output

Pair with automatic context caching on the Kimi API ($0.19/M cache hit vs $0.95/M cache miss) when agent harnesses reuse system prompts and repo context across turns.

Pricing — API, subscriptions, and vs closed frontiers

Kimi API (usage-based)

Model	Input (cache hit)	Input (cache miss)	Output	Context
kimi-k2.7-code	$0.19 / 1M	$0.95 / 1M	$4.00 / 1M	262,144

Source: Moonshot K2.7 resource page. Prices exclude tax; verify platform.kimi.ai for updates.

Kimi Code subscription plans (terminal + IDE)

For developers using Kimi Code directly (terminal and IDE plugins, K2.7 as default model):

Plan	Monthly (annual billing)	Best for
Moderato	$15	Regular coding workflows, weekly refreshed quotas
Allegretto	$31	Larger weekly limits, higher concurrency
Allegro	$79	Intensive development, complex projects
Vivace	$159	Largest weekly quotas, big codebases

Each tier includes weekly refreshed usage limits and rising concurrency caps. K2.7-Code is the default model in Kimi Code with thinking enabled.

How to access K2.7-Code

1. Kimi Code (fastest path)

Kimi Code — Moonshot's coding agent surface. K2.7-Code is the default model. Setup instructions on the page. Good for trying the model before committing to self-host infrastructure.

2. Kimi API

platform.kimi.ai — REST API for agents, IDEs, and custom harnesses. Model id: kimi-k2.7-code. Integrates with the same thinking-only constraint.

3. Hugging Face weights (self-host)

Inference stacks Moonshot documents:

vLLM — production serving, continuous batching
SGLang — structured generation, MoE-friendly paths
KTransformers — optimized MoE inference

4. Third-party routers

K2.7-Code vs K2.6 — which Kimi should you use?

Moonshot's own guidance is clear:

Use case	Model
Software engineering agents, refactors, MCP tool loops, repo-scale context	K2.7-Code
General writing, analysis, conversation, non-coding tasks	K2.6

K2.7 is a specialist — not a drop-in replacement for every K2.6 workload. If your harness sends thinking: false, you are already on K2.6 whether you intended to or not.

How K2.7-Code fits the open-weight landscape (June 2026)

With Fable 5 offline Day 18, teams compare open coders on three axes: benchmarks, license, and agent harness fit.

Model	License	Strength (vendor/community)	Fable-ban relevance
Kimi K2.7-Code	Modified MIT	Long-horizon agents, MCP tool use, 256K context	Strong international API + self-host path
GLM-5.2	MIT	BridgeBench reasoning, Zhipu open-weight cadence	Same week as ban; fully open
DeepSeek V4-Pro	Open weights	Cost-efficient coding, strong public bench culture	Popular API alternative
Sakana Fugu Ultra	API	Orchestrates public models — no restricted weights needed	SWE-bench Pro 73.7 without Fable in pool
Opus 4.8	Closed API	Anthropic's official Fable fallback	Available globally; not Fable-class

Community reception — signals, not verdicts

Developer reaction on X after launch was strongly positive. Treat these as pointers for your eval queue, not production decisions:

Jun Song (Qwen ambassador, local LLM ecosystem) — informal agent ranking after hands-on tests:

Fable > Kimi-2.7 > Opus-4.8 = GLM-5.2 > GPT5.5 > Minimax-M3

xjdr (AI infrastructure): "k2.7 has been extremely impressive so far (as was k2.6 before it). Fantastic job Moonshot team."

Noctus: "Some of the things it pulls off in a single shot are absurd."

POM: "GLM 5.2 and Kimi 2.7 are another impressive leap forward—feeling around GPT5.4/Claude 4.6-level."

International Cyber Digest (Fable ban context):

"While the US is restricting and banning its frontier AI models, the Chinese are open-sourcing theirs."

Community rank ≠ SWE-bench Verified. Run 500 real tickets from your backlog before switching production defaults.

Modified MIT license — what to verify

Open weights under Modified MIT are more permissive than CC-BY-NC, but Modified means Moonshot added terms beyond standard MIT. Before commercial deployment:

Read the full license in the Hugging Face repo README
Check attribution requirements
Confirm redistribution rules for fine-tunes and derivatives
Align with your legal/compliance team if you ship customer-facing products

Self-hosting removes per-token API cost but adds hardware, ops, and security ownership — same trade-off as any open MoE at trillion-parameter scale.

The Fable 5 context — Day 18 and counting

K2.7-Code launched on June 12, 2026 — the same evening Anthropic disabled Fable 5 and Mythos 5 globally. As of June 30 (Day 18):

Fable 5 remains offline for general users (live status)
Mythos 5 partially restored for US Annex A critical-infrastructure orgs only
Axios reporting (June 27) suggested a near-term Fable lift — unconfirmed after the weekend
International developers in Europe, the UK, and India still have no bilateral restoration path

Production evaluation checklist

If you are benchmarking K2.7-Code against Opus 4.8 or a pre-ban Fable baseline:

Task distribution — agent loop completion rate, tool-call accuracy, single-shot generation on your repo layout
MCP harness — if you use Notion/GitHub/Postgres tools, run the same suite you would on Claude Code with MCP
Context utilization — does coherence hold at 80K, 120K, 200K tokens on your longest real traces?
Thinking token budget — measure tokens per solved task vs K2.6 and vs Opus; the 30% claim is vendor-reported
Latency — API p95 under your concurrency; self-hosted throughput with your GPU config
License — Modified MIT compliance for your shipping model
Fallback — keep Opus 4.8 or OpenRouter Fusion routing for regression comparison

Bottom line

It launched the same day as the Fable 5 ban — and eighteen days later, that ban is still driving international developers toward open Chinese coders while US negotiations continue.

Do not trust our summary or Moonshot's tables alone. Download the weights or hit the API. Run your eval. The benchmark that matters is the one on your codebase.

What Moonshot AI shipped — and why the timing mattered

Architecture — what 1T MoE actually means here

Benchmarks — coding, agentic, and the honesty filter

Coding benchmarks

Agentic benchmarks — where K2.7 stands out

What is still missing from public leaderboards

Reasoning efficiency — the 30% token cut

Pricing — API, subscriptions, and vs closed frontiers

Kimi API (usage-based)

Kimi Code subscription plans (terminal + IDE)

How to access K2.7-Code

1. Kimi Code (fastest path)

2. Kimi API

3. Hugging Face weights (self-host)

4. Third-party routers

K2.7-Code vs K2.6 — which Kimi should you use?

How K2.7-Code fits the open-weight landscape (June 2026)

Community reception — signals, not verdicts

Modified MIT license — what to verify

The Fable 5 context — Day 18 and counting

Production evaluation checklist

Bottom line

Related reading

What Moonshot AI shipped — and why the timing mattered

Architecture — what 1T MoE actually means here

Benchmarks — coding, agentic, and the honesty filter

Coding benchmarks

Agentic benchmarks — where K2.7 stands out

What is still missing from public leaderboards

Reasoning efficiency — the 30% token cut

Pricing — API, subscriptions, and vs closed frontiers

Kimi API (usage-based)

Kimi Code subscription plans (terminal + IDE)

How to access K2.7-Code

1. Kimi Code (fastest path)

2. Kimi API

3. Hugging Face weights (self-host)

4. Third-party routers

K2.7-Code vs K2.6 — which Kimi should you use?

How K2.7-Code fits the open-weight landscape (June 2026)

Community reception — signals, not verdicts

Modified MIT license — what to verify

The Fable 5 context — Day 18 and counting

Production evaluation checklist

Bottom line

Related reading

Related posts

Kimi K2.7 Code in GitHub Copilot: First Open-Weight Model

LongCat-2.0: Meituan's 1.6T MoE Open Model Trained on AI ASIC Superpods

Cohere North Mini Code: Open-Source Agentic Coding Model (Apache 2.0)

Related posts

Kimi K2.7 Code in GitHub Copilot: First Open-Weight Model

LongCat-2.0: Meituan's 1.6T MoE Open Model Trained on AI ASIC Superpods

Cohere North Mini Code: Open-Source Agentic Coding Model (Apache 2.0)