MoonshotAI: Kimi K2 0711

Kimi-K2-0711

Kimi K2 7011 Instruct is a large-scale Mixture-of-Experts (MoE) language model developed by Moonshot AI, featuring 1 trillion parameters with 32B active per forward pass. Optimized for agentic tasks, it delivers advanced capabilities in reasoning, tool use, and code synthesis. The model achieves strong results across benchmarks, excelling in coding (LiveCodeBench, SWE-bench), reasoning (ZebraLogic, GPQA), and tool-use (Tau2, AceBench). With support for long-context inference up to 128K tokens, Kimi K2 leverages a novel training stack that includes the MuonClip optimizer for stable, large-scale MoE training.

Conversations

Download TXT
Download PDF
CreatorMoonshot AI
Release DateJuly, 2025
LicenseModified MIT License
Context Window32,768
Image Input SupportNo
Open Source (Weights)Yes
Parameters1000B, 32B active at inference time
Model WeightsClick here

Performance Benchmarks

BenchmarkMetricKimi K2 InstructDeepSeek-V3-0324Qwen3-235B-A22B
(non-thinking)
Claude Sonnet 4
(w/o extended thinking)
Claude Opus 4
(w/o extended thinking)
GPT-4.1Gemini 2.5 Flash
Preview (05-20)
Coding Tasks
LiveCodeBench v6
(Aug 24 – May 25)
Pass@153.746.937.048.547.444.744.7
OJBenchPass@127.124.011.315.319.619.519.5
MultiPL-EPass@185.783.178.288.689.686.785.6
SWE-bench Verified
(Agentless Coding)
Single Patch w/o Test (Acc)51.836.639.450.253.040.832.6
SWE-bench Verified
(Agentic Coding)
Single Attempt (Acc)65.838.834.472.7*72.5*54.6
Multiple Attempts (Acc)71.680.279.4*
SWE-bench Multilingual
(Agentic Coding)
Single Attempt (Acc)47.325.820.951.031.5
TerminalBenchInhouse Framework (Acc)30.035.543.28.3
Terminus (Acc)25.016.36.630.316.8
Aider-PolyglotAcc60.055.161.856.470.752.444.0
Tool Use Tasks
Tau2 retailAvg@470.669.157.075.081.874.864.3
Tau2 airlineAvg@456.539.026.555.560.054.542.5
Tau2 telecomAvg@465.832.522.145.257.038.616.9
AceBenchAcc76.572.770.576.275.680.174.5
Math & STEM Tasks
AIME 2024Avg@6469.659.4*40.1*43.448.246.561.3
AIME 2025Avg@6449.546.724.7*33.1*33.9*37.046.6
MATH-500Acc97.494.0*91.2*94.094.492.495.4
HMMT 2025Avg@3238.827.511.915.915.919.434.7
CNMO 2024Avg@1674.374.748.660.457.656.675.0
PolyMath-enAvg@465.159.551.952.849.854.049.9
ZebraLogicAcc89.084.037.7*73.759.358.557.9
AutoLogiAcc89.588.983.389.886.188.284.1
GPQA-DiamondAvg@875.168.4*62.9*70.0*74.9*66.368.2
SuperGPQAAcc57.253.750.255.756.550.849.6
Humanity’s Last Exam
(Text Only)
4.75.25.75.87.13.75.6
General Tasks
MMLUEM89.589.487.091.592.990.490.1
MMLU-ReduxEM92.790.589.293.694.292.490.6
MMLU-ProEM81.181.2*77.383.786.681.879.4
IFEvalPrompt Strict89.881.183.2*87.687.488.084.3
Multi-ChallengeAcc54.131.434.046.849.036.439.5
SimpleQACorrect31.027.713.215.922.842.323.3
LivebenchPass@176.472.467.674.874.669.867.8