|  | --- | 
					
						
						|  | license: other | 
					
						
						|  | license_name: modified-mit | 
					
						
						|  | library_name: transformers | 
					
						
						|  | --- | 
					
						
						|  | <div align="center"> | 
					
						
						|  | <picture> | 
					
						
						|  | <img src="figures/kimi-logo.png" width="30%" alt="Kimi K2: Open Agentic Intellignece"> | 
					
						
						|  | </picture> | 
					
						
						|  | </div> | 
					
						
						|  | <hr> | 
					
						
						|  |  | 
					
						
						|  | <div align="center" style="line-height:1"> | 
					
						
						|  | <a href="https://www.kimi.com" target="_blank"><img alt="Chat" src="https://img.shields.io/badge/🤖%20Chat-Kimi%20K2-ff6b6b?color=1783ff&logoColor=white"/></a> | 
					
						
						|  | <a href="https://github.com/moonshotai/Kimi-K2"><img alt="github" src="https://img.shields.io/badge/🤖%20Github-Kimi%20K2-ff6b6b?color=1783ff&logoColor=white"/></a> | 
					
						
						|  | <a href="https://www.moonshot.ai" target="_blank"><img alt="Homepage" src="https://img.shields.io/badge/Homepage-Moonshot%20AI-white?logo=Kimi&logoColor=white"/></a> | 
					
						
						|  | </div> | 
					
						
						|  |  | 
					
						
						|  | <div align="center" style="line-height: 1;"> | 
					
						
						|  | <a href="https://huggingface.co/moonshotai" target="_blank"><img alt="Hugging Face" src="https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Moonshot%20AI-ffc107?color=ffc107&logoColor=white"/></a> | 
					
						
						|  | <a href="https://twitter.com/kimi_moonshot" target="_blank"><img alt="Twitter Follow" src="https://img.shields.io/badge/Twitter-Kimi.ai-white?logo=x&logoColor=white"/></a> | 
					
						
						|  | <a href="https://discord.gg/TYU2fdJykW" target="_blank"><img alt="Discord" src="https://img.shields.io/badge/Discord-Kimi.ai-white?logo=discord&logoColor=white"/></a> | 
					
						
						|  | </div> | 
					
						
						|  | <div align="center" style="line-height: 1;"> | 
					
						
						|  | <a href="https://huggingface.co/moonshotai/Kimi-K2-Instruct-0905/blob/main/LICENSE"><img alt="License" src="https://img.shields.io/badge/License-Modified_MIT-f5de53?&color=f5de53"/></a> | 
					
						
						|  | </div> | 
					
						
						|  |  | 
					
						
						|  | <p align="center"> | 
					
						
						|  | <b>📰  <a href="https://moonshotai.github.io/Kimi-K2/">Tech Blog</a></b>     |     <b>📄  <a href="https://github.com/MoonshotAI/Kimi-K2/blob/main/tech_report.pdf">Paper</a></b> | 
					
						
						|  | </p> | 
					
						
						|  |  | 
					
						
						|  |  | 
					
						
						|  | ## 1. Model Introduction | 
					
						
						|  |  | 
					
						
						|  | Kimi K2-Instruct-0905 is the latest, most capable version of Kimi K2. It is a state-of-the-art mixture-of-experts (MoE) language model, featuring 32 billion activated parameters and a total of 1 trillion parameters. | 
					
						
						|  |  | 
					
						
						|  | ### Key Features | 
					
						
						|  | - Enhanced agentic coding intelligence: Kimi K2-Instruct-0905 demonstrates significant improvements in performance on public benchmarks and real-world coding agent tasks. | 
					
						
						|  | - Improved frontend coding experience: Kimi K2-Instruct-0905 offers advancements in both the aesthetics and practicality of frontend programming. | 
					
						
						|  | - Extended context length: Kimi K2-Instruct-0905’s context window has been increased from 128k to 256k tokens, providing better support for long-horizon tasks. | 
					
						
						|  |  | 
					
						
						|  |  | 
					
						
						|  | ## 2. Model Summary | 
					
						
						|  |  | 
					
						
						|  | <div align="center"> | 
					
						
						|  |  | 
					
						
						|  |  | 
					
						
						|  | | | | | 
					
						
						|  | |:---:|:---:| | 
					
						
						|  | | **Architecture** | Mixture-of-Experts (MoE) | | 
					
						
						|  | | **Total Parameters** | 1T | | 
					
						
						|  | | **Activated Parameters** | 32B | | 
					
						
						|  | | **Number of Layers** (Dense layer included) | 61 | | 
					
						
						|  | | **Number of Dense Layers** | 1 | | 
					
						
						|  | | **Attention Hidden Dimension** | 7168 | | 
					
						
						|  | | **MoE Hidden Dimension** (per Expert) | 2048 | | 
					
						
						|  | | **Number of Attention Heads** | 64 | | 
					
						
						|  | | **Number of Experts** | 384 | | 
					
						
						|  | | **Selected Experts per Token** | 8 | | 
					
						
						|  | | **Number of Shared Experts** | 1 | | 
					
						
						|  | | **Vocabulary Size** | 160K | | 
					
						
						|  | | **Context Length** | 256K | | 
					
						
						|  | | **Attention Mechanism** | MLA | | 
					
						
						|  | | **Activation Function** | SwiGLU | | 
					
						
						|  | </div> | 
					
						
						|  |  | 
					
						
						|  | ## 3. Evaluation Results | 
					
						
						|  |  | 
					
						
						|  | | Benchmark              | Metric | K2-Instruct-0905 | K2-Instruct-0711 | Qwen3-Coder-480B-A35B-Instruct    | GLM-4.5    | DeepSeek-V3.1 | Claude-Sonnet-4 | Claude-Opus-4 | | 
					
						
						|  | |------------------------|--------|------------------|------------------|--------|--------|--------|-----------------|---------------| | 
					
						
						|  | | SWE-Bench verified     | ACC    | 69.2 ± 0.63      | 65.8             | 69.6*  | 64.2*  | 66.0*  | 72.7*            | 72.5*          | | 
					
						
						|  | | SWE-Bench Multilingual | ACC    | 55.9 ± 0.72      | 47.3             | 54.7*  | 52.7   | 54.5*  | 53.3*           | -             | | 
					
						
						|  | | Multi-SWE-Bench        | ACC    | 33.5 ± 0.28      | 31.3             | 32.7   | 31.7   | 29.0   | 35.7            | -             | | 
					
						
						|  | | Terminal-Bench         | ACC    | 44.5 ± 2.03      | 37.5             | 37.5*  | 39.9*  | 31.3*  | 36.4*           | 43.2*         | | 
					
						
						|  | | SWE-Dev                | ACC    | 66.6 ± 0.72      | 61.9             | 64.7   | 63.2   | 53.3   | 67.1            | -             | | 
					
						
						|  |  | 
					
						
						|  |  | 
					
						
						|  | All K2-Instruct-0905 numbers are reported as mean ± std over five independent, full-test-set runs. | 
					
						
						|  | Before each run we prune the repository so that every Git object unreachable from the target commit disappears; this guarantees the agent sees only the code that would legitimately be available at that point in history. | 
					
						
						|  |  | 
					
						
						|  | Except for Terminal-Bench (Terminus-2), every result was produced with our in-house evaluation harness. The harness is derived from SWE-agent, but we clamp the context windows of the Bash and Edit tools and rewrite the system prompt to match the task semantics. All baseline figures denoted with an asterisk (*) are excerpted directly from their official report or public leaderboard; the remaining metrics were evaluated by us under conditions identical to those used for K2-Instruct-0905. | 
					
						
						|  |  | 
					
						
						|  | For SWE-Dev we go one step further: we overwrite the original repository files and delete any test file that exercises the functions the agent is expected to generate, eliminating any indirect hints about the desired implementation. | 
					
						
						|  |  | 
					
						
						|  |  | 
					
						
						|  | ## 4. Deployment | 
					
						
						|  | > [!Note] | 
					
						
						|  | > You can access Kimi K2's API on https://platform.moonshot.ai , we provide OpenAI/Anthropic-compatible API for you. | 
					
						
						|  | > | 
					
						
						|  | > The Anthropic-compatible API maps temperature by `real_temperature = request_temperature * 0.6` for better compatible with existing applications. | 
					
						
						|  |  | 
					
						
						|  | Our model checkpoints are stored in the block-fp8 format, you can find it on [Huggingface](https://huggingface.co/moonshotai/Kimi-K2-Instruct). | 
					
						
						|  |  | 
					
						
						|  | Currently, Kimi-K2 is recommended to run on the following inference engines: | 
					
						
						|  |  | 
					
						
						|  | * vLLM | 
					
						
						|  | * SGLang | 
					
						
						|  | * KTransformers | 
					
						
						|  | * TensorRT-LLM | 
					
						
						|  |  | 
					
						
						|  | Deployment examples for vLLM and SGLang can be found in the [Model Deployment Guide](docs/deploy_guidance.md). | 
					
						
						|  |  | 
					
						
						|  | --- | 
					
						
						|  |  | 
					
						
						|  | ## 5. Model Usage | 
					
						
						|  |  | 
					
						
						|  | ### Chat Completion | 
					
						
						|  |  | 
					
						
						|  | Once the local inference service is up, you can interact with it through the chat endpoint: | 
					
						
						|  |  | 
					
						
						|  | ```python | 
					
						
						|  | def simple_chat(client: OpenAI, model_name: str): | 
					
						
						|  | messages = [ | 
					
						
						|  | {"role": "system", "content": "You are Kimi, an AI assistant created by Moonshot AI."}, | 
					
						
						|  | {"role": "user", "content": [{"type": "text", "text": "Please give a brief self-introduction."}]}, | 
					
						
						|  | ] | 
					
						
						|  | response = client.chat.completions.create( | 
					
						
						|  | model=model_name, | 
					
						
						|  | messages=messages, | 
					
						
						|  | stream=False, | 
					
						
						|  | temperature=0.6, | 
					
						
						|  | max_tokens=256 | 
					
						
						|  | ) | 
					
						
						|  | print(response.choices[0].message.content) | 
					
						
						|  | ``` | 
					
						
						|  |  | 
					
						
						|  | > [!NOTE] | 
					
						
						|  | > The recommended temperature for Kimi-K2-Instruct-0905 is `temperature = 0.6`. | 
					
						
						|  | > If no special instructions are required, the system prompt above is a good default. | 
					
						
						|  |  | 
					
						
						|  | --- | 
					
						
						|  |  | 
					
						
						|  | ### Tool Calling | 
					
						
						|  |  | 
					
						
						|  | Kimi-K2-Instruct-0905 has strong tool-calling capabilities. | 
					
						
						|  | To enable them, you need to pass the list of available tools in each request, then the model will autonomously decide when and how to invoke them. | 
					
						
						|  |  | 
					
						
						|  | The following example demonstrates calling a weather tool end-to-end: | 
					
						
						|  |  | 
					
						
						|  | ```python | 
					
						
						|  | # Your tool implementation | 
					
						
						|  | def get_weather(city: str) -> dict: | 
					
						
						|  | return {"weather": "Sunny"} | 
					
						
						|  | # Tool schema definition | 
					
						
						|  | tools = [{ | 
					
						
						|  | "type": "function", | 
					
						
						|  | "function": { | 
					
						
						|  | "name": "get_weather", | 
					
						
						|  | "description": "Retrieve current weather information. Call this when the user asks about the weather.", | 
					
						
						|  | "parameters": { | 
					
						
						|  | "type": "object", | 
					
						
						|  | "required": ["city"], | 
					
						
						|  | "properties": { | 
					
						
						|  | "city": { | 
					
						
						|  | "type": "string", | 
					
						
						|  | "description": "Name of the city" | 
					
						
						|  | } | 
					
						
						|  | } | 
					
						
						|  | } | 
					
						
						|  | } | 
					
						
						|  | }] | 
					
						
						|  | # Map tool names to their implementations | 
					
						
						|  | tool_map = { | 
					
						
						|  | "get_weather": get_weather | 
					
						
						|  | } | 
					
						
						|  | def tool_call_with_client(client: OpenAI, model_name: str): | 
					
						
						|  | messages = [ | 
					
						
						|  | {"role": "system", "content": "You are Kimi, an AI assistant created by Moonshot AI."}, | 
					
						
						|  | {"role": "user", "content": "What's the weather like in Beijing today? Use the tool to check."} | 
					
						
						|  | ] | 
					
						
						|  | finish_reason = None | 
					
						
						|  | while finish_reason is None or finish_reason == "tool_calls": | 
					
						
						|  | completion = client.chat.completions.create( | 
					
						
						|  | model=model_name, | 
					
						
						|  | messages=messages, | 
					
						
						|  | temperature=0.6, | 
					
						
						|  | tools=tools,          # tool list defined above | 
					
						
						|  | tool_choice="auto" | 
					
						
						|  | ) | 
					
						
						|  | choice = completion.choices[0] | 
					
						
						|  | finish_reason = choice.finish_reason | 
					
						
						|  | if finish_reason == "tool_calls": | 
					
						
						|  | messages.append(choice.message) | 
					
						
						|  | for tool_call in choice.message.tool_calls: | 
					
						
						|  | tool_call_name = tool_call.function.name | 
					
						
						|  | tool_call_arguments = json.loads(tool_call.function.arguments) | 
					
						
						|  | tool_function = tool_map[tool_call_name] | 
					
						
						|  | tool_result = tool_function(**tool_call_arguments) | 
					
						
						|  | print("tool_result:", tool_result) | 
					
						
						|  | messages.append({ | 
					
						
						|  | "role": "tool", | 
					
						
						|  | "tool_call_id": tool_call.id, | 
					
						
						|  | "name": tool_call_name, | 
					
						
						|  | "content": json.dumps(tool_result) | 
					
						
						|  | }) | 
					
						
						|  | print("-" * 100) | 
					
						
						|  | print(choice.message.content) | 
					
						
						|  | ``` | 
					
						
						|  |  | 
					
						
						|  | The `tool_call_with_client` function implements the pipeline from user query to tool execution. | 
					
						
						|  | This pipeline requires the inference engine to support Kimi-K2’s native tool-parsing logic. | 
					
						
						|  | For more information, see the [Tool Calling Guide](docs/tool_call_guidance.md). | 
					
						
						|  |  | 
					
						
						|  | --- | 
					
						
						|  |  | 
					
						
						|  | ## 6. License | 
					
						
						|  |  | 
					
						
						|  | Both the code repository and the model weights are released under the [Modified MIT License](LICENSE). | 
					
						
						|  |  | 
					
						
						|  | --- | 
					
						
						|  |  | 
					
						
						|  | ## 7. Third Party Notices | 
					
						
						|  |  | 
					
						
						|  | See [THIRD PARTY NOTICES](THIRD_PARTY_NOTICES.md) | 
					
						
						|  |  | 
					
						
						|  | --- | 
					
						
						|  |  | 
					
						
						|  | ## 7. Contact Us | 
					
						
						|  |  | 
					
						
						|  | If you have any questions, please reach out at [[email protected]](mailto:[email protected]). | 
					
						
						|  |  |