| # LIMO: Less Is More for Reasoning π | |
| ## π Table of Contents | |
| - [Overview](#overview) | |
| - [Key Results](#key-results) | |
| - [Model Zoo](#model-zoo) | |
| - [Datasets](#datasets) | |
| - [Quick Start](#quick-start) | |
| - [Training](#training) | |
| - [Evaluation](#evaluation) | |
| - [Citation](#citation) | |
| ## Overview | |
| LIMO challenges the conventional wisdom in mathematical reasoning by demonstrating that models can achieve superior performance with significantly less but higher quality training data. Our approach: | |
| - π― Achieves SOTA with only 817 carefully curated training samples | |
| - π Shows strong generalization across diverse problem types | |
| - π¬ Provides comprehensive evaluation on 10 benchmarks | |
| - π Releases high-quality datasets and evaluation tools | |
| ## Key Results | |
| | Model | AIME24 | MATH500 | Training Samples | | |
| |-------|--------|---------|-----------------| | |
| | LIMO (Ours) | **57.1%** | **94.8%** | 817 | | |
| | Previous SOTA | 6.5% | 59.2% | 100k+ | | |
| <details> | |
| <summary>Click to see more detailed results</summary> | |
| | Benchmark | LIMO | Previous SOTA | Improvement | | |
| |-----------|------|--------------------------|-------------| | |
| | AIME24 | **57.1%** | 6.5% | +50.6% | | |
| | MATH500 | **94.8%** | 59.2% | +35.6% | | |
| | AMC23 | **92.0%** | 40.6% | +51.4% | | |
| | OlympiadBench | **66.8%** | 36.7% | +30.1% | | |
| | CHMath | **75.4%** | 11.2% | +64.2% | | |
| | Gaokao | **81.0%** | 49.4% | +31.6% | | |
| | Kaoyan | **73.4%** | 32.7% | +40.7% | | |
| | GradeSchool | **76.2%** | 36.2% | +40.0% | | |
| | Minerva | 44.9% | **47.1%** | -2.2% | | |
| | GPQA | 66.7% | **73.3%** | -6.6% | | |
| </details> | |
| ## Model Zoo | |
| Our LIMO model is available on Hugging Face π€: | |
| | Model | Backbone | Size | Link | | |
| |-------|------|------|------| | |
| | LIMO | [Qwen2.5-32B-Instruct](https://huggingface.co/Qwen/Qwen2.5-32B-Instruct) | 32B | [π€](https://huggingface.co/GAIR/LIMO) | | |
| ## Datasets | |
| We release our datasets through Hugging Face π€: | |
| | Dataset | Description | Size | Link | | |
| |---------|-------------|------|------| | |
| | LIMO | Training set used to train LIMO model | 817 | [π€](https://huggingface.co/datasets/GAIR/LIMO) | | |
| Note: We are gradually releasing additional datasets mentioned in our paper, including those used for comparative experiments, to facilitate reproducibility and further analysis by the research community. Stay tuned! | |
| ## Quick Start | |
| Our model is fine-tuned on [Qwen2.5-32B-Instruct](https://huggingface.co/Qwen/Qwen2.5-32B-Instruct) and is compatible with most mainstream frameworks like [HF Transformers](https://github.com/huggingface/transformers), [VLLM](https://github.com/vllm-project/vllm), [TensorRT-LLM](https://github.com/NVIDIA/TensorRT-LLM) and etc. | |
| <details> | |
| <summary>Start with HF Transformers</summary> | |
| ```bash | |
| # Install required packages | |
| pip install transformers | |
| ``` | |
| ```python | |
| from transformers import AutoModelForCausalLM, AutoTokenizer | |
| import torch | |
| # Initialize model and tokenizer | |
| model = AutoModelForCausalLM.from_pretrained( | |
| "GAIR/LIMO", | |
| torch_dtype="auto", | |
| trust_remote_code=True, | |
| device_map="auto" | |
| ) | |
| tokenizer = AutoTokenizer.from_pretrained("GAIR/LIMO", trust_remote_code=True) | |
| # Prepare input messages (We use the following template and system prompt during training and inference) | |
| messages = [ | |
| {"role": "system", "content": "Please reason step by step, and put your final answer within \\boxed{}."}, | |
| {"role": "user", "content": "What is the result of 1+1?"} | |
| ] | |
| # Format input using chat template | |
| text = tokenizer.apply_chat_template( | |
| messages, | |
| tokenize=False, | |
| add_generation_prompt=True | |
| ) | |
| # Tokenize input | |
| inputs = tokenizer(text, return_tensors="pt").to(model.device) | |
| # Generate response | |
| outputs = model.generate( | |
| **inputs, | |
| max_new_tokens=32768, | |
| temperature=0.7, | |
| top_p=0.95, | |
| do_sample=True | |
| ) | |
| # Decode and print response | |
| response = tokenizer.decode(outputs[0][inputs['input_ids'].shape[1]:], skip_special_tokens=True) | |
| print(response) | |
| ``` | |
| </details> | |
| <details> | |
| <summary>Start with VLLM</summary> | |
| ```bash | |
| # Install required packages | |
| pip install vllm | |
| ``` | |
| ```python | |
| from vllm import LLM, SamplingParams | |
| from transformers import AutoTokenizer | |
| # Initialize the model | |
| llm = LLM( | |
| model="GAIR/LIMO", | |
| tensor_parallel_size=4, # adjust based on available GPUs | |
| trust_remote_code=True, | |
| swap_space=60, | |
| gpu_memory_utilization=0.96, | |
| ) | |
| # Prepare input messages (We use the following template and system prompt during training and inference) | |
| messages = [ | |
| {"role": "system", "content": "Please reason step by step, and put your final answer within \\boxed{}."}, | |
| {"role": "user", "content": "What is the result of 1+1?"} | |
| ] | |
| # Setup tokenizer | |
| tokenizer = AutoTokenizer.from_pretrained("GAIR/LIMO", trust_remote_code=True) | |
| text = tokenizer.apply_chat_template( | |
| messages, | |
| tokenize=False, | |
| add_generation_prompt=True | |
| ) | |
| # Configure generation parameters | |
| sampling_params = SamplingParams( | |
| temperature=0.7, | |
| max_tokens=32768, | |
| top_p=0.95, | |
| ) | |
| # Generate response | |
| output = llm.generate(text, sampling_params) | |
| print(output[0].outputs[0].text) | |
| ``` | |
| </details> | |
| ## License | |
| This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details. | |
| ## Citation | |
| ```bibtex | |
| @misc{ye2025limoreasoning, | |
| title={LIMO: Less is More for Reasoning}, | |
| author={Yixin Ye and Zhen Huang and Yang Xiao and Ethan Chern and Shijie Xia and Pengfei Liu}, | |
| year={2025}, | |
| eprint={2502.03387}, | |
| archivePrefix={arXiv}, | |
| primaryClass={cs.CL}, | |
| url={https://arxiv.org/abs/2502.03387}, | |
| } | |
| ``` |