amandaa
/

AutoL2S-7b

Model card Files Files and versions

AutoL2S-7b / README.md

Feng Luo

update README

64d078f 7 months ago

|

history blame contribute delete

3.47 kB

	---
	license: apache-2.0
	base_model:
	- Qwen/Qwen2.5-7B-Instruct
	---

	# AutoL2S-7B

	This is the official model repository for AutoL2S-7B, a model fine-tuned for efficient reasoning based on [Qwen/Qwen2.5-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-7B-Instruct/tree/main).

	## 💡 Overview

	AutoL2S enables automatically switching between short and long reasoning paths based on input complexity.
	Auto Long-Short Reasoning (AutoL2S), a dynamic and model-agnostic framework that enables LLMs to dynamically compress their generated reasoning
	path based on the complexity of the reasoning question. AutoL2S enables a learned paradigm, in which LLMs themselves can decide when longer reasoning is necessary and when shorter reasoning
	suffices, by training on data annotated with our proposed method, which includes both long and short CoT paths and a special \<EASY\> token (\<specialLong\> in the implementation). We then use <EASY> token to indicate when the model can
	skip generating lengthy CoT reasoning. This proposed annotation strategy can enhance the LLMs’ ability to generate shorter CoT reasoning paths with improved quality after training.

	This repository contains:

	- Model weights
	- Configuration files
	- necessary scripts in the `examples/` directory

	<p align="left">
	<img src="https://cdn-uploads.huggingface.co/production/uploads/66f9bb2dd5575ad6914756ce/dVpIjeIaU8Hv1M5z5VWYS.png" width="40%" style="display:inline-block; margin-right: 10px;" />
	<img src="https://cdn-uploads.huggingface.co/production/uploads/66f9bb2dd5575ad6914756ce/qxHTE-ZGTpxVjmkIX6Fk-.png" width="40%" style="display:inline-block;" />
	</p>

	---
	## 🧩 Dependencies
	We recommend using the model with [vLLM](https://github.com/vllm-project/vllm).
	The code has been tested with:

	```
	vLLM == 0.6.2
	```

	---
	## 🚀 How to Use

	Run the inference example:

	```bash
	cd examples
	python run_inference.py
	```

	Alternatively, please download examples/prefixLLM.py and examples/template.py from this repository and put them in your working dir.

	```python
	from vllm import SamplingParams
	from prefixLLM import PrefixLLM
	from template import SYSTEM_PROMPT, SHORT_TRIGGER

	llm = PrefixLLM(model="amandaa/AutoL2S-7b")
	max_tokens, temp = 32768, 0.7
	sampling_params_route = SamplingParams(max_tokens=max_tokens, temperature=temp, stop=["<specialLong>"], include_stop_str_in_output=True)
	sampling_params_force_think = SamplingParams(max_tokens=max_tokens, temperature=temp)

	question = "Convert the point $(0,3)$ in rectangular coordinates to polar coordinates. Enter your answer in the form $(r,\\theta),$ where $r > 0$ and $0 \\le \\theta < 2 \\pi.$"
	messages = [
	{"role": "system", "content": SYSTEM_PROMPT},
	{"role": "user", "content": question}
	]
	responses = llm.route_chat(messages=messages, sampling_params_route=sampling_params_route, sampling_params_force_think=sampling_params_force_think, use_tqdm=True, trigger_word=SHORT_TRIGGER)

	print(SHORT_TRIGGER + responses[0].outputs[0].text)
	```

	---


	## 🔍 Citation

	If you use this model in your work, please consider citing:

	```bibtex
	@article{luo2025autol2s,
	title={AutoL2S: Auto Long-Short Reasoning for Efficient Large Language Models},
	author={Luo, Feng and Chuang, Yu-Neng and Wang, Guanchu and Le, Hoang Anh Duy and Zhong, Shaochen and Liu, Hongyi and Yuan, Jiayi and Sui, Yang and Braverman, Vladimir and Chaudhary, Vipin and others},
	journal={arXiv preprint arXiv:2505.22662},
	year={2025}
	}
	```