File size: 15,888 Bytes

---
license: apache-2.0
base_model:
- Qwen/Qwen3-32B
pipeline_tag: text-generation
library_name: transformers
---
<!-- markdownlint-disable first-line-h1 -->
<!-- markdownlint-disable html -->
<!-- markdownlint-disable no-duplicate-header -->

# Light-IF-32B

<div align="center">
  <img src="https://cdn-uploads.huggingface.co/production/uploads/64eeb81ad0ceda46832e0160/b2_eQV04B8xSdYJZnB2FD.png" width="95%" alt="Light-IF-32B" />
</div>
<hr>
<div align="center" style="line-height: 1;">
  🤗 <a href="https://huggingface.co/qihoo360/Light-IF-32B">Hugging Face</a>&nbsp&nbsp | &nbsp&nbsp 📑 <a href="https://arxiv.org/abs/2508.03178">Paper Link</a> &nbsp&nbsp | &nbsp&nbsp 📑 <a href="https://zhuanlan.zhihu.com/p/1936535948360918628">Blog</a> &nbsp&nbsp | &nbsp&nbsp 📑 <a href="https://github.com/Qihoo360/Light-IF">Github</a> &nbsp&nbsp
<br>
  </a>
</div>

## Evaluation 
|Model|SuperClue|IFEval|CFBench|IFBench|
| ---- | ---- | ---- | ---- | ---- |
|Qwen3-32B|0.234|0.877|0.823|0.384|
|Qwen3-235B-A22B|0.244|0.882|0.834|0.423|
|Qwen3-235B-A22B-Thinking-2507|0.434|0.916|0.843|0.475|
|DeepSeek-R1-0528|0.436|0.863|0.827|0.415|
|Doubao-seed-1-6-thinking-250615|0.362|0.832|0.82|0.477|
|Doubao-seed-1-6-thinking-250715|0.345|0.856|0.84|0.366|
|ChatGPT-4o-latest|0.260|0.836|0.807|0.365|
|Deepseek-v3-250324|0.306|0.859|0.833|0.405|
|Doubao-1.5-pro-32k-250115|0.285|0.889|0.797|0.375|
|Kimi-K2|0.227|0.921|0.820|0.395|
|GLM-4.5|0.395|0.893|0.833|0.466|
| [**Light-IF-32B (ours)** 🤗](https://huggingface.co/qihoo360/Light-IF-32B) |**0.575**|**0.938**|**0.85**|**0.575**| 


## Introduction
**Instruction following** is a core ability of large language models (LLMs), but performance remains inconsistent, especially on complex tasks.

We identify **lazy reasoning** during the thinking stage as a key cause of poor instruction adherence.

To address this, we propose a framework that promotes rigorous reasoning through **previewing and self-checking**.

Our method begins by generating instruction data with **complex constraints**, filtering out samples that are too easy or too difficult. We then use rejection sampling to build a small but high-quality dataset for model adaptation.

Training involves entropy-preserving supervised fine-tuning (**Entropy-SFT**) and token-wise entropy-adaptive reinforcement learning (**TEA-RL**), guided by rule-based multidimensional rewards.

This approach encourages models to plan ahead and verify their outputs, fostering more generalizable reasoning abilities.

Experiments show consistent improvements across model sizes. Notably, our 32B model outperforms both larger open-source models like **DeepSeek-R1** and closed-source models like **ChatGPT-4o** on challenging instruction-following benchmarks.

<p align="left"><b></b> The overall framework of the proposed method:</p>
<p align="left">
  <img src="https://cdn-uploads.huggingface.co/production/uploads/64eeb81ad0ceda46832e0160/CPa2Eq6a3o4O9ItzcQgqx.png" alt="The overall framework of the proposed method" width="600"/>
</p>

## Quickstart

The following contains a code snippet illustrating how to use the model generate content based on given inputs. 
```python
from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "qihoo360/Light-IF-32B"

tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype="auto",
    device_map="auto"
)

prompt = "Please help me write a poem with a total of 15 lines and no more than 300 words. The poem should be divided into 4 stanzas, each beginning with a **highlighted subtitle**."
messages = [
    {"role": "user", "content": prompt}
]
text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True,
    enable_thinking=True 
)
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)

generated_ids = model.generate(
    **model_inputs,
    max_new_tokens=32768
)
output_ids = generated_ids[0][len(model_inputs.input_ids[0]):].tolist() 

# parsing thinking content
try:
    # rindex finding 151668 (</think>)
    index = len(output_ids) - output_ids[::-1].index(151668)
except ValueError:
    index = 0

thinking_content = tokenizer.decode(output_ids[:index], skip_special_tokens=True).strip("\n")
content = tokenizer.decode(output_ids[index:], skip_special_tokens=True).strip("\n")

print("thinking content:", thinking_content)
print("content:", content)
```

**thinking content:**
<div style="max-height:300px; overflow-y:auto; border:1px solid #ccc; padding:10px;">
Got it, let's tackle this request. First, I need to remember all the requirements: 15 lines total, 4 stanzas, each stanza starts with a highlighted subtitle, no more than 300 words. 

First, I should figure out how to split 15 lines into 4 stanzas. Since 4 stanzas, the most even split would be maybe 4, 4, 4, 3 lines. That adds up to 15. Let's check: 4+4+4+3=15. Perfect. 

Next, the subtitles. They need to be highlighted, so maybe italics or bold. The user said "highlighted subtitle," so I'll use italics for consistency. The subtitles should be meaningful and set the theme of each stanza. Let's think of a cohesive theme—maybe something natural, like a journey through the day or seasons, or growth. Let's go with "growth" since it's easy to split into stanzas. 

First stanza: Maybe "seed" or "beginning." Let's say *Beneath the Soil* as the first subtitle. That's a strong start for growth. 

Second stanza: Next step in growth—*Sprout Through Cracks*. That's good, continues the growth theme, adds imagery. 

Third stanza: Maybe *Bend, But Not Break*—growth often involves resilience. 

Fourth stanza: *Bloom Into Sun*—culmination, reaching the light. 

Now, I need to make sure each stanza starts with the subtitle, then the lines. Let's draft the first stanza: 

*Beneath the Soil*  
a heartbeat of green,  
wrapped tight in dark,  
waiting for the world to breathe.  

Wait, that's 4 lines (subtitle plus 3 lines). Wait, no: the user said each stanza begins with a highlighted subtitle. Does the subtitle count as a line? The original request says "total of 15 lines" and "4 stanzas, each beginning with a highlighted subtitle." So probably, the subtitles are part of the lines, so each stanza's first line is the subtitle, then the rest. Wait, but the example: if the first stanza is 4 lines, then subtitle (1) + 3 lines. Let's check: 4 stanzas, total lines: 15. Let's split: 4 stanzas: maybe stanza 1: 4 lines, stanza 2: 4, stanza 3: 4, stanza 4: 3? Wait, 4+4+4+3=15. That's a possible split. 

But need to make sure the flow is natural. Let's adjust the number of lines per stanza: for example, stanza 1: 4 lines (subtitle + 3), stanza 2: 4 (subtitle + 3), stanza 3: 4, stanza 4: 3. 

Let's try drafting stanza 1: *Beneath the Soil*  
a heartbeat of green,  
wrapped tight in dark,  
waiting for the world to breathe.  

That's 4 lines. Good. 

Stanza 2: *Sprout Through Cracks*  
tip tilts toward light,  
pushing through stone and time,  
a thin stem holding sky.  

Another 4 lines. 

Stanza 3: *Bend, But Not Break*  
when wind screams like fear,  
I arch, then rise again,  
stronger where I was torn.  

4 lines. 

Stanza 4: *Bloom Into Sun*  
petals open like questions,  
each one asking to stay,  
here.  

3 lines. 

Wait, total lines: 4+4+4+3=15. Perfect. Now, check the word count. Let's count: 

Stanza 1: 22 words (approximately: "Beneath the Soil" is 3, then 3 lines: 22-3=19? Wait, need to count exactly. Let's do it: 

Beneath the Soil: 3 words.  
a heartbeat of green: 4  
wrapped tight in dark: 4  
waiting for the world to breathe: 6  
Total stanza 1: 3+4+4+6=17 words. 

Stanza 2: 3+4+4+5=16 (wait: *Sprout Through Cracks*: 3 words.  
tip tilts toward light: 4  
pushing through stone and time: 5  
a thin stem holding sky: 5. Wait, no: 3+4+5+5=17? Wait, no: 3+4+5+5? Wait, no, each line is a separate count. Let's do it again: 

Stanza 1: 3 (subtitle) + 4 +4 +6 = 17. 

Stanza 2: 3 (subtitle) + 4 +5 +5 = 17? Wait, no: "tip tilts toward light" is 4 words. "pushing through stone and time" is 5 words. "a thin stem holding sky" is 5 words. So 3+4+5+5=17. 

Stanza 3: *Bend, But Not Break* (4) + 5 (when wind screams like fear) + 5 (I arch, then rise again) + 5 (stronger where I was torn) = 4+5+5+5=19. 

Stanza 4: *Bloom Into Sun* (3) + 4 (petals open like questions) + 5 (each one asking to stay) + 1 (here) = 3+4+5+1=13. 

Total words: 17+17+19+13=66 words. Way under 300. Good. 

Now, check the flow: the theme of growth from seed to sprout to resilience to bloom. Makes sense. 

But maybe adjust for more imagery or emotional depth. Let's revise some lines for better rhythm or impact. 

Stanza 1: *Beneath the Soil*  
a heartbeat of green,  
wrapped tight in dark,  
waiting for the world to breathe.  

Good. 

Stanza 2: *Sprout Through Cracks*  
tip tilts toward the faint,  
pushing through stone and time,  
a fragile stem holding sky.  

Better: "faint" instead of "light" for more subtlety, "fragile" adds depth. 

Stanza 3: *Bend, But Not Break*  
when wind screams like rage,  
I arch, then rise again,  
stronger where I was torn.  

"Rage" instead of "fear" to make the wind more destructive. 

Stanza 4: *Bloom Into Sun*  
petals open like hearts,  
each one asking to be seen,  
here.  

"Like hearts" adds emotion, "asking to be seen" is more vivid than "stay". 

Now, check word count again: 

Stanza 1: 17 (same) 

Stanza 2: "faint" is 1 word instead of 5 (wait, no: "toward the faint" is 3 words, so total stanza 2: 3 (subtitle) + 5 (tip tilts toward the faint) + 5 (pushing through stone and time) + 5 (a fragile stem holding sky) = 3+5+5+5=18. 

Stanza 3: 4 (subtitle) + 5 (when wind screams like rage) + 5 (I arch, then rise again) + 5 (stronger where I was torn) = 4+5+5+5=19. 

Stanza 4: 3 (subtitle) + 4 (petals open like hearts) + 6 (each one asking to be seen) + 1 (here) = 3+4+6+1=14. 

Total: 17+18+19+14=68 words. Still under. 

Now, check the rhythm. Each line is roughly similar in length, maybe iambic tetrameter or free verse. Since the user didn't specify rhythm, free verse is fine, but should flow naturally. 

Another revision: maybe make some lines more concise or impactful. For example, stanza 1: "waiting for the world to breathe" is good. 

Stanza 2: "toward the faint" could be "toward faint light" to be more specific. So: tip tilts toward faint light, (5 words). Then stanza 2 word count: 3+5+5+5=18. 

Stanza 3: "when wind screams like rage" is 5 words. Maybe "when winds roar like rage" for more intensity. So: 5 words. Then stanza 3 word count: 4+5+5+5=19. 

Stanza 4: "each one asking to be seen" is 6 words. Maybe change to "each begging to be seen" for more impact. So: 5 words. Then stanza 4 word count: 3+4+6+1=14. 

Total word count: 17+18+19+14=68. Still fine. 

Now, make sure each stanza's subtitle is highlighted. The user said "highlighted subtitle," so in the response, we'll use italics. 

Another revision for flow: Maybe change the order of stanzas to make the growth more natural. Wait, the order is: beneath soil, sprout, bend, bloom. That's logical. 

Another check: 15 lines total. Let's count: 

Stanza 1: 1 (subtitle) +3=4 lines. 

Stanza 2: 5+3=4? Wait, no: each stanza is subtitle (1 line) plus number of lines. Wait, no: the user said "4 stanzas, each beginning with a highlighted subtitle." So the total number of lines includes the subtitles. So if each stanza has 4 lines, then 4 stanzas would be 16 lines. But the user wants 15. So one stanza has 3 lines. 

As per our split: 4,4,4,3. Let's confirm: 

Stanza 1: 4 lines (1 subtitle + 3 lines). 

Stanza 2: 4 (1+3). 

Stanza 3: 4 (1+3). 

Stanza 4: 3 lines (1 subtitle +2 lines). 

Yes, that's 4+4+4+3=15 lines. 

Wait, earlier I had stanza 4: 3 lines (1 subtitle +2). So let's adjust stanza 4 to have 3 lines. For example: 

*Bloom Into Sun*  
petals open like hearts,  
each begging to be seen.  

That's 3 lines (1 subtitle +2). Then the previous stanza (stanza 3) is 4 lines: subtitle +3. 

But wait, no: let's do this again. The total lines are 15. So if stanza 1 has 4 lines (1+3), stanza 2 has 4 (1+3), stanza 3 has 4 (1+3), stanza 4 has 3 (1+2). That's 4+4+4+3=15. 

Perfect. So stanza 4 is 3 lines: subtitle +2. 

So let's revise stanza 4: *Bloom Into Sun*  
petals open like hearts,  
each begging to be seen.  

That's 3 lines. Then total lines: 15. 

Now, check again: 

Stanza 1: 4 lines. 

1. *Beneath the Soil*  
2. a heartbeat of green,  
3. wrapped tight in dark,  
4. waiting for the world to breathe.  

Stanza 2: 4 lines. 

5. *Sprout Through Cracks*  
6. tip tilts toward faint light,  
7. pushing through stone and time,  
8. a fragile stem holding sky.  

Stanza 3: 4 lines. 

9. *Bend, But Not Break*  
10. when winds roar like rage,  
11. I arch, then rise again,  
12. stronger where I was torn.  

Stanza 4: 3 lines. 

13. *Bloom Into Sun*  
14. petals open like hearts,  
15. each begging to be seen.  

Yes, exactly 15 lines. 

Now, check the flow: Each stanza builds on the growth theme. 

Another revision for better imagery: Maybe change "faint light" to "faint light through cracks" to tie back to the subtitle. But that adds words. Since we have to keep word count low, maybe keep it as is. 

Stanza 2: "tip tilts toward faint light,"—good, shows struggle. 

Stanza 3: "when winds roar like rage,"—good, shows adversity. 

Stanza 4: "petals open like hearts,"—adds emotion, "each begging to be seen,"—relatable. 

I think this works. Now, let's write the final version. 

Wait, let's check for any redundancy or weak lines. "Wrapped tight in dark" is good. "Waiting for the world to breathe"—beautiful, personifies the seed. 

Stanza 2: "pushing through stone and time"—good, combines physical and temporal struggle. 

Stanza 3: "stronger where I was torn"—great, conveys resilience. 

Stanza 4: "each begging to be seen"—adds a personal touch, makes the bloom more meaningful. 

Is there anything to change? Maybe make the transition between stanzas smoother? Not necessary; the theme is clear. 

Another check: Each subtitle is highlighted. Yes, we'll use italics. 

Now, check word count again: 

Stanza 1: 3 (subtitle) + 4 +4 +6 = 17. 

Stanza 2: "tip tilts toward faint light" is 5 words. "pushing through stone and time" is 5 words. "a fragile stem holding sky" is 5 words. So 3+5+5+5=18. 

Stanza 3: *Bend, But Not Break* (4) + 5 (when wind screams like fear) + 5 (I arch, then rise again) + 5 (stronger where I was torn) = 4+5+5+5=19. 

Stanza 4: *Bloom Into Sun* (3) + 4 (petals open like questions) + 5 (each one asking to stay) = 3+4+5=12. 

Total words: 17+18+19+12=66 words. Way under 300. Good. 

Word count: 66 words. Way under 300. 

Lines: 15. Perfect. 

Now, let's write the final poem.
</div>

**content:**
<div style="max-height:300px; overflow-y:auto; border:1px solid #ccc; padding:10px;">
**Poem: "Growth in Shattered Light"**  

*Beneath the Soil*  
a heartbeat of green,  
wrapped tight in dark,  
waiting for the world to breathe.  

*Sprout Through Cracks*  
tip tilts toward faint light,  
pushing through stone and time,  
a fragile stem holding sky.  

*Bend, But Not Break*  
when winds roar like rage,  
I arch, then rise again,  
stronger where I was torn.  

*Bloom Into Sun*  
petals open like hearts,  
each begging to be seen.
</div>

For deployment, you can use `sglang>=0.4.6.post1` or `vllm>=0.8.5` or to create an OpenAI-compatible API endpoint.

## Citation
```
@misc{lightifproj,
      title={Light-IF: Endowing LLMs with Generalizable Reasoning via Preview and Self-Checking for Complex Instruction Following}, 
      author={Chenyang Wang, Liang Wen, Shousheng Jia, Xiangzheng Zhang, Liang Xu},
      year={2025},
      eprint={2508.03178},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2508.03178}, 
}
```