Update README.md
Browse files
README.md
CHANGED
|
@@ -6,11 +6,11 @@ pipeline_tag: text-generation
|
|
| 6 |
tags:
|
| 7 |
- reasoning
|
| 8 |
- looped transformer
|
| 9 |
-
arxiv:
|
| 10 |
library_name: transformers
|
| 11 |
---
|
| 12 |
|
| 13 |
-
This is the general
|
| 14 |
|
| 15 |
Think-at-Hard(TaH0 uses a neural decider to dynamically initiate latent iterations only where needed. Compared with baselines that iterate twice for all output tokens, TaH delivers 8.1-11.3% accuracy gains while exempting 94% of tokens from the second iteration. Against strong single-iteration Qwen3 models finetuned with the same data, it also delivers 4.0-5.0% accuracy gains. When allowing less than 3% additional parameters from LoRA and the iteration decider, the gains increase to 8.5-12.6% and 5.3-5.4%, respectively.
|
| 16 |
|
|
|
|
| 6 |
tags:
|
| 7 |
- reasoning
|
| 8 |
- looped transformer
|
| 9 |
+
arxiv: 2511.08577
|
| 10 |
library_name: transformers
|
| 11 |
---
|
| 12 |
|
| 13 |
+
This is the general version of TaH-plus-1.7B, trained on a mixture of math, code, and science data, presented in the paper [Think-at-Hard: Selective Latent Iterations to Improve Reasoning Language Models](https://huggingface.co/papers/2511.08577).
|
| 14 |
|
| 15 |
Think-at-Hard(TaH0 uses a neural decider to dynamically initiate latent iterations only where needed. Compared with baselines that iterate twice for all output tokens, TaH delivers 8.1-11.3% accuracy gains while exempting 94% of tokens from the second iteration. Against strong single-iteration Qwen3 models finetuned with the same data, it also delivers 4.0-5.0% accuracy gains. When allowing less than 3% additional parameters from LoRA and the iteration decider, the gains increase to 8.5-12.6% and 5.3-5.4%, respectively.
|
| 16 |
|