--- license: apache-2.0 datasets: - DreamW1ngs/AdaR-orca-math-9k language: - en metrics: - accuracy base_model: - Qwen/Qwen2.5-Math-7B pipeline_tag: question-answering --- # Making Mathematical Reasoning Adaptive

📃 Paper | ⚙️ Code | 🤖 Project | 📭 Contact

--- ## 🌱 Overview Large Language Models (LLMs) have shown impressive reasoning capabilities, yet they often rely on **spurious reasoning** — producing answers from superficial features, leading to failure at robustness and generalization. We propose **AdaR** framework to enable adaptive reasoning, wherein models rely on problem-solving logic to produce answers. **AdaR** synthesizes logically equivalent queries by varying variable values, and trains models with RLVR on these data to penalize spurious logic while encouraging adaptive logic. The framework integrates *data synthesis* and *RLVR training* to enhance both **robustness (in-domain)** and **generalization (out-of-domain)**. ![AdaR Process Framework](./figs/process.png) > **Figure 1.** > *Subfigure I:* Three reasoning modes — direct inference (black), spurious reasoning (red), adaptive reasoning (green). > *Subfigure II:* Logic-preserving variable perturbation and gold-answer generation via executable logic. > *Subfigure III:* RLVR optimization encouraging adaptive reasoning through comparative feedback. ## 📈 Highlights - 🚀 **+8.5 Average Improvement** across in-domain robustness tasks and out-of-domain tasks. - 🧮 **Only 9K synthetic data** needed for significant gains. - ⚖️ **Enable algebraic thinking** and improved stability under scaling. - 🔁 **Generalizable framework** applicable to instruct models. ---