DiffAdapt: Difficulty-Adaptive Reasoning for Token-Efficient LLM Inference Paper • 2510.19669 • Published Oct 22 • 1
Can Compressed LLMs Truly Act? An Empirical Evaluation of Agentic Capabilities in LLM Compression Paper • 2505.19433 • Published May 26 • 5
The Lottery LLM Hypothesis, Rethinking What Abilities Should LLM Compression Preserve? Paper • 2502.17535 • Published Feb 24 • 8
Perovskite-LLM: Knowledge-Enhanced Large Language Models for Perovskite Solar Cell Research Paper • 2502.12669 • Published Feb 18 • 2
Mediator: Memory-efficient LLM Merging with Less Parameter Conflicts and Uncertainty Based Routing Paper • 2502.04411 • Published Feb 6 • 4
Can LLMs Maintain Fundamental Abilities under KV Cache Compression? Paper • 2502.01941 • Published Feb 4 • 14
ChunkKV: Semantic-Preserving KV Cache Compression for Efficient Long-Context LLM Inference Paper • 2502.00299 • Published Feb 1 • 2
LPZero: Language Model Zero-cost Proxy Search from Zero Paper • 2410.04808 • Published Oct 7, 2024 • 2
Should We Really Edit Language Models? On the Evaluation of Edited Language Models Paper • 2410.18785 • Published Oct 24, 2024 • 7
Dissecting the Runtime Performance of the Training, Fine-tuning, and Inference of Large Language Models Paper • 2311.03687 • Published Nov 7, 2023
LISA: Layerwise Importance Sampling for Memory-Efficient Large Language Model Fine-Tuning Paper • 2403.17919 • Published Mar 26, 2024 • 16
Pruner-Zero: Evolving Symbolic Pruning Metric from scratch for Large Language Models Paper • 2406.02924 • Published Jun 5, 2024 • 2