view article Article Intel XPU Kernel Skill: LLM-driven Triton kernel optimization for the Hugging Face Kernel Hub danf • 4 days ago • 9
view article Article Getting More from Your Test-Time Compute Budget with Portfolio Beam Search danelbaz • Feb 24 • 8
Prune Once for All: Sparse Pre-Trained Language Models Paper • 2111.05754 • Published Nov 10, 2021 • 2
view article Article DeepMath: A lightweight math reasoning Agent with smolagents +1 danf, mber, moshew • Dec 4, 2025 • 40
view article Article Accelerating Qwen3-8B Agent on Intel® Core™ Ultra with Depth-Pruned Draft Models +3 imargulis, ofirzaf, sguskin, guybd, pcuenq • Sep 29, 2025 • 25
view article Article Accelerating Qwen3-8B Agent on Intel® Core™ Ultra with Depth-Pruned Draft Models +3 imargulis, ofirzaf, sguskin, guybd, pcuenq • Sep 29, 2025 • 25
view article Article Breaking Language Barriers in Mathematical AI: Introducing Hebrew Math Tutor danf • Sep 7, 2025 • 3
view article Article Introducing HELMET: Holistically Evaluating Long-context Language Models +5 hyen, gaotianyu1350, houminmin, kding1, danf, moshew, cdq10131 • Apr 16, 2025 • 42
view article Article Speeding Up LLM Decoding with Advanced Universal Assisted Generation Techniques jmamou • Mar 24, 2025 • 20
SQuARE: Sequential Question Answering Reasoning Engine for Enhanced Chain-of-Thought in Large Language Models Paper • 2502.09390 • Published Feb 13, 2025 • 16