Superpositional Gradient Descent: Harnessing Quantum Principles for Model Training Paper • 2511.01918 • Published Nov 1 • 11
INT v.s. FP: A Comprehensive Study of Fine-Grained Low-bit Quantization Formats Paper • 2510.25602 • Published Oct 29 • 75
Stable-SPAM: How to Train in 4-Bit More Stably than 16-Bit Adam Paper • 2502.17055 • Published Feb 24 • 20
Frac-Connections: Fractional Extension of Hyper-Connections Paper • 2503.14125 • Published Mar 18 • 22