ontocord/1.7b-MixtureVitae-300BT-v1-instruct-16k Feature Extraction • 2B • Updated 9 days ago • 480
ontocord/1.7b-MixtureVitae-300BT-v1-instruct-16k Feature Extraction • 2B • Updated 9 days ago • 480
ontocord/1.7b-MixtureVitae-300BT-v1-instruct-16k Feature Extraction • 2B • Updated 9 days ago • 480
MixtureVitae: Open Web-Scale Pretraining Dataset With High Quality Instruction and Reasoning Data Built from Permissive-First Text Sources Paper • 2509.25531 • Published 28 days ago • 7
MixtureVitae: Open Web-Scale Pretraining Dataset With High Quality Instruction and Reasoning Data Built from Permissive-First Text Sources Paper • 2509.25531 • Published 28 days ago • 7
MixtureVitae: Open Web-Scale Pretraining Dataset With High Quality Instruction and Reasoning Data Built from Permissive-First Text Sources Paper • 2509.25531 • Published 28 days ago • 7
MixtureVitae: Open Web-Scale Pretraining Dataset With High Quality Instruction and Reasoning Data Built from Permissive-First Text Sources Paper • 2509.25531 • Published 28 days ago • 7
Optimal Sparsity of Mixture-of-Experts Language Models for Reasoning Tasks Paper • 2508.18672 • Published Aug 26 • 10
mSCoRe: a $M$ultilingual and Scalable Benchmark for $S$kill-based $Co$mmonsense $Re$asoning Paper • 2508.10137 • Published Aug 13 • 2
Lizard: An Efficient Linearization Framework for Large Language Models Paper • 2507.09025 • Published Jul 11 • 18
SLR: An Automated Synthesis Framework for Scalable Logical Reasoning Paper • 2506.15787 • Published Jun 18 • 2