EmbeddingGemma: Powerful and Lightweight Text Representations Paper • 2509.20354 • Published Sep 24 • 39
CMMU: A Benchmark for Chinese Multi-modal Multi-type Question Understanding and Reasoning Paper • 2401.14011 • Published Jan 25, 2024
HalluDial: A Large-Scale Benchmark for Automatic Dialogue-Level Hallucination Evaluation Paper • 2406.07070 • Published Jun 11, 2024
MLVU: A Comprehensive Benchmark for Multi-Task Long Video Understanding Paper • 2406.04264 • Published Jun 6, 2024 • 2
CS-Dialogue: A 104-Hour Dataset of Spontaneous Mandarin-English Code-Switching Dialogues for Speech Recognition Paper • 2502.18913 • Published Feb 26
SeniorTalk: A Chinese Conversation Dataset with Rich Annotations for Super-Aged Seniors Paper • 2503.16578 • Published Mar 20
Video-SafetyBench: A Benchmark for Safety Evaluation of Video LVLMs Paper • 2505.11842 • Published May 17
EmotionTalk: An Interactive Chinese Multimodal Emotion Dataset With Rich Annotations Paper • 2505.23018 • Published May 29
Beyond Solving Math Quiz: Evaluating the Ability of Large Reasoning Models to Ask for Information Paper • 2508.11252 • Published Aug 15 • 3
RealTalk-CN: A Realistic Chinese Speech-Text Dialogue Benchmark With Cross-Modal Interaction Analysis Paper • 2508.10015 • Published Aug 6
Reconsidering Overthinking: Penalizing Internal and External Redundancy in CoT Reasoning Paper • 2508.02178 • Published Aug 4
FlagEval Findings Report: A Preliminary Evaluation of Large Reasoning Models on Automatically Verifiable Textual and Visual Questions Paper • 2509.17177 • Published Sep 21 • 13
view post Post 2973 Quick 30s demo of the new Hub > Azure AI integration to deploy HF models in your own Azure account. Now with Py and CLI!GG @alvarobartt @kramp @pagezyhf See translation 🔥 4 4 + Reply
view post Post 5806 Run DeepSeek-V3.1 locally on 170GB RAM with Dynamic 1-bit GGUFs!🐋GGUFs: unsloth/DeepSeek-V3.1-GGUFThe 715GB model gets reduced to 170GB (-80% size) by smartly quantizing layers.The 1-bit GGUF passes all our code tests & we fixed the chat template for llama.cpp supported backends.Guide: https://docs.unsloth.ai/basics/deepseek-v3.1 See translation ❤️ 18 18 🔥 9 9 🚀 5 5 + Reply
Query Attribute Modeling: Improving search relevance with Semantic Search and Meta Data Filtering Paper • 2508.04683 • Published Aug 6
DSBC : Data Science task Benchmarking with Context engineering Paper • 2507.23336 • Published Jul 31 • 2
view post Post 5098 Run OpenAI's new gpt-oss models locally with Unsloth GGUFs! 🔥🦥20b GGUF: unsloth/gpt-oss-20b-GGUF120b GGUF: unsloth/gpt-oss-120b-GGUFModel will run on 14GB RAM for 20b and 66GB for 120b. See translation 2 replies · ❤️ 20 20 🔥 6 6 🚀 5 5 + Reply