hey hey @mradermacher - VB from Hugging Face here, we'd love to onboard you over to our optimised xet backend! ๐ฅ
as you know we're in the process of upgrading our storage backend to xet (which helps us scale and offer blazingly fast upload/ download speeds too): https://huggingface.co/blog/xet-on-the-hub and now that we are certain that the backend can scale with even big models like Llama 4/ Qwen 3 - we;re moving to the next phase of inviting impactful orgs and users on the hub over as you are a big part of the open source ML community - we would love to onboard you next and create some excitement about it in the community too!
in terms of actual steps - it should be as simple as one of the org admins to join hf.co/join/xet - we'll take care of the rest.
Few days back, I posted about my ongoing research on making reasoning mamba models and I found great insights from the community.
Today, I am announcing an update to the model weights. With newer checkpoints, the Falcon3 Mamba R1 model now outperforms very large transformer based LLMs (including Gemini) for Formal Logic questions of MMLU. It scores 60% on formal logic which is considered a tough subset of questions in MMLU.
I would highly appreciate your insights and suggestions on this new checkpoint.
I want to share my work of creating a reasoning mamba model
I used GRPO over Falcon3 Mamba Instruct to make this model. It generates blazing fast response while building good logic to answer challenging questions.
Llava o1 - vlm capable of spontaneous, systematic reasoning, similar to GPT-o1, 11B model outperforms gemini-1.5-pro, gpt-4o-mini, and llama-3.2-90B-vision Xkev/Llama-3.2V-11B-cot
Jina AI Jina CLIP v2 - general purpose multilingual and multimodal (text & image) embedding model, 900M params, 512 x 512 resolution, matroyoshka representations (1024 to 64) jinaai/jina-clip-v2
Athene v2 Chat & Agent by NexusFlow - SoTA general LLM fine-tuned from Qwen 2.5 72B excels at Chat + Function Calling/ JSON/ Agents Nexusflow/athene-v2-6735b85e505981a794fb02cc
Orca Agent Instruct by Microsoft - 1 million instruct pairs covering text editing, creative writing, coding, reading comprehension, etc - permissively licensed microsoft/orca-agentinstruct-1M-v1