AI & ML interests

None defined yet.

romainhuetย 
updated a Space 3 months ago
romainhuetย 
published a Space 3 months ago
reach-vbย 
posted an update 5 months ago
view post
Post
5761
Excited to onboard FeatherlessAI on Hugging Face as an Inference Provider - they bring a fleet of 6,700+ LLMs on-demand on the Hugging Face Hub ๐Ÿคฏ

Starting today, you'd be able to access all those LLMs (OpenAI compatible) on HF model pages and via OpenAI client libraries too! ๐Ÿ’ฅ

Go, play with it today: https://huggingface.co/blog/inference-providers-featherless

P.S. They're also bringing on more GPUs to support all your concurrent requests!
  • 1 reply
ยท
reach-vbย 
posted an update 6 months ago
view post
Post
4612
hey hey @mradermacher - VB from Hugging Face here, we'd love to onboard you over to our optimised xet backend! ๐Ÿ’ฅ

as you know we're in the process of upgrading our storage backend to xet (which helps us scale and offer blazingly fast upload/ download speeds too): https://huggingface.co/blog/xet-on-the-hub and now that we are certain that the backend can scale with even big models like Llama 4/ Qwen 3 - we;re moving to the next phase of inviting impactful orgs and users on the hub over as you are a big part of the open source ML community - we would love to onboard you next and create some excitement about it in the community too!

in terms of actual steps - it should be as simple as one of the org admins to join hf.co/join/xet - we'll take care of the rest.

p.s. you'd need to have a the latest hf_xet version of huggingface_hub lib but everything else should be the same: https://huggingface.co/docs/hub/storage-backends#using-xet-storage

p.p.s. this is fully backwards compatible so everything will work as it should! ๐Ÿค—
ยท
hanzlaย 
posted an update 8 months ago
view post
Post
2134
Hi community,

Few days back, I posted about my ongoing research on making reasoning mamba models and I found great insights from the community.

Today, I am announcing an update to the model weights. With newer checkpoints, the Falcon3 Mamba R1 model now outperforms very large transformer based LLMs (including Gemini) for Formal Logic questions of MMLU. It scores 60% on formal logic which is considered a tough subset of questions in MMLU.

I would highly appreciate your insights and suggestions on this new checkpoint.

Model Repo: hanzla/Falcon3-Mamba-R1-v0

Chat space: hanzla/Falcon3MambaReasoner
hanzlaย 
posted an update 8 months ago
view post
Post
4066
Hello community,

I want to share my work of creating a reasoning mamba model

I used GRPO over Falcon3 Mamba Instruct to make this model. It generates blazing fast response while building good logic to answer challenging questions.

Give it a try:

Model repo: hanzla/Falcon3-Mamba-R1-v0

Space: hanzla/Falcon3MambaReasoner

Looking forward to community feedback.
  • 2 replies
ยท
hanzlaย 
posted an update 9 months ago
view post
Post
1270
Gemma 3 is a game changer for on device multimodal applications.

Try for yourself how a 4 billion parameter model can be so good.

hanzla/PlaygroundGemma3
  • 1 reply
ยท
reach-vbย 
posted an update 12 months ago
view post
Post
7258
VLMs are going through quite an open revolution AND on-device friendly sizes:

1. Google DeepMind w/ PaliGemma2 - 3B, 10B & 28B: google/paligemma-2-release-67500e1e1dbfdd4dee27ba48

2. OpenGVLabs w/ InternVL 2.5 - 1B, 2B, 4B, 8B, 26B, 38B & 78B: https://huggingface.co/collections/OpenGVLab/internvl-25-673e1019b66e2218f68d7c1c

3. Qwen w/ Qwen 2 VL - 2B, 7B & 72B: Qwen/qwen2-vl-66cee7455501d7126940800d

4. Microsoft w/ FlorenceVL - 3B & 8B: @jiuhai

5. Moondream2 w/ 0.5B: https://huggingface.co/vikhyatk/

What a time to be alive! ๐Ÿ”ฅ
reach-vbย 
posted an update 12 months ago
view post
Post
5093
Massive week for Open AI/ ML:

Mistral Pixtral & Instruct Large - ~123B, 128K context, multilingual, json + function calling & open weights
mistralai/Pixtral-Large-Instruct-2411
mistralai/Mistral-Large-Instruct-2411

Allen AI Tรผlu 70B & 8B - competive with claude 3.5 haiku, beats all major open models like llama 3.1 70B, qwen 2.5 and nemotron
allenai/tulu-3-models-673b8e0dc3512e30e7dc54f5
allenai/tulu-3-datasets-673b8df14442393f7213f372

Llava o1 - vlm capable of spontaneous, systematic reasoning, similar to GPT-o1, 11B model outperforms gemini-1.5-pro, gpt-4o-mini, and llama-3.2-90B-vision
Xkev/Llama-3.2V-11B-cot

Black Forest Labs Flux.1 tools - four new state of the art model checkpoints & 2 adapters for fill, depth, canny & redux, open weights
reach-vb/black-forest-labs-flux1-6743847bde9997dd26609817

Jina AI Jina CLIP v2 - general purpose multilingual and multimodal (text & image) embedding model, 900M params, 512 x 512 resolution, matroyoshka representations (1024 to 64)
jinaai/jina-clip-v2

Apple AIM v2 & CoreML MobileCLIP - large scale vision encoders outperform CLIP and SigLIP. CoreML optimised MobileCLIP models
apple/aimv2-6720fe1558d94c7805f7688c
apple/coreml-mobileclip

A lot more got released like, OpenScholar (https://huggingface.co/collections/OpenScholar/openscholar-v1-67376a89f6a80f448da411a6), smoltalk ( HuggingFaceTB/smoltalk), Hymba ( nvidia/hymba-673c35516c12c4b98b5e845f), Open ASR Leaderboard ( hf-audio/open_asr_leaderboard) and much more..

Can't wait for the next week! ๐Ÿค—
reach-vbย 
posted an update about 1 year ago
view post
Post
4817
What a brilliant week for Open Source AI!

Qwen 2.5 Coder by Alibaba - 0.5B / 1.5B / 3B / 7B / 14B/ 32B (Base + Instruct) Code generation LLMs, with 32B tackling giants like Gemnini 1.5 Pro, Claude Sonnet
Qwen/qwen25-coder-66eaa22e6f99801bf65b0c2f

LLM2CLIP from Microsoft - Leverage LLMs to train ultra-powerful CLIP models! Boosts performance over the previous SOTA by ~17%
microsoft/llm2clip-672323a266173cfa40b32d4c

Athene v2 Chat & Agent by NexusFlow - SoTA general LLM fine-tuned from Qwen 2.5 72B excels at Chat + Function Calling/ JSON/ Agents
Nexusflow/athene-v2-6735b85e505981a794fb02cc

Orca Agent Instruct by Microsoft - 1 million instruct pairs covering text editing, creative writing, coding, reading comprehension, etc - permissively licensed
microsoft/orca-agentinstruct-1M-v1

Ultravox by FixieAI - 70B/ 8B model approaching GPT4o level, pick any LLM, train an adapter with Whisper as Audio Encoder
reach-vb/ultravox-audio-language-model-release-67373b602af0a52b2a88ae71

JanusFlow 1.3 by DeepSeek - Next iteration of their Unified MultiModal LLM Janus with RectifiedFlow
deepseek-ai/JanusFlow-1.3B

Common Corpus by Pleais - 2,003,039,184,047 multilingual, commercially permissive and high quality tokens!
PleIAs/common_corpus

I'm sure I missed a lot, can't wait for the next week!

Put down in comments what I missed! ๐Ÿค—