In a Training Loop 🔄

33 13 78

aquiffoo

https://aquiffoo.is-a.dev/

AI & ML interests

thanks for everything.

Recent Activity

liked a model 1 day ago

LiquidAI/LFM2.5-1.2B-Thinking

liked a model 3 days ago

zai-org/GLM-4.7-Flash

replied to AdinaY's post 4 days ago

After a VLM, StepFun dropped a new audio model: Step-Audio-R1.1, enabling thinking while speaking 🔥 https://huggingface.co/stepfun-ai/Step-Audio-R1.1 ✨ Apache 2.0 ✨ Combines dual-brain architecture and acoustic-grounded reasoning to enable real-time dialogue with SOTA-level reasoning

View all activity

Organizations

liked a model 1 day ago

LiquidAI/LFM2.5-1.2B-Thinking

Text Generation • 1B • Updated 1 day ago • 3.01k • 105

liked a model 3 days ago

zai-org/GLM-4.7-Flash

Text Generation • 31B • Updated 2 days ago • 124k • • 932

replied to AdinaY's post 4 days ago

i think i'll keep an eye on stepfun this year, they're cooking

reacted to AdinaY's post with 👍 4 days ago

Post

1012

After a VLM, StepFun dropped a new audio model: Step-Audio-R1.1, enabling thinking while speaking 🔥

stepfun-ai/Step-Audio-R1.1

✨ Apache 2.0
✨ Combines dual-brain architecture and acoustic-grounded reasoning to enable real-time dialogue with SOTA-level reasoning

2 replies

upvoted a paper 4 days ago

STEP3-VL-10B Technical Report

Paper • 2601.09668 • Published 8 days ago • 180

liked a model 5 days ago

stepfun-ai/Step3-VL-10B

Image-Text-to-Text • 10B • Updated about 3 hours ago • 28.6k • 229

New activity in huggingface/InferenceSupport 7 days ago

meituan-longcat/LongCat-Flash-Thinking-2601

👍 5

#7453 opened 7 days ago by

aquiffoo

liked a model 7 days ago

meituan-longcat/LongCat-Flash-Thinking-2601

Text Generation • 562B • Updated about 20 hours ago • 503 • 76

liked a model 8 days ago

zai-org/GLM-Image

Text-to-Image • Updated 7 days ago • 10.8k • • 938

liked 2 datasets 9 days ago

MiniMaxAI/OctoCodingBench

Viewer • Updated 9 days ago • 72 • 11.5k • 239

ylecun/mnist

Viewer • Updated Aug 8, 2024 • 70k • 66.2k • 223

liked a model 9 days ago

anwgpt/anwgpt4.1-chat

30.7M • Updated 10 days ago • 17 • 2

liked a model 11 days ago

anwgpt/anwgpt4-chat

Text Generation • 27.2M • Updated 10 days ago • 45 • 1

New activity in aquiffoo/neo-3-1B-A90M-Base 11 days ago

Production deployment considerations

#1 opened 18 days ago by

Cagnicolas

liked a model 13 days ago

NousResearch/NousCoder-14B

Text Generation • 15B • Updated 16 days ago • 1.94k • 171

reacted to sergiopaniego's post with 🔥 13 days ago

Post

2214

New GRPO + TRL free Colab notebook out! 🔥

Fine-tune 7B+ models on T4 GPUs thanks to a ton of memory optimizations for GRPO

7B model uses only 9.2 GB VRAM (~7× reduction) 🤯

Try the notebook here 👉 https://colab.research.google.com/github/huggingface/trl/blob/main/examples/notebooks/grpo_trl_lora_qlora.ipynb