Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
0.7
TFLOPS
11
37
415
Matricardi Fabio
FM-1976
Follow
Fishtiks's profile picture
lucazsh's profile picture
puettmann's profile picture
18 followers
·
99 following
https://medium.com/@fabio.matricardi
ThePoorGpuGuy
fabiomatricardi
AI & ML interests
control system engineering, AI, LLM with python. ThePoorGPUguy on substack
Recent Activity
liked
a model
4 days ago
Tiiny/SmallThinker-3B-Preview
liked
a model
8 days ago
LiquidAI/LFM2-2.6B-Exp
reacted
to
codelion
's
post
with 🚀
8 days ago
Introducing Dhara-70M: A diffusion language model that achieves 3.8x higher throughput than autoregressive models! Key findings from our research on optimal architectures for small language models: → Depth beats width: 32 layers outperforms 12 layers at the same parameter count → Best-in-class factuality: 47.5% on TruthfulQA → 10x training efficiency using WSD (Warmup-Stable-Decay) conversion → Canon layers add only 0.13% parameters but improve reasoning We trained on 1B tokens using the optimal 50-30-20 dataset mix (PDFs + filtered web + educational content), then converted to diffusion with just 100M additional tokens. Blog: https://huggingface.co/blog/codelion/optimal-model-architecture Model: https://huggingface.co/codelion/dhara-70m
View all activity
Organizations
None yet
FM-1976
's activity
All
Models
Datasets
Spaces
Papers
Collections
Community
Posts
Upvotes
Likes
Articles
liked
a model
4 days ago
Tiiny/SmallThinker-3B-Preview
Text Generation
•
3B
•
Updated
Jan 16, 2025
•
22.2k
•
416
liked
2 models
8 days ago
LiquidAI/LFM2-2.6B-Exp
Text Generation
•
3B
•
Updated
7 days ago
•
6.16k
•
300
codelion/dhara-70m
Text Generation
•
71.3M
•
Updated
4 days ago
•
3.3k
•
29
liked
2 models
10 days ago
google/t5gemma-2-1b-1b
Image-Text-to-Text
•
2B
•
Updated
15 days ago
•
7.17k
•
61
facebook/sam-audio-small
Updated
3 days ago
•
7.85k
•
61
liked
a model
20 days ago
hitonet/hito-1.7b
Text Generation
•
2B
•
Updated
22 days ago
•
978
•
7
liked
4 models
21 days ago
ByteDance-Seed/Seed-X-PPO-7B
Translation
•
Updated
Jul 28, 2025
•
13.9k
•
285
NeuML/bert-hash-pico
Updated
Oct 9, 2025
•
24
•
3
liu-nlp/hyperllama-180m-multilingual-1x
Text Generation
•
0.2B
•
Updated
21 days ago
•
51
•
1
TitleOS/Lightning-1.7B
Text Generation
•
2B
•
Updated
23 days ago
•
56
•
3
liked
2 models
24 days ago
jhu-clsp/ettin-decoder-68m
Fill-Mask
•
Updated
Jul 16, 2025
•
125
•
1
jhu-clsp/ettin-encoder-17m
Fill-Mask
•
Updated
Jul 16, 2025
•
1.24k
•
11
liked
5 models
26 days ago
nvidia/parakeet-tdt-0.6b-v3
Automatic Speech Recognition
•
Updated
Nov 27, 2025
•
56.2k
•
501
UsefulSensors/moonshine
Automatic Speech Recognition
•
Updated
Nov 30, 2025
•
1
•
86
shoumenchougou/RWKV7-G1a-0.1B-GGUF
0.2B
•
Updated
Oct 16, 2025
•
231
•
3
shoumenchougou/RWKV7-G1b-1.5B-GGUF
2B
•
Updated
30 days ago
•
178
•
1
onnx-community/ettin-encoder-32m-ONNX
Fill-Mask
•
Updated
26 days ago
•
21
•
1
liked
3 models
27 days ago
LucidityAI/Astral-0.6B-Flash-Coder
0.6B
•
Updated
Oct 5, 2025
•
13
•
1
keras/moonshine_tiny_en
Updated
Jun 17, 2025
•
11
•
1
mradermacher/aquif-3.5-Nano-1B-GGUF
2B
•
Updated
Dec 2, 2025
•
369
•
1
Load more