Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
3
1
iliass ayaou
datalyes
Follow
ValeKnappich's profile picture
Roman190928's profile picture
cbobed's profile picture
6 followers
ยท
2 following
iliass-ayaou-7bb481151
AI & ML interests
information retrieval, patent retrieval, knowledge management, data engineering and architecture, NLP
Recent Activity
reacted
to
nouamanetazi
's
post
with ๐
about 1 month ago
After training ๐๐ฆ๐จ๐ฅ๐๐๐ on ๐๐๐ ๐๐๐๐๐ฌ for nearly a month, I've come to realize something most people overlook: ๐ข๐ง๐๐ซ๐๐ฌ๐ญ๐ซ๐ฎ๐๐ญ๐ฎ๐ซ๐ ๐ข๐ฌ ๐ญ๐ก๐ ๐ฆ๐๐ค๐-๐จ๐ซ-๐๐ซ๐๐๐ค ๐๐๐๐ญ๐จ๐ซ ๐ข๐ง ๐๐๐ ๐ญ๐ซ๐๐ข๐ง๐ข๐ง๐ . ๐ฅ Everyone talks about model architecture and data quality. And yes, those matter immensely. But here's what nobody tells you: when your training run fails at 2 AM because of mysterious ๐๐๐๐ ๐๐ซ๐ซ๐จ๐ซ๐ฌ, or when your expensive GPU cluster is running at ๐๐% ๐๐๐๐ข๐๐ข๐๐ง๐๐ฒ, the problem isn't your model. It's most probably a ๐ฆ๐ข๐ฌ๐ฎ๐ฌ๐ ๐จ๐ ๐ญ๐ก๐ ๐ก๐๐ซ๐๐ฐ๐๐ซ๐. ๐ ๏ธ Questions that seemed simple but had no clear answers: Why is ๐๐จ๐ ๐ญ๐ซ๐๐ข๐ง๐ข๐ง๐ ๐ฌ๐ฅ๐จ๐ฐ๐๐ซ ๐ญ๐ก๐๐ง ๐๐๐ง๐ฌ๐ ๐ฆ๐จ๐๐๐ฅ๐ฌ? Which ๐๐๐๐ ๐๐ฅ๐๐ ๐ฌ should we actually set? How often should we checkpoint without killing throughput? That's why we built ๐๐ก๐ ๐๐ฆ๐จ๐ฅ ๐๐ซ๐๐ข๐ง๐ข๐ง๐ ๐๐ฅ๐๐ฒ๐๐จ๐จ๐ค ๐: a complete guide covering everything from model architecture and data curation to the SmolLM3 training marathon, post-training techniques, and crucially, the ๐ข๐ง๐๐ซ๐๐ฌ๐ญ๐ซ๐ฎ๐๐ญ๐ฎ๐ซ๐ ๐ฅ๐๐ฒ๐๐ซ that most teams get wrong. We validated real vs theoretical bandwidth across the entire stack: ๐๐๐๐ ๐ก๐ข๐ญ๐ญ๐ข๐ง๐ ๐ ๐๐/๐ฌ, ๐๐๐๐ข๐ง๐ค ๐.๐ ๐ซ๐๐๐๐ก๐ข๐ง๐ ๐๐๐ ๐๐/๐ฌ, ๐๐๐๐ ๐๐๐ง๐ ๐๐ญ ๐๐.๐ ๐๐/๐ฌ. Then we ran collective operations across ๐๐๐ ๐๐๐๐ฌ (16 nodes, 8xH100s each) and measured how performance degrades at scale: all-reduce drops from ๐๐๐ ๐๐/๐ฌ on a single node to ๐๐๐-๐๐๐ ๐๐/๐ฌ across 16 nodes. If you've ever wondered why your training runs are slower than they should be, or you're planning to scale up and want to avoid expensive mistakes, this guide might save you weeks of debugging. ๐๐ก๐ ๐๐ฆ๐จ๐ฅ ๐๐ซ๐๐ข๐ง๐ข๐ง๐ ๐๐ฅ๐๐ฒ๐๐จ๐จ๐ค: https://lnkd.in/e5MKXUHS Shared with โค๏ธ by the HuggingFace team
reacted
to
piercus
's
post
with ๐
about 1 month ago
Starts erasing! ๐ ๐ ๐ This is made with a one-step SD1.5 LBM [1]ย eraser ! Data is open. Data pipeline is open. Training code is open. On our LBM fork : https://github.com/finegrain-ai/LBM [1] https://huggingface.co/papers/2503.07535
authored
a paper
about 1 month ago
PatenTEB: A Comprehensive Benchmark and Model Family for Patent Text Embedding
View all activity
Organizations
datalyes
's datasets
16
Sort:ย Recently updated
datalyes/title2full
Viewer
โข
Updated
Oct 28
โข
18.7k
โข
10
datalyes/retrieval_OUT
Viewer
โข
Updated
Oct 28
โข
15.5k
โข
7
datalyes/retrieval_MIXED
Viewer
โข
Updated
Oct 28
โข
15.6k
โข
8
datalyes/retrieval_IN
Viewer
โข
Updated
Oct 28
โข
15.8k
โข
13
datalyes/problem2solution
Viewer
โข
Updated
Oct 28
โข
18.7k
โข
9
datalyes/problem2full
Viewer
โข
Updated
Oct 28
โข
18.7k
โข
12
datalyes/para_solution
Viewer
โข
Updated
Oct 28
โข
18.7k
โข
13
datalyes/para_problem
Viewer
โข
Updated
Oct 28
โข
18.7k
โข
11
datalyes/effect2substance
Viewer
โข
Updated
Oct 28
โข
2.04k
โข
10
datalyes/effect2full
Viewer
โข
Updated
Oct 28
โข
2.04k
โข
17
datalyes/clusters_inventor
Viewer
โข
Updated
Oct 28
โข
86.8k
โข
13
datalyes/clusters_ext_full_ipc
Viewer
โข
Updated
Oct 28
โข
47.2k
โข
19
datalyes/class_text2ipc3
Viewer
โข
Updated
Oct 28
โข
48.7k
โข
14
datalyes/class_nli_oldnew
Viewer
โข
Updated
Oct 28
โข
37.9k
โข
8
datalyes/class_bloom
Viewer
โข
Updated
Oct 28
โข
19k
โข
11
datalyes/DAPFAM_patent
Viewer
โข
Updated
Sep 10
โข
96.5k
โข
269
โข
6