Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
15
4
9
Pietro Lesci
pietrolesci
Follow
joebaumann's profile picture
Mi6paulino's profile picture
Tonic's profile picture
18 followers
·
33 following
https://pietrolesci.github.io/
pietro_lesci
pietrolesci
pietrolesci
pietrolesci.bsky.social
AI & ML interests
I like developing and applying causal methods to study the effect of training choices on models’ behaviour, including memorisation, shortcut learning, and tokenisation.
Recent Activity
upvoted
an
article
about 2 months ago
There is no such thing as a tokenizer-free lunch
updated
a model
2 months ago
pietrolesci/tokenisers
updated
a model
2 months ago
pietrolesci/tokenizers
View all activity
Organizations
pietrolesci
's datasets
56
Sort: Recently updated
pietrolesci/unimixlm
Viewer
•
Updated
Jul 25
•
81.9M
•
463
pietrolesci/me-minipile-evals
Viewer
•
Updated
Jun 3
•
1.22M
•
15
pietrolesci/pile-deduped
Viewer
•
Updated
May 5
•
748M
•
35
pietrolesci/pythia-deduped-memorisation-profiles
Viewer
•
Updated
Apr 9
•
2.13M
•
44
pietrolesci/pile-validation
Viewer
•
Updated
Apr 9
•
429k
•
21
pietrolesci/pile-deduped-subset
Viewer
•
Updated
Apr 9
•
16.3k
•
45
pietrolesci/pythia-deduped-stats
Viewer
•
Updated
Apr 9
•
16.3M
•
153
pietrolesci/pythia-deduped-stats-raw
Viewer
•
Updated
Apr 9
•
14.9M
•
29.6k
pietrolesci/agnews
Viewer
•
Updated
Apr 9
•
510k
•
23
pietrolesci/amazoncat-13k
Viewer
•
Updated
Apr 9
•
5.99M
•
2.48k
•
1
pietrolesci/wikitoxic
Viewer
•
Updated
Apr 9
•
894k
•
31
•
1
pietrolesci/multiwoz_all_versions
Viewer
•
Updated
Apr 9
•
82k
•
41
•
1
pietrolesci/anchoral-paper-artefacts
Viewer
•
Updated
Apr 9
•
2.78M
•
362
pietrolesci/pile-deduped-pythia-preshuffled
Viewer
•
Updated
Mar 25
•
244M
•
803
pietrolesci/pile-deduped-pythia-tokfreq
Viewer
•
Updated
Mar 17
•
50.1k
•
8
pietrolesci/finewebedu-20B
Viewer
•
Updated
Mar 16
•
40.4M
•
1.67k
pietrolesci/minipile
Viewer
•
Updated
Feb 27
•
6.06M
•
155
pietrolesci/opus-5langs-1M
Viewer
•
Updated
Dec 10, 2024
•
5M
•
22
pietrolesci/opus-raw
Viewer
•
Updated
Nov 27, 2024
•
4.06B
•
1.87k
pietrolesci/pythia-pile-stats
Viewer
•
Updated
Sep 23, 2024
•
113M
•
11
pietrolesci/slim-pajama-eval
Viewer
•
Updated
Sep 16, 2024
•
1.84M
•
14
•
1
pietrolesci/pile-subset
Updated
Sep 13, 2024
•
37
pietrolesci/cmnist
Viewer
•
Updated
Jul 29, 2024
•
308k
•
15
pietrolesci/celeba-wilds
Viewer
•
Updated
Jul 2, 2024
•
203k
•
25
•
1
pietrolesci/civilcomments-wilds
Viewer
•
Updated
Jul 2, 2024
•
893k
•
41
•
2
pietrolesci/mnli-stats
Viewer
•
Updated
May 13, 2024
•
785k
•
13
pietrolesci/mnli-embeddings
Viewer
•
Updated
Mar 22, 2024
•
785k
•
8
pietrolesci/_mnli-stats
Viewer
•
Updated
Mar 20, 2024
•
15.7M
•
7
pietrolesci/wikitext-103-raw-v1_gpt2-20k
Viewer
•
Updated
Nov 16, 2023
•
8.01M
•
25
pietrolesci/yahoo_answers_topics
Viewer
•
Updated
Sep 25, 2023
•
2.92M
•
20
Previous
1
2
Next