stillerman (Jason Stillerman)

upvoted an article 10 months ago

Article

CodeAgents + Structure: A Better Way to Execute Actions

May 28, 2025

•

82

upvoted a paper 11 months ago

LongAttn: Selecting Long-context Training Data via Token-level Attention

Paper • 2502.16860 • Published Feb 24, 2025 • 1

upvoted an article 11 months ago

Article

wHy DoNt YoU jUsT uSe ThE lLaMa ToKeNiZeR??

Sep 27, 2024

•

54

upvoted a collection 11 months ago

CLIPPER

Collection

Models and datasets for CLIPPER: Compression enables long-context synthetic data generation • 7 items • Updated Oct 3, 2025 • 5

upvoted 3 articles 11 months ago

Article

Deploying Your FastAPI Applications on Huggingface Via Docker

Dec 11, 2023

•

41

Article

Comparing sub 50GB Llama 4 Scout quants (KLD/Top P)

Apr 9, 2025

•

45

Article

Open R1: Update #2

Feb 10, 2025

•

218

upvoted a paper 12 months ago

SmolVLM: Redefining small and efficient multimodal models

Paper • 2504.05299 • Published Apr 7, 2025 • 207

upvoted a paper about 1 year ago

Towards the Law of Capacity Gap in Distilling Language Models

Paper • 2311.07052 • Published Nov 13, 2023 • 2

upvoted an article about 1 year ago

Article

Open R1: Update #3

Mar 11, 2025

•

297

upvoted a paper about 1 year ago

SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model

Paper • 2502.02737 • Published Feb 4, 2025 • 257

upvoted 2 articles about 1 year ago

Article

From PyTorch DDP to Accelerate to Trainer, mastery of distributed training with ease

Oct 21, 2022

•

43

Article

Embodied AI == Unlimited Training Data

Jan 13, 2025

•

4

upvoted a paper over 1 year ago

Unpacking SDXL Turbo: Interpreting Text-to-Image Models with Sparse Autoencoders

Paper • 2410.22366 • Published Oct 28, 2024 • 84

upvoted a paper almost 2 years ago

Aurora-M: The First Open Source Multilingual Language Model Red-teamed according to the U.S. Executive Order

Paper • 2404.00399 • Published Mar 30, 2024 • 42

Jason Stillerman

AI & ML interests