4 13 8

Roxanna

borntobeignored

AI & ML interests

None yet

Recent Activity

upvoted an article 12 days ago

Tricks from OpenAI gpt-oss YOU 🫵 can use with transformers

upvoted an article 12 days ago

From Zero to GPU: A Guide to Building and Scaling Production-Ready CUDA Kernels

upvoted an article 12 days ago

Accelerate ND-Parallel: A Guide to Efficient Multi-GPU Training

View all activity

Organizations

upvoted 3 articles 12 days ago

Article

Tricks from OpenAI gpt-oss YOU 🫵 can use with transformers

Sep 11

• 161

Article

From Zero to GPU: A Guide to Building and Scaling Production-Ready CUDA Kernels

Aug 18

• 83

Article

Accelerate ND-Parallel: A Guide to Efficient Multi-GPU Training

Aug 8

• 76

liked a Space about 2 months ago

AgentSeer

🔍

upvoted a paper 2 months ago

Analysing Chain of Thought Dynamics: Active Guidance or Unfaithful Post-hoc Rationalisation?

Paper • 2508.19827 • Published Aug 27 • 33

liked a dataset 2 months ago

ai-safety-institute/AgentHarm

Viewer • Updated Dec 19, 2024 • 468 • 1.23k • 38

commented a paper 3 months ago

Memp: Exploring Agent Procedural Memory

Paper • 2508.06433 • Published Aug 8 • 34 •

upvoted a paper 3 months ago

Agent Lightning: Train ANY AI Agents with Reinforcement Learning

Paper • 2508.03680 • Published Aug 5 • 86

upvoted an article 3 months ago

Article

SmolLM3: smol, multilingual, long-context reasoner

Jul 8

• 705

commented a paper 3 months ago

The Differences Between Direct Alignment Algorithms are a Blur

Paper • 2502.01237 • Published Feb 3 • 113 •

liked a dataset 3 months ago

samuelyeh/HalluEntity

Viewer • Updated Apr 24 • 157 • 20 • 2

liked a model 3 months ago

Wan-AI/Wan2.2-T2V-A14B

Text-to-Video • Updated Aug 7 • 8.57k • • 335

commented a paper 3 months ago

Taming Polysemanticity in LLMs: Provable Feature Recovery via Sparse Autoencoders

Paper • 2506.14002 • Published Jun 16 • 5 •

liked a Space 3 months ago

Monet 1.4B Experts Viewer

🔍

Show expert routing examples

commented a paper 3 months ago

CSR:Achieving 1 Bit Key-Value Cache via Sparse Representation

Paper • 2412.11741 • Published Dec 16, 2024 •

upvoted 4 papers 3 months ago

Group Sequence Policy Optimization

Paper • 2507.18071 • Published Jul 24 • 306

nablaNABLA: Neighborhood Adaptive Block-Level Attention

Paper • 2507.13546 • Published Jul 17 • 123

LAPO: Internalizing Reasoning Efficiency via Length-Adaptive Policy Optimization

Paper • 2507.15758 • Published Jul 21 • 35

Hierarchical Budget Policy Optimization for Adaptive Reasoning

Paper • 2507.15844 • Published Jul 21 • 16

upvoted a collection 3 months ago

🔍 Interpretability & Analysis of LMs

Collection

Outstanding research in LM interpretability and evaluation, summarized • 134 items • Updated 12 days ago • 116

Roxanna

AI & ML interests

Recent Activity

Organizations

borntobeignored's activity

Tricks from OpenAI gpt-oss YOU 🫵 can use with transformers

From Zero to GPU: A Guide to Building and Scaling Production-Ready CUDA Kernels

Accelerate ND-Parallel: A Guide to Efficient Multi-GPU Training

AgentSeer

SmolLM3: smol, multilingual, long-context reasoner

Monet 1.4B Experts Viewer