-
I-Con: A Unifying Framework for Representation Learning
Paper • 2504.16929 • Published • 31 -
SONAR-LLM: Autoregressive Transformer that Thinks in Sentence Embeddings and Speaks in Tokens
Paper • 2508.05305 • Published • 48 -
The Strong Lottery Ticket Hypothesis for Multi-Head Attention Mechanisms
Paper • 2511.04217 • Published • 17 -
Large Language Models as Markov Chains
Paper • 2410.02724 • Published • 33
Collections
Discover the best community collections!
Collections including paper arxiv:2603.14482
-
HuggingFaceFW/finetranslations
Viewer • Updated • 3.33B • 117k • 288 -
LLM-Inference-Bench: Inference Benchmarking of Large Language Models on AI Accelerators
Paper • 2411.00136 • Published -
The Illusion of Readiness in Health AI
Paper • 2509.18234 • Published • 1 -
The Roots of Performance Disparity in Multilingual Language Models: Intrinsic Modeling Difficulty or Design Choices?
Paper • 2601.07220 • Published
-
What matters when building vision-language models?
Paper • 2405.02246 • Published • 104 -
An Introduction to Vision-Language Modeling
Paper • 2405.17247 • Published • 90 -
DeMamba: AI-Generated Video Detection on Million-Scale GenVideo Benchmark
Paper • 2405.19707 • Published • 9 -
Scaling Up Your Kernels: Large Kernel Design in ConvNets towards Universal Representations
Paper • 2410.08049 • Published • 8
-
mHC: Manifold-Constrained Hyper-Connections
Paper • 2512.24880 • Published • 324 -
Fantastic Reasoning Behaviors and Where to Find Them: Unsupervised Discovery of the Reasoning Process
Paper • 2512.23988 • Published • 19 -
SpaceTimePilot: Generative Rendering of Dynamic Scenes Across Space and Time
Paper • 2512.25075 • Published • 16 -
Guiding a Diffusion Transformer with the Internal Dynamics of Itself
Paper • 2512.24176 • Published • 8
-
End-to-End Vision Tokenizer Tuning
Paper • 2505.10562 • Published • 22 -
Global and Local Entailment Learning for Natural World Imagery
Paper • 2506.21476 • Published • 1 -
DINOv3
Paper • 2508.10104 • Published • 307 -
Reasoning Vectors: Transferring Chain-of-Thought Capabilities via Task Arithmetic
Paper • 2509.01363 • Published • 62
-
I-Con: A Unifying Framework for Representation Learning
Paper • 2504.16929 • Published • 31 -
SONAR-LLM: Autoregressive Transformer that Thinks in Sentence Embeddings and Speaks in Tokens
Paper • 2508.05305 • Published • 48 -
The Strong Lottery Ticket Hypothesis for Multi-Head Attention Mechanisms
Paper • 2511.04217 • Published • 17 -
Large Language Models as Markov Chains
Paper • 2410.02724 • Published • 33
-
mHC: Manifold-Constrained Hyper-Connections
Paper • 2512.24880 • Published • 324 -
Fantastic Reasoning Behaviors and Where to Find Them: Unsupervised Discovery of the Reasoning Process
Paper • 2512.23988 • Published • 19 -
SpaceTimePilot: Generative Rendering of Dynamic Scenes Across Space and Time
Paper • 2512.25075 • Published • 16 -
Guiding a Diffusion Transformer with the Internal Dynamics of Itself
Paper • 2512.24176 • Published • 8
-
HuggingFaceFW/finetranslations
Viewer • Updated • 3.33B • 117k • 288 -
LLM-Inference-Bench: Inference Benchmarking of Large Language Models on AI Accelerators
Paper • 2411.00136 • Published -
The Illusion of Readiness in Health AI
Paper • 2509.18234 • Published • 1 -
The Roots of Performance Disparity in Multilingual Language Models: Intrinsic Modeling Difficulty or Design Choices?
Paper • 2601.07220 • Published
-
End-to-End Vision Tokenizer Tuning
Paper • 2505.10562 • Published • 22 -
Global and Local Entailment Learning for Natural World Imagery
Paper • 2506.21476 • Published • 1 -
DINOv3
Paper • 2508.10104 • Published • 307 -
Reasoning Vectors: Transferring Chain-of-Thought Capabilities via Task Arithmetic
Paper • 2509.01363 • Published • 62
-
What matters when building vision-language models?
Paper • 2405.02246 • Published • 104 -
An Introduction to Vision-Language Modeling
Paper • 2405.17247 • Published • 90 -
DeMamba: AI-Generated Video Detection on Million-Scale GenVideo Benchmark
Paper • 2405.19707 • Published • 9 -
Scaling Up Your Kernels: Large Kernel Design in ConvNets towards Universal Representations
Paper • 2410.08049 • Published • 8