Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2512.13687

Towards Scalable Pre-training of Visual Tokenizers for Generation

MiniMaxAI/VTP-Small-f16d64

Image Feature Extraction • 0.2B • Updated 13 days ago • 16.3k • 10
MiniMaxAI/VTP-Base-f16d64

Image Feature Extraction • 0.3B • Updated 13 days ago • 15.5k • 17
MiniMaxAI/VTP-Large-f16d64

Image Feature Extraction • 0.7B • Updated 13 days ago • 16k • 12
Towards Scalable Pre-training of Visual Tokenizers for Generation

Paper • 2512.13687 • Published 14 days ago • 96

Training and tuning

QwenLong-L1.5: Post-Training Recipe for Long-Context Reasoning and Memory Management

Paper • 2512.12967 • Published 14 days ago • 101
Towards Scalable Pre-training of Visual Tokenizers for Generation

Paper • 2512.13687 • Published 14 days ago • 96

Tokenization methods and language modelling

Continuous Autoregressive Language Models

Paper • 2510.27688 • Published Oct 31 • 70
Efficient Speech Language Modeling via Energy Distance in Continuous Latent Space

Paper • 2505.13181 • Published May 19 • 9
Long-Context Autoregressive Video Modeling with Next-Frame Prediction

Paper • 2503.19325 • Published Mar 25 • 73
Bridging Continuous and Discrete Tokens for Autoregressive Visual Generation

Paper • 2503.16430 • Published Mar 20 • 34

lusxvr/nanoVLM-222M

Image-Text-to-Text • 0.2B • Updated May 8 • 447 • 98
Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning

Paper • 2503.09516 • Published Mar 12 • 36
AlphaOne: Reasoning Models Thinking Slow and Fast at Test Time

Paper • 2505.24863 • Published May 30 • 97
QwenLong-L1: Towards Long-Context Large Reasoning Models with Reinforcement Learning

Paper • 2505.17667 • Published May 23 • 88

yandex/stable-diffusion-3.5-medium-alchemist

Text-to-Image • Updated May 16 • 4 • 6
Ovis-U1 Technical Report

Paper • 2506.23044 • Published Jun 29 • 61
FreeMorph: Tuning-Free Generalized Image Morphing with Diffusion Model

Paper • 2507.01953 • Published Jul 2 • 18
LongAnimation: Long Animation Generation with Dynamic Global-Local Memory

Paper • 2507.01945 • Published Jul 2 • 76

Towards Scalable Pre-training of Visual Tokenizers for Generation

Paper • 2512.13687 • Published 14 days ago • 96
MMGR: Multi-Modal Generative Reasoning

Paper • 2512.14691 • Published 13 days ago • 114

Test-Time Scaling with Reflective Generative Model

Paper • 2507.01951 • Published Jul 2 • 107
Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach

Paper • 2502.05171 • Published Feb 7 • 151
Autoregressive Diffusion Models

Paper • 2110.02037 • Published Oct 5, 2021
EQ-VAE: Equivariance Regularized Latent Space for Improved Generative Image Modeling

Paper • 2502.09509 • Published Feb 13 • 8

CatLIP: CLIP-level Visual Recognition Accuracy with 2.7x Faster Pre-training on Web-scale Image-Text Data

Paper • 2404.15653 • Published Apr 24, 2024 • 29
MoDE: CLIP Data Experts via Clustering

Paper • 2404.16030 • Published Apr 24, 2024 • 15
MoRA: High-Rank Updating for Parameter-Efficient Fine-Tuning

Paper • 2405.12130 • Published May 20, 2024 • 50
Reducing Transformer Key-Value Cache Size with Cross-Layer Attention

Paper • 2405.12981 • Published May 21, 2024 • 33

Towards Scalable Pre-training of Visual Tokenizers for Generation

MiniMaxAI/VTP-Small-f16d64

Image Feature Extraction • 0.2B • Updated 13 days ago • 16.3k • 10
MiniMaxAI/VTP-Base-f16d64

Image Feature Extraction • 0.3B • Updated 13 days ago • 15.5k • 17
MiniMaxAI/VTP-Large-f16d64

Image Feature Extraction • 0.7B • Updated 13 days ago • 16k • 12
Towards Scalable Pre-training of Visual Tokenizers for Generation

Paper • 2512.13687 • Published 14 days ago • 96

yandex/stable-diffusion-3.5-medium-alchemist

Text-to-Image • Updated May 16 • 4 • 6
Ovis-U1 Technical Report

Paper • 2506.23044 • Published Jun 29 • 61
FreeMorph: Tuning-Free Generalized Image Morphing with Diffusion Model

Paper • 2507.01953 • Published Jul 2 • 18
LongAnimation: Long Animation Generation with Dynamic Global-Local Memory

Paper • 2507.01945 • Published Jul 2 • 76

Training and tuning

QwenLong-L1.5: Post-Training Recipe for Long-Context Reasoning and Memory Management

Paper • 2512.12967 • Published 14 days ago • 101
Towards Scalable Pre-training of Visual Tokenizers for Generation

Paper • 2512.13687 • Published 14 days ago • 96

Towards Scalable Pre-training of Visual Tokenizers for Generation

Paper • 2512.13687 • Published 14 days ago • 96
MMGR: Multi-Modal Generative Reasoning

Paper • 2512.14691 • Published 13 days ago • 114

Tokenization methods and language modelling

Continuous Autoregressive Language Models

Paper • 2510.27688 • Published Oct 31 • 70
Efficient Speech Language Modeling via Energy Distance in Continuous Latent Space

Paper • 2505.13181 • Published May 19 • 9
Long-Context Autoregressive Video Modeling with Next-Frame Prediction

Paper • 2503.19325 • Published Mar 25 • 73
Bridging Continuous and Discrete Tokens for Autoregressive Visual Generation

Paper • 2503.16430 • Published Mar 20 • 34

Test-Time Scaling with Reflective Generative Model

Paper • 2507.01951 • Published Jul 2 • 107
Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach

Paper • 2502.05171 • Published Feb 7 • 151
Autoregressive Diffusion Models

Paper • 2110.02037 • Published Oct 5, 2021
EQ-VAE: Equivariance Regularized Latent Space for Improved Generative Image Modeling

Paper • 2502.09509 • Published Feb 13 • 8

lusxvr/nanoVLM-222M

Image-Text-to-Text • 0.2B • Updated May 8 • 447 • 98
Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning

Paper • 2503.09516 • Published Mar 12 • 36
AlphaOne: Reasoning Models Thinking Slow and Fast at Test Time

Paper • 2505.24863 • Published May 30 • 97
QwenLong-L1: Towards Long-Context Large Reasoning Models with Reinforcement Learning

Paper • 2505.17667 • Published May 23 • 88

CatLIP: CLIP-level Visual Recognition Accuracy with 2.7x Faster Pre-training on Web-scale Image-Text Data

Paper • 2404.15653 • Published Apr 24, 2024 • 29
MoDE: CLIP Data Experts via Clustering

Paper • 2404.16030 • Published Apr 24, 2024 • 15
MoRA: High-Rank Updating for Parameter-Efficient Fine-Tuning

Paper • 2405.12130 • Published May 20, 2024 • 50
Reducing Transformer Key-Value Cache Size with Cross-Layer Attention

Paper • 2405.12981 • Published May 21, 2024 • 33

Company

TOS Privacy About Careers

Website

Models Datasets Spaces Pricing Docs