Abstract
Extending the decoder Transformer with unsupervised learned random latent variables enhances performance on downstream tasks.
AI-generated summary
We propose an extension of the decoder Transformer that conditions its generative process on random latent variables which are learned without supervision thanks to a variational procedure. Experimental evaluations show that allowing such a conditioning translates into substantial improvements on downstream tasks.
Models citing this paper 0
No model linking this paper
Cite arxiv.org/abs/2510.17558 in a model README.md to link it from this page.
Datasets citing this paper 0
No dataset linking this paper
Cite arxiv.org/abs/2510.17558 in a dataset README.md to link it from this page.
Spaces citing this paper 0
No Space linking this paper
Cite arxiv.org/abs/2510.17558 in a Space README.md to link it from this page.