| license: mit | |
| library_name: transformers | |
| pipeline_tag: text-generation | |
| # LLaDA-8B-Base | |
| We introduce LLaDA, a diffusion model with an unprecedented 8B scale, trained entirely from scratch, rivaling LLaMA3 8B in performance. | |
| [Project Page](https://ml-gsai.github.io/LLaDA-demo/) | |
| [Code](https://github.com/ML-GSAI/LLaDA) | |
| ## Updates | |
| [2025-10-21] We have modified modeling_llada.py to support the input of attention_mask. |