view article Article Efficient LLM Pretraining: Packed Sequences and Masked Attention Oct 7, 2024 β’ 60
Running 3.49k The Ultra-Scale Playbook π 3.49k The ultimate guide to training LLM on large GPU Clusters
Running on A10G 253 Controlnet for Interior Design π₯ 253 Upload an image and edit it using segmentation, inpainting, or regeneration
LLM2Vec: Large Language Models Are Secretly Powerful Text Encoders Paper β’ 2404.05961 β’ Published Apr 9, 2024 β’ 66
Linear Transformers with Learnable Kernel Functions are Better In-Context Models Paper β’ 2402.10644 β’ Published Feb 16, 2024 β’ 81