MoM
					Collection
				
				9 items
				• 
				Updated
					
				•
					
					2
Model of the paper MoM: Linear Sequence Modeling with Mixture-of-Memories.
The model was trained on a sample of SlimPajama with 100B tokens. We use Gated-Deltanet as the memory update mechanism.