 nilq
			's Collections
			nilq
			's Collections
			
			
		Dynamics of Transformer Language Model Features
		
	updated
			
 
				
				
 - Model soups: averaging weights of multiple fine-tuned models improves
  accuracy without increasing inference time- 
			Paper
			 •- 
			2203.05482
			 •
			Published
				
			•- 
				7
			 
 - Diverse Weight Averaging for Out-of-Distribution Generalization- 
			Paper
			 •- 
			2205.09739
			 •
			Published
				
			•- 
				1
			 
 - Fusing finetuned models for better pretraining- 
			Paper
			 •- 
			2204.03044
			 •
			Published
				
			•- 
				6
			 
 - Sudden Drops in the Loss: Syntax Acquisition, Phase Transitions, and
  Simplicity Bias in MLMs- 
			Paper
			 •- 
			2309.07311
			 •
			Published
				
			•- 
				4
			 
 - Steering Llama 2 via Contrastive Activation Addition- 
			Paper
			 •- 
			2312.06681
			 •
			Published
				
			•- 
				15
			 
 - Knowledge Fusion of Large Language Models- 
			Paper
			 •- 
			2401.10491
			 •
			Published
				
			•- 
				5
			 
 - ReAGent: Towards A Model-agnostic Feature Attribution Method for
  Generative Language Models- 
			Paper
			 •- 
			2402.00794
			 •
			Published
				
			•- 
				1
			 
 - Resolving Interference When Merging Models- 
			Paper
			 •- 
			2306.01708
			 •
			Published
				
			•- 
				15
			 
 - Tracking Universal Features Through Fine-Tuning and Model Merging- 
			Paper
			 •- 
			2410.12391
			 •
			Published
				
			•- 
				5