jzhihui
			's Collections
			 
		
			
		useful-paper
		
	updated
			
 
				
				
	
	
	
			
			PIA: Your Personalized Image Animator via Plug-and-Play Modules in
  Text-to-Image Models
		
			Paper
			
•
			2312.13964
			
•
			Published
				
			•
				
				20
			
 
	
	 
	
	
	
			
			LLM in a flash: Efficient Large Language Model Inference with Limited
  Memory
		
			Paper
			
•
			2312.11514
			
•
			Published
				
			•
				
				260
			
 
	
	 
	
	
	
			
			StreamDiffusion: A Pipeline-level Solution for Real-time Interactive
  Generation
		
			Paper
			
•
			2312.12491
			
•
			Published
				
			•
				
				74
			
 
	
	 
	
	
	
			
			LLaVA-φ: Efficient Multi-Modal Assistant with Small Language Model
		
			Paper
			
•
			2401.02330
			
•
			Published
				
			•
				
				18
			
 
	
	 
	
	
	
			
			TinyLlama: An Open-Source Small Language Model
		
			Paper
			
•
			2401.02385
			
•
			Published
				
			•
				
				94
			
 
	
	 
	
	
	
			
			Understanding LLMs: A Comprehensive Overview from Training to Inference
		
			Paper
			
•
			2401.02038
			
•
			Published
				
			•
				
				65
			
 
	
	 
	
	
	
			
			SliceGPT: Compress Large Language Models by Deleting Rows and Columns
		
			Paper
			
•
			2401.15024
			
•
			Published
				
			•
				
				74
			
 
	
	 
	
	
	
			
			OWSM v3.1: Better and Faster Open Whisper-Style Speech Models based on
  E-Branchformer
		
			Paper
			
•
			2401.16658
			
•
			Published
				
			•
				
				14
			
 
	
	 
	
	
	
			
			H2O-Danube-1.8B Technical Report
		
			Paper
			
•
			2401.16818
			
•
			Published
				
			•
				
				18
			
 
	
	 
	
	
	
			
			MobileLLM: Optimizing Sub-billion Parameter Language Models for
  On-Device Use Cases
		
			Paper
			
•
			2402.14905
			
•
			Published
				
			•
				
				134
			
 
	
	 
	
	
	
			
			ChatMusician: Understanding and Generating Music Intrinsically with LLM
		
			Paper
			
•
			2402.16153
			
•
			Published
				
			•
				
				60
			
 
	
	 
	
	
	
			
			The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits
		
			Paper
			
•
			2402.17764
			
•
			Published
				
			•
				
				625
			
 
	
	 
	
	
	
			
			Sora: A Review on Background, Technology, Limitations, and Opportunities
  of Large Vision Models
		
			Paper
			
•
			2402.17177
			
•
			Published
				
			•
				
				88
			
 
	
	 
	
	
	
			
			BurstAttention: An Efficient Distributed Attention Framework for
  Extremely Long Sequences
		
			Paper
			
•
			2403.09347
			
•
			Published
				
			•
				
				22
			
 
	
	 
	
	
	
			
			Transformer-Lite: High-efficiency Deployment of Large Language Models on
  Mobile Phone GPUs
		
			Paper
			
•
			2403.20041
			
•
			Published
				
			•
				
				34
			
 
	
	 
	
	
	
			
			TextCraftor: Your Text Encoder Can be Image Quality Controller
		
			Paper
			
•
			2403.18978
			
•
			Published
				
			•
				
				15
			
 
	
	 
	
	
	
			
			MiniGPT4-Video: Advancing Multimodal LLMs for Video Understanding with
  Interleaved Visual-Textual Tokens
		
			Paper
			
•
			2404.03413
			
•
			Published
				
			•
				
				28
			
 
	
	 
	
	
	
			
			Mixture-of-Depths: Dynamically allocating compute in transformer-based
  language models
		
			Paper
			
•
			2404.02258
			
•
			Published
				
			•
				
				107
			
 
	
	 
	
	
	
			
			Long-context LLMs Struggle with Long In-context Learning
		
			Paper
			
•
			2404.02060
			
•
			Published
				
			•
				
				37
			
 
	
	 
	
	
	
			
			Social Skill Training with Large Language Models
		
			Paper
			
•
			2404.04204
			
•
			Published
				
			•
				
				16
			
 
	
	 
	
	
	
			
			Chinese Tiny LLM: Pretraining a Chinese-Centric Large Language Model
		
			Paper
			
•
			2404.04167
			
•
			Published
				
			•
				
				14
			
 
	
	 
	
	
	
			
			MoMA: Multimodal LLM Adapter for Fast Personalized Image Generation
		
			Paper
			
•
			2404.05674
			
•
			Published
				
			•
				
				15
			
 
	
	 
	
	
	
			
			MA-LMM: Memory-Augmented Large Multimodal Model for Long-Term Video
  Understanding
		
			Paper
			
•
			2404.05726
			
•
			Published
				
			•
				
				23
			
 
	
	 
	
	
	
			
			LLaVA-Gemma: Accelerating Multimodal Foundation Models with a Compact
  Language Model
		
			Paper
			
•
			2404.01331
			
•
			Published
				
			•
				
				27
			
 
	
	 
	
	
	
			
			EdgeFusion: On-Device Text-to-Image Generation
		
			Paper
			
•
			2404.11925
			
•
			Published
				
			•
				
				23
			
 
	
	 
	
	
	
			
			How Good Are Low-bit Quantized LLaMA3 Models? An Empirical Study
		
			Paper
			
•
			2404.14047
			
•
			Published
				
			•
				
				45
			
 
	
	 
	
	
	
			
			FlashSpeech: Efficient Zero-Shot Speech Synthesis
		
			Paper
			
•
			2404.14700
			
•
			Published
				
			•
				
				32
			
 
	
	 
	
	
	
			
			Phi-3 Technical Report: A Highly Capable Language Model Locally on Your
  Phone
		
			Paper
			
•
			2404.14219
			
•
			Published
				
			•
				
				258
			
 
	
	 
	
	
	
			
			OpenELM: An Efficient Language Model Family with Open-source Training
  and Inference Framework
		
			Paper
			
•
			2404.14619
			
•
			Published
				
			•
				
				126
			
 
	
	 
	
	
	
			
			Octopus v4: Graph of language models
		
			Paper
			
•
			2404.19296
			
•
			Published
				
			•
				
				118
			
 
	
	 
	
	
	
			
			Mobile-Agent-v2: Mobile Device Operation Assistant with Effective
  Navigation via Multi-Agent Collaboration
		
			Paper
			
•
			2406.01014
			
•
			Published
				
			•
				
				34