- 
	
	
	
MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training
Paper • 2403.09611 • Published • 129 - 
	
	
	
Unlocking the conversion of Web Screenshots into HTML Code with the WebSight Dataset
Paper • 2403.09029 • Published • 55 - 
	
	
	
GiT: Towards Generalist Vision Transformer through Universal Language Interface
Paper • 2403.09394 • Published • 27 
Xijia Tao
Cie1
		AI & ML interests
Multimodal tool-calling agents, Diffusion large language models
		Recent Activity
						liked
								a model
							
						12 days ago
						
					
						
						
						
						Alibaba-NLP/Tongyi-DeepResearch-30B-A3B
						
						liked
								a model
							
						25 days ago
						
					
						
						
						
						Dream-org/DreamOn-v0-7B
						Organizations
None yet