Eden Yavin
EdenYav
		AI & ML interests
Reinforcement learning, online learning, cybersecurity, large language models
		
		Organizations
None yet
LLM Evaluation
			
			
	
	- 
	
	
	
Are We on the Right Way for Assessing Document Retrieval-Augmented Generation?
Paper • 2508.03644 • Published • 25 - 
	
	
	
WebWatcher: Breaking New Frontier of Vision-Language Deep Research Agent
Paper • 2508.05748 • Published • 137 - 
	
	
	
MCP-Bench: Benchmarking Tool-Using LLM Agents with Complex Real-World Tasks via MCP Servers
Paper • 2508.20453 • Published • 63 
LLM in Cybersecurity
			
			
	
	VisionLM
			
			
	
	LLM Evaluation
			
			
	
	- 
	
	
	
Are We on the Right Way for Assessing Document Retrieval-Augmented Generation?
Paper • 2508.03644 • Published • 25 - 
	
	
	
WebWatcher: Breaking New Frontier of Vision-Language Deep Research Agent
Paper • 2508.05748 • Published • 137 - 
	
	
	
MCP-Bench: Benchmarking Tool-Using LLM Agents with Complex Real-World Tasks via MCP Servers
Paper • 2508.20453 • Published • 63 
Reasoning
			
			
	
	LLM in Cybersecurity