StoryTeller: Improving Long Video Description through Global Audio-Visual Character Identification Paper • 2411.07076 • Published Nov 11, 2024
AGILE: A Novel Reinforcement Learning Framework of LLM Agents Paper • 2405.14751 • Published May 23, 2024
Seeing, Listening, Remembering, and Reasoning: A Multimodal Agent with Long-Term Memory Paper • 2508.09736 • Published Aug 13 • 57
PaSa: An LLM Agent for Comprehensive Academic Paper Search Paper • 2501.10120 • Published Jan 17 • 53