Papers From Seconds to Hours: Reviewing MultiModal Large Language Models on Comprehensive Long Video Understanding Paper • 2409.18938 • Published Sep 27, 2024 microsoft/xclip-base-patch16-zero-shot Video Classification • 0.2B • Updated Sep 12, 2023 • 3.66k • 26
From Seconds to Hours: Reviewing MultiModal Large Language Models on Comprehensive Long Video Understanding Paper • 2409.18938 • Published Sep 27, 2024
microsoft/xclip-base-patch16-zero-shot Video Classification • 0.2B • Updated Sep 12, 2023 • 3.66k • 26
Papers From Seconds to Hours: Reviewing MultiModal Large Language Models on Comprehensive Long Video Understanding Paper • 2409.18938 • Published Sep 27, 2024 microsoft/xclip-base-patch16-zero-shot Video Classification • 0.2B • Updated Sep 12, 2023 • 3.66k • 26
From Seconds to Hours: Reviewing MultiModal Large Language Models on Comprehensive Long Video Understanding Paper • 2409.18938 • Published Sep 27, 2024
microsoft/xclip-base-patch16-zero-shot Video Classification • 0.2B • Updated Sep 12, 2023 • 3.66k • 26