video-SALMONN 2: Captioning-Enhanced Audio-Visual Large Language Models
Paper
•
2506.15220
•
Published
•
1
Official model release of video-SALMONN 2: Captioning-Enhanced Audio-Visual Large Language Models
Base model
Qwen/Qwen2-7B