Changli Tang's picture

1 2 20

Changli Tang

Changli

·

TCL606

AI & ML interests

Speech signal processing; video understanding; multi-modal LLM

Recent Activity

new activity about 2 months ago

tsinghua-ee/video-SALMONN-2_plus_7B:Question about the base model for LoRA adapter

updated a dataset 2 months ago

tsinghua-ee/AVUTBenchmark

updated a model 2 months ago

tsinghua-ee/video-SALMONN-2_plus_72B

View all activity

Organizations

New activity in tsinghua-ee/video-SALMONN-2_plus_7B about 2 months ago

Question about the base model for LoRA adapter

#3 opened 2 months ago by

zhangchunjie1999

updated a dataset 2 months ago

tsinghua-ee/AVUTBenchmark

Viewer • Updated Sep 28 • 3.28k • 4.51k

updated 4 models 2 months ago

tsinghua-ee/video-SALMONN-2_plus_72B

Updated Sep 28 • 5 • 1

tsinghua-ee/video-SALMONN-2_plus_3B

Updated Sep 28 • 10 • 2

tsinghua-ee/video-SALMONN-2_plus_7B

Updated Sep 28 • 96 • 4

tsinghua-ee/video-SALMONN-2

Video-Text-to-Text • 9B • Updated Sep 28 • 274

authored 3 papers 5 months ago

Fine-grained Audio-Visual Joint Representations for Multimodal Large Language Models

Paper • 2310.05863 • Published Oct 9, 2023 • 2

Enhancing Multimodal LLM for Detailed and Accurate Video Captioning using Multi-Round Preference Optimization

Paper • 2410.06682 • Published Oct 9, 2024

video-SALMONN 2: Captioning-Enhanced Audio-Visual Large Language Models

Paper • 2506.15220 • Published Jun 18 • 1

upvoted a paper 5 months ago

video-SALMONN 2: Captioning-Enhanced Audio-Visual Large Language Models

Paper • 2506.15220 • Published Jun 18 • 1

updated 3 datasets 8 months ago

Changli/Ytb_Video

Viewer • Updated Apr 28 • 5.57k • 187

Changli/Ytb_Video

Viewer • Updated Apr 28 • 5.57k • 187

Changli/Ytb_Video

Viewer • Updated Apr 28 • 5.57k • 187