23 64 56

Joya Chen PRO

chenjoya

https://chenjoya.github.io/

chenjoya

AI & ML interests

Video LLM

Recent Activity

upvoted a paper about 21 hours ago

WEAVE: Unleashing and Benchmarking the In-context Interleaved Comprehension and Generation

liked a model 1 day ago

nvidia/diar_streaming_sortformer_4spk-v2

liked a model 1 day ago

pyannote/speaker-diarization-community-1

View all activity

Organizations

upvoted a paper about 21 hours ago

WEAVE: Unleashing and Benchmarking the In-context Interleaved Comprehension and Generation

Paper • 2511.11434 • Published 7 days ago • 43

liked 2 models 1 day ago

nvidia/diar_streaming_sortformer_4spk-v2

Audio Classification • Updated 28 days ago • 13k • 73

pyannote/speaker-diarization-community-1

Automatic Speech Recognition • Updated Sep 29 • 197k • 90

upvoted a paper 4 days ago

Virtual Width Networks

Paper • 2511.11238 • Published 7 days ago • 32

upvoted a paper 6 days ago

Depth Anything 3: Recovering the Visual Space from Any Views

Paper • 2511.10647 • Published 8 days ago • 77

upvoted a paper 9 days ago

Grounding Computer Use Agents on Human Demonstrations

Paper • 2511.07332 • Published 11 days ago • 99

upvoted a paper 14 days ago

Cambrian-S: Towards Spatial Supersensing in Video

Paper • 2511.04670 • Published 15 days ago • 34

upvoted 2 papers 16 days ago

Revisiting Multimodal Positional Encoding in Vision-Language Models

Paper • 2510.23095 • Published 25 days ago • 20

VCode: a Multimodal Coding Benchmark with SVG as Symbolic Visual Representation

Paper • 2511.02778 • Published 17 days ago • 100

upvoted a paper 22 days ago

ChronoPlay: A Framework for Modeling Dual Dynamics and Authenticity in Game RAG Benchmarks

Paper • 2510.18455 • Published about 1 month ago • 17

upvoted a paper 24 days ago

FARMER: Flow AutoRegressive Transformer over Pixels

Paper • 2510.23588 • Published 25 days ago • 57

liked a dataset 30 days ago

MikhailT/lj-speech

Viewer • Updated Jun 23, 2023 • 13.1k • 310 • 6

liked 2 datasets about 1 month ago

zeyun-zhong/LLaVA-Video-216KQA

Viewer • Updated Oct 18 • 1.53k • 216 • 1

mit-han-lab/Inf-Stream-Train

Preview • Updated about 1 month ago • 2.84k • 1

upvoted a paper about 1 month ago

StreamingVLM: Real-Time Understanding for Infinite Video Streams

Paper • 2510.09608 • Published Oct 10 • 50

liked 2 datasets about 1 month ago

ZaynZhu/Paper2Video

Viewer • Updated Oct 7 • 101 • 112 • 10

Enxin/VideoNSA-data

Viewer • Updated Oct 8 • 162k • 30 • 1

upvoted 2 papers about 2 months ago

Paper2Video: Automatic Video Generation from Scientific Papers

Paper • 2510.05096 • Published Oct 6 • 112

Code2Video: A Code-centric Paradigm for Educational Video Generation

Paper • 2510.01174 • Published Oct 1 • 33

liked a model about 2 months ago

Qwen/Qwen3-VL-235B-A22B-Thinking

Image-Text-to-Text • 236B • Updated Oct 4 • 11.3k • • 326

Joya Chen PRO

AI & ML interests

Recent Activity

Organizations

chenjoya's activity