Tom Lu's picture

39 6

Tom Lu

eigentom

·

https://eigentom.github.io

EigenTom

AI & ML interests

MLLM, Generative AI, Agentic RL

Recent Activity

upvoted a paper 7 days ago

Visual Spatial Tuning

upvoted a paper 17 days ago

Emu3.5: Native Multimodal Models are World Learners

upvoted a paper 19 days ago

VisCoder2: Building Multi-Language Visualization Coding Agents

View all activity

Organizations

upvoted a paper 7 days ago

Visual Spatial Tuning

Paper • 2511.05491 • Published 11 days ago • 46

upvoted a paper 17 days ago

Emu3.5: Native Multimodal Models are World Learners

Paper • 2510.26583 • Published 19 days ago • 103

upvoted a paper 19 days ago

VisCoder2: Building Multi-Language Visualization Coding Agents

Paper • 2510.23642 • Published 25 days ago • 21

upvoted 6 papers about 1 month ago

WithAnyone: Towards Controllable and ID Consistent Image Generation

Paper • 2510.14975 • Published Oct 16 • 80

BrowserAgent: Building Web Agents with Human-Inspired Web Browsing Actions

Paper • 2510.10666 • Published Oct 12 • 27

Vision-Zero: Scalable VLM Self-Improvement via Strategic Gamified Self-Play

Paper • 2509.25541 • Published Sep 29 • 139

UniVideo: Unified Understanding, Generation, and Editing for Videos

Paper • 2510.08377 • Published Oct 9 • 70

In-the-Flow Agentic System Optimization for Effective Planning and Tool Use

Paper • 2510.05592 • Published Oct 7 • 101

Paper2Video: Automatic Video Generation from Scientific Papers

Paper • 2510.05096 • Published Oct 6 • 111

upvoted 7 papers about 2 months ago

SparseD: Sparse Attention for Diffusion Language Models

Paper • 2509.24014 • Published Sep 28 • 30

Word Form Matters: LLMs' Semantic Reconstruction under Typoglycemia

Paper • 2503.01714 • Published Mar 3 • 5

A Rigorous Benchmark with Multidimensional Evaluation for Deep Research Agents: From Answers to Reports

Paper • 2510.02190 • Published Oct 2 • 18

GEM: A Gym for Agentic LLMs

Paper • 2510.01051 • Published Oct 1 • 88

Language Models Can Learn from Verbal Feedback Without Scalar Rewards

Paper • 2509.22638 • Published Sep 26 • 67

MCPMark: A Benchmark for Stress-Testing Realistic and Comprehensive MCP Use

Paper • 2509.24002 • Published Sep 28 • 171

EditReward: A Human-Aligned Reward Model for Instruction-Guided Image Editing

Paper • 2509.26346 • Published Sep 30 • 18

upvoted 4 collections about 2 months ago

Critique-Coder

Crique-Coder • 5 items • Updated Sep 30 • 3

Mantis

Mantis model family optimized for multi-image reasoning with interleaved text/image format • 11 items • Updated Jul 2, 2024 • 11

VL-Rethinker

SoTA VLM for Reasoning • 7 items • Updated May 5 • 6

General-Reasoner

Advancing LLMs' general reasoning capabilities • 9 items • Updated Oct 12 • 6