SteveMcpoet 's Collections loveit
updated
Energy-Based Transformers are Scalable Learners and Thinkers
Paper
• 2507.02092
• Published
• 69
MOSPA: Human Motion Generation Driven by Spatial Audio
Paper
• 2507.11949
• Published
• 25
Sound and Complete Neuro-symbolic Reasoning with LLM-Grounded
Interpretations
Paper
• 2507.09751
• Published
• 2
Geometry Forcing: Marrying Video Diffusion and 3D Representation for
Consistent World Modeling
Paper
• 2507.07982
• Published
• 34
Dynamic Chunking for End-to-End Hierarchical Sequence Modeling
Paper
• 2507.07955
• Published
• 27
Tora2: Motion and Appearance Customized Diffusion Transformer for
Multi-Entity Video Generation
Paper
• 2507.05963
• Published
• 13
SAMed-2: Selective Memory Enhanced Medical Segment Anything Model
Paper
• 2507.03698
• Published
• 12
FAROS: Fair Graph Generation via Attribute Switching Mechanisms
Paper
• 2507.03728
• Published
• 2
PresentAgent: Multimodal Agent for Presentation Video Generation
Paper
• 2507.04036
• Published
• 11
Kwai Keye-VL Technical Report
Paper
• 2507.01949
• Published
• 131
DuplexMamba: Enhancing Real-time Speech Conversations with Duplex and
Streaming Capabilities
Paper
• 2502.11123
• Published
Paper
• 2507.06204
• Published
• 19
STITCH: Simultaneous Thinking and Talking with Chunked Reasoning for
Spoken Language Models
Paper
• 2507.15375
• Published
• 30
Robust 3D-Masked Part-level Editing in 3D Gaussian Splatting with
Regularized Score Distillation Sampling
Paper
• 2507.11061
• Published
• 37
Deep Researcher with Test-Time Diffusion
Paper
• 2507.16075
• Published
• 68
Persona Vectors: Monitoring and Controlling Character Traits in Language
Models
Paper
• 2507.21509
• Published
• 33
LaTCoder: Converting Webpage Design to Code with Layout-as-Thought
Paper
• 2508.03560
• Published
• 24
Sel3DCraft: Interactive Visual Prompts for User-Friendly Text-to-3D
Generation
Paper
• 2508.00428
• Published
• 3
REINA: Regularized Entropy Information-Based Loss for Efficient
Simultaneous Speech Translation
Paper
• 2508.04946
• Published
• 1