loveit - a SteveMcpoet Collection

SteveMcpoet 's Collections

loveit

updated Aug 8, 2025

Energy-Based Transformers are Scalable Learners and Thinkers

Paper • 2507.02092 • Published Jul 2, 2025 • 69
MOSPA: Human Motion Generation Driven by Spatial Audio

Paper • 2507.11949 • Published Jul 16, 2025 • 25
Sound and Complete Neuro-symbolic Reasoning with LLM-Grounded Interpretations

Paper • 2507.09751 • Published Jul 13, 2025 • 2
Geometry Forcing: Marrying Video Diffusion and 3D Representation for Consistent World Modeling

Paper • 2507.07982 • Published Jul 10, 2025 • 34
Dynamic Chunking for End-to-End Hierarchical Sequence Modeling

Paper • 2507.07955 • Published Jul 10, 2025 • 27
Tora2: Motion and Appearance Customized Diffusion Transformer for Multi-Entity Video Generation

Paper • 2507.05963 • Published Jul 8, 2025 • 13
SAMed-2: Selective Memory Enhanced Medical Segment Anything Model

Paper • 2507.03698 • Published Jul 4, 2025 • 12
FAROS: Fair Graph Generation via Attribute Switching Mechanisms

Paper • 2507.03728 • Published Jul 4, 2025 • 2
PresentAgent: Multimodal Agent for Presentation Video Generation

Paper • 2507.04036 • Published Jul 5, 2025 • 11
Kwai Keye-VL Technical Report

Paper • 2507.01949 • Published Jul 2, 2025 • 131
DuplexMamba: Enhancing Real-time Speech Conversations with Duplex and Streaming Capabilities

Paper • 2502.11123 • Published Feb 16, 2025
Differential Mamba

Paper • 2507.06204 • Published Jul 8, 2025 • 19
STITCH: Simultaneous Thinking and Talking with Chunked Reasoning for Spoken Language Models

Paper • 2507.15375 • Published Jul 21, 2025 • 30
Robust 3D-Masked Part-level Editing in 3D Gaussian Splatting with Regularized Score Distillation Sampling

Paper • 2507.11061 • Published Jul 15, 2025 • 37
Deep Researcher with Test-Time Diffusion

Paper • 2507.16075 • Published Jul 21, 2025 • 68
Persona Vectors: Monitoring and Controlling Character Traits in Language Models

Paper • 2507.21509 • Published Jul 29, 2025 • 33
LaTCoder: Converting Webpage Design to Code with Layout-as-Thought

Paper • 2508.03560 • Published Aug 5, 2025 • 24
Sel3DCraft: Interactive Visual Prompts for User-Friendly Text-to-3D Generation

Paper • 2508.00428 • Published Aug 1, 2025 • 3
REINA: Regularized Entropy Information-Based Loss for Efficient Simultaneous Speech Translation

Paper • 2508.04946 • Published Aug 7, 2025 • 1