MihailSlutsky's picture

76 15

MihailSlutsky

MihailSlutsky

AI & ML interests

None yet

Recent Activity

upvoted a paper 3 days ago

Beyond Multiple Choice: Verifiable OpenQA for Robust Vision-Language RFT

upvoted a paper 8 days ago

SofT-GRPO: Surpassing Discrete-Token LLM Reinforcement Learning via Gumbel-Reparameterized Soft-Thinking Policy Optimization

upvoted a paper 8 days ago

Generating an Image From 1,000 Words: Enhancing Text-to-Image With Structured Captions

View all activity

Organizations

None yet

upvoted a paper 3 days ago

Beyond Multiple Choice: Verifiable OpenQA for Robust Vision-Language RFT

Paper • 2511.17405 • Published 13 days ago • 10

upvoted 19 papers 8 days ago

SofT-GRPO: Surpassing Discrete-Token LLM Reinforcement Learning via Gumbel-Reparameterized Soft-Thinking Policy Optimization

Paper • 2511.06411 • Published 25 days ago • 16

Generating an Image From 1,000 Words: Enhancing Text-to-Image With Structured Captions

Paper • 2511.06876 • Published 24 days ago • 26

Routing Manifold Alignment Improves Generalization of Mixture-of-Experts LLMs

Paper • 2511.07419 • Published 24 days ago • 25

Robot Learning from a Physical World Model

Paper • 2511.07416 • Published 24 days ago • 28

IterResearch: Rethinking Long-Horizon Agents via Markovian State Reconstruction

Paper • 2511.07327 • Published 24 days ago • 73

HaluMem: Evaluating Hallucinations in Memory Systems of Agents

Paper • 2511.03506 • Published 29 days ago • 92

Grounding Computer Use Agents on Human Demonstrations

Paper • 2511.07332 • Published 24 days ago • 103

Adaptive Multi-Agent Response Refinement in Conversational Systems

Paper • 2511.08319 • Published 23 days ago • 40

VideoSSR: Video Self-Supervised Reinforcement Learning

Paper • 2511.06281 • Published 25 days ago • 24

The Path Not Taken: RLVR Provably Learns Off the Principals

Paper • 2511.08567 • Published 23 days ago • 31

Tiny Model, Big Logic: Diversity-Driven Optimization Elicits Large-Model Reasoning Ability in VibeThinker-1.5B

Paper • 2511.06221 • Published 25 days ago • 127

Lumine: An Open Recipe for Building Generalist Agents in 3D Open Worlds

Paper • 2511.08892 • Published 22 days ago • 192

TiDAR: Think in Diffusion, Talk in Autoregression

Paper • 2511.08923 • Published 22 days ago • 108

SliderEdit: Continuous Image Editing with Fine-Grained Instruction Control

Paper • 2511.09715 • Published 22 days ago • 8

Black-Box On-Policy Distillation of Large Language Models

Paper • 2511.10643 • Published 21 days ago • 46

PAN: A World Model for General, Interactable, and Long-Horizon World Simulation

Paper • 2511.09057 • Published 22 days ago • 74

Depth Anything 3: Recovering the Visual Space from Any Views

Paper • 2511.10647 • Published 21 days ago • 91

SpatialThinker: Reinforcing 3D Reasoning in Multimodal LLMs via Spatial Rewards

Paper • 2511.07403 • Published 24 days ago • 13

LiteAttention: A Temporal Sparse Attention for Diffusion Transformers

Paper • 2511.11062 • Published 20 days ago • 30