Xintao Wang's picture

3 19 4

Xintao Wang

Neph0s

·

https://neph0s.github.io/

Neph0s

AI & ML interests

None yet

Recent Activity

upvoted a collection about 1 month ago

LightReasoner Models

upvoted a paper 2 months ago

ScaleCUA: Scaling Open-Source Computer Use Agents with Cross-Platform Data

upvoted a paper 2 months ago

Harnessing Uncertainty: Entropy-Modulated Policy Gradients for Long-Horizon LLM Agents

View all activity

Organizations

None yet

upvoted a collection about 1 month ago

LightReasoner Models

https://arxiv.org/abs/2510.07962 • 3 items • Updated about 1 month ago • 5

upvoted 2 papers 2 months ago

ScaleCUA: Scaling Open-Source Computer Use Agents with Cross-Platform Data

Paper • 2509.15221 • Published Sep 18 • 109

Harnessing Uncertainty: Entropy-Modulated Policy Gradients for Long-Horizon LLM Agents

Paper • 2509.09265 • Published Sep 11 • 46

upvoted 3 papers 3 months ago

SimpleTIR: End-to-End Reinforcement Learning for Multi-Turn Tool-Integrated Reasoning

Paper • 2509.02479 • Published Sep 2 • 83

CODA: Coordinating the Cerebrum and Cerebellum for a Dual-Brain Computer Use Agent with Decoupled Reinforcement Learning

Paper • 2508.20096 • Published Aug 27 • 36

WideSearch: Benchmarking Agentic Broad Info-Seeking

Paper • 2508.07999 • Published Aug 11 • 109

upvoted a paper 5 months ago

Is Extending Modality The Right Path Towards Omni-Modality?

Paper • 2506.01872 • Published Jun 2 • 23

upvoted 3 papers 6 months ago

A Controllable Examination for Long-Context Language Models

Paper • 2506.02921 • Published Jun 3 • 33

ARIA: Training Language Agents with Intention-Driven Reward Aggregation

Paper • 2506.00539 • Published May 31 • 30

ARM: Adaptive Reasoning Model

Paper • 2505.20258 • Published May 26 • 45

upvoted 3 papers 8 months ago

JiraiBench: A Bilingual Benchmark for Evaluating Large Language Models' Detection of Human Self-Destructive Behavior Content in Jirai Community

Paper • 2503.21679 • Published Mar 27 • 1

CapArena: Benchmarking and Analyzing Detailed Image Captioning in the LLM Era

Paper • 2503.12329 • Published Mar 16 • 27

Implicit Reasoning in Transformers is Reasoning through Shortcuts

Paper • 2503.07604 • Published Mar 10 • 23

upvoted 2 papers 9 months ago

BenchMAX: A Comprehensive Multilingual Evaluation Suite for Large Language Models

Paper • 2502.07346 • Published Feb 11 • 54

CoSER: Coordinating LLM-Based Persona Simulation of Established Roles

Paper • 2502.09082 • Published Feb 13 • 30

upvoted a paper 10 months ago

Agent-R: Training Language Model Agents to Reflect via Iterative Self-Training

Paper • 2501.11425 • Published Jan 20 • 109

upvoted 3 papers about 1 year ago

VidEgoThink: Assessing Egocentric Video Understanding Capabilities for Embodied AI

Paper • 2410.11623 • Published Oct 15, 2024 • 49

Revealing the Barriers of Language Agents in Planning

Paper • 2410.12409 • Published Oct 16, 2024 • 27

Foundation Models for Music: A Survey

Paper • 2408.14340 • Published Aug 26, 2024 • 44