Ge Zhang

zhangysk

AI & ML interests

None yet

Recent Activity

upvoted a paper 28 days ago

InCoder-32B: Code Foundation Model for Industrial Scenarios

upvoted a paper 29 days ago

Understanding by Reconstruction: Reversing the Software Development Process for LLM Pretraining

authored a paper about 1 month ago

VeRA: Verified Reasoning Data Augmentation at Scale

View all activity

Organizations

authored 4 papers about 1 month ago

VeRA: Verified Reasoning Data Augmentation at Scale

Paper • 2602.13217 • Published Jan 23

Search More, Think Less: Rethinking Long-Horizon Agentic Search for Efficiency and Generalization

Paper • 2602.22675 • Published Feb 26 • 23

\$OneMillion-Bench: How Far are Language Agents from Human Experts?

Paper • 2603.07980 • Published Mar 9 • 27

Understanding by Reconstruction: Reversing the Software Development Process for LLM Pretraining

Paper • 2603.11103 • Published Mar 11 • 9

authored 4 papers 2 months ago

BABE: Biology Arena BEnchmark

Paper • 2602.05857 • Published Feb 5 • 10

Context Forcing: Consistent Autoregressive Video Generation with Long Context

Paper • 2602.06028 • Published Feb 5 • 36

Retrieval-Infused Reasoning Sandbox: A Benchmark for Decoupling Retrieval and Reasoning Capabilities

Paper • 2601.21937 • Published Jan 29 • 19

ConceptMoE: Adaptive Token-to-Concept Compression for Implicit Compute Allocation

Paper • 2601.21420 • Published Jan 29 • 42

authored 6 papers 3 months ago

NL2Repo-Bench: Towards Long-Horizon Repository Generation Evaluation of Coding Agents

Paper • 2512.12730 • Published Dec 14, 2025 • 52

The Molecular Structure of Thought: Mapping the Topology of Long Chain-of-Thought Reasoning

Paper • 2601.06002 • Published Jan 9 • 59

submitted a paper to Daily Papers 4 months ago

NL2Repo-Bench: Towards Long-Horizon Repository Generation Evaluation of Coding Agents

Paper • 2512.12730 • Published Dec 14, 2025 • 52

authored a paper 4 months ago

From Code Foundation Models to Agents and Applications: A Practical Guide to Code Intelligence

Paper • 2511.18538 • Published Nov 23, 2025 • 304

authored 4 papers 5 months ago

MME-CC: A Challenging Multi-Modal Evaluation Benchmark of Cognitive Capacity

Paper • 2511.03146 • Published Nov 5, 2025 • 8

RLoop: An Self-Improving Framework for Reinforcement Learning with Iterative Policy Initialization

Paper • 2511.04285 • Published Nov 6, 2025 • 8

MVU-Eval: Towards Multi-Video Understanding Evaluation for Multimodal LLMs

Paper • 2511.07250 • Published Nov 10, 2025 • 18

DiscoX: Benchmarking Discourse-Level Translation task in Expert Domains

Paper • 2511.10984 • Published Nov 14, 2025 • 6

Ge Zhang

AI & ML interests

Recent Activity

Organizations

zhangysk's activity