Full-Duplex-Bench: A Benchmark to Evaluate Full-duplex Spoken Dialogue Models on Turn-taking Capabilities Paper • 2503.04721 • Published Mar 6, 2025 • 4
PersonaPlex: Voice and Role Control for Full Duplex Conversational Speech Models Paper • 2602.06053 • Published Jan 14 • 8
RubricHub: A Comprehensive and Highly Discriminative Rubric Dataset via Automated Coarse-to-Fine Generation Paper • 2601.08430 • Published Jan 13 • 62
Learning Query-Specific Rubrics from Human Preferences for DeepResearch Report Generation Paper • 2602.03619 • Published Feb 3 • 27
RubiCap: Rubric-Guided Reinforcement Learning for Dense Image Captioning Paper • 2603.09160 • Published 10 days ago • 15
Recursive Language Models Meet Uncertainty: The Surprising Effectiveness of Self-Reflective Program Search for Long Context Paper • 2603.15653 • Published 13 days ago • 4
Reranking with Compressed Document Representation Paper • 2505.15394 • Published May 21, 2025 • 1
view article Article PISCO-OSCAR: embeddings for efficient Retrieval-Augmented Generation Jun 18, 2025 • 3
PISCO: Pretty Simple Compression for Retrieval-Augmented Generation Paper • 2501.16075 • Published Jan 27, 2025 • 1
Let LLMs Speak Embedding Languages: Generative Text Embeddings via Iterative Contrastive Refinement Paper • 2509.24291 • Published Sep 29, 2025 • 1
Precision Spatio-Temporal Feature Fusion for Robust Remote Sensing Change Detection Paper • 2507.11523 • Published Jul 15, 2025 • 2
TFPI Collection ICLR2026: Thinking-Free Policy Initialization Makes Distilled Reasoning Models More Effective and Efficient Reasoners https://arxiv.org/abs/2509.26226 • 14 items • Updated Feb 12 • 1
Composition-RL Collection Datasets and trained checkpoints of Composition-RL: https://github.com/XinXU-USTC/Composition-RL • 13 items • Updated about 5 hours ago • 1
HARE: HumAn pRiors, a key to small language model Efficiency Paper • 2406.11410 • Published Jun 17, 2024 • 40
EnterpriseOps-Gym: Environments and Evaluations for Stateful Agentic Planning and Tool Use in Enterprise Settings Paper • 2603.13594 • Published 6 days ago • 137
LaSER: Internalizing Explicit Reasoning into Latent Space for Dense Retrieval Paper • 2603.01425 • Published 18 days ago • 6
AgentStepper: Interactive Debugging of Software Development Agents Paper • 2602.06593 • Published Feb 6 • 1
CodeV: Code with Images for Faithful Visual Reasoning via Tool-Aware Policy Optimization Paper • 2511.19661 • Published Nov 24, 2025 • 3