3 41 5

Shrey Pandit

SP2001

https://sites.google.com/view/shrey-pandit/home

AI & ML interests

None yet

Recent Activity

updated a dataset 1 day ago

SP2001/FRAMES_judge_passed_unique_questions

published a dataset 1 day ago

SP2001/FRAMES_judge_passed_unique_questions

updated a dataset 3 days ago

SP2001/Browsecomp_judge_passed_unique_questions

View all activity

Organizations

updated a dataset 1 day ago

SP2001/FRAMES_judge_passed_unique_questions

Viewer • Updated 1 day ago • 478 • 3

published a dataset 1 day ago

SP2001/FRAMES_judge_passed_unique_questions

Viewer • Updated 1 day ago • 478 • 3

updated a dataset 3 days ago

SP2001/Browsecomp_judge_passed_unique_questions

Viewer • Updated 3 days ago • 212 • 8

published a dataset 3 days ago

SP2001/Browsecomp_judge_passed_unique_questions

Viewer • Updated 3 days ago • 212 • 8

upvoted a paper 22 days ago

Kimi Linear: An Expressive, Efficient Attention Architecture

Paper • 2510.26692 • Published 23 days ago • 108

liked a dataset 22 days ago

Salesforce/LiveResearchBenchFull

Viewer • Updated 11 days ago • 772 • 178 • 4

updated a dataset 22 days ago

SP2001/Browsecomp-style

Viewer • Updated 22 days ago • 3 • 32

published a dataset 22 days ago

SP2001/Browsecomp-style

Viewer • Updated 22 days ago • 3 • 32

liked a dataset 26 days ago

Salesforce/Hard2Verify

Viewer • Updated Oct 17 • 200 • 301 • 6

authored a paper 29 days ago

Synthesizing Agentic Data for Web Agents with Progressive Difficulty Enhancement Mechanisms

Paper • 2510.13913 • Published Oct 15 • 3

upvoted 2 papers about 1 month ago

LiveResearchBench: A Live Benchmark for User-Centric Deep Research in the Wild

Paper • 2510.14240 • Published Oct 16 • 11

Synthesizing Agentic Data for Web Agents with Progressive Difficulty Enhancement Mechanisms

Paper • 2510.13913 • Published Oct 15 • 3

commented a paper about 1 month ago

Synthesizing Agentic Data for Web Agents with Progressive Difficulty Enhancement Mechanisms

Paper • 2510.13913 • Published Oct 15 • 3 •

authored 2 papers about 1 month ago

EgoVLM: Policy Optimization for Egocentric Video Understanding

Paper • 2506.03097 • Published Jun 3

Hard2Verify: A Step-Level Verification Benchmark for Open-Ended Frontier Math

Paper • 2510.13744 • Published Oct 15 • 5

upvoted 2 papers about 1 month ago

Hard2Verify: A Step-Level Verification Benchmark for Open-Ended Frontier Math

Paper • 2510.13744 • Published Oct 15 • 5

Webscale-RL: Automated Data Pipeline for Scaling RL Data to Pretraining Levels

Paper • 2510.06499 • Published Oct 7 • 31

authored a paper 2 months ago

SFR-DeepResearch: Towards Effective Reinforcement Learning for Autonomously Reasoning Single Agents

Paper • 2509.06283 • Published Sep 8 • 17

upvoted 2 papers 2 months ago

SFR-DeepResearch: Towards Effective Reinforcement Learning for Autonomously Reasoning Single Agents

Paper • 2509.06283 • Published Sep 8 • 17

Why Language Models Hallucinate

Paper • 2509.04664 • Published Sep 4 • 192

Shrey Pandit

AI & ML interests

Recent Activity

Organizations

SP2001's activity