-
Where Do Deep-Research Agents Go Wrong? Span-Level Error Localization in Agent Trajectories
Paper • 2606.02060 • Published • 50 -
MMG2Skill: Can Agents Distill In-the-Wild Guides into Self-Evolving Skills?
Paper • 2606.01993 • Published • 13 -
NJU-LINK/DR3-Eval
Viewer • Updated • 100 • 2.09k • 2 -
TVIR: Building Deep Research Agents Towards Text--Visual Interleaved Report Generation
Paper • 2606.02320 • Published • 13
AI & ML interests
None defined yet.
Recent Activity
View all activity
Papers
MMG2Skill: Can Agents Distill In-the-Wild Guides into Self-Evolving Skills?
TVIR: Building Deep Research Agents Towards Text--Visual Interleaved Report Generation
-
Where Do Deep-Research Agents Go Wrong? Span-Level Error Localization in Agent Trajectories
Paper • 2606.02060 • Published • 50 -
MMG2Skill: Can Agents Distill In-the-Wild Guides into Self-Evolving Skills?
Paper • 2606.01993 • Published • 13 -
NJU-LINK/DR3-Eval
Viewer • Updated • 100 • 2.09k • 2 -
TVIR: Building Deep Research Agents Towards Text--Visual Interleaved Report Generation
Paper • 2606.02320 • Published • 13
datasets 15
NJU-LINK/OmniCap-IF
Viewer • Updated • 480 • 43 • 1
NJU-LINK/OmniCap-IF-54K
Viewer • Updated • 53.9k • 71
NJU-LINK/AVSCapBench
Viewer • Updated • 1.23k • 838
NJU-LINK/TELBench
Updated • 205 • 1
NJU-LINK/TVIR-Bench
Viewer • Updated • 100 • 78
NJU-LINK/CoVEBench
Viewer • Updated • 626 • 470 • 1
NJU-LINK/WebCompass
Viewer • Updated • 933 • 10.6k • 6
NJU-LINK/ViDiC-1K
Updated • 253 • 5
NJU-LINK/DR3-Eval
Viewer • Updated • 100 • 2.09k • 2
NJU-LINK/CodeTraceBench
Viewer • Updated • 4.32k • 2.97k • 3