KL for a KL: On-Policy Distillation with Control Variate Baseline Paper • 2605.07865 • Published May 8 • 22
Your Language Model is Its Own Critic: Reinforcement Learning with Value Estimation from Actor's Internal States Paper • 2605.07579 • Published May 8 • 18
DAHL: Domain-specific Automated Hallucination Evaluation of Long-Form Text through a Benchmark Dataset in Biomedicine Paper • 2411.09255 • Published Nov 14, 2024
Learning to Retrieve User History and Generate User Profiles for Personalized Persuasiveness Prediction Paper • 2601.05654 • Published Apr 19
Your Language Model is Its Own Critic: Reinforcement Learning with Value Estimation from Actor's Internal States Paper • 2605.07579 • Published May 8 • 18
MM-JudgeBias: A Benchmark for Evaluating Compositional Biases in MLLM-as-a-Judge Paper • 2604.18164 • Published Apr 20 • 4
Can Natural Image Autoencoders Compactly Tokenize fMRI Volumes for Long-Range Dynamics Modeling? Paper • 2604.03619 • Published Apr 4 • 9
Uncertainty-guided Compositional Alignment with Part-to-Whole Semantic Representativeness in Hyperbolic Vision-Language Models Paper • 2603.22042 • Published Mar 23 • 3
RFEval: Benchmarking Reasoning Faithfulness under Counterfactual Reasoning Intervention in Large Reasoning Models Paper • 2602.17053 • Published Feb 19 • 1
Online Hybrid Lightweight Representations Learning: Its Application to Visual Tracking Paper • 2205.11179 • Published May 23, 2022
Becoming Experienced Judges: Selective Test-Time Learning for Evaluators Paper • 2512.06751 • Published Dec 7, 2025 • 1
Factorizing Perception and Policy for Interactive Instruction Following Paper • 2012.03208 • Published Dec 6, 2020
Context-Aware Planning and Environment-Aware Memory for Instruction Following Embodied Agents Paper • 2308.07241 • Published Aug 14, 2023
Story Visualization by Online Text Augmentation with Context Memory Paper • 2308.07575 • Published Aug 15, 2023 • 1
Tuning Large Multimodal Models for Videos using Reinforcement Learning from AI Feedback Paper • 2402.03746 • Published Feb 6, 2024
Multi-Level Compositional Reasoning for Interactive Instruction Following Paper • 2308.09387 • Published Aug 18, 2023
Online Continual Learning on Hierarchical Label Expansion Paper • 2308.14374 • Published Aug 28, 2023