RubricEM: Meta-RL with Rubric-guided Policy Decomposition beyond Verifiable Rewards Paper • 2605.10899 • Published May 11 • 79
Budget-Aware Tool-Use Enables Effective Agent Scaling Paper • 2511.17006 • Published Nov 21, 2025 • 34
CodecLM: Aligning Language Models with Tailored Synthetic Data Paper • 2404.05875 • Published Apr 8, 2024 • 18