OmniOPD: Logit-Free On-Policy Distillation via Speculative Verification Paper • 2606.01476 • Published 6 days ago • 8
Synthetic Sandbox for Training Machine Learning Engineering Agents Paper • 2604.04872 • Published Apr 6 • 14
Agent Data Protocol: Unifying Datasets for Diverse, Effective Fine-tuning of LLM Agents Paper • 2510.24702 • Published Oct 28, 2025 • 31
Lumine: An Open Recipe for Building Generalist Agents in 3D Open Worlds Paper • 2511.08892 • Published Nov 12, 2025 • 216
ViCrit: A Verifiable Reinforcement Learning Proxy Task for Visual Perception in VLMs Paper • 2506.10128 • Published Jun 11, 2025 • 22