Parallel-R1: Towards Parallel Thinking via Reinforcement Learning Paper • 2509.07980 • Published Sep 9, 2025 • 101
GRAM-R^2: Self-Training Generative Foundation Reward Models for Reward Reasoning Paper • 2509.02492 • Published Sep 2, 2025 • 1
GRAM: A Generative Foundation Reward Model for Reward Generalization Paper • 2506.14175 • Published Jun 17, 2025 • 1
GRAM Collection Generative Foundation Reward Models for Reward Generalization • 8 items • Updated Jun 19, 2025 • 1
RoVRM: A Robust Visual Reward Model Optimized via Auxiliary Textual Preference Data Paper • 2408.12109 • Published Aug 22, 2024 • 1
A Controlled Study on Long Context Extension and Generalization in LLMs Paper • 2409.12181 • Published Sep 18, 2024 • 45