Learning to Build the Environment: Self-Evolving Reasoning RL via Verifiable Environment Synthesis Paper • 2605.14392 • Published 7 days ago • 7
FrontierSmith: Synthesizing Open-Ended Coding Problems at Scale Paper • 2605.14445 • Published 7 days ago • 19
Learning POMDP World Models from Observations with Language-Model Priors Paper • 2605.13740 • Published 8 days ago • 4
Agentic Discovery of Neural Architectures: AIRA-Compose and AIRA-Design Paper • 2605.15871 • Published 6 days ago • 13
Skills-Coach: A Self-Evolving Skill Optimizer via Training-Free GRPO Paper • 2604.27488 • Published 21 days ago • 6
StateSMix: Online Lossless Compression via Mamba State Space Models and Sparse N-gram Context Mixing Paper • 2605.02904 • Published Apr 5 • 8
Kronos: A Foundation Model for the Language of Financial Markets Paper • 2508.02739 • Published Aug 2, 2025 • 35
SWE-chat: Coding Agent Interactions From Real Users in the Wild Paper • 2604.20779 • Published 29 days ago • 15
Neural Additive Experts: Context-Gated Experts for Controllable Model Additivity Paper • 2602.10585 • Published Feb 11 • 2
Composition-RL: Compose Your Verifiable Prompts for Reinforcement Learning of Large Language Models Paper • 2602.12036 • Published Feb 12 • 93
Routing the Lottery: Adaptive Subnetworks for Heterogeneous Data Paper • 2601.22141 • Published Jan 29 • 4
ASTRA: Automated Synthesis of agentic Trajectories and Reinforcement Arenas Paper • 2601.21558 • Published Jan 29 • 61
Native Parallel Reasoner: Reasoning in Parallelism via Self-Distilled Reinforcement Learning Paper • 2512.07461 • Published Dec 8, 2025 • 80
BeyondWeb: Lessons from Scaling Synthetic Data for Trillion-scale Pretraining Paper • 2508.10975 • Published Aug 14, 2025 • 60
Technical Report: Full-Stack Fine-Tuning for the Q Programming Language Paper • 2508.06813 • Published Aug 9, 2025 • 6
Mercury: Ultra-Fast Language Models Based on Diffusion Paper • 2506.17298 • Published Jun 17, 2025 • 10