PhyCritic: Multimodal Critic Models for Physical AI Paper • 2602.11124 • Published 1 day ago • 45
Internalizing Meta-Experience into Memory for Guided Reinforcement Learning in Large Language Models Paper • 2602.10224 • Published 2 days ago • 15
SkillRL: Evolving Agents via Recursive Skill-Augmented Reinforcement Learning Paper • 2602.08234 • Published 4 days ago • 64
Chain of Mindset: Reasoning with Adaptive Cognitive Modes Paper • 2602.10063 • Published 2 days ago • 70
Dr. MAS: Stable Reinforcement Learning for Multi-Agent LLM Systems Paper • 2602.08847 • Published 3 days ago • 20
Rolling Sink: Bridging Limited-Horizon Training and Open-Ended Testing in Autoregressive Video Diffusion Paper • 2602.07775 • Published 5 days ago • 7
OPUS: Towards Efficient and Principled Data Selection in Large Language Model Pre-training in Every Iteration Paper • 2602.05400 • Published 8 days ago • 297
Code2World: A GUI World Model via Renderable Code Generation Paper • 2602.09856 • Published 2 days ago • 183
Modality Gap-Driven Subspace Alignment Training Paradigm For Multimodal Large Language Models Paper • 2602.07026 • Published 10 days ago • 129
AgentCPM-Report: Interleaving Drafting and Deepening for Open-Ended Deep Research Paper • 2602.06540 • Published 7 days ago • 20
NanoQuant: Efficient Sub-1-Bit Quantization of Large Language Models Paper • 2602.06694 • Published 7 days ago • 12
F-GRPO: Don't Let Your Policy Learn the Obvious and Forget the Rare Paper • 2602.06717 • Published 6 days ago • 68
DFlash: Block Diffusion for Flash Speculative Decoding Paper • 2602.06036 • Published 7 days ago • 41
RISE-Video: Can Video Generators Decode Implicit World Rules? Paper • 2602.05986 • Published 7 days ago • 26
EgoActor: Grounding Task Planning into Spatial-aware Egocentric Actions for Humanoid Robots via Visual-Language Models Paper • 2602.04515 • Published 9 days ago • 38
Self-Hinting Language Models Enhance Reinforcement Learning Paper • 2602.03143 • Published 10 days ago • 27
HySparse: A Hybrid Sparse Attention Architecture with Oracle Token Selection and KV Cache Sharing Paper • 2602.03560 • Published 9 days ago • 42
OmniSIFT: Modality-Asymmetric Token Compression for Efficient Omni-modal Large Language Models Paper • 2602.04804 • Published 8 days ago • 46