RLinf/RLinf-DreamZero-WAN2.2-5B-LIBERO-SFT-Step18000
13B • Updated • 44
None defined yet.
WoVR: World Models as Reliable Simulators for Post-Training VLA Policies with RL
RLinf-Co: Reinforcement Learning-Based Sim-Real Co-Training for VLA Models