ReLIFT
Collection
ReLIFT, a training method that interleaves RL with online FT, achieving superior performance and efficiency compared to using RL or SFT alone. • 8 items • Updated
• 1
This repository contains the ReLIFT model presented in Learning What Reinforcement Learning Can't: Interleaved Online Fine-Tuning for Hardest Questions.
Code: https://github.com/TheRoadQaQ/ReLIFT
Hugging Face Collection: https://huggingface.co/collections/RoadQAQ/relift-684535e199a909cad16d8b05