Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
baohao 's Collections
SAGE

SAGE

updated 25 days ago

Self-Hinting Language Models Enhance Reinforcement Learning

Upvote
2

  • baohao/aime24

    Viewer • Updated 27 days ago • 30 • 176

  • baohao/aime25

    Viewer • Updated 27 days ago • 30 • 161

  • baohao/amc23

    Viewer • Updated 27 days ago • 40 • 160

  • baohao/olympiadbench

    Viewer • Updated 27 days ago • 675 • 200

  • baohao/minerva_math

    Viewer • Updated 27 days ago • 272 • 153

  • baohao/math500

    Viewer • Updated 27 days ago • 500 • 161

  • baohao/gpqa

    Viewer • Updated 27 days ago • 198 • 149

  • baohao/mmlu_pro

    Viewer • Updated 27 days ago • 12k • 57

  • baohao/sage_train

    Viewer • Updated 8 days ago • 15k • 34

  • baohao/luffy_train

    Viewer • Updated 8 days ago • 15k • 18

  • baohao/scaf-grpo_train

    Viewer • Updated 27 days ago • 15k • 17

  • Self-Hinting Language Models Enhance Reinforcement Learning

    Paper • 2602.03143 • Published Feb 3 • 30

  • baohao/sage_validation

    Viewer • Updated 8 days ago • 1.67k • 19

  • baohao/SAGE_Llama-3.2-3B-Instruct

    4B • Updated 8 days ago • 25

  • baohao/SAGE_Qwen2.5-7B-Instruct

    8B • Updated 8 days ago • 20

  • baohao/SAGE_Qwen3-4B-Instruct-2507

    4B • Updated 8 days ago • 13

  • baohao/SAGE-light_Qwen2.5-7B-Instruct

    8B • Updated 8 days ago • 28 • 2

  • baohao/SAGE-light_Llama-3.2-3B-Instruct

    4B • Updated 8 days ago • 22

  • baohao/SAGE-light_Qwen3-4B-Instruct-2507

    4B • Updated 8 days ago • 19
Upvote
2
  • Collection guide
  • Browse collections
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs