Xiaoyang Cao's picture

5

Xiaoyang Cao

Sean13

·

https://xiaoyangcao1113.github.io/

AI & ML interests

RLFH, Deep Reinfrocement Learning

Recent Activity

updated a model about 1 month ago

Sean13/llama-8b-instruct-v0.2-cpo-full-label_smoothing-0.1

published a model about 1 month ago

Sean13/llama-8b-instruct-v0.2-cpo-full-label_smoothing-0.1

updated a model about 1 month ago

Sean13/mistral-7b-instruct-v0.2-cpo-full-label_smoothing-0.1

View all activity

Organizations

None yet

Sean13 's models 61

Sean13/mistral-7b-instruct-v0.2-emdpo-full

7B • Updated Jul 24, 2025 • 3