sxcasf (ads)

6 8 5

sxcasf

AI & ML interests

None yet

Recent Activity

commentedon a paper 5 days ago

Lightning OPD: Efficient Post-Training for Large Reasoning Models with Offline On-Policy Distillation

upvoted a paper 12 days ago

Lightning OPD: Efficient Post-Training for Large Reasoning Models with Offline On-Policy Distillation

commentedon a paper 12 days ago

Lightning OPD: Efficient Post-Training for Large Reasoning Models with Offline On-Policy Distillation

View all activity

Organizations

commented a paper 5 days ago

Lightning OPD: Efficient Post-Training for Large Reasoning Models with Offline On-Policy Distillation

Paper • 2604.13010 • Published 14 days ago • 12 •

upvoted a paper 12 days ago

Lightning OPD: Efficient Post-Training for Large Reasoning Models with Offline On-Policy Distillation

Paper • 2604.13010 • Published 14 days ago • 12

commented a paper 12 days ago

Lightning OPD: Efficient Post-Training for Large Reasoning Models with Offline On-Policy Distillation

Paper • 2604.13010 • Published 14 days ago • 12 •

upvoted a paper 26 days ago

A Survey of On-Policy Distillation for Large Language Models

Paper • 2604.00626 • Published 27 days ago • 12

upvoted a paper about 1 month ago

LongCat-Flash-Prover: Advancing Native Formal Reasoning via Agentic Tool-Integrated Reinforcement Learning

Paper • 2603.21065 • Published Mar 22 • 77

liked a dataset about 2 months ago

AudioVisual-Caption/ASID-1M

Viewer • Updated Mar 11 • 241k • 1.59k • 82

liked 2 models 3 months ago

Alibaba-Apsara/DASD-30B-A3B-Thinking-Preview

Text Generation • Updated Jan 15 • 172 • 52

Alibaba-Apsara/DASD-4B-Thinking

Text Generation • Updated Jan 15 • 352 • • 236

upvoted a collection 3 months ago

DASD-Thinking

Collection

6 items • Updated Feb 3 • 25

liked a dataset 3 months ago

Alibaba-Apsara/Superior-Reasoning-SFT-gpt-oss-120b

Viewer • Updated Jan 31 • 306k • 2.86k • 348

upvoted a paper 3 months ago

Distribution-Aligned Sequence Distillation for Superior Long-CoT Reasoning

Paper • 2601.09088 • Published Jan 14 • 63

liked a dataset 3 months ago

Alibaba-Apsara/Superior-Reasoning-SFT-gpt-oss-120b-Logprob

Viewer • Updated Jan 15 • 435k • 1.23k • 58

updated 2 datasets 3 months ago

Alibaba-Apsara/Superior-Reasoning-SFT-gpt-oss-120b

Viewer • Updated Jan 31 • 306k • 2.86k • 348

Alibaba-Apsara/Superior-Reasoning-SFT-gpt-oss-120b-Logprob

Viewer • Updated Jan 15 • 435k • 1.23k • 58

updated 2 models 3 months ago

Alibaba-Apsara/DASD-30B-A3B-Thinking-Preview

Text Generation • Updated Jan 15 • 172 • 52

Alibaba-Apsara/DASD-4B-Thinking

Text Generation • Updated Jan 15 • 352 • • 236

upvoted a paper 4 months ago

Where Did This Sentence Come From? Tracing Provenance in LLM Reasoning Distillation

Paper • 2512.20908 • Published Dec 24, 2025 • 29

New activity in HuggingFaceTB/Countdown-Task-GOLD 5 months ago

Inconsistent numbers

#1 opened 5 months ago by

MysticJay

upvoted a paper 5 months ago

Unified Video Editing with Temporal Reasoner

Paper • 2512.07469 • Published Dec 8, 2025 • 46

New activity in Qwen/Qwen3-1.7B 5 months ago

When enable_thinking=True, why doesn't the chat_template output end with "<think>？

#16 opened 5 months ago by

sxcasf

ads

AI & ML interests

Recent Activity

Organizations

sxcasf's activity

Inconsistent numbers

When enable_thinking=True, why doesn't the chat_template output end with "<think>？