arxiv:2510.05592
ZhuofengLi
ZhuofengLi
AI & ML interests
Agents, Reasoning LLMs/VLLMs, RL
Organizations
models
17
ZhuofengLi/Qwen3-4B-Instruct-2507-DeepReview-lora-sft-ms-swift-v2
4B
•
Updated
•
4
ZhuofengLi/Qwen3-4B-Instruct-2507-DeepReview-lora-sft-ms-swift-new
4B
•
Updated
•
3
ZhuofengLi/Qwen3-4B-Instruct-2507-DeepReview-lora-sft-ms-swift
4B
•
Updated
•
3
ZhuofengLi/Qwen3-4B-Instruct-2507-DeepReview-lora-sft
4B
•
Updated
•
37
ZhuofengLi/torl-qwen2.5-7b-instruct
8B
•
Updated
•
4
ZhuofengLi/octo-science-qwen2.5-7b-grpo-step-40-v2
2B
•
Updated
•
5
ZhuofengLi/octo-search-qwen2.5-7b-grpo-155-step-v1
8B
•
Updated
•
5
ZhuofengLi/octo-search-qwen2.5-7b-grpo-step-60-v1.5
2B
•
Updated
•
6
ZhuofengLi/tool-n1-multi-turn-reason-lora-sft-1180-step
Text Generation
•
8B
•
Updated
•
4
ZhuofengLi/xlam-reason-lora-sft-1340-step
Text Generation
•
3B
•
Updated
•
7
datasets
18
ZhuofengLi/fineweb_indexes
Updated
•
56
ZhuofengLi/fineweb_corpus
Viewer
•
Updated
•
14.9M
•
168
ZhuofengLi/MiroVerse-v0.1
Viewer
•
Updated
•
142k
•
90
•
1
ZhuofengLi/lambda-sft-code-data-gen-st-debug
Viewer
•
Updated
•
5
•
12
ZhuofengLi/lambda-sft-math-data-gen-st-debug
Viewer
•
Updated
•
5
•
10
ZhuofengLi/deepreview-fast-sft-v2
Viewer
•
Updated
•
13.3k
•
3
ZhuofengLi/ICLR_26
Viewer
•
Updated
•
19.6k
•
6
ZhuofengLi/deepreview-fast-sft
Viewer
•
Updated
•
13.4k
•
8
ZhuofengLi/deepreview-sft
Viewer
•
Updated
•
41.4k
•
18
ZhuofengLi/deepreview-synthesis-sft
Viewer
•
Updated
•
13.4k
•
6