5 39 8

Xiyao Wang

russwang

AI & ML interests

None yet

Recent Activity

upvoted a paper 5 days ago

CocoaBench: Evaluating Unified Digital Agents in the Wild

upvoted a paper 12 days ago

Synthetic Sandbox for Training Machine Learning Engineering Agents

updated a dataset 15 days ago

russwang/ThinkLite-VL-70k

View all activity

Organizations

upvoted a paper 5 days ago

CocoaBench: Evaluating Unified Digital Agents in the Wild

Paper • 2604.11201 • Published 7 days ago • 33

upvoted a paper 12 days ago

Synthetic Sandbox for Training Machine Learning Engineering Agents

Paper • 2604.04872 • Published 14 days ago • 14

updated 2 datasets 15 days ago

russwang/ThinkLite-VL-70k

Viewer • Updated 15 days ago • 70k • 57 • 5

russwang/ThinkLite-VL-hard-11k

Viewer • Updated 15 days ago • 11k • 72 • 3

updated a model 15 days ago

russwang/ThinkLite-VL-7B

8B • Updated 15 days ago • 1.25k • 16

authored a paper about 1 month ago

Agentic Critical Training

Paper • 2603.08706 • Published Mar 9 • 14

upvoted a paper about 1 month ago

Agentic Critical Training

Paper • 2603.08706 • Published Mar 9 • 14

updated a dataset 2 months ago

russwang/flux-50k

Viewer • Updated Feb 12 • 50k • 70

published a dataset 2 months ago

russwang/flux-50k

Viewer • Updated Feb 12 • 50k • 70

updated a dataset 2 months ago

russwang/blip3o-long-caption-50k

Viewer • Updated Feb 11 • 50k • 15

published a dataset 2 months ago

russwang/blip3o-long-caption-50k

Viewer • Updated Feb 11 • 50k • 15

liked a dataset 2 months ago

Kwai-Keye/Thyme-RL

Viewer • Updated Aug 18, 2025 • 55.2k • 877 • 11

upvoted 2 papers 3 months ago

Paying Less Generalization Tax: A Cross-Domain Generalization Study of RL Training for LLM Agents

Paper • 2601.18217 • Published Jan 26 • 13

Token-Level LLM Collaboration via FusionRoute

Paper • 2601.05106 • Published Jan 8 • 40

authored a paper 5 months ago

Multi-Crit: Benchmarking Multimodal Judges on Pluralistic Criteria-Following

Paper • 2511.21662 • Published Nov 26, 2025 • 11

upvoted a paper 5 months ago

Multi-Crit: Benchmarking Multimodal Judges on Pluralistic Criteria-Following

Paper • 2511.21662 • Published Nov 26, 2025 • 11

liked a model 5 months ago

lmms-lab/LLaVA-Critic-R1-7B-Plus-Qwen

8B • Updated Jul 26, 2025 • 60 • 5

upvoted a paper 5 months ago

Scaling Agent Learning via Experience Synthesis

Paper • 2511.03773 • Published Nov 5, 2025 • 83

upvoted a paper 6 months ago

When Visualizing is the First Step to Reasoning: MIRA, a Benchmark for Visual Chain-of-Thought

Paper • 2511.02779 • Published Nov 4, 2025 • 60

authored a paper 6 months ago

ROVER: Benchmarking Reciprocal Cross-Modal Reasoning for Omnimodal Generation

Paper • 2511.01163 • Published Nov 3, 2025 • 32

Xiyao Wang

AI & ML interests

Recent Activity

Organizations

russwang's activity