Li Dong
unilm
AI & ML interests
Language Model Pre-Training
Recent Activity
authored
a paper
37 minutes ago
MoE-CAP: Benchmarking Cost, Accuracy and Performance of Sparse Mixture-of-Experts Systems
authored
a paper
37 minutes ago
Towards Stable and Effective Reinforcement Learning for Mixture-of-Experts
authored
a paper
37 minutes ago
Multiplex Thinking: Reasoning via Token-wise Branch-and-Merge