arxiv:2602.04705
Junyuan Shang
sjy1203
ยท
AI & ML interests
NLP
Recent Activity
authored a paper about 1 month ago
DHA: Learning Decoupled-Head Attention from Transformer Checkpoints via Adaptive Heads Fusion authored a paper about 1 month ago
NACL: A General and Effective KV Cache Eviction Framework for LLMs at Inference Time authored a paper about 1 month ago
ERNIE 5.0 Technical ReportOrganizations
None yet