Live Avatar: Streaming Real-time Audio-Driven Avatar Generation with Infinite Length Paper • 2512.04677 • Published Dec 4, 2025 • 167
DualCamCtrl: Dual-Branch Diffusion Model for Geometry-Aware Camera-Controlled Video Generation Paper • 2511.23127 • Published Nov 28, 2025 • 43
AMSP: Super-Scaling LLM Training via Advanced Model States Partitioning Paper • 2311.00257 • Published Nov 1, 2023 • 10
Grounding Visual Illusions in Language: Do Vision-Language Models Perceive Illusions Like Humans? Paper • 2311.00047 • Published Oct 31, 2023 • 10
ChatCoder: Chat-based Refine Requirement Improves LLMs' Code Generation Paper • 2311.00272 • Published Nov 1, 2023 • 11
De-Diffusion Makes Text a Strong Cross-Modal Interface Paper • 2311.00618 • Published Nov 1, 2023 • 23
The Generative AI Paradox: "What It Can Create, It May Not Understand" Paper • 2311.00059 • Published Oct 31, 2023 • 20
Controllable Music Production with Diffusion Models and Guidance Gradients Paper • 2311.00613 • Published Nov 1, 2023 • 26
LLaVA-Interactive: An All-in-One Demo for Image Chat, Segmentation, Generation and Editing Paper • 2311.00571 • Published Nov 1, 2023 • 43
FlashDecoding++: Faster Large Language Model Inference on GPUs Paper • 2311.01282 • Published Nov 2, 2023 • 37
RoboGen: Towards Unleashing Infinite Data for Automated Robot Learning via Generative Simulation Paper • 2311.01455 • Published Nov 2, 2023 • 30
In-Context Prompt Editing For Conditional Audio Generation Paper • 2311.00895 • Published Nov 1, 2023 • 11