Comprehensive Benchmarking of Long-Form Speech Generation in Diverse Scenarios Paper • 2605.28618 • Published 12 days ago • 31
Towards Streaming Synchronized Spatial Audio Generation via Autoregressive Diffusion Transformer Paper • 2605.30940 • Published 10 days ago • 37
SwanVoice: Expressive Long-Form Zero-Shot Speech Synthesis for Both Monologue and Dialogue Paper • 2605.30993 • Published 10 days ago • 56
RULER-Bench: Probing Rule-based Reasoning Abilities of Next-level Video Generation Models for Vision Foundation Intelligence Paper • 2512.02622 • Published Dec 2, 2025 • 10