Natural Language Processing, Image Generation
VisPhyWorld: Probing Physical Reasoning via Code-Driven Video Reconstruction
Context Forcing: Consistent Autoregressive Video Generation with Long Context