One Model, Many Latencies: Universal Speech Enhancement for Diverse Real-Time Applications Paper • 2606.25621 • Published 4 days ago • 13
Zone of Proximal Policy Optimization: Teacher in Prompts, Not Gradients Paper • 2606.18216 • Published 12 days ago • 63
SpatialClaw: Rethinking Action Interface for Agentic Spatial Reasoning Paper • 2606.13673 • Published 17 days ago • 108
SpatialClaw: Rethinking Action Interface for Agentic Spatial Reasoning Paper • 2606.13673 • Published 17 days ago • 108
SpatialClaw: Rethinking Action Interface for Agentic Spatial Reasoning Paper • 2606.13673 • Published 17 days ago • 108
FineBench: Benchmarking and Enhancing Vision-Language Models for Fine-grained Human Activity Understanding Paper • 2605.19846 • Published May 20 • 3
DVSM: Decoder-only View Synthesis Model Done Right Paper • 2605.29891 • Published about 1 month ago • 2
Physics in 2-Steps: Locking Motion Priors Before Visual Refinement Erases Them Paper • 2606.06361 • Published 24 days ago • 16
FineBench: Benchmarking and Enhancing Vision-Language Models for Fine-grained Human Activity Understanding Paper • 2605.19846 • Published May 20 • 3
DVSM: Decoder-only View Synthesis Model Done Right Paper • 2605.29891 • Published about 1 month ago • 2
Physics in 2-Steps: Locking Motion Priors Before Visual Refinement Erases Them Paper • 2606.06361 • Published 24 days ago • 16
Why Far Looks Up: Probing Spatial Representation in Vision-Language Models Paper • 2605.30161 • Published about 1 month ago • 60
Agent Explorative Policy Optimization for Multimodal Agentic Reasoning Paper • 2605.28774 • Published May 27 • 93
view article Article Fine-Tuning NVIDIA Cosmos Predict 2.5 with LoRA/DoRA for Robot Video Generation nvidia • May 18 • 21