V-JEPA 2.1: Unlocking Dense Features in Video Self-Supervised Learning Paper • 2603.14482 • Published 17 days ago • 25
PixMo Collection A set of vision-language datasets built by Ai2 and used to train the Molmo family of models. Read more at https://molmo.allenai.org/blog • 9 items • Updated about 1 month ago • 88