CASA: Cross-Attention via Self-Attention for Efficient Vision-Language Fusion Paper • 2512.19535 • Published 2 days ago • 4
Discover-then-Name: Task-Agnostic Concept Bottlenecks via Automated Concept Discovery Paper • 2407.14499 • Published Jul 19, 2024
Good Teachers Explain: Explanation-Enhanced Knowledge Distillation Paper • 2402.03119 • Published Feb 5, 2024
Better Understanding Differences in Attribution Methods via Systematic Evaluations Paper • 2303.11884 • Published Mar 21, 2023
MoshiVis v0.1 Collection MoshiVis is a Vision Speech Model built as a perceptually-augmented version of Moshi v0.1 for conversing about image inputs • 9 items • Updated about 16 hours ago • 22