nvidia/parakeet-tdt-0.6b-v3 Automatic Speech Recognition • Updated about 1 month ago • 70.2k • 480
view article Article *Context Is Gold to Find the Gold Passage*: Evaluating and Training Contextual Document Embeddings Jun 2 • 26
Ovis2 Collection Our latest advancement in multi-modal large language models (MLLMs) • 15 items • Updated Mar 25 • 65
Qwen2.5-VL Collection Vision-language model series based on Qwen2.5 • 11 items • Updated Jul 21 • 550
OpenGVLab/InternViT-300M-448px-V2_5 Image Feature Extraction • 0.3B • Updated Dec 9, 2024 • 9.85k • 48