foundation-multimodal-models/DetailCaps-4870 Viewer β’ Updated Feb 17, 2025 β’ 4.87k β’ 223 β’ 14
Running Featured 131 Open VLM Video Leaderboard π 131 VLMEvalKit Eval Results in video understanding benchmark
Runtime error Featured 2.02k Chat With Janus-Pro-7B π 2.02k A unified multimodal understanding and generation model.
Running on Zero Featured 1.75k Dia 1.6B π― 1.75k Generate realistic dialogue from a script, using Dia!
Running on Zero Featured 781 UNO FLUX β‘ 781 Generate customized images using text and multiple images
Running on CPU Upgrade Featured 1.22k Open ASR Leaderboard π 1.22k Explore ASR model performance across languages and datasets