MMLU Pro benchmark for GGUFs (1 shot) Collection "Not all quantized model perform good", serving framework ollama uses NVIDIA gpu, llama.cpp uses CPU with AVX & AMX • 13 items • Updated Aug 15, 2025 • 9
Wan2.1 14B T2V LoRAs Collection A collection of Remade's Wan2.1 14B T2V LoRAs • 20 items • Updated Mar 27, 2025 • 35
Wan2.1 14B 480p I2V LoRAs Collection A collection of Remade's Wan2.1 14B 480p I2V LoRAs • 49 items • Updated May 24, 2025 • 208
MedGemma Release Collection Collection of Gemma 3 variants for performance on medical text and image comprehension to accelerate building healthcare-based AI applications. • 7 items • Updated Jul 11, 2025 • 370
Seek in the Dark: Reasoning via Test-Time Instance-Level Policy Gradient in Latent Space Paper • 2505.13308 • Published May 19, 2025 • 27
Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention Paper • 2502.11089 • Published Feb 16, 2025 • 166
Step-Audio Collection Step-Audio model family, including Audio-Tokenizer, Audio-Chat and TTS • 4 items • Updated Jul 31, 2025 • 32
UI-TARS: Pioneering Automated GUI Interaction with Native Agents Paper • 2501.12326 • Published Jan 21, 2025 • 64
rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking Paper • 2501.04519 • Published Jan 8, 2025 • 287
Awaker2.5-VL: Stably Scaling MLLMs with Parameter-Efficient Mixture of Experts Paper • 2411.10669 • Published Nov 16, 2024 • 10
LLaVA-o1: Let Vision Language Models Reason Step-by-Step Paper • 2411.10440 • Published Nov 15, 2024 • 129