papers
updated
FastVLM: Efficient Vision Encoding for Vision Language Models
Paper
• 2412.13303
• Published
• 75
rStar2-Agent: Agentic Reasoning Technical Report
Paper
• 2508.20722
• Published
• 117
AgentScope 1.0: A Developer-Centric Framework for Building Agentic
Applications
Paper
• 2508.16279
• Published
• 53
OmniWorld: A Multi-Domain and Multi-Modal Dataset for 4D World Modeling
Paper
• 2509.12201
• Published
• 106
HuMo: Human-Centric Video Generation via Collaborative Multi-Modal
Conditioning
Paper
• 2509.08519
• Published
• 128
ST-Raptor: LLM-Powered Semi-Structured Table Question Answering
Paper
• 2508.18190
• Published
• 7
A Survey of Reinforcement Learning for Large Reasoning Models
Paper
• 2509.08827
• Published
• 190
AgentFly: Fine-tuning LLM Agents without Fine-tuning LLMs
Paper
• 2508.16153
• Published
• 160
Scaling Agents via Continual Pre-training
Paper
• 2509.13310
• Published
• 117
ReSum: Unlocking Long-Horizon Search Intelligence via Context
Summarization
Paper
• 2509.13313
• Published
• 80
Kimi Linear: An Expressive, Efficient Attention Architecture
Paper
• 2510.26692
• Published
• 125
Back to Basics: Let Denoising Generative Models Denoise
Paper
• 2511.13720
• Published
• 69
Safety Tax: Safety Alignment Makes Your Large Reasoning Models Less
Reasonable
Paper
• 2503.00555
• Published
• 1
SAM 3D: 3Dfy Anything in Images
Paper
• 2511.16624
• Published
• 113
SAM 3: Segment Anything with Concepts
Paper
• 2511.16719
• Published
• 129
Scaling Rectified Flow Transformers for High-Resolution Image Synthesis
Paper
• 2403.03206
• Published
• 71
Z-Image: An Efficient Image Generation Foundation Model with Single-Stream Diffusion Transformer
Paper
• 2511.22699
• Published
• 238
Decoupled DMD: CFG Augmentation as the Spear, Distribution Matching as the Shield
Paper
• 2511.22677
• Published
• 33
Latent Collaboration in Multi-Agent Systems
Paper
• 2511.20639
• Published
• 121
ByteDance-Seed/Adversarial-Flow-Models
OmniPSD: Layered PSD Generation with Diffusion Transformer
Paper
• 2512.09247
• Published
• 48