-
Mulberry: Empowering MLLM with o1-like Reasoning and Reflection via Collective Monte Carlo Tree Search
Paper • 2412.18319 • Published • 39 -
Token-Budget-Aware LLM Reasoning
Paper • 2412.18547 • Published • 46 -
Efficiently Serving LLM Reasoning Programs with Certaindex
Paper • 2412.20993 • Published • 36 -
B-STaR: Monitoring and Balancing Exploration and Exploitation in Self-Taught Reasoners
Paper • 2412.17256 • Published • 47
Kai Zuberbühler
kaizuberbuehler
AI & ML interests
language models, agents, image generation, music generation
Recent Activity
updated
a collection
about 1 month ago
Reasoning, Thinking, RL and Test-Time Scaling
upvoted
a
paper
about 1 month ago
NORA-1.5: A Vision-Language-Action Model Trained using World Model- and Action-based Preference Rewards
upvoted
a
paper
about 1 month ago
π_{0.5}: a Vision-Language-Action Model with Open-World
Generalization
Organizations
None yet