Andrew

mendeza

http://andrewmendez.me

AI & ML interests

Computer Vision, Deep Learning

Organizations

upvoted an article 5 months ago

Article

The Engineering Handbook for GRPO + LoRA with Verl: Training Qwen2.5 on Multi-GPU

Weyaxi

•

Jan 2

• 22

upvoted an article 9 months ago

Article

Jupyter Agents: training LLMs to reason with notebooks

baptistecolle, hannayukhymenko, lvwerra

•

Sep 10, 2025

• 64

upvoted a paper 9 months ago

LLaVA-Critic-R1: Your Critic Model is Secretly a Strong Policy Model

Paper • 2509.00676 • Published Aug 31, 2025 • 85

upvoted 2 papers 10 months ago

Skip a Layer or Loop it? Test-Time Depth Adaptation of Pretrained LLMs

Paper • 2507.07996 • Published Jul 10, 2025 • 35

Wait, We Don't Need to "Wait"! Removing Thinking Tokens Improves Reasoning Efficiency

Paper • 2506.08343 • Published Jun 10, 2025 • 54

upvoted an article 12 months ago

Article

Tiny Agents: an MCP-powered agent in 50 lines of code

julien-c

•

Apr 25, 2025

• 308

upvoted a paper 12 months ago

MORSE-500: A Programmatically Controllable Video Benchmark to Stress-Test Multimodal Reasoning

Paper • 2506.05523 • Published Jun 5, 2025 • 34

upvoted an article about 1 year ago

Article

Open-Source Handwritten Signature Detection Model

samuellimabraz

•

Mar 14, 2025

• 121

Andrew

AI & ML interests

Organizations

mendeza's activity

The Engineering Handbook for GRPO + LoRA with Verl: Training Qwen2.5 on Multi-GPU

Jupyter Agents: training LLMs to reason with notebooks

Tiny Agents: an MCP-powered agent in 50 lines of code

Open-Source Handwritten Signature Detection Model