Han-Bit Kang's picture

Han-Bit Kang

hbkang

·

AI & ML interests

ML

Recent Activity

upvoted a paper 4 days ago

See and Fix the Flaws: Enabling VLMs and Diffusion Models to Comprehend Visual Artifacts via Agentic Data Synthesis

updated a collection 4 days ago

artistic rendering

liked a model 4 days ago

PaddlePaddle/PaddleOCR-VL-1.5

View all activity

Organizations

None yet

upvoted 2 papers 4 days ago

See and Fix the Flaws: Enabling VLMs and Diffusion Models to Comprehend Visual Artifacts via Agentic Data Synthesis

Paper • 2602.20951 • Published 6 days ago • 13

SIMSPINE: A Biomechanics-Aware Simulation Framework for 3D Spine Motion Annotation and Benchmarking

Paper • 2602.20792 • Published 6 days ago • 2

upvoted 2 papers 6 days ago

Agents of Chaos

Paper • 2602.20021 • Published 7 days ago • 28

SoulX-FlashHead: Oracle-guided Generation of Infinite Real-time Streaming Talking Heads

Paper • 2602.07449 • Published 23 days ago • 3

upvoted a paper 10 days ago

Optimizing Few-Step Generation with Adaptive Matching Distillation

Paper • 2602.07345 • Published 23 days ago • 9

upvoted a paper 25 days ago

Token Sparse Attention: Efficient Long-Context Inference with Interleaved Token Selection

Paper • 2602.03216 • Published 27 days ago • 12

upvoted a paper 27 days ago

Golden Goose: A Simple Trick to Synthesize Unlimited RLVR Tasks from Unverifiable Internet Text

Paper • 2601.22975 • Published about 1 month ago • 107

upvoted a paper 28 days ago

PaddleOCR-VL-1.5: Towards a Multi-Task 0.9B VLM for Robust In-the-Wild Document Parsing

Paper • 2601.21957 • Published Jan 29 • 19

upvoted a paper 29 days ago

Scaling Embeddings Outperforms Scaling Experts in Language Models

Paper • 2601.21204 • Published Jan 29 • 100

upvoted a paper 30 days ago

One-step Latent-free Image Generation with Pixel Mean Flows

Paper • 2601.22158 • Published Jan 29 • 18

upvoted 10 papers about 1 month ago

LongCat-Video Technical Report

Paper • 2510.22200 • Published Oct 25, 2025 • 33

LongCat-Flash-Omni Technical Report

Paper • 2511.00279 • Published Oct 31, 2025 • 26

LongCat-Image Technical Report

Paper • 2512.07584 • Published Dec 8, 2025 • 23

Unlocking Implicit Experience: Synthesizing Tool-Use Trajectories from Text

Paper • 2601.10355 • Published Jan 15 • 39

LongCat-Flash-Thinking-2601 Technical Report

Paper • 2601.16725 • Published Jan 23 • 176

DeepSeek-OCR 2: Visual Causal Flow

Paper • 2601.20552 • Published Jan 28 • 64

Reinforcement Learning via Self-Distillation

Paper • 2601.20802 • Published Jan 28 • 40

OpenThinkIMG: Learning to Think with Images via Visual Tool Reinforcement Learning

Paper • 2505.08617 • Published May 13, 2025 • 42

Post-LayerNorm Is Back: Stable, ExpressivE, and Deep

Paper • 2601.19895 • Published Jan 27 • 24

Towards Pixel-Level VLM Perception via Simple Points Prediction

Paper • 2601.19228 • Published Jan 27 • 18