M P
kustomkoder
·
AI & ML interests
None yet
Organizations
None yet
Vision
-
Modeling Collaborator: Enabling Subjective Vision Classification With Minimal Human Effort via LLM Tool-Use
Paper • 2403.02626 • Published • 11 -
How Far are VLMs from Visual Spatial Intelligence? A Benchmark-Driven Perspective
Paper • 2509.18905 • Published • 29 -
Hunyuan3D-Omni: A Unified Framework for Controllable Generation of 3D Assets
Paper • 2509.21245 • Published • 39 -
Seedream 4.0: Toward Next-generation Multimodal Image Generation
Paper • 2509.20427 • Published • 81
visual effects
-
IDEA-Bench: How Far are Generative Models from Professional Designing?
Paper • 2412.11767 • Published -
GeoRemover: Removing Objects and Their Causal Visual Artifacts
Paper • 2509.18538 • Published -
MME-VideoOCR: Evaluating OCR-Based Capabilities of Multimodal LLMs in Video Scenarios
Paper • 2505.21333 • Published • 38 -
Fast Training Data Acquisition for Object Detection and Segmentation using Black Screen Luminance Keying
Paper • 2405.07653 • Published
chroma keying
visual effects
-
IDEA-Bench: How Far are Generative Models from Professional Designing?
Paper • 2412.11767 • Published -
GeoRemover: Removing Objects and Their Causal Visual Artifacts
Paper • 2509.18538 • Published -
MME-VideoOCR: Evaluating OCR-Based Capabilities of Multimodal LLMs in Video Scenarios
Paper • 2505.21333 • Published • 38 -
Fast Training Data Acquisition for Object Detection and Segmentation using Black Screen Luminance Keying
Paper • 2405.07653 • Published
Vision
-
Modeling Collaborator: Enabling Subjective Vision Classification With Minimal Human Effort via LLM Tool-Use
Paper • 2403.02626 • Published • 11 -
How Far are VLMs from Visual Spatial Intelligence? A Benchmark-Driven Perspective
Paper • 2509.18905 • Published • 29 -
Hunyuan3D-Omni: A Unified Framework for Controllable Generation of 3D Assets
Paper • 2509.21245 • Published • 39 -
Seedream 4.0: Toward Next-generation Multimodal Image Generation
Paper • 2509.20427 • Published • 81