XMHe
Northrend
AI & ML interests
None yet
Recent Activity
liked
a model
about 2 months ago
moonshotai/Kimi-K2-Thinking
liked
a model
about 2 months ago
BAAI/Emu3.5-Image
liked
a model
2 months ago
tianweiy/DMD2
Organizations
None yet
image synthetic
-
FiTv2: Scalable and Improved Flexible Vision Transformer for Diffusion Model
Paper • 2410.13925 • Published • 24 -
BiGR: Harnessing Binary Latent Codes for Image Generation and Improved Visual Representation Capabilities
Paper • 2410.14672 • Published • 8 -
Scalable Ranked Preference Optimization for Text-to-Image Generation
Paper • 2410.18013 • Published • 14 -
DreamClear: High-Capacity Real-World Image Restoration with Privacy-Safe Dataset Curation
Paper • 2410.18666 • Published • 19
video synthetic
-
BroadWay: Boost Your Text-to-Video Generation Model in a Training-free Way
Paper • 2410.06241 • Published • 10 -
Animate-X: Universal Character Image Animation with Enhanced Motion Representation
Paper • 2410.10306 • Published • 56 -
Cavia: Camera-controllable Multi-view Video Diffusion with View-Integrated Attention
Paper • 2410.10774 • Published • 25 -
Allegro: Open the Black Box of Commercial-Level Video Generation Model
Paper • 2410.15458 • Published • 40
controllable synthetic
-
OmniBooth: Learning Latent Control for Image Synthesis with Multi-modal Instruction
Paper • 2410.04932 • Published • 9 -
ControlAR: Controllable Image Generation with Autoregressive Models
Paper • 2410.02705 • Published • 11 -
MagicTailor: Component-Controllable Personalization in Text-to-Image Diffusion Models
Paper • 2410.13370 • Published • 37 -
GrounDiT: Grounding Diffusion Transformers via Noisy Patch Transplantation
Paper • 2410.20474 • Published • 14
llm and mllm
Basic CV task
image synthetic
-
FiTv2: Scalable and Improved Flexible Vision Transformer for Diffusion Model
Paper • 2410.13925 • Published • 24 -
BiGR: Harnessing Binary Latent Codes for Image Generation and Improved Visual Representation Capabilities
Paper • 2410.14672 • Published • 8 -
Scalable Ranked Preference Optimization for Text-to-Image Generation
Paper • 2410.18013 • Published • 14 -
DreamClear: High-Capacity Real-World Image Restoration with Privacy-Safe Dataset Curation
Paper • 2410.18666 • Published • 19
dataset and benchmark
video synthetic
-
BroadWay: Boost Your Text-to-Video Generation Model in a Training-free Way
Paper • 2410.06241 • Published • 10 -
Animate-X: Universal Character Image Animation with Enhanced Motion Representation
Paper • 2410.10306 • Published • 56 -
Cavia: Camera-controllable Multi-view Video Diffusion with View-Integrated Attention
Paper • 2410.10774 • Published • 25 -
Allegro: Open the Black Box of Commercial-Level Video Generation Model
Paper • 2410.15458 • Published • 40
ML Basic
controllable synthetic
-
OmniBooth: Learning Latent Control for Image Synthesis with Multi-modal Instruction
Paper • 2410.04932 • Published • 9 -
ControlAR: Controllable Image Generation with Autoregressive Models
Paper • 2410.02705 • Published • 11 -
MagicTailor: Component-Controllable Personalization in Text-to-Image Diffusion Models
Paper • 2410.13370 • Published • 37 -
GrounDiT: Grounding Diffusion Transformers via Noisy Patch Transplantation
Paper • 2410.20474 • Published • 14