π¬ Wan-NVFP4-4Steps Models
NVFP4 Quantization-Aware Step Distillation for Blackwell Architecture
π Table of Contents
- β¨ Features
- π Quick Start
- π¬ Generation Results
- β‘ Performance Comparison
- π¦ Installation
- π οΈ Usage
- π§ Project Structure
- β οΈ Notes
- π€ Community
β¨ Features
- β‘ 4-Step Inference: Dramatically accelerated end-to-end generation approaching real-time performance (tested on RTX 5090 single GPU)
- π― NVFP4 Quantization: Reduced memory and bandwidth usage, optimized for Blackwell architecture
- π§ LightX2V Integration: Optimal performance and stability on the official framework
- π High-Quality Generation: Maintains Wan2.1's superior video quality while achieving unprecedented speed
π Quick Start
# 1. Install LightX2V
git clone https://github.com/ModelTC/LightX2V.git
cd LightX2V
uv pip install -v .
# 2. Install NVFP4 Kernel
pip install scikit_build_core uv
git clone https://github.com/NVIDIA/cutlass.git
cd lightx2v_kernel
MAX_JOBS=$(nproc) CMAKE_BUILD_PARALLEL_LEVEL=$(nproc) \
uv build --wheel \
-Cbuild-dir=build . \
-Ccmake.define.CUTLASS_PATH=/path/to/cutlass \
--verbose --color=always --no-build-isolation
pip install dist/*whl --force-reinstall --no-deps
# 3. Run inference
cd examples/wan
python wan_i2v_nvfp4.py # Image-to-Video
python wan_t2v_nvfp4.py # Text-to-Video
π¬ Generation Results
"A cinematic, hyper-realistic 3D animation, in the somber and beautiful style of Sekiro: Shadows Die Twice. In a vast field of silvery-white pampas grass, under a luminous full moon, the shinobi Wolf stands ready for a final duel..."
| Input Image | Wan2.1-I2V-14B-480P | wan2.1_i2v_480p_nvfp4_lightx2v_4step |
|---|---|---|
|
"ι«ε―Ήζ―εΊ¦οΌι«ι₯±εεΊ¦οΌηθΎΉζεΎοΌζ₯θ½οΌδΈη¦θ·οΌζε οΌθε οΌζθ²θ°οΌθΎΉηΌε οΌδΈθΏζ―οΌζ₯ε οΌζ΄ε€©ε οΌδΈδ½ε€ε½η½δΊΊε₯³ζ§ηθΏζ―οΌε₯ΉθΊ«η©Ώι»θ²ζ ΌεθΏθ‘£θ£οΌζ΄ηθ³η―γιηδ»°ζι倴ηδΈεοΌε₯³εζ¬θ΅·ε€΄ζ₯οΌηΌηιε«ηζ³ͺζ°΄οΌηηεζΉθ―΄ηθ―..."
| Wan2.1-T2V-1.3B | wan2.1_t2v_1_3b_nvfp4_lightx2v_4step |
|---|---|
β‘ Performance Comparison
Test Environment: RTX 5090 Single GPU | LightX2V Framework
πΈ Image-to-Video (I2V-14B-480P)
|
π¬ Text-to-Video (T2V-1.3B-480P)
|
β οΈ Notes
System Requirements
- Required Hardware: NVIDIA RTX 50-series GPUs (RTX 5090/5080/5070/5060) or other Blackwell architecture GPUs
Dependencies
- Prepare T5 / CLIP / VAE components yourself (same as Wan2.x structure)
Performance Tips
- Use Blackwell + NVFP4 for best performance
- Enable CPU offload for GPUs with limited memory
π€ Community
- π Issues: GitHub Issues
- π€ Models: HuggingFace Hub
- π Documentation: LightX2V Docs
If you find this project helpful, please give us a β on GitHub
- Downloads last month
- -
Model tree for lightx2v/Wan-NVFP4
Base model
Wan-AI/Wan2.1-I2V-14B-480P