ERNIE-Image-INT8
ERNIE-Image-INT8 is a publishable INT8 derivative of Baidu/ERNIE-Image, prepared for local deployment, packaging, and reproducible benchmarking. The default release profile prioritizes transformer INT8 quantization first, while text_encoder and pe may remain in bfloat16 when quality checks show that full INT8 introduces unacceptable degradation.
What Is Included
- Diffusers-compatible model folder layout.
- Component-wise precision manifest and quantization metadata.
Precision Matrix
| Component | Backend | Precision | Enabled |
|---|---|---|---|
| transformer | quanto | int8 | True |
| text_encoder | none | bfloat16 | False |
| pe | none | bfloat16 | False |
Recommended Runtime
- NVIDIA GPU with 24 GB+ VRAM for practical generation.
- CPU is supported only for loading validation, metadata inspection, and smoke tests.
- Recommended image sizes follow the original ERNIE-Image guidance:
1024x1024,848x1264,1264x848,1200x896.
Quick Start
import torch
from diffusers import ErnieImagePipeline
pipe = ErnieImagePipeline.from_pretrained(
"ixim/ERNIE-Image-INT8",
torch_dtype=torch.bfloat16,
).to("cuda")
image = pipe(
prompt="A premium event poster with readable bilingual typography and strong layout hierarchy.",
width=848,
height=1264,
num_inference_steps=50,
guidance_scale=4.0,
use_pe=True,
).images[0]
image.save("output.png")
Benchmark Snapshot
Benchmark context: 7 prompt(s), seed=42, prompts=zh_portrait_studio_east_asian, zh_poster_dense_text, zh_infographic_wide, zh_browser_ui_article, en_storyboard_dialogue, zh_sticker_grid, en_backlit_street_photo. Primary comparison covers transformer-int8 + pe-bf16 + use_pe=true, transformer-int8 + pe-int8 + use_pe=true, transformer-int8 + use_pe=false; variant-specific steps / guidance_scale / use_pe are listed in the tables below. Supplementary references cover ERNIE-Image-Turbo Reference. The pe-int8 row is a runtime-quantized benchmark variant used for comparison only, and does not change the packaged release precision matrix shown above. Peak VRAM reports the peak reserved CUDA memory of the current PyTorch process during each generation call.
| Group | Variant | Prompt Count | Avg Latency (ms) | Avg Peak VRAM (MiB) | Steps | CFG | Use PE |
|---|---|---|---|---|---|---|---|
| primary | transformer-int8 + pe-bf16 + use_pe=true | 7 | 78053 | 28516 | 50 | 4.0 | True |
| primary | transformer-int8 + pe-int8 + use_pe=true | 7 | 81412 | 28721 | 50 | 4.0 | True |
| primary | transformer-int8 + use_pe=false | 7 | 60287 | 28339 | 50 | 4.0 | False |
| supplementary | ERNIE-Image-Turbo Reference | 7 | 32535 | 35255 | 8 | 1.0 | True |
Prompt-by-Prompt Comparison
zh_portrait_studio_east_asian
zh_poster_dense_text
zh_infographic_wide
zh_browser_ui_article
en_storyboard_dialogue
zh_sticker_grid
en_backlit_street_photo
Example Prompt Set
See example_prompts.json for the curated prompt suite used during packaging and regression checks. When scripts/build_release.py receives an --examples-dir benchmark folder, the prompt-grouped benchmark tables above also render preview images from those outputs automatically.
Intended Use
- Local image generation tools and controlled packaging workflows.
- Quantization research on large open-weight text-to-image models.
- Internal demo services where image history, prompt reproducibility, and artifact packaging matter.
Limitations
- Full CPU generation is not a practical primary target for this release.
- Text rendering, dense layouts, and long structured prompts should always be rechecked after quantization.
- Experimental all-INT8 variants can degrade typography, object counting, and layout adherence.
License
This release inherits the Apache-2.0 terms of the base model. Review the included LICENSE and make sure your downstream usage also complies with the original ERNIE-Image terms and any third-party dependencies you add around it.
- Downloads last month
- 42
Model tree for ixim/ERNIE-Image-INT8
Base model
baidu/ERNIE-Image


























