Qwen-Image

Image generation model family. Qwen-Image (20B MMDiT) excels at text rendering in Chinese and English. Qwen-Image-2.0 (7B) unifies generation and editing.

GitHub HuggingFace

Outputs 2

model

20B-parameter image generation foundation model built on Multimodal Diffusion Transformer (MMDiT) architecture. Exceptional text rendering accuracy for complex logographic languages.

GitHub HuggingFace

Parameters 20B

Qwen-Image-2.0

model 2026-02-10

Next-gen 7B image model unifying text-to-image generation and image editing. Native 2K resolution. Scores 88.32 on DPG-Bench, outperforming FLUX.1 (12B).

Blog Post

Parameters 7B

generationopen-weight