PRISMLAYERS: Open Data for High-Quality Multi-Layer Transparent Image Generative Models

Microsoft Research Asia The University of Electro-Communications Tsinghua University Peking University

Abstract

Generating high-quality, multi-layer transparent images from text prompts can unlock a new level of creative control, allowing users to edit each layer as effortlessly as editing text outputs from LLMs. However, the development of multi-layer generative models lags behind that of conventional text-to-image models due to the absence of a large, high-quality corpus of multi-layer transparent data. In this paper, we address this fundamental challenge by:

(i) releasing the first open, ultra–high-fidelity PrismLayers (PrismLayersPro) dataset of 200K (20K) multi- layer transparent images with accurate alpha mattes,

(ii) introducing a training- free synthesis pipeline that generates such data on demand using off-the-shelf diffusion models,

(iii) delivering a strong, open-source multi-layer generation model, ART+, which matches the aesthetics of modern text-to-image generation models.

The key technical contributions include: LayerFLUX, which excels at generating high-quality single transparent layers with accurate alpha mattes, and MultiLayerFLUX, which composes multiple LayerFLUX outputs into complete images, guided by human-annotated semantic layout. To ensure higher quality, we apply a rigorous filtering stage to remove artifacts and semantic mismatches, followed by human selection. Fine-tuning the state-of-the-art ART model on our synthetic PrismLayersPro yields ART+, which outperforms the original ART in 60% of head-to-head user study comparisons and even matches the visual quality of images generated by the FLUX.1-[dev] model. We anticipate that our work will establish a solid dataset foundation for the multi-layer transparent image generation task, enabling research and applications that require precise, editable, and visually compelling layered imagery.

Transparent Image Quality Assessment

Our MultiLayerFLUX composes multiple LayerFLUX-generated transparent layers according to a semantic layout—either extracted from a reference or produced by an LLM—ensuring precise spatial control and preserving each layer’s visual fidelity. We fine-tune ART with our synthesized data to improve its layer-wise aesthetic quality and overall visual appeal.

BibTeX

@article{chen2025prismlayersopendatahighquality, title={PrismLayers: Open Data for High-Quality Multi-Layer Transparent Image Generative Models}, author={Junwen Chen and Heyang Jiang and Yanbin Wang and Keming Wu and Ji Li and Chao Zhang and Keiji Yanai and Dong Chen and Yuhui Yuan}, journal={arXiv preprint arXiv:2505.22523}, year={2025}, }

PRISMLAYERS: Open Data for High-Quality Multi-Layer Transparent Image Generative Models

Multi-layer transparent images with different styles generated by our data engine

Abstract

Method Overview

Qualitative Results

Qualitative Results of LayerFLUX

Qualitative Results of ART+

Qualitative comparison results between FLUX.1-[dev] (1st row), MultiLayerFLUX (2nd row), ART (3rd row), and ART+ (4th row)

Quantitative Results

Transparent Image Quality Assessment

Transparent Image Quality Assessment

Transparent Image Quality Assessment

Editable Demo

BibTeX

Acknowledgements