PRISMLAYERS: Open Data for High-Quality Multi-Layer Transparent Image Generative Models

Junwen Chen1    Heyang Jiang1    Keming Wu1    Yanbin Wang1        Ji Li     Chao Zhang     Keiji Yanai     Dong Chen     Yuhui Yuan 2   
1equal technical contribution    2project lead   
Microsoft Research Asia         The University of Electro-Communications         Tsinghua University         Peking University        

Multi-layer transparent images with different styles generated by our data engine

Image 1

Abstract

Generating high-quality, multi-layer transparent images from text prompts can unlock a new level of creative control, allowing users to edit each layer as effortlessly as editing text outputs from LLMs. However, the development of multi-layer generative models lags behind that of conventional text-to-image models due to the absence of a large, high-quality corpus of multi-layer transparent data. In this paper, we address this fundamental challenge by:

(i) releasing the first open, ultra–high-fidelity PrismLayers (PrismLayersPro) dataset of 200K (20K) multi- layer transparent images with accurate alpha mattes,

(ii) introducing a training- free synthesis pipeline that generates such data on demand using off-the-shelf diffusion models,

(iii) delivering a strong, open-source multi-layer generation model, ART+, which matches the aesthetics of modern text-to-image generation models.

The key technical contributions include: LayerFLUX, which excels at generating high-quality single transparent layers with accurate alpha mattes, and MultiLayerFLUX, which composes multiple LayerFLUX outputs into complete images, guided by human-annotated semantic layout. To ensure higher quality, we apply a rigorous filtering stage to remove artifacts and semantic mismatches, followed by human selection. Fine-tuning the state-of-the-art ART model on our synthetic PrismLayersPro yields ART+, which outperforms the original ART in 60% of head-to-head user study comparisons and even matches the visual quality of images generated by the FLUX.1-[dev] model. We anticipate that our work will establish a solid dataset foundation for the multi-layer transparent image generation task, enabling research and applications that require precise, editable, and visually compelling layered imagery.



Method Overview

Dataset Curation Pipeline of PrismLayers and PrismLayersPro

pipeline

Key dataset statistics on PrismLayers and PrismLayersPro


pipeline

LayerFLUX and MultiLayerFLUX Framework


pipeline

Qualitative Results

Qualitative Results of LayerFLUX


pipeline

Qualitative Results of ART+


pipeline

Qualitative comparison results between FLUX.1-[dev] (1st row), MultiLayerFLUX (2nd row), ART (3rd row), and ART+ (4th row)

Image 1 Image 2 Image 3 Image 4 Image 5 Image 6 Image 7
Image 1 Image 2 Image 3 Image 4 Image 5 Image 6 Image 7
Image 1 Image 2 Image 3 Image 4 Image 5 Image 6 Image 7
Image 1 Image 2 Image 3 Image 4 Image 5 Image 6 Image 7

Quantitative Results

Transparent Image Quality Assessment


pipeline

We develop a dedicated Transparent Image Preference Scoring (TIPS) model fine-tuned on a large win–lose dataset to reliably evaluate and rank the aesthetic quality of generated transparent layers and composites.

Transparent Image Quality Assessment


pipeline

Transparent Image Quality Assessment


pipeline

Our MultiLayerFLUX composes multiple LayerFLUX-generated transparent layers according to a semantic layout—either extracted from a reference or produced by an LLM—ensuring precise spatial control and preserving each layer’s visual fidelity. We fine-tune ART with our synthesized data to improve its layer-wise aesthetic quality and overall visual appeal.


Editable Demo



BibTeX


        
        @article{chen2025prismlayersopendatahighquality,
          title={PrismLayers: Open Data for High-Quality Multi-Layer Transparent Image Generative Models}, 
          author={Junwen Chen and Heyang Jiang and Yanbin Wang and Keming Wu and Ji Li and Chao Zhang and Keiji Yanai and Dong Chen and Yuhui Yuan},
          journal={arXiv preprint arXiv:2505.22523},
          year={2025},
        }
      

Acknowledgements

Website adapted from the following template