DUAW: Data-free Universal Adversarial Watermark against Stable Diffusion Customization

Authors: Xiaoyu Ye, Hao Huang, Jiaqi An, Yongtao Wang

Abstract: Stable Diffusion (SD) customization approaches enable users to personalize SD model outputs, greatly enhancing the flexibility and diversity of AI art. However, they also allow individuals to plagiarize specific styles or subjects from copyrighted images, which raises significant concerns about potential copyright infringement. To address this issue, we propose an invisible data-free universal adversarial watermark (DUAW), aiming to protect a myriad of copyrighted images from different customization approaches across various versions of SD models. First, DUAW is designed to disrupt the variational autoencoder during SD customization. Second, DUAW operates in a data-free context, where it is trained on synthetic images produced by a Large Language Model (LLM) and a pretrained SD model. This approach circumvents the necessity of directly handling copyrighted images, thereby preserving their confidentiality. Once crafted, DUAW can be imperceptibly integrated into massive copyrighted images, serving as a protective measure by inducing significant distortions in the images generated by customized SD models. Experimental results demonstrate that DUAW can effectively distort the outputs of fine-tuned SD models, rendering them discernible to both human observers and a simple classifier.

What, Why and How

Here is a summary of the key points from this paper:

What:

  • The paper proposes a data-free universal adversarial watermark (DUAW) to protect copyrighted images from being misused by customized Stable Diffusion (SD) models.

Why:

  • SD customization methods like DreamBooth and LoRA allow personalizing SD models to generate images matching specific subjects or styles. This raises concerns about potential copyright infringement.

  • Existing protection methods require accessing the copyrighted images, which may not be feasible due to confidentiality.

How:

  • The DUAW is designed to disrupt the variational autoencoder (VAE) of SD models, which remains unchanged during customization.

  • It is trained on synthetic images from an LLM and SD model, avoiding direct use of copyrighted data.

  • During training, it minimizes MS-SSIM between original and VAE-decoded images to introduce distortions.

  • Once optimized, the imperceptible DUAW can be added to copyrighted images.

  • When used to customize SD models, it induces visible distortions in generated images.

In summary, this paper proposes a data-free universal watermark that can protect copyrighted images by disrupting customized SD models, without needing access to those images.

Main Contributions

Here are the key contributions of this paper:

  • Proposes the first data-free universal adversarial watermark (DUAW) to protect copyrighted images from customized SD models.

  • Introduces an effective approach to craft the DUAW by disrupting the VAE component of SD models.

  • Presents a data-free solution utilizing an LLM and SD model to generate training data, eliminating direct handling of copyrighted images.

  • Demonstrates DUAW’s ability to cause noticeable distortions in outputs of fine-tuned SD models, validated quantitatively and qualitatively.

  • Shows DUAW’s transferability across diverse images, SD models, and customization techniques like DreamBooth and LoRA.

  • Evaluates DUAW’s robustness against common image perturbations like blur and compression.

  • Provides comprehensive analyses like ablation studies, hyperparameter tests, and comparisons between loss functions.

In summary, the key contributions are: proposing the DUAW framework, the data-free training methodology, and extensively evaluating its effectiveness, transferability, and robustness for protecting copyrighted images.

Method Section

Here is a summary of the method section from the paper:

The overall pipeline has two main stages:

Watermark Generation Stage:

  • Use an LLM (ChatGPT) and SD model to generate diverse synthetic images as training data.

  • Initialize the DUAW watermark randomly.

  • Add DUAW to training images and pass them through the VAE encoder and decoder.

  • Optimize DUAW by minimizing MS-SSIM between original and decoded images to introduce distortions.

  • Use optimizer-based adversarial training across batches for universal protection.

Protection Stage:

  • Apply optimized imperceptible DUAW to images requiring copyright protection.

  • Use DUAW-watermarked images to customize an SD model via DreamBooth or LoRA fine-tuning.

  • DUAW disrupts the VAE during customization, causing visible distortions in generated images.

  • A simple classifier can identify distorted outputs indicating successful protection.

In summary, the key aspects are: 1) Data-free watermark crafting by perturbing VAE and minimizing MS-SSIM loss 2) Universal protection by optimizing over diverse synthetic images 3) Deploying DUAW to safeguard real copyrighted images.

High-Level Pseudo Code

Here is the high-level pseudo code for the key steps in the paper:

# Watermark Generation Stage
 
# Generate training images 
images = generate_images_from_LLM_and_SD()  
 
# Initialize watermark randomly
duaw = initialize_watermark()   
 
# Add watermark to images
watermarked_images = images + duaw
 
# Pass watermarked images through VAE 
encoded = VAE_encoder(watermarked_images)
decoded = VAE_decoder(encoded)
 
# Optimize watermark by minimizing MS-SSIM loss
for i in range(epochs):
  loss = MS_SSIM(images, decoded)
  duaw = optimize(loss) 
 
# Protection Stage
 
# Apply optimized DUAW to images 
watermarked_images = images + duaw
 
# Customize SD model on watermarked images
customized_SD = fine_tune_SD(watermarked_images) 
 
# Generate images and check for distortions
generated = customized_SD(prompt)
 
if has_distortions(generated):
  print("Copyright Protected")

In summary, the key steps are:

  1. Generate synthetic training images
  2. Craft DUAW by optimizing to distort VAE outputs
  3. Apply DUAW to real images needing protection
  4. Fine-tune SD model on DUAW-watermarked images
  5. Check if generated images have distortions indicating protection

Detailed Pseudo Code

Here is a more detailed pseudo code implementation of the key steps in the paper:

# Imports
import LLM, SD_model, VAE
from optimizers import Adam
from losses import MS_SSIM
 
# Hyperparameters
EPOCHS = 1000 
LR = 0.01
BATCH_SIZE = 4
EPSILON = 0.05
 
# Watermark Generation
 
def generate_images(num_images):
 
  prompts = LLM.generate_prompts(num_images) 
  styles = get_styles_from_zoo()
 
  images = []
  for prompt in prompts:
    for style in styles:
      text = prompt + " in " + style  
      images.append(SD_model.generate(text))
  
  return images
 
def initialize_duaw(height, width):
  return torch.rand(height, width) * EPSILON
 
def add_duaw(images, duaw):
  return images + duaw
 
def VAE_forward(images):
  encoded = VAE.encoder(images)
  decoded = VAE.decoder(encoded)
  return decoded
 
def optimize_duaw(images, duaw):
  
  opt = Adam(duaw, LR)
 
  for epoch in range(EPOCHS):
    
    shuffled_images = shuffle(images)
    batches = split_into_batches(shuffled_images, BATCH_SIZE)
    
    for batch in batches:
     
      watermarked_batch = add_duaw(batch, duaw)    
      decoded_batch = VAE_forward(watermarked_batch)
     
      loss = MS_SSIM(batch, decoded_batch)
      loss.backward()
      opt.step()
      opt.zero_grad()
 
  return duaw
 
images = generate_images(100)
duaw = initialize_duaw(512, 512) 
 
duaw = optimize_duaw(images, duaw)
 
# Protection 
 
watermarked_images = add_duaw(copyright_images, duaw)
 
customized_SD = fine_tune(SD_model, watermarked_images)
 
generated = customized_SD(prompt) 
 
if is_distorted(generated):
  print("Protected!")

The key aspects are:

  • Generating synthetic training images via LLM and SD
  • Initializing and adding DUAW watermark
  • Forward pass through VAE
  • Optimizing DUAW by minimizing MS-SSIM loss
  • Applying DUAW to copyright images
  • Fine-tuning SD model
  • Checking if generated images are distorted