Model Inversion Attack via Dynamic Memory Learning

Authors: Gege Qi, YueFeng Chen, Xiaofeng Mao, Binyuan Hui, Xiaodan Li, Rong Zhang, Hui Xue

Abstract: Model Inversion (MI) attacks aim to recover the private training data from the target model, which has raised security concerns about the deployment of DNNs in practice. Recent advances in generative adversarial models have rendered them particularly effective in MI attacks, primarily due to their ability to generate high-fidelity and perceptually realistic images that closely resemble the target data. In this work, we propose a novel Dynamic Memory Model Inversion Attack (DMMIA) to leverage historically learned knowledge, which interacts with samples (during the training) to induce diverse generations. DMMIA constructs two types of prototypes to inject the information about historically learned knowledge: Intra-class Multicentric Representation (IMR) representing target-related concepts by multiple learnable prototypes, and Inter-class Discriminative Representation (IDR) characterizing the memorized samples as learned prototypes to capture more privacy-related information. As a result, our DMMIA has a more informative representation, which brings more diverse and discriminative generated results. Experiments on multiple benchmarks show that DMMIA performs better than state-of-the-art MI attack methods.

What, Why and How

Here is a summary of the key points from the paper:

What:

The paper proposes a new model inversion attack called Dynamic Memory Model Inversion Attack (DMMIA).
It uses two types of prototypes to improve the attack:
- Intra-class Multicentric Representation (IMR): Represents the target class with multiple learnable prototypes to capture diverse concepts.
- Inter-class Discriminative Representation (IDR): Stores features of generated images in a memory bank as non-parametric prototypes to distinguish between classes.

Why:

To address the issue of “catastrophic forgetting” in GAN-based model inversion attacks, where the attack tends to forget previously learned features.
The prototypes help prevent forgetting and reuse historical knowledge to improve attack success and diversity.

How:

IMR maximizes likelihood of generated samples belonging to target class concepts.
IDR aligns features to be more similar to corresponding class prototypes.
The attack optimizes a combined loss with cross-entropy term and losses for the two prototypes.
Experiments show DMMIA outperforms prior attacks, achieving higher success rate and diversity on facial image datasets.

In summary, the key contribution is using dynamic memory prototypes to improve model inversion attacks by reusing prior knowledge, preventing catastrophic forgetting. The intra-class and inter-class prototypes help capture target concepts and discriminate between classes.

Main Contributions

Here are the main contributions of this paper:

Proposes a new model inversion attack called Dynamic Memory Model Inversion Attack (DMMIA) that uses memory prototypes to improve attack performance.
Introduces two types of prototypes:
- Intra-class Multicentric Representation (IMR) to represent target class concepts
- Inter-class Discriminative Representation (IDR) to distinguish between classes
Provides theoretical analysis on how the prototypes increase diversity of generated samples.
Achieves state-of-the-art attack success rates on facial image datasets, outperforming prior methods.
Demonstrates improved sample diversity and quality compared to baseline attacks without prototypes.
Extensive experiments validate effectiveness of DMMIA on facial datasets as well as natural images, attacking various model architectures.
Ablation studies analyze the impact of different prototype configurations.
Shows minimal extra computational cost for training with prototypes, no added inference cost.

In summary, the main contributions are:

The DMMIA attack method using memory prototypes
Intra-class and inter-class prototype formulations
Theoretical analysis of diversity
Superior attack performance demonstrated through experiments
Model-agnostic approach with minimal computational overhead

Method Section

Here is a summary of the method section from the paper:

The key components of the proposed Dynamic Memory Model Inversion Attack (DMMIA) are:

Uses a pre-trained StyleGAN generator with an auxiliary dataset as the generative model.
The mapping network converts the input latent code z to an intermediate latent code.
The synthesis network then generates the image from the intermediate latent code.
Optimization is done on the mapping network parameters to invert the target model.

Intra-class Multicentric Representation (IMR):

Learns multiple positive and negative prototypes to represent concepts of the target class.
Maximizes likelihood of generated samples belonging to target concepts.

Inter-class Discriminative Representation (IDR):

Maintains a memory bank of embedding features of generated images.
Aligns features to be more similar to corresponding class prototypes.
Uses momentum update to update memory bank smoothly.

Training:

Combines cross-entropy loss, IMR loss, and IDR loss.
Alternately optimizes mapping network parameters and IMR prototypes.
Samples latent vectors and selects top ones based on prediction scores.
Progressive update of prototypes helps prevent catastrophic forgetting.

In summary, the key method components are the two types of prototypes (IMR and IDR) that leverage memory to improve the model inversion attack by reusing prior knowledge in an efficient way. The prototypes are optimized along with the mapping network to invert the target model.

High-Level Pseudo Code

Here is the high-level pseudo code for the main method in the paper:

# Input: 
# G: Generator (StyleGAN) 
# F_phi: Mapping network
# f: Target model
# T: Training epochs
 
# Initialize:
# IMR prototypes W 
# IDR memory bank M (empty)
 
for i in range(T):
  
  # Sample latent vector
  z ~ N(0, I) 
  
  # Generate image
  x_hat = G(F_phi(z))  
 
  # Update mapping network 
  phi_(i+1) = grad(L_DMMIA) 
  
  # Update IMR prototypes
  W_(i+1) = grad(L_imr)
  
  # Update IDR memory bank
  M_(i+1) = r*M_i + (1-r)*M'_i
  
# L_DMMIA = L_ce + lambda_1*L_imr + lambda_2*L_idr

Where:

L_DMMIA is the total loss
L_ce is the cross-entropy loss
L_imr is the IMR loss
L_idr is the IDR loss

The key steps are sampling the latent vectors, generating images, updating the mapping network and prototypes based on the losses, and updating the memory bank. The training loop continues for T epochs to optimize the attack.

Detailed Pseudo Code

Here is a more detailed pseudo code to implement the key components of the Dynamic Memory Model Inversion Attack (DMMIA) method proposed in the paper:

# Hyperparameters
T = # training epochs
N_z = # number of latent vectors 
N_w = # number of prototypes
rho = # number of positive prototypes
r = # momentum coefficient
lambda_1, lambda_2 = # loss weights 
 
# Initialize models
G = Pretrained_StyleGAN(...) # Generator
F_phi = MappingNetwork(...) # Mapping network
f = TargetModel(...) # Target model
 
# Initialize prototypes  
W = RandomEmbeddings(N_w, N_d) # IMR prototypes
M = {} # IDR memory bank
for c in range(K):
  M[c] = [] 
  
for i in range(T):
 
  # Sample latent vectors
  Z = [SampleZ() for _ in range(N_z)]  
  
  # Get top vectors based on prediction
  Z_top = TopK(f(G(F_phi(z))), k=200) 
  
  for z in Z_top:
  
    # Generate image 
    x_hat = G(F_phi(z))
    
    # Class prediction
    y_hat = argmax(f(x_hat))
    
    # Get image feature
    f_x = FeatureExtractor(x_hat) 
    
    # Update mapping network
    phi = Optimizer(L_DMMIA)  
 
    # Update IMR prototypes 
    W = Optimizer(L_imr)
    
    # Update IDR memory 
    M[y_hat].append(f_x) 
    for c in range(K):
      M[c] = r*M[c] + (1-r)*Mean(M[c])
      
# Loss functions
L_DMMIA = L_ce + lambda_1*L_imr + lambda_2*L_idr 
 
L_imr = -log(p_imr) # IMR likelihood
L_idr = -log(p_idr) # IDR likelihood

The key steps are sampling latent vectors, getting predictions, generating images, extracting features, updating models and prototypes, and updating the memory bank. The training loop runs for T epochs.

aixiv-c

Table of Contents

2309.00013 Model Inversion Attack via Dynamic Memory Learning

Model Inversion Attack via Dynamic Memory Learning

What, Why and How

Main Contributions

Method Section

High-Level Pseudo Code

Detailed Pseudo Code

Graph View

Backlinks

aixiv-c

Table of Contents

2309.00013 Model Inversion Attack via Dynamic Memory Learning

Model Inversion Attack via Dynamic Memory Learning §

What, Why and How §

Main Contributions §

Method Section §

High-Level Pseudo Code §

Detailed Pseudo Code §

Graph View

Backlinks

Model Inversion Attack via Dynamic Memory Learning

What, Why and How

Main Contributions

Method Section

High-Level Pseudo Code

Detailed Pseudo Code