Diffusion Sampling with Momentum for Mitigating Divergence Artifacts

Authors: Suttisak Wizadwongsa, Worameth Chinchuthakun, Pramook Khungurn, Amit Raj, Supasorn Suwajanakorn

Abstract: Despite the remarkable success of diffusion models in image generation, slow sampling remains a persistent issue. To accelerate the sampling process, prior studies have reformulated diffusion sampling as an ODE/SDE and introduced higher-order numerical methods. However, these methods often produce divergence artifacts, especially with a low number of sampling steps, which limits the achievable acceleration. In this paper, we investigate the potential causes of these artifacts and suggest that the small stability regions of these methods could be the principal cause. To address this issue, we propose two novel techniques. The first technique involves the incorporation of Heavy Ball (HB) momentum, a wellknown technique for improving optimization, into existing diffusion numerical methods to expand their stability regions. We also prove that the resulting methods have first-order convergence. The second technique, called Generalized Heavy Ball (GHVB), constructs a new high-order method that offers a variable trade-off between accuracy and artifact suppression. Experimental results show that our techniques are highly effective in reducing artifacts and improving image quality, surpassing state-of-the-art diffusion solvers on both pixel-based and latent-based diffusion models for low-step sampling. Our research provides novel insights into the design of numerical methods for future diffusion work.

What, Why and How

Here is a summary of the key points from the paper:

What:

  • The paper investigates divergence artifacts that can occur when using high-order numerical methods to accelerate diffusion model sampling with a low number of steps. These artifacts manifest as unrealistic, oversaturated image regions.

Why:

  • The paper hypothesizes that the artifacts are caused by the small stability regions of the numerical methods, which can lead to divergence when eigenvalues fall outside the stability region. This results in large latent variable magnitudes and artifacts.

How:

  • The paper proposes two techniques to address this issue:

    1. Incorporating Heavy Ball (HB) momentum into existing methods like DPM-Solver++ and PLMS to expand their stability regions, reducing artifacts. However, HB reduces accuracy to 1st order convergence.
    2. A new Generalized Heavy Ball (GHVB) method that offers variable trade-off between accuracy and artifact suppression. GHVB generalizes HB momentum to achieve high-order convergence.
  • Experiments show both techniques effectively reduce artifacts and improve image quality compared to current diffusion solvers, enabling faster sampling. The techniques provide new ways to design numerical methods for diffusion models.

In summary, the paper identifies and explains an artifact problem with high-order diffusion sampling methods, then proposes momentum-based techniques to expand stability regions and mitigate artifacts while maintaining efficiency and accuracy. This enables faster, higher quality diffusion model sampling.

Main Contributions

Here are the main contributions of this paper:

  1. Identifies and analyzes the potential causes of divergence artifacts that can occur when using high-order numerical methods for diffusion model sampling with insufficient steps.

  2. Establishes a connection between the small stability regions of numerical methods and the emergence of divergence artifacts through mathematical analysis.

  3. Proposes incorporating Heavy Ball (HB) momentum into existing numerical diffusion solvers like DPM-Solver++ and PLMS to expand their stability regions and reduce artifacts.

  4. Develops a novel Generalized Heavy Ball (GHVB) numerical method that offers a variable trade-off between accuracy and artifact suppression, while achieving high-order convergence.

  5. Provides extensive experiments on pixel-based and latent-based diffusion models showing the proposed techniques significantly reduce artifacts and improve image quality compared to current state-of-the-art diffusion solvers.

  6. Offers new insights into designing more stable and efficient numerical methods for diffusion model sampling to enable faster high-quality image generation.

  7. Introduces techniques that are easy to implement, require negligible additional computation, and do not need retraining.

In summary, the key contributions include identifying the artifact problem, analyzing the causes, proposing two novel momentum-based techniques to mitigate artifacts, demonstrating their effectiveness through experiments, and providing insights to guide future diffusion solver design.

Method Section

Here is a summary of the methods section from the paper:

The paper proposes two techniques to mitigate divergence artifacts in diffusion model sampling:

  1. Incorporating Heavy Ball (HB) Momentum
  • Applies Polyak’s heavy ball momentum, commonly used in optimization, to existing numerical diffusion solvers like DPM-Solver++ and PLMS.

  • Modifies solvers to calculate an exponential moving average of the noise prediction before the solver update step.

  • Mathematically, this is done by introducing a velocity variable that accumulates the weighted noise over iterations. The position is then updated using the velocity.

  • Adding momentum expands stability regions of solvers, reducing likelihood of divergence.

  • However, analysis shows this reduces accuracy to 1st order convergence.

  1. Generalized Heavy Ball (GHVB)
  • Develops a novel technique to generalize HB momentum to achieve high-order convergence.

  • Derives GHVB by starting from the Adams-Bashforth formulation and incorporating momentum before the summation of coefficients.

  • Allows flexibility in balancing accuracy and artifact suppression through a momentum parameter β.

  • Achieves high-order convergence theoretically and empirically.

  • Enlarges stability region compared to original high-order solvers.

  • Interpolates between methods of different orders.

In summary, the proposed techniques expand stability regions to mitigate artifacts by incorporating forms of momentum into the update steps of existing numerical diffusion solvers. HB provides a simpler approach while GHVB offers high-order accuracy.

High-Level Pseudo Code

Here is the high-level pseudocode for the key methods proposed in the paper:

# Polyak's Heavy Ball Momentum
 
def hb_momentum(x, f, beta):
  v = 0 # initialize velocity
  for i in range(num_steps):
    v = (1 - beta)*v + beta*f(x) 
    x = x + step_size*v
  return x
 
# Apply to existing solver like PLMS
def plms_hb(x, f, beta):
  v = 0
  for i in range(num_steps):
    # Calculate PLMS update
    update = plms_update(x, f)  
    v = (1 - beta)*v + beta*update
    x = x + step_size*v
  return x
# Generalized Heavy Ball (GHVB)
 
def ghvb(x, f, beta, order):
  v = [0]*order # initialize velocity buffer
  for i in range(num_steps):
    # Calculate update with momentum 
    update = 0
    for j in range(order):
      update += ghvb_coeffs[order][j]*v[j] 
    v.pop(0)
    v.append((1 - beta)*v[-1] + beta*f(x))
    
    x = x + step_size*update
  return x 

In summary, the core ideas are to maintain a velocity variable that accumulates weighted noise predictions, then apply it to update position. The momentum is injected before the numerical update in PLMS but after coefficients summation in GHVB. These techniques expand stability and reduce artifacts.

Detailed Pseudo Code

Here is more detailed pseudocode to implement the key methods proposed in the paper:

# Utils
def noise_model(x, t):
  # Returns noise prediction given x and timestep t
  return model(x, t) 
 
def numerical_update(x, f):
  # Returns update step based on numerical method and f  
  if method == 'euler':
    return f(x)
  elif method == 'plms':
    # Implement PLMS update with coefficients
    ...
  else:
    raise ValueError('Invalid method')
 
# Polyak's Heavy Ball Momentum
 
def hb_momentum(x0, t_seq, beta):
 
  x = x0
  v = 0
 
  for t in t_seq:
    f = noise_model(x, t)
    v = (1 - beta)*v + beta*f
    x = x + step_size*v
  
  return x
 
# Apply to PLMS
def plms_hb(x0, t_seq, beta):
 
  x = x0
  v = 0
 
  for t in t_seq:
    f = noise_model(x, t)
    update = numerical_update(x, f) # PLMS update
    
    v = (1 - beta)*v + beta*update
    x = x + step_size*v
 
  return x
# GHVB
 
def ghvb(x0, t_seq, beta, order):
 
  x = x0
  v = [0] * order
 
  for t in t_seq:
 
    f = noise_model(x, t)
    
    update = 0
    for i in range(order):
      update += ghvb_coeffs[order][i] * v[i]
 
    v.pop(0)
    v.append((1 - beta)*v[-1] + beta*f)
 
    x = x + step_size*update
 
  return x

The key steps are computing the noise prediction, numerically updating with PLMS or other method, applying momentum to maintain velocity, and using it to update position. The pseudo code illustrates how the core ideas can be implemented.