Diffusion Models: From DDPM to Flow Matching

q(\mathbf{x}_t \mid \mathbf{x}_{t-1}) = \mathcal{N}(\mathbf{x}_t; \sqrt{1 - \beta_t} \mathbf{x}_{t-1}, \beta_t \mathbf{I})

\mathbf{x}_t = \sqrt{\bar{\alpha}_t}\mathbf{x}_0 + \sqrt{1 - \bar{\alpha}_t}\boldsymbol{\epsilon}

t=0 T=1000

SNR: ∞

p_\theta(\mathbf{x}_{t-1} \mid \mathbf{x}_t) = \mathcal{N}(\mathbf{x}_{t-1}; \boldsymbol{\mu}_\theta(\mathbf{x}_t, t), \sigma_t^2 \mathbf{I})

\boldsymbol{\mu}_\theta(\mathbf{x}_t, t) = \frac{1}{\sqrt{\alpha_t}} \left( \mathbf{x}_t - \frac{\beta_t}{\sqrt{1 - \bar{\alpha}_t}} \boldsymbol{\epsilon}_\theta(\mathbf{x}_t, t) \right)

t = 1000

L_{\text{simple}}(\theta) = \mathbb{E}_{t, \mathbf{x}_0, \boldsymbol{\epsilon}} \left[ \| \boldsymbol{\epsilon} - \boldsymbol{\epsilon}_\theta(\sqrt{\bar{\alpha}_t}\mathbf{x}_0 + \sqrt{1 - \bar{\alpha}_t}\boldsymbol{\epsilon}, t) \|^2 \right]

Ready

Linear Cosine

Hover to view

Part II: Flow Matching

\frac{dx}{dt} = v_\theta(x, t), \quad t \in [0, 1]

x_t = (1 - t) x_0 + t x_1

\mathcal{L}(\theta) = \mathbb{E}_{t, x_0, x_1} \left[ \| v_\theta(x_t, t) - (x_1 - x_0) \|^2 \right]

def train_step(model, x_1):
    # t ~ U[0, 1]
    t = torch.rand(B, 1)
    
    # x_0 ~ N(0, I)
    x_0 = torch.randn_like(x_1)
    
    # Linear interpolation
    x_t = (1 - t) * x_0 + t * x_1
    
    # Target velocity is simply the difference
    v_target = x_1 - x_0
    
    # Predict velocity field
    v_pred = model(x_t, t)
    
    # MSE Loss
    loss = F.mse_loss(v_pred, v_target)
    return loss

def sample_euler(model, steps=50, shape=(1, 3, 256, 256)):
    x = torch.randn(shape)  # Start at noise (t=0)
    dt = 1.0 / steps
    
    for i in range(steps):
        t = torch.full((shape[0],), i * dt)
        v = model(x, t)
        
        # Euler integration step
        x = x + v * dt
        
    return x  # Data at t=1

\bm{x}_{t-\Delta t} = \underbrace{(1-(t-\Delta t))}_{\text{signal coeff}} \hat{\bm{x}}_0 + (t-\Delta t)\underbrace{\cos\!\left(\frac{\eta\pi}{2}\right)}_{\text{predicted noise}} \hat{\bm{x}}_1 + (t-\Delta t)\underbrace{\sin\!\left(\frac{\eta\pi}{2}\right)}_{\text{fresh noise}} \bm{\epsilon}

t=1.0

η=0.70

η = 0.50

Method	Stochasticity	Noise Artifacts	Matches Scheduler
ODE (Euler)	None	None	Yes
Flow-SDE	Uncontrolled	Severe	No (Excess)
CPS	Controlled (η)	None	Yes (Exact)

Reference: Wang & Yu, "Coefficients-Preserving Sampling for Reinforcement Learning with Flow Matching," arXiv:2509.05952, 2025.

\text{Error}_{\text{Euler}} \propto \max_{t} \| \nabla_x v_\theta \| \cdot dt^2

Feature	DDPM / DDIM	Flow Matching
Mathematical Framework	Stochastic Differential Equations (SDE)	Ordinary Differential Equations (ODE)
Training Target	Predict Noise $\epsilon$ or Data $x_0$	Predict Velocity $v = x_1 - x_0$
Path Shape	Curved (Noise Schedule dependent)	Straight (Linear Interpolation)
Typical Sampling Steps	20 - 1000	1 - 50 (with Reflow)