../
Generative Modeling
Table of Contents
Classifier-free Guidance In Practice
- Train a network to handle both conditional and unconditional generation: $f(x_t, t, c)$ and $f(x_t, t, \varnothing)$.
- For each denoising step, generate two outputs, one conditioned and one unconditioned. The output used is $\epsilon_{\text{guided}} = (1 - w)\epsilon_{\text{unconditioned}} + w\epsilon_{\text{conditioned}}$.
- Train the network by dropping the condition with some probability — known as condition dropout.
Guiding a Diffusion Model with a Bad Version of Itself1
- Classifier-free guidance: training both conditional and unconditional diffusion model, extrapolate between the two denoiser networks with some factor $w$.
- Maximum likelihood attempts to cover all training samples.
- CFG results in more natural looking images.
- Paper proposes autoguidance - interpolate with a poorer version of itself, where the poor model $D_0$ is trained on the same task and data distribution, but is maybe small/low capacity or under-trained.