## ILVR + ADM

This is the implementation of ILVR: Conditioning Method for Denoising Diffusion Probabilistic Models (ICCV 2021 Oral).

This repository is heavily based on improved diffusion and guided diffusion. We use PyTorch-Resizer for resizing function.

## Overview

ILVR is a learning-free method for controlling the generation of unconditional DDPMs. ILVR refines each generation step with low-frequency component of purturbed reference image. Our method enables various tasks (image translation, paint-to-image, editing with scribbles) with only a single model trained on a target dataset.

## Download pre-trained models

Create a folder `models/`

and download model checkpoints into it. Here are the unconditional models trained on FFHQ and AFHQ-dog:

- 256x256 FFHQ: ffhq_10m.pt
- 256x256 AFHQ-dog: afhq_dog_4m.pt

These models have seen 10M and 4M images respectively. You may also try with models from guided diffusion.

## ILVR Sampling

First, set PYTHONPATH variable to point to the root of the repository.

```
export PYTHONPATH=$PYTHONPATH:$(pwd)
```

Then, place your input image into a folder `ref_imgs/`

.

Run the `ilvr_sample.py`

script. Specify the folder where you want to save the output in `--save_dir`

.

Here, we provide flags for sampling from above models. Feel free to change `--down_N`

and `--range_t`

to adapt downsampling factor and conditioning range from the paper.

Refer to improved diffusion for `--timestep_respacing`

flag.

```
python scripts/ilvr_sample.py --attention_resolutions 16 --class_cond False --diffusion_steps 1000 --dropout 0.0 --image_size 256 --learn_sigma True --noise_schedule linear --num_channels 128 --num_head_channels 64 --num_res_blocks 1 --resblock_updown True --use_fp16 False --use_scale_shift_norm True --timestep_respacing 100 --model_path models/ffhq_10m.pt --base_samples ref_imgs/face --down_N 32 --range_t 20 --save_dir output
```

ILVR sampling is implemented in `p_sample_loop_progressive`

of `guided-diffusion/gaussian_diffusion.py`

## Results

These are samples generated with N=8 and 16:

These are cat-to-dog samples generated with N=32:

## Note

This repo is re-implemention of our method on guided diffusion. Our initial implementation of the paper is based on denoising-diffusion-pytorch.