Skip to content

Latest commit

 

History

History
238 lines (204 loc) · 12.1 KB

README.md

File metadata and controls

238 lines (204 loc) · 12.1 KB

Development of a lightweight class-conditional GAN for image generation on CIFAR10

Dataset

CIFAR10

The classes of CIFAR-10, along with 10 random images from each.

The CIFAR-10 dataset consists of 60,000 colour images from 10 different class categories, with 6,000 images per class. Although the spatial size of each image is small (32x32), it is still complex enough to require a large model to generate quality images. In particular, the current state of the art model in Class-conditional Image Generation on CIFAR-10 is StyleGAN2, which contains more than 20 million trainable parameters. More information about the dataset can be found on the official website.

Knowledge Distillation

In the case of the CIFAR-10 dataset, we opted to distil the StyleGAN2-ADA model. The aforementioned model is able to achieve state-of-the-art performance on the task of conditional image generation on the CIFAR-10 dataset. The main reason behind its success was the proposed adaptive discriminator augmentation mechanism that significantly stabilizes training when there are limited data available. However, in our case the model was utilized for black-box image generation. So despite the training procedure and techniques followed by the study, we only needed access to the input-output pairs of the model's generator. In particular we used the official PyTorch implementation of the StyleGAN2-ADA model by NVIDIA Research Projects from Github, along with the provided weights for the pre-trained model on the CIFAR-10 dataset, for conditional image generation. Therefore, StyleGAN was used to create a FakeCIFAR10 dataset, consisting of images generated by the model, along with the input noise vectors and labels. Subsequently, this dataset was used to train the student network to mimic the functionality of the teacher network (i.e. StyleGAN2-ADA), using several objectives. On the other hand, the StyleGAN2-ADA model, upon the creation of the dataset, was no longer used in the training procedure of the student network, and thus it could be discarded.

FakeCIFAR10

The FakeCIFAR10 dataset consists of 50,000 synthetic images, generated by the StyleGAN2-ADA model. There are 5,000 images for each of the 10 classes, along with the noise vectors that were used as input to StyleGAN's Generator. The dataset was generated using the create_dataset.py script, and it can be found here.

$ python create_dataset.py -h
usage: create_dataset.py [-h] -p PATH -c CHECKPOINT [-n NSAMPLES] [-b BATCH_SIZE] [-d {cpu,cuda}]

options:
  -h, --help            show this help message and exit
  -p PATH, --path PATH  Path to save the generated images to.
  -c CHECKPOINT, --checkpoint CHECKPOINT
                        Path to StyleGAN2's checkpoint for CIFAR-10.
  -n NSAMPLES, --nsamples NSAMPLES
                        Number of samples per class.
  -b BATCH_SIZE, --batch_size BATCH_SIZE
                        Number of samples per minibatch.
  -d {cpu,cuda}, --device {cpu,cuda}
                        Device to use for the image generation

In order to run the script, please adhere to the requirements of the StyleGAN2-ADA model from its official repository. Note that the script can also be executed without a GPU.

DiStyleGAN

Getting Started

  1. Clone the GitHub repository.
  2. Install the python packages in the requirements file. (Python 3.10)
  3. Download the FaceCIFAR10 dataset from here and extract the zip file.

Training DiStyleGAN

  • Train DiStyleGAN from scratch in Python using the code below:
from distylegan import DiStyleGAN
model = DiStyleGAN()
model.train(
  dataset="./fakecifar/dataset", 
  save="results"
)
  • Train DiStyleGAN from scratch using the command line options of the corresponding script
$ python distylegan.py train -h
usage: distylegan.py train [-h] --dataset DATASET --save SAVE [--c_dim C_DIM] [--lambda_ganD LAMBDA_GAND] [--lambda_ganG LAMBDA_GANG] 
[--lambda_pixel LAMBDA_PIXEL] [--nc NC][--ndf NDF] [--ngf NGF] [--project_dim PROJECT_DIM] [--transform TRANSFORM] [--z_dim Z_DIM] 
[--adam_momentum ADAM_MOMENTUM] [--batch_size BATCH_SIZE] [--checkpoint_interval CHECKPOINT_INTERVAL] [--checkpoint_path CHECKPOINT_PATH] 
[--device DEVICE] [--epochs EPOCHS] [--gstep GSTEP] [--lr_D LR_D] [--lr_G LR_G] [--lr_decay LR_DECAY] [--num_test NUM_TEST] 
[--num_workers NUM_WORKERS] [--real_dataset REAL_DATASET]

options:
  -h, --help            show this help message and exit

Required arguments for the training procedure:
  --dataset DATASET     Path to the dataset directory of the fake CIFAR10 data generated by the teacher network
  --save SAVE           Path to save checkpoints and results

Optional arguments about the network configuration:
  --c_dim C_DIM         Condition dimension (Default: 10)
  --lambda_ganD LAMBDA_GAND
                        Weight for the adversarial GAN loss of the 
                        Discriminator (Default: 0.2)
  --lambda_ganG LAMBDA_GANG
                        Weight for the adversarial distillation loss 
                        of the Generator (Default: 0.01)
  --lambda_pixel LAMBDA_PIXEL
                        Weight for the pixel loss of the Generator      
                       (Default: 0.2)
  --nc NC               Number of channels for the images 
                        (Default: 3)
  --ndf NDF             Number of discriminator filters in the first 
                        convolutional layer (Default: 128)
  --ngf NGF             Number of generator filters in the first 
                        convolutional layer (Default: 256)
  --project_dim PROJECT_DIM
                        Dimension to project the input condition 
                       (Default: 128)
  --transform TRANSFORM
                        Optional transform to be applied on a sample 
                        image (Default: None)
  --z_dim Z_DIM         Noise dimension (Default: 512)

Optional arguments about the training procedure:
  --adam_momentum ADAM_MOMENTUM
                        Momentum value for the Adam optimizers' 
                        betas (Default: 0.5)
  --batch_size BATCH_SIZE
                        Number of samples per batch (Default: 128)
  --checkpoint_interval CHECKPOINT_INTERVAL
                        Checkpoints will be saved every `
                        checkpoint_interval` epochs (Default: 20)
  --checkpoint_path CHECKPOINT_PATH
                        Path to previous checkpoint
  --device DEVICE       Device to use for training ('cpu' or 'cuda') 
                       (Default: If there is a CUDA device      
                        available, it will be used for training)
  --epochs EPOCHS       Number of training epochs (Default: 150)
  --gstep GSTEP         The number of discriminator updates after 
                        which the generator is updated using the 
                        full loss(Default: 10)
  --lr_D LR_D           Learning rate for the discriminator's Adam 
                        optimizer (Default: 0.0002)
  --lr_G LR_G           Learning rate for the generator's Adam 
                        optimizer (Default: 0.0002)
  --lr_decay LR_DECAY   Iteration to start decaying the learning 
                        rates for the Generator and the 
                        Discriminator(Default: 350000)
  --num_test NUM_TEST   Number of generated images for evaluation 
                       (Default: 30)
  --num_workers NUM_WORKERS
                        Νumber of subprocesses to use for data 
                        loading (Default: 0, whichs means that the 
                        data will be loaded in the main process.)
  --real_dataset REAL_DATASET
                        Path to the dataset directory of the real         
                        CIFAR10 data. (Default: None, it will be 
                        downloaded and saved in the parent directory 
                        of input `dataset` path)

Image Generation

Download the weights of our pre-trained model from here and extract the zip file in the root directory of our repository. Then, you can generate images using the following options:

  • Python
from distylegan import DiStyleGAN
model = DiStyleGAN()
images= model.generate(
    checkpoint_path="../checkpoint", 
    nsamples=100,
    label=[0, 3, 7] # or label=x (int in range [0,9]), label=None
    save="synthetic-samples",
    batch_size=32
)
  • Command line
$ python distylegan.py generate -h
usage: distylegan.py generate [-h] --checkpoint_path CHECKPOINT_PATH --nsamples NSAMPLES --save SAVE 
[--label [{0,1,2,3,4,5,6,7,8,9} ...]] [--batch_size BATCH_SIZE]

options:
  -h, --help            show this help message and exit

Required arguments for the generation procedure:
  --checkpoint_path CHECKPOINT_PATH
                        Path to previous checkpoint (the directory 
                        must contain the generator.pt and 
                        config.json files)
  --nsamples NSAMPLES   Number of samples to generate per label
  --save SAVE           Path to save the generated images to

Optional arguments about the generation procedure:
  --label [{0,1,2,3,4,5,6,7,8,9} ...]
                        Class label(s) for the samples (Default: 
                        None, random labels) --> e.g. --label 0 3 7
  --batch_size BATCH_SIZE
                        Number of samples per batch (Default: 32)
  • Using the flask webapp by running the command flask run inside the webapp/ directory of our repository. Then following the link displayed in the command line (e.g. http://127.0.0.1:5000/), you will be presented with the following interface where you can generate images.

webapp

The interface of our webapp for image generation.

Qualitative Evaluation

  • Evaluate the progress of training using the gifmaker.py script.
$ python gifmaker.py -h
usage: gifmaker.py [-h] -p PATH -s SAVE [-d DURATION]

options:
  -h, --help            show this help message and exit
  -p PATH, --path PATH  Path to the "images/" directory from training.
  -s SAVE, --save SAVE  Filename for the .gif file.
  -d DURATION, --duration DURATION
                        GIF duration in seconds.
  • Evaluate the diversity of DiStyleGAN's samples using the t-SNE algorithm
$ python tsne.py -h
usage: tsne.py [-h] -p PATH -f FILENAME -n NSAMPLES [-t TITLE] [-b BATCH_SIZE]

t-SNE visualization of generated samples

options:
  -h, --help            show this help message and exit
  -p PATH, --path PATH  Path to the directory of the generated images. The directory should have the following format:
                        dir/{class-0, class-1, ...}/image_X.png)
  -f FILENAME, --filename FILENAME
                        Filename for the .png file.
  -n NSAMPLES, --nsamples NSAMPLES
                        Number of samples to use from each class.
  -t TITLE, --title TITLE
                        Title for the image.
  -b BATCH_SIZE, --batch_size BATCH_SIZE
                        Number of samples per batch.

For more information about the evaluation see the corresponding wiki page.