Skip to content

Official implementation of the paper "Leveraging Latent Diffusion Models for Training-Free In-Distribution Data Augmentation for Surface Defect Detection" accepted @ CBMI 2024.

Notifications You must be signed in to change notification settings

intelligolabs/DIAG

Repository files navigation

Leveraging Latent Diffusion Models for Training-Free In-Distribution Data Augmentation for Surface Defect Detection

Official implementation of the paper Leveraging Latent Diffusion Models for Training-Free In-Distribution Data Augmentation for Surface Defect Detection accepted at the 21st International Conference on Content-Based Multimedia Indexing (CBMI 2024).

Teaser for DIAG

Installation

1. Clone the repository

git clone https://github.com/intelligolabs/DIAG.git
cd DIAG

2. Create an environment with dependencies

conda create -n DIAG python=3.10
conda activate DIAG
pip install torch==1.12.1+cu113 torchvision==0.13.1+cu113 torchaudio==0.12.1 --extra-index-url https://download.pytorch.org/whl/cu113

Note: the pip install command installs PyTorch 1.12 with CUDA 11.3. Change it accordingly to your own version of CUDA.

Then, install the requirements:

pip install -r requirements.txt

3. Data preparation Download KSDD2 from the official KSDD2 website.

Run the ksdd2_preprocess.py script. This will create a pre-processed copy of KSDD2 in the --dst_dir. We will use this pre-processed copy for our augmentation and evaluation.

python ksdd2_preprocess.py --src_dir="<path_to>/ksdd2" --dst_dir="<path_to>/ksdd2_preprocessed"

4. [Optional] Set up wandb for logging Optionally, you can log the training and evaluation for wandb.

wandb init

Part 1: Data augmentation

This step generates the augmented positive images. The images are generated using the same prompts used in the original paper. The src_dir argument should point to the preprocess data root (see 3. Data preparation step).

By default, this script will generate the augmented images in an augmented_<img_per_prompt * prompts> folder inside src_dir. This is needed for the dataloader during training.

python generate_augmented_images.py --src_dir="<path_to>/ksdd2_preprocessed" --imgs_per_prompt=50  --seed=0

Part 2: Training and evaluation

This step will fine-tune a pre-trained ResNet-50 on the (augmented) KSDD2. Different arguments handle the policy for training:

  • --zero_shot trains the model without GT positive images
  • --add_augmented adds the augmented images to the dataset (can be both zero or full shot)
  • --num_augented selects how many augmented images to add to the training set. Required if using --add_audmented. Note that this MUST be the TOTAL (imgs_per_prompt * prompts) number of images generated in the previous step.

Example of DIAG training (zero-shot with augmentations):

python train_ResNet50.py --seed=0 --epochs=30 --batch_size=32 --num_workers=8 --dataset_path="<path_to>/ksdd2_preprocessed" --zero_shot --add_augmented --num_augmented=100 --logging

Credits

  • Kolector Surface-Defect Dataset 2. More info here.
  • Diffusers and StabilityAI for their SDXL implementation and weights.

Authors

Federico Girella, Ziyue Liu, Franco Fummi, Francesco Setti, Marco Cristani, Luigi Capogrosso

Department of Engineering for Innovation Medicine, University of Verona, Italy

[email protected]

Citation

If you use DIAG, please, cite the following paper:

@article{girella2024leveraging,
  title={Leveraging Latent Diffusion Models for Training-Free In-Distribution Data Augmentation for Surface Defect Detection},
  author={Girella, Federico and Liu, Ziyue and Fummi, Franco and Setti, Francesco and Cristani, Marco and Capogrosso, Luigi},
  journal={arXiv preprint arXiv:2407.03961},
  year={2024}
}

About

Official implementation of the paper "Leveraging Latent Diffusion Models for Training-Free In-Distribution Data Augmentation for Surface Defect Detection" accepted @ CBMI 2024.

Resources

Stars

Watchers

Forks

Languages