Knowledge Distillation Framework

Training a GAN from scratch is an intricate procedure [1], especially on complex datasets. In addition, the current state of the art models [2][3] showcase a trend of scaling up to achieve better performance. So the question is whether or not it is possible to generate quality images using a smaller model. In Romero et al. (2014), the authors propose a Knowledge Distillation framework for network compression. In this framework the knowledge of a pre-trained teacher network (big model) is used to train a student network (small model), which is then able to achieve comparable results with the former’s, while requiring significantly fewer parameters. In Chang et al. (2020) [5], the authors adopted this framework and proposed a black-box knowledge distillation method designed for GANs. Their proposed model, TinyGAN, successfully distils BigGAN, by achieving competitive performance, while reducing the number of parameters by a factor of 16.

References

[1] Salimans, Tim, et al. "Improved techniques for training gans." Advances in neural information processing systems 29 (2016).

[2] Brock, Andrew, Jeff Donahue, and Karen Simonyan. "Large scale GAN training for high fidelity natural image synthesis." arXiv preprint arXiv:1809.11096 (2018).

[3] Kang, Minguk, et al. "Rebooting acgan: Auxiliary classifier gans with stable training." Advances in Neural Information Processing Systems 34 (2021): 23505-23518.

[4] Romero, Adriana, et al. "Fitnets: Hints for thin deep nets." arXiv preprint arXiv:1412.6550 (2014).

[5] Chang, Ting-Yun, and Chi-Jen Lu. "Tinygan: Distilling biggan for conditional image generation." Proceedings of the Asian Conference on Computer Vision. 2020.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Knowledge Distillation Framework

Table of Contents

Clone this wiki locally