👾 Personal notes:
- Google Slides - Intro to diffusion models
- 🌟Google Slides - Stable Diffusion, explained (This version contains important contents from previous slides)
- Review on ICLR 2023 Diffusion related paper
- Google Slides - ControlNet, explained
🦄 Some good materials to get start with:
- Lilianweng's Log - What are Diffusion Models?
- The Annotated Diffusion Model@HuggingFace
- Youtube@AICofeeBreak - "diffusion" related videos
- Bilibili@deep_thoughts - DDPM algorithm and its codes, video explained (Chinese)
- 生成扩散模型漫谈@苏剑林 (系列文章,这里只列出第一篇)
This paper list contains papers of awesome diffusion models in different areas. Click 💡 to access its github.
Cornerstone papers of diffusion models:
- Deep unsupervised learning using nonequilibrium thermodynamics, 💡, Nov 2015, "Diffusion" occured in machine learning first time
- NCSN Generative modeling by estimating gradients of the data distribution, 💡, Jul 2019, Noice Conditioned Score Network
- DDPM Denoising diffusion probabilistic models, 💡, Jun 2020, could seen as cornerstone of following diffusion model works
- Score-Based Generative Modeling through Stochastic Differential Equations, Nov 2020, from SDE perspective, lots of math...
- DDIM: Denoising Diffusion Implicit Models
- PLMS: Pseudo Numerical Methods for Diffusion Models on Manifolds ICLR 2022
- Analytic-DPM: an Analytic Estimate of the Optimal Reverse Variance in Diffusion Probabilistic Models, ICLR2022 (Outstanding Paper Award)
- EDM, Elucidating the Design Space of Diffusion-Based Generative Models, 💡 Jun 2022, from NVIDIA, NeurIPS 2022 (Oustanding Paper Award)
- Diffusion-GAN: Training GANs with Diffusion (https://arxiv.org/abs/2206.02262), 💡, Jun 2022, from Microsoft Azure AI.
- DPM-Solver: A Fast ODE Solver for Diffusion Probabilistic Model Sampling in Around 10 Steps, NeurIPS 2022
- DPM-Solver++: Fast Solver for Guided Sampling of Diffusion Probabilistic Models
- Improved diffusion Improved denoising diffusion probabilistic models, 💡 Feb 2021, from OpenAI, PMLR 2021.
- Guided diffusion Diffusion models beat gans on image synthesis, 💡,Apr 2021, from OpenAI, NeurIPS 2021.
- FastDPM,On Fast Sampling of Diffusion Probabilistic Models, 💡, May 2021, from NVIDIA, ICLR Workshop 2021.
- LSGM, Score-based Generative Modeling in Latent Space, 💡 Jun 2021, from NVIDIA, NeurIPS 2021.
- Distilled-DM,Progressive Distillation for Fast Sampling of Diffusion Models, 💡, Feb 2022, from Google Brain, ICLR 2022.
- GGDM, Learning Fast Samplers for Diffusion Models by Differentiating Through Sample Quality Feb 2022, from Google Brain, ICLR 2022.
- DiT, Scalable Diffusion Models with Transformers, Dec 2022, from UCB. Replace Unet with transformer with adaptive layer norm layers.
- 🌟Stable Diffusion/LDM High-resolution image synthesis with latent diffusion models, 💡 Dec 2021, from Stability.AI & LMU Munic & Runway. Very awoesome for releasing codes & weights for free!
- Glide: Towards photorealistic image generation and editing with text-guided diffusion models, 💡, Dec 2021, from OpenAI
- Dalle-2 Hierarchical text-conditional image generation with clip latents, 💡 Apr 2022, from OpenAI. Btw, Dalle using different architecture - VQ-VAE.
- KNN Diffusion: Image Generation via Large-Scale Retrieval Apr 2022, from Meta AI.
- Imagen:Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding, 💡 May 2022, from Google Brain. NeurIPS 2022 (Oustanding Paper Award) Using pretrained languange model - T5.
- LAION-RDM, Text-Guided Synthesis of Artistic Images with Retrieval-Augmented Diffusion Models, 💡, Jul 2022, from Ludwig-Maximilian University of Munich.
- DreamBooth: Fine Tuning Text-to-Image Diffusion Models for Subject-Driven Generation, 💡, Aug 2022, from Google Research & Boston University.
- DreamFusion: Text-to-3D using 2D Diffusion, 💡, 29 Sep 2022, from Google Research & UCB.
- SDEdit, SDEdit: Image Synthesis and Editing with Stochastic Differential Equations, 💡 Aug 2021, from Stanford University & Carnegie Mellon University, ICLR 2022
- RePaint, RePaint: Inpainting using Denoising Diffusion Probabilistic Models, 💡, Jan 2022, from ETH Zurich, CVPR 2022.
- Prompt-to-Prompt Image Editing with Cross Attention Control, Aug 2022, from Google.
- DiffEdit, DiffEdit: Diffusion-based semantic image editing with mask guidance, Oct 2022, from Meta AI.
- InstructPix2Pix, InstructPix2Pix: Learning to Follow Image Editing Instructions 💡, Nov 2022, from UC Berkeley (Using GPT-3)
- Diffusion Guided Domain Adaptation of Image Generators, 💡 Dec 2022 from Rutgers University
- Bit Diffusion, Analog Bits: Generating Discrete Data using Diffusion Models with Self-Conditioning, from Google Brain, ICLR 2023, analog bits, self conditioning and asymmetric time intervals. (Also could be used in discrete/categorical image generation task)
- Video diffusion models, Apr 2022, ICLR 2022 Workshop. GIF like.
- MCVD: Masked Conditional Video Diffusion for Prediction, Generation, and Interpolation, 💡, May 2022, from University of Montreal
- Make-A-Video: Text-to-Video Generation without Text-Video Data, 29 Sep 2022, from Meta AI (Previous work is Make A Scence, VQ-VAE + classifer free guidence)
- Imagen Video: High Definition Video Generation with Diffusion Models, 5 Oct 2022, from Google Brain (Also annoced Phenaki for long video generation, but it is not diffusion model)
- DiffusionDet: Diffusion Model for Object Detection 💡 17 Nov 2022, from HKU and Tencent AI
- Pix2Seq-D: A Generalist Framework for Panoptic Segmentation of Images and Videos
- Diffusion-LM Diffusion-LM Improves Controllable Text Generation, 💡, May 2022, from Stanford University.
- DiffWave, DiffWave: A Versatile Diffusion Model for Audio Synthesis, 💡, Jun 2020, from UCSD & Nvidia & Baidu, ISMIR 2021
- WaveGrad, WaveGrad: Estimating Gradients for Waveform Generation, 💡, Sep 2020, from Google Brain, ICLR 2021. 3.Symbolic Music Generation with Diffusion Models, 💡, Mar 2021, from Google Brain, ISMIR 2021
- DiffSinger,DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism, 💡, May 2021, from Zhejiang University, AAAI 2022
- VDM,Variational Diffusion Models, 💡, Jul 2021, from Google Brain, NeurIPS 2021.
- FastDiff, FastDiff: A Fast Conditional Diffusion Model for High-Quality Speech Synthesis, 💡, Apr 2022, from Zhejiang University & Tencent AI Lab, IJCAI 2022
- BDDMs, BDDM: Bilateral Denoising Diffusion Models for Fast and High-Quality Speech Synthesis, 💡, May 2022, from Tencetn AI Lab, ICLR 2022
- SawSing, DDSP-based Singing Vocoders: A New Subtractive-based Synthesizer and A Comprehensive Evaluation, 💡, AUG 2022, ISMIR 2022
- Prodiff, ProDiff: Progressive Fast Diffusion Model For High-Quality Text-to-Speech, 💡, JUL 2022, from Zhejiang University, ACM Muiltimedia 2022
- DiffVC, Diffusion-Based Voice Conversion with Fast Maximum Likelihood Sampling Scheme, 💡, Sep 2021, from Huawei Noah, ICLR 2022.
- NU-Wave, NU-Wave: A Diffusion Probabilistic Model for Neural Audio Upsampling,💡, Apr 2021, from MINDSLAB & Seoul National University, Interspeech 2021
- CDiffSE,Conditional Diffusion Probabilistic Model for Speech Enhancement, 💡 Feb 2022, from CMU & Reality Labs Research, Pittsburgh & Academia Sinica, IEEE 2022
- Grad-TTS, Grad-TTS: A Diffusion Probabilistic Model for Text-to-Speech, May 2021, from Huawei Noah
- EdiTTS, EdiTTS: Score-based Editing for Controllable Text-to-Speech, 💡 Oct 2021, from Yale University & Supertone Inc & Neosapience Inc
- DiffGAN-TTS, DiffGAN-TTS: High-Fidelity and Efficient Text-to-Speech with Denoising Diffusion GANs, 💡, Jan 2022, from Tencent AI Lab
- Diffsound, Diffsound: Discrete Diffusion Model for Text-to-sound Generation, 💡, Jul 2022, from Beijing University & Tencent AI Lab