Skip to content

Github page for the preprint paper "InfoCatVAE: Representation Learning with Categorical Variational Autoencoders"

Notifications You must be signed in to change notification settings


Folders and files

Last commit message
Last commit date

Latest commit



95 Commits

Repository files navigation


This repository gives an implementation of InfoCatVAE:

InfoCatVAE is a variational autoencoder framework that enables categorical and continuous interpretable representation with three main specifities:

  • A multimodal fixed prior distribution
  • A soft-clustering shaped objective function
  • An information maximization layer that:
    • requires no additional network
    • improves conditional generation
    • gives a natural framework to overpass discrete sampling backpropagation problem


  • Lowering the negative trade-off between expressiveness and robustness in mixture models by using information maximization trick
  • Leveraging information maximization architecture to enable the network to naturally optimize categorical sampling layer

Modification of VAEs for representation learning


Enforce categorical readable information in the latent code representation with the following categorical VAE (CatVAE):

Mixture model

Figure 1: CatVAE: square blocks represent neural networks, oval-shaped blocks represent sampling

This architecture offers a natural new ELBO that has the following propoerties:

  • The mapping of a datapoint to a cluster is done with a softclutering framework
  • All the distances between the code and all clusters are explicitly computed and used in the backpropagation algorithm
  • An entropy term prevents trivial solution where all datapoints are mapped to one cluster

InfoCatVAE: categorical VAE with information maximization

Objective: improve generation and regularize representation learning InfoCatVAE using the learned classifier, with the following idea:

The higher the mutual information between the sample and its category, the better the generation should be

Figure 2: square blocks represent neural networks, oval-shaped blocks represent sampling. Encoding and decoding blocks are shared with CatVAE presented in figure 1.

Mutual information has a tractable lower bound (see Chen's InfoGAN) whose exact algorithmic transcription is described in figure 2. Main idea: each conditionally generated data should be classified in its original generative cluster. The mutual information lower bound term is added to the CatVAE ELBO.

Choice of the prior

Let d be the dimension of the latent space such that ∃ δ ∈ N s.t. d = K.δ. We chose a prior such that:

  • All clusters are equidistant and orthogonal in the latent space
  • Data with K categories should be encoded with a K-modal distribution

Hence we model class c in the latent space with N(z; μc, 1) such that μc ∈ R^d and μc.μc′ = 0. We inspire from subspace clustering assumptions and propose ∀c ∈ {1...K} we propose a μc such that:

  • Each categories lives mainly in a δ−dimensional subspace of Z
  • The categorical variable is modeled by p(c) = U({1...K})
  • This prior shape encourages the network to find discriminative representation of the data according to its most salient attribute

Optimization with categorical sampling layer

Gumbel-softmax trick is a standard continuous relaxation for categorical sampling optimization. Several discrete optimization trick for back-propagating the gradient have been developped.

In InfoCatVAE:

  • Categorical representation is let deterministic instead of random: for each x all qφ(c|x) are computed and two by two confronted
  • The categorical sampling is made in parallel during optimization in the information maximization part

It means that categorical representation is deterministic (catVAE) conditionally to the fact that representation is coherent when sampled randomly (infoCatVAE). It is a form of gradient-free (Monte-Carlo) optimization.

Illustrative results


Fashion MNIST

Moving away from latent space origin along the subspace of each found class


         title={InfoCatVAE: representation learning with categorical variational autoencoders},
         author={Pineau, Edouard and Lelarge, Marc},
         journal={arXiv preprint arXiv:1806.08240},


Github page for the preprint paper "InfoCatVAE: Representation Learning with Categorical Variational Autoencoders"






No releases published


No packages published
