Skip to content

massastrello/awesome-implicit-neural-models

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

41 Commits
 
 
 
 

Repository files navigation

awesome-implicit-neural-models

A collection of resources on Implicit learning model, ranging from Neural ODEs to Equilibrium Networks, Differentiable Optimization Layers and more.

"The crux of an implicit layer, is that instead of specifying how to compute the layer’s output from the input, we specify the conditions that we want the layer’s output to satisfy." cit. (NeurIPS 2020 Implicit Layers Tutorial)

NOTE: Feel free to suggest additions via Issues or Pull Requests.

For a comprehensive list of resources on the connections between differential equations and deep learning, please refer to awesome-neural-ode

Table of Contents

Implicit Deep Learning

Neural Differential Equations

In Neural Differential Equations, the input-output mapping is realized by solving a boundary value problem. The learnable component is the vector field of the differential equation.

  • Neural Ordinary Differential Equations (best paper award): NeurIPS18

We introduce a new family of deep neural network models. Instead of specifying a discrete sequence of hidden layers, we parameterize the derivative of the hidden state using a neural network. We also construct continuous normalizing flows, a generative model that can train by maximum likelihood, without partitioning or ordering the data dimensions

Continuous deep learning architectures have recently re-emerged as Neural Ordinary Differential Equations (Neural ODEs). This infinite-depth approach theoretically bridges the gap between deep learning and dynamical systems, offering a novel perspective. However, deciphering the inner working of these models is still an open challenge, as most applications apply them as generic black-box modules. In this work we "open the box", further developing the continuous-depth formulation with the aim of clarifying the influence of several design choices on the underlying dynamics.

  • Neural Controlled Differential Equations for Irregular Time Series (spotlight): NeurIPS20

Neural ordinary differential equations are an attractive option for modelling temporal dynamics. However, a fundamental issue is that the solution to an ordinary differential equation is determined by its initial condition, and there is no mechanism for adjusting the trajectory based on subsequent observations. Here, we demonstrate how this may be resolved through the well-understood mathematics of controlled differential equations. The resulting neural controlled differential equation model is directly applicable to the general setting of partially-observed irregularly-sampled multivariate time series, and (unlike previous work on this problem) it may utilise memory-efficient adjoint-based backpropagation even across observations. We demonstrate that our model achieves state-of-the-art performance against similar (ODE or RNN based) models in empirical studies on a range of datasets. Finally we provide theoretical results demonstrating universal approximation, and that our model subsumes alternative ODE models.

  • Scalable Gradients for Stochastic Differential Equations: AISTATS20

The adjoint sensitivity method scalably computes gradients of solutions to ordinary differential equations. We generalize this method to stochastic differential equations, allowing time-efficient and constant-memory computation of gradients with high-order adaptive solvers. Specifically, we derive a stochastic differential equation whose solution is the gradient, a memory-efficient algorithm for caching noise, and conditions under which numerical solutions converge. In addition, we combine our method with gradient-based stochastic variational inference for latent stochastic differential equations. We use our method to fit stochastic dynamics defined by neural networks, achieving competitive performance on a 50-dimensional motion capture dataset.

  • Discretize-Optimize vs. Optimize-Discretize for Time-Series Regression and Continuous Normalizing Flows: arxiv

We compare the discretize-optimize (Disc-Opt) and optimize-discretize (Opt-Disc) approaches for time-series regression and continuous normalizing flows (CNFs) using neural ODEs. Neural ODEs are ordinary differential equations (ODEs) with neural network components. Training a neural ODE is an optimal control problem where the weights are the controls and the hidden features are the states. Every training iteration involves solving an ODE forward and another backward in time, which can require large amounts of computation, time, and memory. Comparing the Opt-Disc and Disc-Opt approaches in image classification tasks, Gholami et al. (2019) suggest that Disc-Opt is preferable due to the guaranteed accuracy of gradients. In this paper, we extend the comparison to neural ODEs for time-series regression and CNFs. Unlike in classification, meaningful models in these tasks must also satisfy additional requirements beyond accurate final-time output, e.g., the invertibility of the CNF. Through our numerical experiments, we demonstrate that with careful numerical treatment, Disc-Opt methods can achieve similar performance as Opt-Disc at inference with drastically reduced training costs. Disc-Opt reduced costs in six out of seven separate problems with training time reduction ranging from 39% to 97%, and in one case, Disc-Opt reduced training from nine days to less than one day.

Recently, Neural Ordinary Differential Equations has emerged as a powerful framework for modeling physical simulations without explicitly defining the ODEs governing the system, but learning them via machine learning. However, the question: “Can Bayesian learning frameworks be integrated with Neural ODE’s to robustly quantify the uncertainty in the weights of a Neural ODE?” remains unanswered. In an effort to address this question, we demonstrate the successful integration of Neural ODEs with two methods of Bayesian Inference: (a) The No-U-Turn MCMC sampler (NUTS) and (b) Stochastic Langevin Gradient Descent (SGLD). We test the performance of our Bayesian Neural ODE approach on classical physical systems, as well as on standard machine learning datasets like MNIST, using GPU acceleration. Finally, considering a simple example, we demonstrate the probabilistic identification of model specification in partially-described dynamical systems using universal ordinary differential equations. Together, this gives a scientific machine learning tool for probabilistic estimation of epistemic uncertainties.

Deep Equilibrium Networks

In Equilibrium Models, the output of the model must be a fixed point of some learnable transformation (e.g. a discrete time map), often explicitly dependent on the input.

We present a new approach to modeling sequential data: the deep equilibrium model (DEQ). Motivated by an observation that the hidden layers of many existing deep sequence models converge towards some fixed point, we propose the DEQ approach that directly finds these equilibrium points via root-finding.

  • Multiscale Deep Equilibrium Models: NeurIPS20

  • Monotone Operator Equilibrium Networks: NeurIPS20

  • Lipschitz Bounded Equilibrium Networks: Arxiv

  • Implicit Deep Learning: Arxiv

  • Algorithmic Differentiation of a Complex C++ Code with Underlying Libraries (An AD system for C++ with DEQ-like adjoints by default on PETSc) Paper

Optimization Layers

To infer any differentiable optimization layer, some cost function has to be minimized (maximized)

  • OptNet: Differentiable Optimization as a Layer in Neural Networks: ICML17

  • Input Covex Neural Networks ICML17

  • Differentiable MPC for End-to-end Planning and Control: NeurIPS18

  • Differentiable Convex Optimization Layers: NeurIPS19

  • Differentiable Implicit Layers: NeurIPS20

In this paper, we introduce an efficient backpropagation scheme for non-constrained implicit functions. These functions are parametrized by a set of learn-able weights and may optionally depend on some input; making them perfectlysuitable as a learnable layer in a neural network. We demonstrate our scheme ondifferent applications: (i) neural ODEs with the implicit Euler method, and (ii) system identification in model predictive control.

Additional Material

Software and Libraries

Neural ODEs

  • torchdiffeq Differentiable ODE solvers with full GPU support and O(1)-memory backpropagation: repo
  • torchdyn PyTorch library for all things neural differential equations. repo, docs
  • torchsde Stochastic differential equation (SDE) solvers with GPU support and efficient sensitivity analysis: repo
  • torchcde GPU-capable solvers for controlled differential equations (CDEs): repo
  • DifferentialEquations.jl is a set of ODE/SDE/DAE/DDE/jump/etc. solvers with GPU and distributed computing support, event handling, along with O(1) memory adjoints and stabilized versions for stiff and partial differential equations repo, docs
  • DiffEqFlux.jl is a companion library to DifferentialEquations.jl which includes common implicit layer models and tooling such as collocation schemes for building complex loss functions repo, docs

Deep Equilibrium Models

  • deq This repository contains the code for the deep equilibrium (DEQ) model, an implicit-depth architecture repo
  • deq-jax Jax Implementation for the deep equilibrium (DEQ) model repo
  • DifferentialEquations.jl SteadyStateProblem is a differentiable solver for steady-states of differential equations repo

Optimization

  • mpc.pytorch A fast and differentiable model predictive control solver for PyTorch. repo, docs
  • cvxpylayers Differentiable convex optimization layers in PyTorch and TensorFlow using CVXPY. repo

Tutorials and Talks

  • NeurIPS20 Tutorial Deep Implicit Layers - Neural ODEs, Deep Equilibirum Models, and Beyond website
  • JuliaCon 2019 Neural Ordinary Differential Equations with DiffEqFlux | Jesse Bettencourt youtube

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •