continuous-convex-adjoints

Sensitivity analysis for convex relaxations is useful in typical convex minimization methods for global optimization. Consider a nonlinear dynamic optimization problem with an embedded system of parametric ordinary differential equations (ODEs). Here $\mathbf{p} \in P$ denotes system parameters, $\mathbf{x}$ denotes state variables that solve an ODE system over the time horizon $I:= [t_0,t_f]$. $J: \mathbb{R}^{n_p}\rightarrow\mathbb{R}$ is an objective function based on a cost function $g: I \times P \times \mathbb{R}^{n_x} \rightarrow \mathbb{R}$.

$$ \begin{align*} \min_{\mathbf{p} \in P} \quad & J(\mathbf{p}) := g(t_f,\mathbf{p},\mathbf{x}(t_f,\mathbf{p}))\\ \mathrm{s.t.}\quad &\dot{\mathbf{x}}(t,\mathbf{p})=\mathbf{f}(t,\mathbf{p},\mathbf{x}),\quad \forall t\in(t_0,t_f],\\ &\mathbf{x}(t_0,\mathbf{p})=\mathbf{x}_0(\mathbf{p}). \end{align*} $$

To construct the convex relaxations of solutions of the parametric ODE system, we consider the generalized McCormick relaxations developed by Scott and Barton (2013). These relaxations of ODE solutions, known as state relaxations, solve the following auxiliary parametric ODE system:

$$ \begin{align*} &\dot{\mathbf{x}}^{\mathrm{cv}}(t,\mathbf{p}) = \mathbf{f}^{\mathrm{cv}}(t,\mathbf{p},\mathbf{x}^{\mathrm{cv}}(t,\mathbf{p}),\mathbf{x}^{\mathrm{cc}}(t,\mathbf{p})), & \mathbf{x}^{\mathrm{cv}}(t_0,\mathbf{p})&=\mathbf{x}_0^{\mathrm{cv}} (\mathbf{p}), \\ &\dot{\mathbf{x}}^{\mathrm{cc}}(t,\mathbf{p}) = \mathbf{f}^{\mathrm{cc}}(t,\mathbf{p},\mathbf{x}^{\mathrm{cv}}(t,\mathbf{p}),\mathbf{x}^{\mathrm{cc}}(t,\mathbf{p})), & \mathbf{x}^{\mathrm{cc}}(t_0,\mathbf{p})&=\mathbf{x}_0^{\mathrm{cc}} (\mathbf{p}). \end{align*} $$

Given Scott and Barton's state relaxations, we consider the following convex relaxation of the nonlinear dynamic optimization problem:

$$ \begin{align*} \min_{\mathbf{p}\in P} \quad & J^{\mathrm{cv}}(\mathbf{p}) := g^{\mathrm{cv}}(t_f,\mathbf{p},\mathbf{x}^{\mathrm{cv}}(t_f,\mathbf{p}),\mathbf{x}^{\mathrm{cc}}(t_f,\mathbf{p})),\\ \end{align*} $$

where $J^{\mathrm{cv}}$ is a convex relaxation of $J$, constructed by composing a generalized McCormick relaxation (Scott and Barton, 2013) $g^{\mathrm{cv}}$ of $g$ with state relaxations $(\mathbf{x}^{\mathrm{cv}},\mathbf{x}^{\mathrm{cc}})$.

This repository contains a proof-of-concept implementation of a new adjoint subgradient evaluation approach (Song and Khan, 2023), where the subgradients of $J^{\mathrm{cv}}$ can be evaluated without constructing full generalized derivatives of Scott and Barton's relaxations. This repository contains our C++ and Julia code for all numerical examples in our accompanying article:

Y. Zhang and K.A. Khan, Evaluating subgradients for convex relaxations of dynamic process models by adapting current tools, Computers & Chemical Engineering, 180:108462 (2024). doi:10.1016/j.compchemeng.2023.108462

This implementation was developed by Yulan Zhang in C++ and Julia. This repository is tied to the accompanying article, and will not be updated except for bug fixes. If you make use of this code, please cite our article as above.

This work was supported by the Natural Sciences and Engineering Research Council of Canada (NSERC) under Grant RGPIN-2017-05944.

Dependencies

CVODES v5.8.0
MC++ v1.0

Method outline

Song and Khan (2023) describe a new subgradient evaluation framework for the two established state relaxations (Scott and Barton, 2013; Song and Khan, 2022). This subgradient evaluation framework computes state relaxation subgradients as solving an auxiliary ODE system that resembles classical ``forward'' sensitivity analysis results for smooth ODEs. For Scott and Barton's relaxations, the subgradient ODE system may be reformulated as an adjoint sensitivity system of the form:

$$ \begin{align*} &\dot{\boldsymbol{\lambda}}(t) = -[\Theta_\mathrm{A}(t,\tilde{\mathbf{p}})]^{\mathrm{T}},\boldsymbol{\boldsymbol{\lambda}}(t),, &\boldsymbol{\lambda}(t_f) = \boldsymbol{\rho}^\mathrm{T}. \end{align*} $$

Then, the subgradient $\tilde{s}$ of $J^{\mathrm{cv}}$ at $\tilde{\mathbf{p}}$ is:

$$ \begin{align*} \tilde{\mathbf{s}} = [\mathbf{S}^{\mathrm{C}}_0]^\mathrm{T}\boldsymbol{\lambda}(t_0) + \boldsymbol{\rho} + \int_{t_0}^{t_f} [\Theta_{\mathrm{B}}(t,\tilde{\mathbf{p}})]^{\mathrm{T}} \boldsymbol{\lambda}(t) dt \in \partial J(\tilde{\mathbf{p}}). \end{align*} $$

The final quadrature term can be integrated simultaneously with the adjoint ODE system. To construct this adjoint subgradient evaluation system, the subgradient propagation coefficients $(\Theta_A, \Theta_B)$ must be constructed. These may be constructed by either the forward-mode automatic differentiation (AD) procedure for subgradient computation (Mitsos et al., 2009), or the corresponding reverse-mode AD (Beckers et al., 2012).

Implementation contents

The forward-mode subgradient AD (Mitsos et al., 2009) can be employed using MC++ directly, while the reverse-mode AD (Beckers et al., 2012) is implemented by our own differentiation and code generation tools.

For the reverse-mode subgradient AD (Beckers et al., 2012), the /src/ folder provides the ReverseADforVW module, a computational graph generation tool CompGraphs.jl, and a variant RevMcCormick of MC++’s McCormick class. For any user-defined factorable function that is compatible with MC++, the ReverseADforVW module in ReverseADforVW.jl automatically constructs its computational graph by adapting our computational graph generation module CompGraphs in CompGraphs.jl, and then generates the corresponding C++ code. The class RevMcCormick in revmccormick.hpp is developed to store propagated subgradient values during the reverse AD sweep.

The /examples/ folder contains our code for four numerical examples in the accompanying manuscript. The /MATLAB/ subfolder in each example folder contains the data generated in C++ and the code for plotting, as well as our code for approximating subgradients using finite differences. Note that this implementation itself can construct Scott and Barton's relaxations and adjoint subgradient evaluation systems automatically for any nonlinear dynamic optimization problem with an embedded system of parametric ODEs.

Example 3

This C++ implementation calculates subgradients for an objective function in a lower-bounding problem when the adjoint system is constructed using forward-mode subgradient AD. For comparision, the subgradients are approximated at the same parameter values using finite difference approximation in MATLAB. This example is adapted from [Scott and Barton (2013)], producing the plot:

Example 4

This C++ implementation calculates subgradients for an objective function in a lower-bounding problem. The computation is performed when the adjoint system is constructed using reverse-mode subgradient AD and corresponding finite difference approximation in MATLAB , producing the plot:

Example 5

This implementation describes both the forward subgradient evaluation system (Song and Khan, 2023) and the adjoint subgradient system for the same lower-bounding problem, allowing for comparison in terms of CPU time (the results for ten runs can be found in CPUtime.xlsx). In constructing the adjoint subgradient system, we separately apply both forward-mode subgradient AD (Mitsos et al., 2009) and reverse-mode subgradient AD (Beckers et al., 2012). Similarly, we also estimated the subgradients via finite difference approximation in MATLAB, and the resulting plot is:

Example 6

To compare how the CPU time for evaluating the subgradient of the objective function scales with the number of parameters, this implementation calculates subgradients using both the adjoint subgradient evaluation system and the forward subgradient evaluation system (Song and Khan, 2023). The results for ten runs can be found in CPUtime.xlsx.

References

Chachuat, B., 2001. MC++: a toolkit for bounding factorable functions. available at: https://github.com/coin-or/MCpp.
Beckers, M., Mosenkis, V., Naumann, U., 2012. Adjoint mode computation of subgradients for McCormick542 relaxations, in: Recent Advances in Algorithmic Differentiation. Springer, pp. 103–113
Hindmarsh, A.C., Brown, P.N., Grant, K.E., Lee, S.L., Serban, R., Shumaker, D.E., Woodward, C.S., 2005. SUNDIALS: Suite of nonlinear and differential/algebraic equation solvers. ACM Transactions on Mathematical Software (TOMS) 31, 363–396.
Scott, J.K., Barton, P.I., 2013. Improved relaxations for the parametric solutions of ODEs using differential589 inequalities. Journal of Global Optimization 57, 143–176.
Song, Y., Khan, K.A., 2022. Optimization-based convex relaxations for nonconvex parametric systems of598 ordinary differential equations. Mathematical Programming 196, 521–565.
Song, Y., Khan, K.A., 2023. Computing subgradients of convex relaxations for the solutions of parametric systems of ordinary differential equations. Under review.

Name		Name	Last commit message	Last commit date
Latest commit History 56 Commits
examples		examples
results		results
src		src
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

continuous-convex-adjoints

Dependencies

Method outline

Implementation contents

Example 3

Example 4

Example 5

Example 6

References

About

Releases

Packages

Contributors 2

Languages

License

kamilkhanlab/continuous-convex-adjoints

Folders and files

Latest commit

History

Repository files navigation

continuous-convex-adjoints

Dependencies

Method outline

Implementation contents

Example 3

Example 4

Example 5

Example 6

References

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages