[BUG] Ensembling multiple trajectories with PDELibrary and WeakPDELibrary #148

znicolaou · 2022-01-10T23:13:54Z

The PDELibrary and WeakPDELibrary libraries need to reshape the data appropriately to take derivatives and integrals along specific axes when calculating the library matrix. When using multiple_trajectories=True in the SINDy.fit function, we need to inform the library that multiple trajectories are present so the reshaping can be done correctly. I implemented a fix by adding a self.num_trajectories variable to the libraries, which is set in the SINDy.fit function when multiple_trajectories=True. This works when ensemble=False in the SINDy.fit function, but I have not yet implemented the ensembling case. The subset selection should probably be done independently on each trajectory in the multiple_trajectories=True case.

akaptano · 2022-01-11T00:29:43Z

Dang, I thought I had covered this. Could you post an example that produces an error? Thanks for looking into this!

znicolaou · 2022-01-11T00:41:23Z

No problem! I was playing with some experimental data, but I'll put together a simpler example that we can work off of and post it here in a bit. I think it won't take much to get the ensembling working--there just needs to be some additional things done in the case of multiple_trajectories in there somewhere.

znicolaou · 2022-01-11T03:46:53Z

Here's a short example for testing.

integrator_keywords = {}
integrator_keywords['rtol'] = 1e-12
integrator_keywords['method'] = 'LSODA'
integrator_keywords['atol'] = 1e-12

def diffuse (t, u, dx, nx):
    u = np.reshape(u, nx)
    du = ps.differentiation.SpectralDerivative(d=2, axis=0)._differentiate(u, dx)
    return np.reshape(u * du, nx)

N = 200
h0 = 1.0
L = 5
T = 1

t = np.linspace(0, T, N)
x = np.arange(0, N) * L / N

ep = 0.5 * h0
y0 = np.reshape(h0 + ep * np.cos(4 * np.pi / L * x) 
                + ep * np.cos(2 * np.pi / L * x), N)
y1 = np.reshape(h0 + ep * np.cos(4 * np.pi / L * x) 
                - ep * np.cos(2 * np.pi / L * x), N)
dx = x[1] - x[0]

sol1 = solve_ivp(diffuse, (t[0], t[-1]), 
                y0=y0, t_eval=t, args=(dx, N),
                **integrator_keywords)
sol2 = solve_ivp(diffuse, (t[0], t[-1]), 
                y0=y1, t_eval=t, args=(dx, N),
                **integrator_keywords)

u = [np.reshape(sol1.y, (N, N, 1)),np.reshape(sol2.y, (N, N, 1))]


library_functions = [lambda x: x]
library_function_names = [lambda x: x]

pde_lib = ps.PDELibrary(
    library_functions=library_functions,
    function_names=library_function_names,
    derivative_order=2,
    spatial_grid=x,
    is_uniform=True,
)

X, T = np.meshgrid(x, t, indexing='ij')
XT = np.transpose([X, T], [1, 2, 0])

weak_lib = ps.WeakPDELibrary(
    library_functions=library_functions,
    function_names=library_function_names,
    derivative_order=2,
    spatiotemporal_grid=XT,
    K=100,
    is_uniform=False,
    num_pts_per_domain=30,
)

np.random.seed(100)

#works fine
optimizer = ps.STLSQ(threshold=0.1, alpha=1e-5, normalize_columns=False)
model = ps.SINDy(feature_library=pde_lib, optimizer=optimizer, feature_names=['u'])
model.fit(u, multiple_trajectories=True, t=t, ensemble=False)
model.print()

#works fine
optimizer = ps.STLSQ(threshold=0.1, alpha=1e-5, normalize_columns=False)
model = ps.SINDy(feature_library=weak_lib, optimizer=optimizer, feature_names=['u'])
model.fit(u, multiple_trajectories=True, t=t, ensemble=False)
model.print()

#Runs poorly
optimizer = ps.STLSQ(threshold=0.1, alpha=1e-5, normalize_columns=False)
model = ps.SINDy(feature_library=pde_lib, optimizer=optimizer, feature_names=['u'])
model.fit(u, multiple_trajectories=True, t=t, ensemble=True, n_subset=len(t))
model.print()

#ValueError: cannot reshape array of size 80000 into shape (200,200,1)
optimizer = ps.STLSQ(threshold=0.1, alpha=1e-5, normalize_columns=False)
model = ps.SINDy(feature_library=weak_lib, optimizer=optimizer, feature_names=['u'])
model.fit(u, multiple_trajectories=True, t=t, ensemble=True, n_subset=len(t))
model.print()

#AttributeError: 'list' object has no attribute 'shape'
optimizer = ps.STLSQ(threshold=0.1, alpha=1e-5, normalize_columns=False)
model = ps.SINDy(feature_library=pde_lib, optimizer=optimizer, feature_names=['u'])
model.fit(u, multiple_trajectories=True, t=t, ensemble=True)
model.print()

#AttributeError: 'list' object has no attribute 'shape'
optimizer = ps.STLSQ(threshold=0.1, alpha=1e-5, normalize_columns=False)
model = ps.SINDy(feature_library=weak_lib, optimizer=optimizer, feature_names=['u'])
model.fit(u, multiple_trajectories=True, t=t, ensemble=True)
model.print()

The fit works for multiple_trajectories=True, ensemble=False but not for multiple_trajectories=True, ensemble=True. I think the fix should not be too complicated--around line 440 of pysindy.py, there should be some if multiple_trajectories statements that loop over trajectories and do the ensembling on each trajectory.

akaptano · 2022-01-11T03:59:40Z

Cool, I'm happy to take a look at this tomorrow!

akaptano · 2022-01-11T17:24:07Z

Did you already make changes to the code? Even the first model throws an error for me

znicolaou · 2022-01-11T17:45:42Z

Oh yeah, sorry that was totally unclear. I did push a couple changes to master yesterday to implement the multiple_trajectories=True, ensemble=False cases. The old code throws the errors because it previously didn't distinguish the spatiotemporal data for multiple trajectories and ran into reshaping errors when trying to take derivatives/integrals. So I added variables self.num_trajectories to the WeakPDELibrary and PDELibrary classes to store that information when pysindy.fit is invoked, and then looped over each trajectory in the transform functions.

akaptano · 2022-01-11T20:17:58Z

Got this working... I'll push the changes in a moment but my guess is there will be some weird edge cases left to find at some point. Added your example script to the tests in test_pysindy.py.

akaptano · 2022-01-11T20:29:33Z

Some notes on the changes:

The weak formulation is really tricky. Basically, in order to get ensembling + multiple trajectories using the weak form, you would need to duplicate the temporal_grid several times and make this the new, longer temporal_grid. This doesn't work because the grid is required to be strictly ascending for the interpolation (also it makes the full spatiotemporal grid much larger). Instead, I settled on sub-sampling each of the trajectories on the normal temporal_grid, and currently each of the trajectories gets sub-sampled in the same way. In other words, if time indices [1, 10, 3] are chosen, trajectories 1, 2, 3, ... all only use these three indices and are stitched together.
It seems like there is a better way to do the num_trajectories thing in the PDE libraries, but I'm not sure what it is.

znicolaou · 2022-01-11T20:39:51Z

Nice--thanks for looking so quickly! I agree that we may need to test a bit to discover other potential issues, and there are probably cleaner solutions for both PDELibrary and WeakPDELibrary.

In the PDELibrary case, part of the challenge is that it's possible to deduce num_trajectories*num_time = n_sample // np.product(self.grid_dims), but not num_trajectories and num_time independently as far as I can tell. I'll think about whether there's another option too.

I'll keep testing on this issue as I use the package, so we may as well keep it open for now and post updates as we see them.

akaptano · 2022-01-11T22:22:24Z

Well it compiled fine on my computer but this is a bizarre (and new?) error while running the linting tests online, although looks like it has to do with some git backend. I'll take a look at this in a bit

akaptano · 2022-01-11T22:27:32Z

apparently today is the day where GitHub finally updates its security protocols https://github.blog/2021-09-01-improving-git-protocol-security-github/#when-are-these-changes-effective. I'll see if I can fix this for the repo.

znicolaou · 2022-01-11T22:35:58Z

Ah, confusing. We'll figure it out...

akaptano · 2022-01-11T23:34:33Z

Okay fixed that and think I fixed this bug so closing this for now. Feel free to open again when we invariably find some more edge cases. Hopefully this is good enough for your coarse grained models for now!

* deleted the prediction_interval_plots since it was redundant * minor typos

znicolaou added the bug Something isn't working label Jan 10, 2022

znicolaou self-assigned this Jan 10, 2022

akaptano closed this as completed Jan 11, 2022

jpcurbelo pushed a commit to jpcurbelo/pysindy_fork that referenced this issue May 9, 2024

Tackling issue147 (dynamicslab#148)

faac5aa

* deleted the prediction_interval_plots since it was redundant * minor typos

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] Ensembling multiple trajectories with PDELibrary and WeakPDELibrary #148

[BUG] Ensembling multiple trajectories with PDELibrary and WeakPDELibrary #148

znicolaou commented Jan 10, 2022

akaptano commented Jan 11, 2022

znicolaou commented Jan 11, 2022

znicolaou commented Jan 11, 2022

akaptano commented Jan 11, 2022

akaptano commented Jan 11, 2022

znicolaou commented Jan 11, 2022

akaptano commented Jan 11, 2022

akaptano commented Jan 11, 2022

znicolaou commented Jan 11, 2022

akaptano commented Jan 11, 2022 •

edited

Loading

akaptano commented Jan 11, 2022

znicolaou commented Jan 11, 2022

akaptano commented Jan 11, 2022

[BUG] Ensembling multiple trajectories with PDELibrary and WeakPDELibrary #148

[BUG] Ensembling multiple trajectories with PDELibrary and WeakPDELibrary #148

Comments

znicolaou commented Jan 10, 2022

akaptano commented Jan 11, 2022

znicolaou commented Jan 11, 2022

znicolaou commented Jan 11, 2022

akaptano commented Jan 11, 2022

akaptano commented Jan 11, 2022

znicolaou commented Jan 11, 2022

akaptano commented Jan 11, 2022

akaptano commented Jan 11, 2022

znicolaou commented Jan 11, 2022

akaptano commented Jan 11, 2022 • edited Loading

akaptano commented Jan 11, 2022

znicolaou commented Jan 11, 2022

akaptano commented Jan 11, 2022

akaptano commented Jan 11, 2022 •

edited

Loading