Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New benchmarking examples #266

Merged
merged 91 commits into from
Jan 16, 2023
Merged

New benchmarking examples #266

merged 91 commits into from
Jan 16, 2023

Conversation

akaptano
Copy link
Collaborator

@akaptano akaptano commented Nov 30, 2022

Hi all,

Lanyue Zhang and I have been working on getting a big benchmark paper + example done, and I think the branch is pretty close to being ready for a merge.

Almost all of the changes are in the examples/16_noise_robustness/ folder. There are three jupyter notebooks there, and a utils.py file for some extra functions. One of the notebooks reproduces all the figures in the paper, another does some visualization and generates plots, and the last performs an example hyper parameter scan using the ensembling and other fancy functionality.

I also changed gitlab -> github in the .pre-commit file, since it looks like the issue was that flake8 moved from gitlab to github very recently (https://jira.mongodb.org/browse/PYTHON-3531). This appears to have resolved the issue. However, there is now a pytest issue related to sphinx and/or ipython (jupyter/nbconvert#528) and I'm not sure how to fix this.

OliviaZ0826 and others added 30 commits February 18, 2022 23:31
Implemented 1D and multiple trajectories cases and wrote corresponding tests.
Raise error if the inputs for pde cases contain tuple, and write corresponding tests. Modified conftest.py: return additional t in data_3d_random_pde and data_5d_random_pde
Log-normal contour plot of RMSE between chaotic level and noise level for X and X dot.
…and avoids integrating the models for speed. Now is quite fast, and the plan is to use this to look at syntactical complexity.
… with syntax so far, which is surprising, but need to make sure it is plotting properly.
…ngs. Made a new script that loops over all the polynomial systems in the dysts database, and fits them (without noise). Also wrote a script to load in all the ODE functions as strings, and parse them to extract all the model coefficients. Needs some tuning, but so far looks like it is performing well compared to the true coefficients.
@znicolaou
Copy link
Collaborator

This example looks really great to me!

I dug into details a little bit, and pushed a small change just now. These are just suggestions--feel free to revert anything back to the last version if you like. I think it is ready for merge with master when you are.

The main change I made was to speed up the Pareto sweeps by calculating x_dot_test and the library .transform() matrices, which I called mats, just once before the sweep rather than for each value of the hyperparameters. The model predictions are then calculated as x_dot_test_pred = [coef_new[j].dot(mat.T).T for mat in mats], where coef_new = np.array(optimizer.coef_list). This hack reproduces the model.predict functionality but allows us to recycle the mats, which is especially useful in the weak case, since it is relatively expensive to calculate the same weak integrals many times. It may be helpful in practice, since the Pareto front scans can take a while, and the hyperparameter_scan functions in utils.py can be used in other cases.

I also included an option strong_rmse in Pareto_scan_ensembling, which allows the Pareto-optimal AIC calculation to use the "strong" rmse value in conjunction with weak_form=True (for any optimizer algorithm), and verified that it is consistent for the STLSQ case in 16_benchmark_paper.ipynb.

For the MIOSR algorithm, I did encounter "GurobiError: Model too large for size-limited license; visit https://www.gurobi.com/free-trial for a full license." I was able to acquire an academic license following the instructions here: https://www.gurobi.com/features/academic-named-user-license/, and everything works okay after copying the license to the right gurobi directory in my anaconda site-packages. The MIOSR runs do take a long time, and I think they will hog resources if running in parallel. I didn't run the sweeps again or save any results. I'd suggest (if you have the patience) to run all the sweeps one-at-a-time on a cluster machine so the runtime can be estimated without competing for resources.

But everything look really nice to me, and I think it is a really helpful example to include. Nice work!

@znicolaou
Copy link
Collaborator

PS: I pushed once more fixing the order of the predicted_coefficient in the weak case. For STLSQ, the average coefficient errors are totally consistent with the previous version, as are the average RMSE errors in the weak case with strong_rmse=True. The average RMSE errors are even a couple orders of magnitude smaller for the weak case with strong_rmse=False (which is the setting that I recommend using). Anyway, I should move on to other things now--let me know if you find any new issues!

@znicolaou znicolaou mentioned this pull request Dec 14, 2022
Alan Kaptanoglu and others added 15 commits December 29, 2022 22:06
…ning all the results at once so dont need multiple notebooks.
…for all the optimizers except MIOSR. Need to update the plotting notebook to use the new data. Made some small changes to the run_all script so that all of the optimizers can run weak form.
…ents were being reshuffled even when weak form was false.
…umbers. The issue is that the true_coefficients matrix was reordered for the weak library, but the error matries were also reordered, so then they didnt match again. Now the true_coefficients are left alone. Reran the STLSQ results which are looking much more reasonable now.
… linting and pytest errors, this is ready for a merge.
@akaptano akaptano merged commit 5c6e9fd into master Jan 16, 2023
@akaptano akaptano deleted the lanyue_work branch January 16, 2023 14:31
Jacob-Stevens-Haas added a commit that referenced this pull request Oct 30, 2023
Recreated bug described in #266, which only arises in gurobipy 10.0.0.
Verified tests pass in 10.0.1 to 10.0.3
Jacob-Stevens-Haas added a commit that referenced this pull request Oct 30, 2023
Fixes #303 
Demonstrated that bug described in #266 only arises in gurobipy 10.0.0.
Verified tests pass in 10.0.1 to 10.0.3
jpcurbelo pushed a commit to jpcurbelo/pysindy_fork that referenced this pull request Apr 30, 2024
jpcurbelo pushed a commit to jpcurbelo/pysindy_fork that referenced this pull request Apr 30, 2024
Fixes dynamicslab#303 
Demonstrated that bug described in dynamicslab#266 only arises in gurobipy 10.0.0.
Verified tests pass in 10.0.1 to 10.0.3
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants