Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Maint: Hypervolume (pyhv.py) module rewrite #484

Merged
merged 37 commits into from
Feb 12, 2020

Conversation

Corwinpro
Copy link
Contributor

@Corwinpro Corwinpro commented Jan 29, 2020

Types of changes

  • Docs change / refactoring / dependency upgrade
  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

Motivation and Context / Related issue

This PR is related to the #366.

This PR rewrites the code that generates the hypervolume indicator, used for multiobjective optimization.

Any suggestions / comments / requests are very welcome.

The purpose is two-fold:

  • Rewrite / refactor the existing code to improve the general readability and extendability, spot possible bugs, and add documentation
  • Get rid of the LGPL licence in favour of the standard MIT licence, used by nevergrad.

How Has This Been Tested (if it applies)

We added unit tests for the data structure that are used by the HypervolumeIndicator class.

We test the main functionality of the new hypervolume algorithm against the old one. More integration tests should be included before this PR can be accepted.

Checklist

  • The documentation is up-to-date with the changes I made.
  • I have read the CONTRIBUTING document and completed the CLA (see CONTRIBUTING).
  • All tests passed, and additional code has been covered with new tests.

@facebook-github-bot facebook-github-bot added the CLA Signed Do not delete this pull request or issue due to inactivity. label Jan 29, 2020
@teytaud
Copy link
Contributor

teytaud commented Jan 29, 2020

Waow thanks!
I try to read this promptly - what is sure is that this was much needed!

@Corwinpro
Copy link
Contributor Author

Thank you @teytaud

I am fixing the mypy errors now, although it can take some time as I have never used this type checking system before.

@teytaud
Copy link
Contributor

teytaud commented Jan 30, 2020

Congratulations, all the real tests work! The only issue is the Type information: mypy detects quite a bit of type info missing. Can you do that ?

@Corwinpro
Copy link
Contributor Author

Thant you @teytaud . Yes, I will be adding the Type info shortly: mypy is a new system for me, so I apologize in advance for the possible delays.

Copy link
Contributor

@jrapin jrapin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've not dived into the actual implementation but here are a few initial feedbacks.
Replacing pyhv by this in the rest of the code will help understand it better and check for bugs.
In the meantime, do you want me to make a PR on your PR to help you add typing?

nevergrad/functions/multiobjective/hypervolume.py Outdated Show resolved Hide resolved
@Corwinpro
Copy link
Contributor Author

Thank you @jrapin ! I am fine with using the same branch if you would like to - I can focus on the testing task if you prefer.

@jrapin
Copy link
Contributor

jrapin commented Jan 31, 2020

Thank you @jrapin ! I am fine with using the same branch if you would like to - I can focus on the testing task if you prefer.

I tend to prefer PRs so as to highlight changes, and avoid collisions. However I don't seem to be able to PR on your fork, I don't understand why :s so you can find typing that works on branch enthought-maint/pyhv-rewrite-typing
I went to the simplest solutions to make it work, sometimes ignoring the typing, and other times using Any when I did not understand it.
Note that there are a few unbound variables (see TODOs)

Edit: doing this it seems I have broken the code though. The "simplest solution" was to assume that next and prev never contained Nones, but it seems it needs to. The issue then is that most of the time it is assumed that they are not None. I'm a bit concerned by this pattern, whenever possible we should work with non "Optional" data, that simplifies the typing, and also avoids weird bugs

@Corwinpro
Copy link
Contributor Author

Corwinpro commented Jan 31, 2020

Thank you @jrapin for this PR, it is great. Sorry for the delay, I might be able to finish the testing by Monday night.

@Corwinpro
Copy link
Contributor Author

Corwinpro commented Jan 31, 2020

I guess node.next[i] can be None - this is the case when we have a default node, with no connections.

Regarding the TODOs, I am updating this comment with what I think can be useful:

  • The HypervolumeIndicator.compute is the main method of the algorigthm:
    def compute(self, points: tp.Any) -> float:  # TODO not too sure what that is (replace Any)

It is used by MultiobjectiveFunction.compute_aggregate_loss (line 52 in functions/multiobjective/core.py. I would assume that a list of "vectors" would be passed as the argument.

@Corwinpro
Copy link
Contributor Author

Corwinpro commented Feb 3, 2020

I believe the mypy issues should be resolved now.

HypervolumeIndicator tests should be on the way, working on this now.

@Corwinpro
Copy link
Contributor Author

@jrapin Could you please have a look at this now? Most of the tests are ready, I guess it could be a good time for review and corrections, if you don't mind.

Thank you!

Copy link
Contributor

@jrapin jrapin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the delay.
Here are a few comments.
I checked mostly the code, but as long as the tests are working (and they are) and with a few minor updates (mostly parameters that I feel are not needed) I am OK.
I'll probably iterate on this in (much) later PRs, I feel there is some underlying structure that may be made more explicit, but dunno what exactly for now anyway.

hypervolume += h * last_node.coordinate[dimension]
return hypervolume

def recursive_hypervolume(self, dimension: int, bounds: np.ndarray) -> float:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any reason for this method to take dimension and bounds as inputs? Those seem already provided by the instance

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, I totally agree that the bounds are passed in for now reason. Should be fixed in the next commit.

The dimension though is needed, as we recursively iterate from the default dimension of the problem to the lower dimensions; it doesn't always equal the dimension of the space of the problem.

hypervolume = self.recursive_hypervolume(self.dimension - 1, self.reference_bounds)
return hypervolume

def plane_hypervolume(self, dimension: int) -> float:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dimension seems provided by the instance and necessarily 1, so you could remove this and if need be,
check self.dimension == 2?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I am not mistaken, the self.dimension should not necessarily be 2 here: as we iterate from the higher to lower dimensional subspaces, at some point we will reach the space of dim == 2. Then, this method is called.

I removed the argument dimension from the method - it was ambiguous.

Thank you.

hypervolume -= last_node.area[dimension] * last_node.coordinate[dimension]
return hypervolume

def skip_dominated_points(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, I would say:

  • either the methods does not need the argument,
  • or it should be a function, not a method

But maybe I'm not understanding it properly

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have the same reasoning as before - the dimension will change during the recursive iteration from the higher to lower dimensional space. I hope I understand this algorithm correctly; I will double check that.

import numpy as np


class VectorNode:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add some docstring here, just to give a brief overview of what this does, and how it is useful ? + describe the parameters.
Btw, shouldnt it be coordinates and not coordinate? I'm not a native English speaker though so I may be wrong

Copy link
Contributor Author

@Corwinpro Corwinpro Feb 5, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct. Fixed that.

@jrapin
Copy link
Contributor

jrapin commented Feb 5, 2020

@teytaud anything to add?
@Corwinpro You can probably remove the pyhv file and the related license

@Corwinpro
Copy link
Contributor Author

Corwinpro commented Feb 7, 2020

I removed the old code and the license file. Would you like me to add anything else?

Thank you!

Copy link
Contributor

@jrapin jrapin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I forgot all the licensing stuff, see comments below + can you remove the sentence " LGPL code is however also included in the multiobjective subpackage." in the README?
@teytaud anything more to say?
I didn't have time to dive into the algorithm but since tests are passing it's perfect for me. Thanks for the help!

@@ -0,0 +1,303 @@
import typing as tp
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add the same header as on all other files? This is required to mention that it is released under the MIT license

@@ -0,0 +1,304 @@
import numpy as np
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add the header here too

@Corwinpro
Copy link
Contributor Author

Thank you very much for the review and the comments. I am ready to merge this if you are happy with all the changes.

Thank you!

@jrapin
Copy link
Contributor

jrapin commented Feb 12, 2020

Sorry for the delay, let's go then ;)

@jrapin jrapin merged commit caf2059 into facebookresearch:master Feb 12, 2020
@Corwinpro Corwinpro deleted the maint/pyhv-rewrite branch February 12, 2020 09:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed Do not delete this pull request or issue due to inactivity.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants