Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Featurizers missing prechecking #385

Open
3 tasks
ardunn opened this issue Mar 26, 2019 · 4 comments
Open
3 tasks

Featurizers missing prechecking #385

ardunn opened this issue Mar 26, 2019 · 4 comments

Comments

@ardunn
Copy link
Contributor

ardunn commented Mar 26, 2019

While #384 introduced 3 prechecks for featurizers, there are many more featurizers that could benefit from a precheck. This issue can be used for tracking these featurizers

  • CationProperty, OxidationStates, ElectronAffinity, ElectronegativityDiff, IonProperty: all-metal entries' oxidation states are ill-defined, so they could be considered invalid
  • NN-based featurizers (e.g., SiteStatsFingerprint): Throw warnings (not neccessarily fail precheck) when large structures are prechecked, cuz featurizing them is gonna take forever (prechecks should be very fast regardless)
  • Some others? Please add more featurizers to this issue list if you think of them
@computron
Copy link
Contributor

Note that returning a NaN for precheck could be considered as neither failing or passing a precheck, e.g., if the precheck test was skipped for a large structure

@ardunn
Copy link
Contributor Author

ardunn commented Mar 26, 2019

@computron Should prechecks have the ability to be skipped though? AFAIK they can be all be super fast operations that (1) pass (2) fail or (3) throw an error. If (3) happens the user should look into it. For example:

def precheck(self, structure):     # for some NN featurizer
    if len(structure.sites) > 250:
        warnings.warn(some_warning)
    ...
    # returns True or False, regardless of whether the warning shows or not

Or is there some other precheck mechanism you had in mind?

@computron
Copy link
Contributor

At the very least precheck skipping should be discouraged. Let's not introduce this unless we feel it is really necessary. Based on your comment I got the sense that some large structures could not be prechecked quickly. If that's the case it might be better to throw NaN versus have the precheck take forever.

@ardunn
Copy link
Contributor Author

ardunn commented Mar 26, 2019

Yes - I think all featurizers can have a precheck that never skips. I don't see any operations where just checking if the featurizer will take a long time will itself take a long time.

Edit: I see in my comment where confusion could come from, changed it

@ardunn ardunn mentioned this issue Jun 9, 2020
15 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants