Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

347 refactor validation category #368

Merged
merged 13 commits into from
Apr 25, 2024
Merged

Conversation

nmaarnio
Copy link
Collaborator

@nmaarnio nmaarnio commented Apr 15, 2024

New tools added and old ones modified. Here is a brief list of the changes:

  • Removed AUC tool and related tests and docs (the reason is that AUC is calculated directly from Sklearn when a curve is calculated)
  • Moved get_pa_intersection to plot_prediction_area_curves.py module, since it is needed only there
  • Added module classifier_probability_evaluation.py that has the following tools:
    • summarize_probability_metrics to generate a report dictionary from probability array
    • plot_roc_curve
    • plot_det_curve
    • plot_precision_recall_curve
    • plot_calibration_curve
    • plot_predicted_probability_distribution
  • Added module classifier_label_evaluation.py and the following tools
    • summarize_label_metrics_binary
    • confusion_matrix (just direct import from Sklearn, no wrapper)
  • Added notebook called testing_classifier_evaluation.ipynb to demonstrate the two new modules above
  • Separated scoring predictions into own module called scoring.py
  • Small modification to plot_confusion_matrix
  • Separated predicting with ML model functions into their own module called machine_learning_predicti.py
  • Separated predict tool for classifier and regressor models, so two new functions:
    • predict_classifier, this tool now can return the probability array based on parameter. NOTE: Testing for Keras models still not adequate enough
    • predict_regressor

TODO:

  • Add and update affected CLI functions
  • Rename the validation folder to evaluation as the last thing

@nmaarnio nmaarnio linked an issue Apr 15, 2024 that may be closed by this pull request
@nmaarnio
Copy link
Collaborator Author

@msmiyels , do you have time to review this PR in the next couple of days? Ended up doing quite a lot of modeling related stuff, so thought it could be good to involve you. If not, I can ask someone else

Copy link
Collaborator

@msmiyels msmiyels left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey Nico,

nice work overall! 🐱‍👤 Left a view comments here and there, but nothing criticial.

I'm a bit torned whether it is better to have one classification_evaluation rather than separate modules ...label_evaluation and ...probability_evaluation., but it will not make any difference at least for the GUI.

If you have any questions, let me know 📯

eis_toolkit/validation/plot_prediction_area_curves.py Outdated Show resolved Hide resolved
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

import sys
sys.path.append('..')

is missing to import the toolkit from the notebooks folder. Please also check for the other provided notebooks, I did not investigate all of them 😉

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think these lines are only needed if files need to be opened in the notebook, which is currently not the case with testing_classifier_evaluation.ipynb. Also, the other notebooks should be pretty much as they were before, just modified some cells slightly according to the general ML tool modifications.

@nmaarnio
Copy link
Collaborator Author

Hey @msmiyels and thanks for the review!

I made some changes according to your review and carried out the renaming

@nmaarnio
Copy link
Collaborator Author

Do you think this could me merged @msmiyels ?

@msmiyels
Copy link
Collaborator

Do you think this could me merged @msmiyels ?

Hi Niko,

yep, think this PR can be merged 👍

@nmaarnio nmaarnio merged commit d3c6808 into master Apr 25, 2024
6 checks passed
@nmaarnio nmaarnio deleted the 347-refactor-validation-category branch April 25, 2024 10:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Refactor validation category
2 participants