Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

From docs it is not super clear that sample selection works analogously to feature selection #164

Open
agoscinski opened this issue Mar 8, 2023 · 2 comments

Comments

@agoscinski
Copy link
Collaborator

We have even in the examples a section Feature and Sample Selection, but no example notebook.
https://scikit-matter.readthedocs.io/en/latest/tutorials.html

@victorprincipe
Copy link
Collaborator

Not too sure what exactly you mean by this. In the API-reference for Feature and Sample Selection it states that:

"scikit-matter contains multiple data sub-selection modules, primarily corresponding to methods derived from CUR matrix decomposition and Farthest Point Sampling. In their classical form, CUR and FPS determine a data subset that maximizes the variance (CUR) or distribution (FPS) of the features or samples. These methods can be modified to combine supervised and unsupervised learning, in a formulation denoted PCov-CUR and PCov-FPS. For further reading, refer to [Imbalzano2018] and [Cersonsky2021].

These selectors can be used for both feature and sample selection, with similar instantiations. Currently, all sub-selection methods extend GreedySelector, where at each iteration the model scores each feature or sample (without an estimator) and chooses that with the maximum score."

https://scikit-matter.readthedocs.io/en/latest/selection.html

@agoscinski
Copy link
Collaborator Author

this is the current tutorials page
scikit-matter-sample-selection-tutorial-page

I agree that it is written the API, but we had a user who wasn't sure from the examples how to use sample selection. So we can improve this, but changing an example or adding one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants