Add kmeans clustering function #286

oyvindeide · 2021-01-11T10:13:50Z

Closes #241

eivindjahren · 2021-01-15T15:03:32Z

tests/jobs/misfit_preprocessor/test_misfit_preprocessor.py

 @pytest.mark.parametrize("method", ["spearman_correlation", "auto_scale"])
 @pytest.mark.parametrize(
    "num_polynomials",
    tuple(range(1, 5)) + (20, 100),
 )
-def test_misfit_preprocessor_n_polynomials(num_polynomials, method):
+def test_misfit_preprocessor_n_polynomials(


I think in this case, you get a cleaner implementation with hypothesis than with pytest parametrization.
Just a suggestion:

from hypothesis import given, assume import hypothesis.strategies as st clustering_functions = st.sampled_from(["hierarchical", "kmeans"]) methods = st.sampled_from(["spearman_correlation", "auto_scale"]) @given(st.integers(min_value=1, max_value=100), methods, clustering_functions) def test_misfit_preprocessor_n_polynomials( num_polynomials, method, clustering_function ): if ( clustering_function == "kmeans" and method == "spearman_correlation" ): assume(num_polynomials in [4,5]) # or more

semeio/workflows/misfit_preprocessor/job.py

semeio/workflows/misfit_preprocessor/kmeans_config.py

tests/jobs/misfit_preprocessor/unit/test_config.py

semeio/workflows/spearman_correlation_job/job.py

lars-petter-hauge · 2021-02-01T13:29:08Z

Had some minor comments, otherwise I think the implementation looks good!

lars-petter-hauge

LGTM! Nice job!

oyvindeide self-assigned this Jan 11, 2021

oyvindeide force-pushed the add_kmeans branch from 07e47aa to a199c60 Compare January 11, 2021 11:01

eivindjahren reviewed Jan 15, 2021

View reviewed changes

oyvindeide force-pushed the add_kmeans branch from 0ad4146 to 2ce2e4f Compare January 25, 2021 15:32

oyvindeide commented Jan 25, 2021

View reviewed changes

semeio/workflows/misfit_preprocessor/job.py Show resolved Hide resolved