Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

load_dataframe_from_json fails on MultiIndex #670

Open
janosh opened this issue Jul 22, 2021 · 0 comments
Open

load_dataframe_from_json fails on MultiIndex #670

janosh opened this issue Jul 22, 2021 · 0 comments

Comments

@janosh
Copy link
Member

janosh commented Jul 22, 2021

I just noticed there's a problem with load_dataframe_from_json when trying to load multi-index dataframes.

from matminer.utils.io import load_dataframe_from_json, store_dataframe_as_json
import numpy as np
import pandas as pd


arr = np.arange(20).reshape(5, 4)

df = pd.DataFrame(arr, columns=list("abcd"))


store_dataframe_as_json(df, "df.json")
df = load_dataframe_from_json("df.json")
# all good here


df = pd.DataFrame(arr, columns=list("abcd")).set_index(["a", "b"])


store_dataframe_as_json(df, "df.json")
df = load_dataframe_from_json("df.json")
>>> ValueError: Shape of passed values is (5, 2), indices imply (2, 2)

That's because pandas doesn't support passing in a list of lists as a multi-index. Instead you have to create a MultiIndex object first and pass that in

idx = [[i, i + 1] for i in range(5)]
pd.DataFrame(arr, columns=list("abcd"), index=idx)
>>> ValueError: Shape of passed values is (5, 4), indices imply (2, 4)


idx = pd.MultiIndex.from_tuples(((i, i + 1) for i in range(5)))
pd.DataFrame(arr, columns=list("abcd"), index=idx)

So one possible fix would be

    if isinstance(dataframe_data, dict):
        if set(dataframe_data.keys()) == {"data", "columns", "index"}:
+           if type(dataframe_data['index'][0]) == list:
+               dataframe_data['index'] = pandas.MultiIndex.from_tuples(dataframe_data['index'])
            return pandas.DataFrame(**dataframe_data)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant