Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why so many generated structures with hydrogen and/or helium? #80

Open
sgbaird opened this issue Jun 11, 2022 · 4 comments
Open

Why so many generated structures with hydrogen and/or helium? #80

sgbaird opened this issue Jun 11, 2022 · 4 comments
Labels
bug Something isn't working

Comments

@sgbaird
Copy link
Member

sgbaird commented Jun 11, 2022

'Ca2He2H1,volume=23,uid=3533.cif'
'Cr1Ag1H2He1,volume=31,uid=3d40.cif'
'Cs1Li1Ca1H2,volume=62,uid=8e9d.cif'
'Cs1Rb1La1Sn1Sb1Pd1H3I1Ne1He1,volume=177,uid=5495.cif'
'Cu1H3I1He1,volume=37,uid=3870.cif'
'K1Ca1Ac1Mg1Ti1Mn1Al1Cr1In2Ga1Co1Tc1Cu1Ag1Hg1Ge1Te2As1Pd1H3Rh1Se1C1Xe1I1Kr1He1,volume=1358,uid=eccb.cif'
'Li1Ti1H2Xe1Cl1,volume=37,uid=070b.cif'
'Mo1H1C1Kr1He1,volume=24,uid=879d.cif'
'Na1Li1H1Kr1,volume=18,uid=5905.cif'
'Na1Sr1Sc1H1C1Br1He3,volume=102,uid=7fcc.cif'
'Rb1Na1Li1Sc1Ti1Ni1Ru2H2Cl1,volume=123,uid=5fa0.cif'
'Rb1Sc1He1,volume=18,uid=90e7.cif'
'Rb1Ti1V1Ga1Ni1Ge1H2,volume=83,uid=cf25.cif'
'Sc1Cr1F1Ne1H1He1,volume=47,uid=5b5d.cif'
'Sr1Gd1Dy1Y1Ni1Pd1He1H2,volume=108,uid=cf05.cif'
'Zr1Tc1Cu1H4Se1Br1F1,volume=69,uid=4e8d.cif'

See also #79

@sgbaird sgbaird added the bug Something isn't working label Jun 11, 2022
@sgbaird
Copy link
Member Author

sgbaird commented Jun 11, 2022

xref: txie-93/cdvae#11

@sgbaird
Copy link
Member Author

sgbaird commented Jun 11, 2022

Doesn't seem to be an issue with noble gases being represented very frequently, which of course isn't expected anyway since it's experimentally verified compounds. Maybe $He$ is "supposed to be" $H$ and more so has to do with the RGB unscaling.

from pymatviz.elements import ptable_heatmap_plotly

equimolar_compositions = train_inputs.apply(
    lambda s: Composition(re.sub(r"\d", "", s.formula))
)
fig = ptable_heatmap_plotly(equimolar_compositions)
fig.show()

mp-time-split-train-fold-0-forced-equimolar

@sgbaird
Copy link
Member Author

sgbaird commented Jun 11, 2022

Also strange that there's not a single instance of oxygen in the generated structures, despite its prevalence. I don't see "adjacent" elements nitrogen or flourine either. Could be an issue with the scaling/unscaling, not enough data, bad model, bad representation, or biased sample. Last one seems unlikely.

Probably not an issue with data shuffling #84

@sgbaird
Copy link
Member Author

sgbaird commented Jun 20, 2022

from #79

The formulas:
image

which seem somewhat better.

Interested to see how it goes with imagen-pytorch instead of denoising_diffusion_pytorch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant