You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Coming from the perspective of MOFs I see one main point as a potential opportunity for the library:
If one could abstract the encoding of the image a bit more from atoms as fundamental building blocks one could apply it also to MOFs (or some coarse-grained representation).
That is, the most general implementation would have an interface such as
By default, the functions would to the encoding of the elements. However, if users provide other sites/or use symbols to indicate certain building blocks, they might want to choose their own encoding/decoding function. This should also make it easier to use Wyckoff sites instead of all sites.
The embedding_encoding_func would be a function that by default creates the pairwise distance matrix, but might also be the adjacency matrix (which can be useful if one aims to generate new crystallographic nets).
Another interesting question might be how materials cluster in "xtal2png" space compared to other representations, e.g. SOAP. However, this would require the implementation invariant to permutation and supercell expansion.
I really like this suggestion. This actually points to a common issue with many materials informatics repositories. For a while, I've wanted to make CrabNet agnostic to chemical formulas sparks-baird/CrabNet#6. @Pepe-Marquez is also interested in featurization for more general building blocks based on some internal discussions I've had with him.
To implement a really general "building blocks" framework seems non-trivial to me, at least at first. I think the common threads here would be that site_encoding_func-s and embedding_encoding_func-s would operate on pymatgen Structure-s, and the site_decoding_func-s and embedding_decoding_func-s would operate on images, where each row/column represents a unique building block. In the latter case, the current xtal2png representation starts to break down since it contains site coordinates. For arbitrary building blocks (e.g. of structural motifs), additional (invertible) information related to the composition and structure of the motifs would need to be present.
The text was updated successfully, but these errors were encountered:
Following up from a chat with @kjappelbaum, there could be a hierarchy of building block types manifested in the layers. For example, the first layer encodes information about the atoms, the second layer encodes information about structural motifs (larger building blocks), etc.
From internal communication. By Berend Smit:
I really like this suggestion. This actually points to a common issue with many materials informatics repositories. For a while, I've wanted to make CrabNet agnostic to chemical formulas sparks-baird/CrabNet#6. @Pepe-Marquez is also interested in featurization for more general building blocks based on some internal discussions I've had with him.
To implement a really general "building blocks" framework seems non-trivial to me, at least at first. I think the common threads here would be that
site_encoding_func
-s andembedding_encoding_func
-s would operate on pymatgenStructure
-s, and thesite_decoding_func
-s andembedding_decoding_func
-s would operate on images, where each row/column represents a unique building block. In the latter case, the currentxtal2png
representation starts to break down since it contains site coordinates. For arbitrary building blocks (e.g. of structural motifs), additional (invertible) information related to the composition and structure of the motifs would need to be present.The text was updated successfully, but these errors were encountered: