Is there matbench benchmark result for Wrenformer? #72

hongshuh · 2023-05-20T16:46:40Z

I saw in the commit history that you have conducted some experiments in matbench benchmark, It's a very good idea and model, but I may not have enough computational resources to run it, I would like to know if you have final resuts?

janosh · 2023-05-20T17:21:46Z

I believe @hrushikesh-s is currently working on submitting Wrenformer to Matbench. Maybe he can teel you more.

In case you haven't seen there are some preliminary results for various Wrenformer hyperparameter settings plotted in #44.

janosh · 2023-05-20T17:23:39Z

@hongshuh Also, what are you planning on using Wrenformer for? If discovery, these results might interest you: https://matbench-discovery.materialsproject.org/preprint#results.

hongshuh · 2023-05-20T17:33:38Z

Yea, I am also following the discorvery benchmark, It seems to handle the task as a regression problem by predicting the energy above hull, rather than treating it as a classification task of identifying whether a material is stable or not. I am a bit puzzled by this approach, since the aim seems to be the identification of stable materials, which would intuitively seem to be a classification task.

janosh · 2023-05-20T17:56:51Z

It seems to handle the task as a regression problem by predicting the energy above hull

That's right.

rather than treating it as a classification task of identifying whether a material is stable or not. I am a bit puzzled by this approach since the aim seems to be the identification of stable materials, which would intuitively seem to be a classification task.

I have some preliminary results which suggest doing direct classification does not improve over regression. But I think that's definitely something that could be investigated further. If you want to check how well a Wrenformer stability classifier performs compared to the Wrenformer regressor, that would be a very welcome contribution to MBD!

janosh · 2023-05-20T18:01:34Z

This section from Bartel et al. 2021 is also relevant here:

As an additional demonstration, all representations (except Roost—see “Methods” for details) were also trained as classifiers (instead of regressors), tasked with predicting whether a given compound is stable (ΔHd ≤ 0) or unstable (ΔHd > 0). The accuracies, F1 scores, and false positive rates are tabulated in Supplementary Table 2 and found to be only slightly better (accuracies < 80%, F1 scores < 0.75, false positive rates > 0.15) than those obtained by training on ΔHf (Fig. 4) or ΔHd (Supplementary Fig. 4).

Here's Table S2:

hongshuh · 2023-05-20T18:09:09Z

I have some preliminary results which suggest doing direct classification does not improve over regression.

Thanks! Maybe the regression values provide more information to the model than just "stable" or "unstable" labels.

janosh added the question Further information is requested label May 20, 2023

hongshuh mentioned this issue Jun 12, 2023

Obtain E_above_hull predictions janosh/matbench-discovery#40

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Is there matbench benchmark result for Wrenformer? #72

Is there matbench benchmark result for Wrenformer? #72

hongshuh commented May 20, 2023

janosh commented May 20, 2023

janosh commented May 20, 2023

hongshuh commented May 20, 2023

janosh commented May 20, 2023 •

edited

Loading

janosh commented May 20, 2023

hongshuh commented May 20, 2023 •

edited

Loading

Is there matbench benchmark result for Wrenformer? #72

Is there matbench benchmark result for Wrenformer? #72

Comments

hongshuh commented May 20, 2023

janosh commented May 20, 2023

janosh commented May 20, 2023

hongshuh commented May 20, 2023

janosh commented May 20, 2023 • edited Loading

janosh commented May 20, 2023

hongshuh commented May 20, 2023 • edited Loading

janosh commented May 20, 2023 •

edited

Loading

hongshuh commented May 20, 2023 •

edited

Loading