Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clear Aromatic Flags for init_from_rdkit_mol #407

Closed

Conversation

stevenkbennett
Copy link
Contributor

@stevenkbennett stevenkbennett commented Nov 5, 2021

Related Issues: #396
Requested Reviewers: @lukasturcani, @andrewtarzia, @annabelbasford

This attempts to fix the issue when loading molecules with aromatic flags from RDKit. By removing aromatic flags, the aromatic bonds in the molecule (with bond order 1.5) are returned to alternating single and double bonds (orders 1 and 2 respectively).

Risks are that if the Kekulization method fails for a molecule, an error will be thrown by RDKit - though I think this is highly unlikely to occur in general use.

  • src/stk/molecular/building_block.py:.BuildingBlock._init_from_rdkit_mol(): The Kekulization method was added to the first line of this function.

Visually inspected molecule for changes. For an example, see the following code before and after the PR:

benzene = rdkit.MolFromSmiles('c1ccccc1')
rdkit.EmbedMolecule(benzene)
stk.BuildingBlock.init_from_rdkit_mol(benzene).to_rdkit_mol()

@andrewtarzia
Copy link
Collaborator

I am happy with a solution to this. Can I suggest adding some test molecules for this init function that test the issue?

I.e. defined with and without aromaticity?

@stevenkbennett stevenkbennett reopened this Nov 5, 2021
@lukasturcani
Copy link
Owner

lukasturcani commented Nov 5, 2021

I think this will incur a performance penalty, and is why this was omitted. Is a warning in the docstring not enough? I'm not sure it should be the responsibility of this function to sanitize user input, especially if it comes with performance penalty.

@stevenkbennett
Copy link
Contributor Author

I'm not sure a warning in the docstring is sufficient alone for this issue. At the least, I think a warning should be issued to the user that their molecular structure has changed after calling this function. It's additionally confusing because initialising the building block directly from SMILES using stk.BuildingBlock('c1ccccc1)` yields a molecule with the correct bonding.

@lukasturcani
Copy link
Owner

lukasturcani commented Nov 5, 2021

At the least, I think a warning should be issued to the user that their molecular structure has changed after calling this function

Hmm, molecular structure should not be changed by stk. The user should never pass a non-kekulized molecule to stk, and this should be made clear in the docstring. Therefore stk does not change molecular structure, because the user should pass in a structure that is valid and does not need to change. The user should Kekulize their rdkit molecule if they use stk, and the docs should be edited to make that super clear, imo.

@stevenkbennett
Copy link
Contributor Author

stevenkbennett commented Nov 5, 2021

I would argue that molecular structure is changed by stk using this function. If an aromatic molecule is loaded, you would expect that bonding to be maintained. Also, I believe the rdkit.KekulizeFunction (with default arguments) sets the bonds orders to 1.5, so a non-kekulized molecule should be passed to stk, if my understanding is correct.

@lukasturcani
Copy link
Owner

But if the user loaded in an aromatic molecule, they made a mistake. If the user made a mistake, I would expect things to go wrong.

I think Kekulize sets bond orders to integers -> think the Kekule structure of benzene, it's switching between two sets of structures, each of which has integer bonds. Pretty confident, but am willing to eat my words if you send me a code snippet that proves me wrong on that count.

@stevenkbennett
Copy link
Contributor Author

You're completely right about Kekulization, sorry I got it the wrong way round!
I got confused because RDKit sanitizes molecules when loading from SMILES as a default argument so all SMILES will be kekulized by default.
I think this is also the case in stk when loading a BB directly from SMILES (line 252 in building_blocks.py molecule = rdkit.AddHs(rdkit.MolFromSmiles(smiles)))

@stevenkbennett
Copy link
Contributor Author

Closing this PR as made a new one in #414 with docstring edits.

@stevenkbennett stevenkbennett deleted the aromatic_flagger branch February 18, 2022 12:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants