Fandom

Chemistry Toolkit Rosetta Wiki

Break rotatable bonds and report the fragments

21pages on
this wiki
Add New Page
Talk0 Share

This task is meant to give examples of how to add and delete atoms and bonds from the molecular graph, using the graph API.

One of the researchers I worked with asked me for a program which took an input molecule and broke it apart at the rotatable bonds. When a bond breaks, each fragment gets attached to an atom with atomic number 0, that is "*".

For a simple example, "c1ccccc1C=O" would become "c1ccccc1*" and "*C=O".

"Rotatable bond" was defined as the bond in the following SMARTS

[!$([NH]!@C(=O))&!D1&!$(*#*)]-&!@[!$([NH]!@C(=O))&!D1&!$(*#*)]

but a toolkit may have a built-in definition of rotatable bond.


ImplementationEdit

Write a function or method which takes a molecule as input and finds all of the rotatable bonds using either the toolkit's built-in definition or the above SMARTS definition. Delete each bond from the structure and add an atom with atomic number 0 to each of the atoms which was at the end of the bond. The function may modify the molecule in-place or return a new molecule.

Use the function to write a program which takes the structure "c1ccc2c(c1)C(=NC(C(=O)N2CC(=O)O)Cc3ccc(cc3)O)c4cccc(c4)O" (which is the perceived aromatic form of PubChem CID 391758), breaks the rotatable bonds, and prints the fragments out in SMILES form, one per line.

The output should be something like

*c1(ccc(cc1)O)*
*c1(cccc(c1)O)*
*[CH]1(C(=O)[N](c2ccccc2C(=N1)(*)*)(*)*)*
*[CH2](*)(*)*
*[CH2](*)(*)*
*C(=O)(*)O

although the specific strings will depend on the toolkit's canonicalization algorithm.

If the toolkit can do this with SMIRKS then feel free to show how that works, in addition to a solution which manipulates the graph directly. (Frankly, I haven't figured out how to do it with SMIRKS in OEChem.)

OpenBabel/RubabelEdit

require 'rubabel'
smarts = "[!$([NH]!@C(=O))&!D1&!$(*#*)]-&!@[!$([NH]!@C(=O))&!D1&!$(*#*)]"
mol = Rubabel["c1ccc2c(c1)C(=NC(C(=O)N2CC(=O)O)Cc3ccc(cc3)O)c4cccc(c4)O"]
mol.matches(smarts).each do |atom1, atom2|
  mol.delete(atom1.get_bond(atom2))
  [atom1, atom2].each do |old_a|
    mol.add_bond!(old_a, mol.add_atom!(0))
  end
end

OpenEye/PythonEdit

from openeye.oechem import *
 
rotatable_pat = OESubSearch("[!$([NH]!@C(=O))&!D1&!$(*#*)]-&!@[!$([NH]!@C(=O))&!D1&!$(*#*)]")
 
def break_rotatable_bonds(mol):
    # Get the rotatable bonds
    rotatable_bonds = []
    for match in rotatable_pat.Match(mol):
        rotatable_bonds.extend(match.GetTargetBonds())
 
    # Delete each bond
    for bond in rotatable_bonds:
        # Get the atoms it's bonded too then delete the atom
        bond_atoms = bond.GetBgn(), bond.GetEnd()
        mol.DeleteBond(bond)
        # Add the dummy atom to each of the ex-bond's atoms
        for atom in bond_atoms:
            new_atom = mol.NewAtom(0)
            mol.NewBond(atom, new_atom, 1)
 
 
mol = OEGraphMol()
OEParseSmiles(mol,
              "c1ccc2c(c1)C(=NC(C(=O)N2CC(=O)O)Cc3ccc(cc3)O)c4cccc(c4)O")
OEAssignAromaticFlags(mol)
 
break_rotatable_bonds(mol)
print OECreateCanSmiString(mol).replace(".", "\n")

SMIRKS versionEdit

from openeye.oechem import *
 
def main():
    '''
    '''
    SMIRKS = "[!$([NH]!@C(=O))&!D1&!$(*#*):1]-&!@[!$([NH]!@C(=O))&!D1&!$(*#*):2]>>[*:1]*.*[*:2]"
 
    molecule = OEGraphMol()
    OEParseSmiles(molecule, 'c1ccc2c(c1)C(=NC(C(=O)N2CC(=O)O)Cc3ccc(cc3)O)c4cccc(c4)O')
 
    libgen = OELibraryGen(SMIRKS)
    libgen.SetValenceCorrection(True)
    libgen.SetExplicitHydrogens(False)
    libgen.SetStartingMaterial(molecule, 0)
 
    for product in libgen.GetProducts():
        partcount,partlist = OEDetermineComponents(product)
        pred = OEPartPredAtom(partlist)
 
        for i in range(partcount):
            fragment = OEGraphMol()
            pred.SelectPart(i+1)
            OESubsetMol(fragment, product, pred)
 
            print OECreateCanSmiString(fragment),
 
        print
main()
 
"*C1=NC(C(=O)N(c2c1cccc2)CC(=O)O)Cc3ccc(cc3)O" "*c1cccc(c1)O"
"*C1C(=O)N(c2ccccc2C(=N1)c3cccc(c3)O)CC(=O)O" "*Cc1ccc(cc1)O"
"*N1c2ccccc2C(=NC(C1=O)Cc3ccc(cc3)O)c4cccc(c4)O" "*CC(=O)O"
"*CN1c2ccccc2C(=NC(C1=O)Cc3ccc(cc3)O)c4cccc(c4)O" "*C(=O)O"
"*CC1C(=O)N(c2ccccc2C(=N1)c3cccc(c3)O)CC(=O)O" "*c1ccc(cc1)O"

RDKit/PythonEdit

from rdkit import Chem
 
patt = Chem.MolFromSmarts('[!$([NH]!@C(=O))&!D1&!$(*#*)]-&!@[!$([NH]!@C(=O))&!D1&!$(*#*)]')
mol = Chem.MolFromSmiles("c1ccc2c(c1)C(=NC(C(=O)N2CC(=O)O)Cc3ccc(cc3)O)c4cccc(c4)O")
 
# find the rotatable bonds
bonds = mol.GetSubstructMatches(patt)
 
# create an editable molecule, break the bonds, and add dummies:
em = Chem.EditableMol(mol)
nAts = mol.GetNumAtoms()
for a,b in bonds:
    em.RemoveBond(a,b)
    em.AddAtom(Chem.Atom(0))
    em.AddBond(a,nAts,Chem.BondType.SINGLE)
    em.AddAtom(Chem.Atom(0))
    em.AddBond(b,nAts+1,Chem.BondType.SINGLE)
    nAts+=2
p = em.GetMol()
Chem.SanitizeMol(p)
 
smis = [Chem.MolToSmiles(x,True) for x in Chem.GetMolFrags(p,asMols=True)]
for smi in smis: print smi
 
# ----------------------------------
# There's actually an easier way to do this since the BRICS decomposition code has
# exactly the function we need:
from rdkit.Chem import BRICS
bonds = [((x,y),(0,0)) for x,y in bonds]
p = BRICS.BreakBRICSBonds(mol,bonds=bonds)
 
smis = [Chem.MolToSmiles(x,True) for x in Chem.GetMolFrags(p,asMols=True)]
for smi in smis: print smi


Cactvs/TclEdit

set eh [ens create "c1ccc2c(c1)C(=NC(C(=O)N2CC(=O)O)Cc3ccc(cc3)O)c4cccc(c4)O"]
prop setparam E_SMILES useexplicith 1
filter create rotbonds property B_ROTATABILITY operator = value1 yes
foreach b [ens bonds $eh rotbonds] {
        lassign [bond atoms $eh $b] a1 a2
        bond delete $eh $b
        bond create $eh [list $a1 [atom add $eh *]]
        bond create $eh [list $a2 [atom add $eh *]]
}
foreach fragsmiles [ens get $eh M_SMILES] {
        puts $fragsmiles
}

Output:

C1(=C(C2=C(C(=C1[H])[H])N(C(C(N=C2*)([H])*)=O)*)[H])[H]
C([H])([H])(*)*
C(=O)(O[H])*
C([H])([H])(*)*
C3(=C(C(=C(C(=C3[H])[H])O[H])[H])[H])*
C4(=C(C(=C(C(=C4[H])[H])[H])O[H])[H])*

And here is the requested version using SMIRKS with the provided rotatable bond definition pattern:

set eh [ens create "c1ccc2c(c1)C(=NC(C(=O)N2CC(=O)O)Cc3ccc(cc3)O)c4cccc(c4)O"]
prop setparam E_SMILES useexplicith 1
set transform {[!$([NH]!@C(=O))&!D1&!$(*#*):1]-&!@[!$([NH]!@C(=O))&!D1&!$(*#*):2]>>[*:1]*.[*:2]*}
set eh [ens transform $eh [list $transform] forward exhaustive]
foreach fragsmiles [ens get $eh M_SMILES] {
        puts $fragsmiles
}

The output is the same. Using SMIRKS is slower than the first version, though, and it requires toolkit version 3.384 or later. Previous versions ignored non-standard atoms for degree computations, without the option to include query atoms, and since the D attribute is used in the SMIRKS transform, the outcome was not what is expected.

Cactvs/PythonEdit

The first solution scripted in Python looks like this:

e=Ens("c1ccc2c(c1)C(=NC(C(=O)N2CC(=O)O)Cc3ccc(cc3)O)c4cccc(c4)O")
Prop.Setparam('E_SMILES','useexplicith',1)
f=Filter('rotbonds','property','B_ROTATABILITY','operator','=','value1','yes')
for b in e.bonds(f):
   (a1,a2)=b.atoms()
   b.delete()
   Bond(e,(a1,Atom(e,'*')))
   Bond(e,(a2,Atom(e,'*')))
for fragsmiles in e.M_SMILES:
   print(fragsmiles)

The output is identical.

Ad blocker interference detected!


Wikia is a free-to-use site that makes money from advertising. We have a modified experience for viewers using ad blockers

Wikia is not accessible if you’ve made further modifications. Remove the custom ad blocker rule(s) and the page will load as expected.