Coder Social home page Coder Social logo

constructure's People

Contributors

simonboothroyd avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

Forkers

lilyminium

constructure's Issues

Non-sequential R-groups

Must R-groups be sequential? e.g. if I have a molecule that I number in my head by carbon,

image

it is easier for me to keep numbering by carbon than transfer that to 1, 2, 3, 4. However, Constructor.enumerate_combinations validates substituents starting from R-group 1. Could the validation function just check the given scaffold for defined R-groups instead? I can implement this if you think this is a valuable addition to constructure.

e.g.

# Create the scaffold.
scaffold = Scaffold(
    smiles="C1(=C([C]([R4])=C2C(=[C]([R7])1)[N]([C]([R2])=[C]([R3])2)[H])O)O",
    r_groups={
        2: sub_groups,
        3: sub_groups,
        4: sub_groups,
        7: sub_groups,
    },
)

smiles = Constructor.enumerate_combinations(scaffold,
                                            substituents={
                                                2: substituents,
                                                3: substituents,
                                                4: substituents,
                                                7: substituents,
                                            })

Cannot parse R-groups > 9

At present constructor cannot parse SMILES strings with R-groups > 9, due to

  • not capturing more than single digits, and
  • excluding 0

Code:

>>> from constructure.scaffolds import Scaffold
>>> from constructure.constructors import RDKitConstructor as Constructor
>>> # this smiles has 12 R-groups numbered 1 to 12
>>> smiles = "C1(C2C(C(C(C1([R1]))([R2]))([R3]))C(C3C(C2([R4]))C(C4C(C3([R5]))C(C(C(C4([R6]))([R7]))([R8]))([R9]))([R10]))([R11]))([R12])"
>>> r_groups = ["hydrogen"]
>>> scaffold = Scaffold(smiles=smiles,
...                     r_groups={i+1: r_groups for i in range(12)})
>>> Constructor.n_replaceable_groups(scaffold)
RDKit ERROR: [21:17:35] SMILES Parse Error: syntax error while parsing: C1(C2C(C(C(C1([1*]))([2*]))([3*]))C(C3C(C2([4*]))C(C4C(C3([5*]))C(C(C(C4([6*]))([7*]))([8*]))([9*]))([R10]))([1*]))([2*])

Changing:

--- re.sub(r"\(\[R([1-9])+]\)", r"([\1*])", scaffold.smiles)
+++ re.sub(r"\(\[R([0-9]+)]\)", r"([\1*])", scaffold.smiles)

may help.

Cannot include stereochemistry

Stereochemistry around the R-groups is not accounted for in the current regex. The string I'm using now is scaffold_smiles = re.sub(r"\(([\\/]*)\[R([0-9]+)]\)", r"(\1[\2*])", scaffold.smiles). Due to RDKit reaction SMARTS not preserving stereochemistry in the one test case I have looked at (rdkit/rdkit#4059), it may also require enumeration of the potential stereoisomers in attach_substituents (as suggested by you on Slack). I haven't seen any edge cases in OpenEye so far.

Couple bugs in smiles_to_image_grid with RDKit backend

overview

  1. Draw.MolsToGridImage takes keyword arguments, not positional
  2. Passing in legend=None (from the current keyword arguments labels=None) borks the RDKit Draw.MolsToGridImage where it tries to subscript the list
  3. Image has no attribute save, so it dies when smiles_to_image_grid tries to image.save(output_path)

code

# Import the scaffold object which stores the scaffold definition.
from constructure.scaffolds import Scaffold
from constructure.constructors import RDKitConstructor as Constructor

from rdkit import Chem
sub_groups = ["hydrogen", "alkyl", "aryl", "acyl",
              "hetero", "halogen"]

# Create the scaffold.
scaffold = Scaffold(
    smiles="C1(=C([C]([R1])=C2C(=[C]([R2])1)[N]([C]([R3])=[C]([R4])2)[H])O)O",
    r_groups={
        1: sub_groups,
        2: sub_groups,
        3: sub_groups,
        4: sub_groups,
    },
)
substituents = ["[R][H]",  # -H
                "[R]C",  # -CH3
                "[R]C=O",
                "[R]N(=O)(=O)",
                "[R]O",
                "[R]-F",
                "[R]-Cl",
                "[R]-Br",
                "[R]C(=O)(O)",
               ]
smiles = Constructor.enumerate_combinations(scaffold,
                                            substituents={
                                                1: substituents,
                                                2: substituents,
                                                3: substituents,
                                                4: substituents,
                                            })
from constructure.utilities.rdkit import smiles_to_image_grid
smiles_to_image_grid(smiles, "generated.png", cols=4)

error traces

  1. Keyword arguments
>>> smiles_to_image_grid(smiles, "generated.png", cols=4)
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-7-107a6a4db137> in <module>
      1 from constructure.utilities.rdkit import smiles_to_image_grid
----> 2 smiles_to_image_grid(smiles, "generated.png", cols=4)

~/pydev/constructure/constructure/utilities/utilities.py in wrapper(*args, **kwargs)
     60                 raise MissingOptionalDependency(import_path, False)
     61 
---> 62             return function(*args, **kwargs)
     63 
     64         return wrapper

~/pydev/constructure/constructure/utilities/rdkit.py in smiles_to_image_grid(smiles, output_path, labels, cols, cell_width, cell_height)
     30     molecules = [Chem.MolFromSmiles(pattern) for pattern in smiles]
     31 
---> 32     image = Draw.MolsToGridImage(molecules, cols, (cell_width, cell_height), labels)
     33     image.save(output_path)
     34 

TypeError: ShowMols() takes from 1 to 2 positional arguments but 4 were given
  1. Legend=None
>>> smiles_to_image_grid(smiles, "generated.png", cols=4)

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-11-107a6a4db137> in <module>
      1 from constructure.utilities.rdkit import smiles_to_image_grid
----> 2 smiles_to_image_grid(smiles, "generated.png", cols=4)

~/pydev/constructure/constructure/utilities/utilities.py in wrapper(*args, **kwargs)
     60                 raise MissingOptionalDependency(import_path, False)
     61 
---> 62             return function(*args, **kwargs)
     63 
     64         return wrapper

~/pydev/constructure/constructure/utilities/rdkit.py in smiles_to_image_grid(smiles, output_path, labels, cols, cell_width, cell_height)
     30     molecules = [Chem.MolFromSmiles(pattern) for pattern in smiles]
     31 
---> 32     image = Draw.MolsToGridImage(molecules, molsPerRow=cols, subImgSize=(cell_width, cell_height), legends=labels)
     33     image.save(output_path)
     34 

~/anaconda3/envs/constructure/lib/python3.8/site-packages/rdkit/Chem/Draw/IPythonConsole.py in ShowMols(mols, maxMols, **kwargs)
    194     for prop in ('legends', 'highlightAtoms', 'highlightBonds'):
    195       if prop in kwargs:
--> 196         kwargs[prop] = kwargs[prop][:maxMols]
    197   res = fn(mols, drawOptions=drawOptions, **kwargs)
    198   if kwargs['useSVG']:

TypeError: 'NoneType' object is not subscriptable
  1. No attribute save
smiles_to_image_grid(smiles, "generated.png", cols=4, labels=[])
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-5-131e1400cb77> in <module>
      1 from constructure.utilities.rdkit import smiles_to_image_grid
----> 2 smiles_to_image_grid(smiles, "generated.png", cols=4, labels=[])

~/pydev/constructure/constructure/utilities/utilities.py in wrapper(*args, **kwargs)
     60                 raise MissingOptionalDependency(import_path, False)
     61 
---> 62             return function(*args, **kwargs)
     63 
     64         return wrapper

~/pydev/constructure/constructure/utilities/rdkit.py in smiles_to_image_grid(smiles, output_path, labels, cols, cell_width, cell_height)
     31 
     32     image = Draw.MolsToGridImage(molecules, molsPerRow=cols, subImgSize=(cell_width, cell_height), legends=labels)
---> 33     image.save(output_path)
     34 
     35 

AttributeError: 'Image' object has no attribute 'save'

versions

  • Python: 3.8, 3.9
  • RDKit: 2021.03.1
  • constructure: current main

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.