sandboxnu / cheminformatics Goto Github PK
View Code? Open in Web Editor NEWCheminformatics analysis process that filters and clusters chemical compounds to determine the most medically viable compounds
Cheminformatics analysis process that filters and clusters chemical compounds to determine the most medically viable compounds
What kind of fingerprint is rdkit fingerprintMols?
Look into the rdkit cookbook and also research what kinds of fingerprint options there are
The user has to entire SMILE strings into the website. This issue covers the UI component of the text box and saving the input in some way in the backend.
does it exist? can we scrape it from the catalogue?
Problem: Currently, we have the PAINs filtering implemented, but it is just calling the script from https://github.com/sandboxnu/Lilly-Medchem-Rules. It would be nice to somehow wrap this, like in a python package.
similarity coefficient
https://www.rdkit.org/docs/api-docs.html
particularly looking into
rdkit.Chem.rdShapeHelpers.EncodeShape
rdkit.Chem.rdShapeHelpers.ShapeTanimotoDist
rdkit.Chem.Descriptors
rdkit.Chem.Descriptors3D
Research how each of the measures noted in the academic paper (ROC, PMI) factor into calculation
check if file exists before removing it
how is the page going to look? what is the organization pattern?
Set up ctyoscape.js
Store the parameters in backend so they can reused in future (or for analytics?)
see slack for possible links
(render all mpo nodes as black)
Make sure structure is easy to read and visible
The centroid is the most connected element of the butina cluster algorithm. Every chemical in the cluster is connected to the centroid.
if it has 60% similarity to the centroid
Using rdkit or by other means.
Maybe http://bkchem.zirael.org ->opensource alternative to chemdraw
Clustering takes much longer when you pick a low tanimoto coefficient on alot of data. Warn the user before they make this mistake and assume the program is broken.
Options:
-better error messaging - maybe an error screen with what went wrong
-prevent key error by ignoring duplicate smiles.
We need to create clusters (lists of compounds) to pass to cytoscape to visualize. We could potentially use: https://scikit-learn.org/stable/modules/classes.html#module-sklearn.cluster
framework, django or flask or other TBD
how are we storing it, dictionary ,dataframe ? also, what values are we storing?
Allow users to choose which fingerprint they want to make the app more versatile
If they are going to have like over 60% be singletons
ECFP4 as default
choose a color scheme for the table (think about how the circles were colored in the graph). Maybe have a couple popular/recommended preset options.
(implementation/backend)
without having to go back to the original select options step
hover, labels, etc.
(figure out what we use to cluster)
Similarity coefficient is a number between 0 and 1 describing how closely related inputs have to be in order to be grouped together (degree of similarity). We were thinking of making this an input box with a small scroll wheel or up/down arrows
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.