Comments (3)
Hi! This is a desired behavior. The feature function we use tokenizes rare atoms as unknown values, similar to the convention of representing rare words with [UNK] in NLP. This helps the model better generalizes across rare atoms. The warning is output to notify the user what kinds of rare atoms are encountered. We will revise the warning message to reduce ambiguity.
from torchdrug.
Many thanks.
Is there already the functionality to get a list of SMILES which lead to RDKit error?
from torchdrug.
This is actually a warning, rather than an error in RDKit. RDKit prints this as an error, but it can still return the nearest valid molecule of the SMILES. If RDKit fails to parse any SMILES, it will return an empty molecule, and in that case TorchDrug will log the SMILES on the screen. We may further wrap this kind of RDKit warnings for better user experience. You can safely ignore these cases if it happens less than 1%.
from torchdrug.
Related Issues (20)
- torch_ext.so cannot be found error on Macbook macOS HOT 2
- Issue converting PDB file to torchdrug.data.PackedProtein / Protein HOT 1
- Passing CIF file directly into TorchDrug Protein?
- To load huge dataset, and train the model
- example for tasks.MultipleBinaryClassification HOT 2
- Statistics and data splitting scheme on PPI datasets HOT 2
- Nan error while using GraphAF to optimize QED
- Small issue in atom2valence dictionary iteration in `Molecule.is_valid`
- torchdrug.patch.py breaks use of ConcatDataset HOT 1
- [Note] Dead lock when running `layers.GraphIsomorphismConv` (issue located in `sparse_coo_tensor`)
- graph with global features, and mydataset
- atom_feature="symbol" is only available for class GCPNGeneration() ? HOT 1
- class Molecule(Graph), definition of "self.atom2valence"
- TypeError during validation
- Segmentation fault HOT 1
- issues with reproducing pLogP benchmark for GPCN
- subprocess.CalledProcessError: Command '['c++', '-v']' returned non-zero exit status 1. HOT 1
- Impossible to reproduce pretrain tutorial
- ESM, max length issues using ESM-gearnet-serial model HOT 2
- Incompatible with huggingface transformers?
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from torchdrug.