Comments (4)
Hi @wushanyun64,
It depends on what you mean by calculating SOAP on a subset of elements? You can definitely just select a subset of chemical species and provide it in the species
-argument. Then just remove all structures in the dataset that contain species outside this subset. But I guess this is not what you mean?
If you instead wish to approximate the chemical and structural diversity of the original dataset, there are different things you can try:
- Use a dimensionality reduction technique of your choice (e.g. PCA, t-SNE) before regression.
- Try disabling the crossover terms in the SOAP spectrum with
crossover=False
(currently enabled only for the gto radial basis). This will make the output scaled linearly with respect to the number of chemical species, as opposed to the quadratic scaling that is the default. However, this will make the output less detailed, as cross-over terms between atomic elements will be completely disabled. It is up to you to choose whether this is acceptable or not in your applications. - You can group multiple chemical elements under a "pseudo-species", e.g. use one species for each column of the periodic table. There are some clever ways for doing such grouping that you can find in the literature.
Hope this helps
from dscribe.
Hi Lauri:
Thank you for the advice, it's really helpful. I definitely want to preserve the diversity for the structures in the dataset cause there's no dominant combination of atom species there. let me try pca first and see what happens.
Thank again.
from dscribe.
Hi @wushanyun64,
It depends on what you mean by calculating SOAP on a subset of elements? You can definitely just select a subset of chemical species and provide it in the
species
-argument. Then just remove all structures in the dataset that contain species outside this subset. But I guess this is not what you mean?If you instead wish to approximate the chemical and structural diversity of the original dataset, there are different things you can try:
- Use a dimensionality reduction technique of your choice (e.g. PCA, t-SNE) before regression.
- Try disabling the crossover terms in the SOAP spectrum with
crossover=False
(currently enabled only for the gto radial basis). This will make the output scaled linearly with respect to the number of chemical species, as opposed to the quadratic scaling that is the default. However, this will make the output less detailed, as cross-over terms between atomic elements will be completely disabled. It is up to you to choose whether this is acceptable or not in your applications.- You can group multiple chemical elements under a "pseudo-species", e.g. use one species for each column of the periodic table. There are some clever ways for doing such grouping that you can find in the literature.
Hope this helps
Hi Lauri:
Thanks again for your help. Here's a quick follow up question about the third comment from you, the 'pseudo-species' method sounds very interesting to me but it seems hard to find relevant info online. Can you elaborate a little more about that, or simply point me to a paper? Thank you!
from dscribe.
Here are two papers that I have come across:
from dscribe.
Related Issues (20)
- Is it possible to parallelize `lmbtr.create` when working on one `ase.Atoms` object? HOT 3
- Error with np.str (NumPy >= 1.24) HOT 1
- Descriptor that recognizes each atom of the same species differently HOT 1
- The example in README.md is not correct HOT 1
- [Bug] Error in SOAP derivatives when using weighting. HOT 2
- API compatibility is broken since 0696656 HOT 1
- ACSF.create cannot accept cartesian positions as "centers" parameter HOT 2
- Numpy operations on sparsed derivatives HOT 5
- Similarity based on Average kernel obtain deferent value between each atom and its replica atoms. HOT 1
- Similarity value is different between equivalent atoms HOT 5
- Segmentation fault in SOAP for l_max > 9 HOT 2
- Analytic Integral of SH expansion coefficients HOT 2
- SOAP polynomial RBF error HOT 3
- issue with "species" HOT 2
- MBTR HOT 3
- ACSF G5 values HOT 2
- Segmentation fault in creation of SOAP descriptor HOT 4
- SOAP_sparse=True HOT 1
- Molecular representations for bio-molecules HOT 1
- Numerical SOAP derivatives for periodic systems HOT 6
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from dscribe.