Hi, great code! I'm currently developing an interf

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

ACSF descriptor structure about dscribe HOT 5 CLOSED

singroup commented on May 29, 2024

ACSF descriptor structure

from dscribe.

Comments (5)

lauri-codes commented on May 29, 2024 1

Ok, good.

I would indeed agree that the fact that SOAP works to some degree with KRR is due to the central atom being encoded in the output. The central atom will get a very big output for the l=0 terms (it looks completely angularly uniform from the center). So if the number of species, nmax and lmax are kept moderate, these l=0 terms will be enough for a kernel to distinguish the central atom to some degree.

from dscribe.

lauri-codes commented on May 29, 2024

Hi @sirmarcel,

Thanks for the feedback! Good to hear that you're interested in interfacing with DScribe: this is exactly the kind of usage we were aiming for! (Some shameless self-promotion: do also consider using MBTR from DScribe, of course after benchmarking it against qmmlpack for speed and flexibility :) We got some nice speedups for it in DScribe 0.2.8)

The very first extra element you are getting corresponds to the symmetry function G1 as defined in [1] (it is simply the cutoff function). Since this symmetry function has no parameters besides rcut, there is no constructor argument for it, which understandably makes it a bit obscure.

We would greatly appreciate any ideas on how to make our ACSF interface better. Maybe we should add a flag to turn G1 on or off, or then simply document the ACSF output better. Any suggestions?

[1] Jörg Behler, Atom-centered symmetry functions for constructing high-dimensional neural network potentials, J. Chem. Phys. (2011) 134, 074106

from dscribe.

sirmarcel commented on May 29, 2024

Hi Lauri,

I'm planning to add interfaces for at least SOAP, ACSF and MBTR, and a side-effect of our benchmark paper will (hopefully) be a reasonably thorough benchmark of your implementations against the "canonical" ones. :) (So far it looks like there is no particularly large difference for SOAP, but let's see.)

Thanks for the answer -- that explains it! I would definitely suggest adding a flag to turn it on/off, and more clearly documenting how the output vector is structured. For SOAP, the pseudocode is very clear!

Otherwise the interface is okay, I think. Is there any particular reason you chose a fixed cutoff for all the SFs? (cmlkit choses a very different direction in terms of interface, if you're interested, the SF interface is here. Both approaches are valid! It's just a matter of taste/usecase.)

One more question about the ACSF output: Is the output structured such that there are blocks corresponding to the element type of the central atom? I.e. can I rely on the euclidean distance between two descriptors where the central atom is of a different type being large? It looks like this is the case, I just want to make sure.

Thanks!

from dscribe.

lauri-codes commented on May 29, 2024

Hi Marcel,

That benchmark sounds super interesting! Please keep us updated if you find anything that might be of interest to us (both good or bad news are appreciated).

We have only applied very simple versions of ACSF in our studies, and thus our interface with a single cutoff has worked and there has not been any real incentive to update it. But I have been thinking about improving it for the more serious stuff with different cutoffs for different species etc. Your interface that allows universal and species-specific SF's seems like a reasonably good choice. I will consider updating to something similar :)

The output from our ACSF does not in any way encode the species of the central atom. This is a very important point that I will have to add to the tutorial. We wanted to keep it this way so that the output size does not grow too much and because it is quite easy to then introduce your own stratification is needed. This stratification also depends on the ML model you use: If working with kernel methods, maybe you want to separate different central species to completely non-overlapping locations in the feature vector? If using a decision three or NN, maybe introducing a single additional feature that one-hot-encodes the species would work? Another way would be to train a separate model for each central species. The different ways of handling this are quite many, so we want to stick with the output that can be most easily be used for any of them. The detailed output structure of ACSF is explained more thoroughly in our preprint. I will update the ACSF tutorial to include similar pseudocode as for SOAP to make it more clear.

Hope this helps

from dscribe.

sirmarcel commented on May 29, 2024

Hi Lauri,

Ok, that makes sense. I'll add the same post-processing we apply to the RuNNer ACSFs, which work the same way (i.e. not structurally distinguishing the central atom), and simply make zero-padded blocks.

Interesting side-note: As far as I can tell, neither SOAP implementation does anything like this, and yet it still works without post-processing for KRR. Probably because including the central atom in the density adds enough information...?

Thank you for your help!

from dscribe.

ACSF descriptor structure about dscribe HOT 5 CLOSED

Comments (5)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent