Coder Social home page Coder Social logo

ACSF descriptor structure about dscribe HOT 5 CLOSED

singroup avatar singroup commented on May 29, 2024
ACSF descriptor structure

from dscribe.

Comments (5)

lauri-codes avatar lauri-codes commented on May 29, 2024 1

Ok, good.

I would indeed agree that the fact that SOAP works to some degree with KRR is due to the central atom being encoded in the output. The central atom will get a very big output for the l=0 terms (it looks completely angularly uniform from the center). So if the number of species, nmax and lmax are kept moderate, these l=0 terms will be enough for a kernel to distinguish the central atom to some degree.

from dscribe.

lauri-codes avatar lauri-codes commented on May 29, 2024

Hi @sirmarcel,

Thanks for the feedback! Good to hear that you're interested in interfacing with DScribe: this is exactly the kind of usage we were aiming for! (Some shameless self-promotion: do also consider using MBTR from DScribe, of course after benchmarking it against qmmlpack for speed and flexibility :) We got some nice speedups for it in DScribe 0.2.8)

The very first extra element you are getting corresponds to the symmetry function G1 as defined in [1] (it is simply the cutoff function). Since this symmetry function has no parameters besides rcut, there is no constructor argument for it, which understandably makes it a bit obscure.

We would greatly appreciate any ideas on how to make our ACSF interface better. Maybe we should add a flag to turn G1 on or off, or then simply document the ACSF output better. Any suggestions?

[1] Jรถrg Behler, Atom-centered symmetry functions for constructing high-dimensional neural network potentials, J. Chem. Phys. (2011) 134, 074106

from dscribe.

sirmarcel avatar sirmarcel commented on May 29, 2024

Hi Lauri,

I'm planning to add interfaces for at least SOAP, ACSF and MBTR, and a side-effect of our benchmark paper will (hopefully) be a reasonably thorough benchmark of your implementations against the "canonical" ones. :) (So far it looks like there is no particularly large difference for SOAP, but let's see.)

Thanks for the answer -- that explains it! I would definitely suggest adding a flag to turn it on/off, and more clearly documenting how the output vector is structured. For SOAP, the pseudocode is very clear!

Otherwise the interface is okay, I think. Is there any particular reason you chose a fixed cutoff for all the SFs? (cmlkit choses a very different direction in terms of interface, if you're interested, the SF interface is here. Both approaches are valid! It's just a matter of taste/usecase.)

One more question about the ACSF output: Is the output structured such that there are blocks corresponding to the element type of the central atom? I.e. can I rely on the euclidean distance between two descriptors where the central atom is of a different type being large? It looks like this is the case, I just want to make sure.

Thanks!

from dscribe.

lauri-codes avatar lauri-codes commented on May 29, 2024

Hi Marcel,

That benchmark sounds super interesting! Please keep us updated if you find anything that might be of interest to us (both good or bad news are appreciated).

We have only applied very simple versions of ACSF in our studies, and thus our interface with a single cutoff has worked and there has not been any real incentive to update it. But I have been thinking about improving it for the more serious stuff with different cutoffs for different species etc. Your interface that allows universal and species-specific SF's seems like a reasonably good choice. I will consider updating to something similar :)

The output from our ACSF does not in any way encode the species of the central atom. This is a very important point that I will have to add to the tutorial. We wanted to keep it this way so that the output size does not grow too much and because it is quite easy to then introduce your own stratification is needed. This stratification also depends on the ML model you use: If working with kernel methods, maybe you want to separate different central species to completely non-overlapping locations in the feature vector? If using a decision three or NN, maybe introducing a single additional feature that one-hot-encodes the species would work? Another way would be to train a separate model for each central species. The different ways of handling this are quite many, so we want to stick with the output that can be most easily be used for any of them. The detailed output structure of ACSF is explained more thoroughly in our preprint. I will update the ACSF tutorial to include similar pseudocode as for SOAP to make it more clear.

Hope this helps

from dscribe.

sirmarcel avatar sirmarcel commented on May 29, 2024

Hi Lauri,

Ok, that makes sense. I'll add the same post-processing we apply to the RuNNer ACSFs, which work the same way (i.e. not structurally distinguishing the central atom), and simply make zero-padded blocks.

Interesting side-note: As far as I can tell, neither SOAP implementation does anything like this, and yet it still works without post-processing for KRR. Probably because including the central atom in the density adds enough information...?

Thank you for your help!

from dscribe.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.