Coder Social home page Coder Social logo

amirhajibabaei / autoforce Goto Github PK

View Code? Open in Web Editor NEW
25.0 5.0 13.0 1.67 MB

Sparse Gaussian Process Potentials

License: MIT License

Python 100.00%
physics chemistry machinelearning gaussian-processes sparse-gaussian-processes density-functional-theory molecular-dynamics-simulation ab-initio-simulations metadynamics metadynamics-simulations

autoforce's People

Contributors

amirhajibabaei avatar changwmyung avatar swillow avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

autoforce's Issues

failures, solutions, tips

ActiveCalculator
In on-the-fly MD, if the initial atomic forces are zero and the model is empty/immature, model may remain blind in the following steps. The solution is to set covdiff to a finite value (~0.1).

north pole singularity in spherical harmonics

In calculation of the gradients of spherical harmonics, there is a division by sin(theta) in lines:
....
Y_theta = cos_theta * self.l_float * Y / sin_theta
Y_theta[1:, 1:] -= _r * Y[:-1, :-1] * self.coef / sin_theta
....
When sin_theta is 0, Y_theta becomes nan.
At the moment nan_to_num is applied when the gradient is returned which simply replaces nan with zeros.
But in calculations of the gradients of SOAP inconsistencies exist with autograd when xyz=[0,0,1].
The correct workaround may be to add a small positive number where sin_theta is zero.
But a beter solution maybe possible.

Addition kernel

Define a addition kernel class, where two (or more) kernels are given at init.
func, leftgrad, rightgrad, gradgrad, etc for the product kernel can be deduced from argument kernels.

Sign issue in stress calculations

Theoretically
Stress=stress1 + stress2,
where
stress1 = (sum over all atoms F.T@r) / volume
where r is coordinates and F forces on atom.
stress2 is related to derivatives wrt cell.
This works fine in ParametricCalculator (test using calc.calculate_numerical_stress(atoms)).
But in AutoForceCalculator (machine learned) we have to multiply stress1 by -1 in order to get the right numerical stresses.
Everything seemingly works just fine but I still can't explain the -1 multiplyer.

Efficient calculation of the diagonal elements of the Gram matrix (in forces block).

Using sparse methods, calculating the full Gram matrix is bypassed.
Similarity calculations is only calculated between potentially many large (data) systems
and a few small inducing systems which reduces the computational complexity considerably.
Only the diagonal elements of the full gram matrix are needed;
either for calculation of the variance or the "trace" term in variational ELBO.
At the moment, covariance of all forces with each other (in one system) is calculated and
then the diagonal elements is extracted.
Instead the diagonal elements should be calculated directly, inside the similarity kernel.

too small noise causes a large shift in predicted energy

When I chose noise=1e-6 (too small) in kernel, the predicted energies by the posterior potential contained a large shift from the actual energies.
I used shift because the predictions were perfectly correlated with the data (R2~1).
This could be related to the jitters added to the diagonal of the Gram matrix.
But other possibilities also exist.
First, we need to find out why this happens.
Second, the program should issue an error or a warning when this happens.

Saving and reloading Local causes inconsistensies

At the moment, when Local objects are sampled from data as inducing geometries, they include i, j indices which belong to the structure that they are embedded in.
Converting these to atoms, saving them in a traj, and reloading them causes these indices to be changed.
i, j indices are relevant via the "bothways" keyword in Local.select: if bothways=False, generally neighbors with j>i are returned.
Therefore in kernels such as PairKernel where bothways=False is frequently used, it is possible that an empty array is returned even if there are relevant atoms in the local environment.
Plus the behavior might change simply by saving and reloading locs.
This issue needs to be fixed.

Product kernels class

Define a product kernel class, where two (or more) kernels are given at init.
func, leftgrad, rightgrad, gradgrad, etc for the product kernel can be deduced from argument kernels.

data first or reference first?

In conditional adding of data or references to a model, sometimes the order at which data and references are added becomes important.
For instance if an atoms object is added first, addition of its locals to references becomes less likely.
Thereof, in training by MD, situations raise when multiple data are added consecutively, but no references are added.
Usually this is accompanied by sharp discontinuities in the energy time series, every time a data is inserted.
One might refer to this as an stressed model.
This stress usually is eventually released when a few references are successfully added to the model.
What is the best order of adding data and references to a model, to avoid this stressed phases?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.