Coder Social home page Coder Social logo

Performance audit about dftk.jl HOT 10 CLOSED

juliamolsim avatar juliamolsim commented on May 28, 2024
Performance audit

from dftk.jl.

Comments (10)

antoine-levitt avatar antoine-levitt commented on May 28, 2024

The bottlenecks seem to be the same for large-ish unit cells (20x1x1 silicon supercell). I guess that doesn't change until the orthogonalization/diagonalization costs start to dominate, which should not come up until we get to systems that require distributed-memory parallelism.

from dftk.jl.

antoine-levitt avatar antoine-levitt commented on May 28, 2024

We should compare anatomy parts performance with eg abinit, using the scripts in test/testcases_ABINIT. At this point it comes down to fine-tuning (preconditionner, eigensolver termination criteria), optimizations (real FFTs) and FFT threading/batching.

from dftk.jl.

mfherbst avatar mfherbst commented on May 28, 2024

I'm happy to do that, but I'd say the result will be disappointing at first ...

from dftk.jl.

antoine-levitt avatar antoine-levitt commented on May 28, 2024

Sure? I don't see anything apart from the above that could explain a difference, so it's clear where to go to catch up with other codes, and not that hard. Abinit has options to print timings, so we could eg get the total number of FFTs they do and compare.

from dftk.jl.

mfherbst avatar mfherbst commented on May 28, 2024

Ha, that's a good angle and a good thing to try.

from dftk.jl.

antoine-levitt avatar antoine-levitt commented on May 28, 2024

OK so I benchmarked DFTK against abinit for a very simple testcase. Ecut 50, LDA, temperature 0.01, Silicon with no kpoint, box with lattice constant a=20 (reduced position of atoms rescaled to be physical). We took 44s, abinit took 8s. I did not do any fine tuning either in abinit (just took source, made ./configure && make, and took the source of the example; I did add in FFTW, but that didn't change the timings much) or DFTK (I did specify a maximum of 4 iterations for LOBPCG, to match abinit's choice). That's not too bad! Pretty sure with a reasonable amount of work we can get to at least within a factor of 2, and possibly beat them.

I couldn't make abinit to use multithreaded, so I disabled all threading in DFTK. Turning it on (FFTW.set_num_threads(4); BLAS.set_num_thread(4)) yielded 34s, which is not great (ie we can probably do better by explicitly threading)

from dftk.jl.

mfherbst avatar mfherbst commented on May 28, 2024

Hmm 44s to 34s indeed sounds pretty bad for me as well, but the ballpark of 44 (DFTK) versus 8 (ABINIT) sounds familiar from my experiments.

from dftk.jl.

mfherbst avatar mfherbst commented on May 28, 2024

Just to keep track of things todo: In #44 the zeros function was discussed as an alternative to our current array = similar(data), array .= 0 pattern. This requires JuliaLang/julia#130.

from dftk.jl.

mfherbst avatar mfherbst commented on May 28, 2024

We have worked quite a bit on this stuff recently. Any objections against closing it for now? I think the low-hanging fruits discussed here are kind of done now.

from dftk.jl.

antoine-levitt avatar antoine-levitt commented on May 28, 2024

Sure. We can rebenchmark later on, but the results above might be obsolete now.

from dftk.jl.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.