mir-group / phoebe Goto Github PK
View Code? Open in Web Editor NEWA high-performance framework for solving phonon and electron Boltzmann equations
Home Page: https://mir-group.github.io/phoebe/
License: MIT License
A high-performance framework for solving phonon and electron Boltzmann equations
Home Page: https://mir-group.github.io/phoebe/
License: MIT License
Timeline: Early next week
Plan:
ph_bte
s/\v\[([^,]+)\,\s?([^\]]+)\]\s?\=\s?([^;]+)\;/tup = \3;\r auto \1 = std::get<0>(tup);\r auto \2 = std::get<1>(tup);
s/\v\[([^,]+)\,\s?([^,]+)\,\s?([^\]]+)\]\s?\=\s?([^;]+)\;/tup = \4;\r auto \1 = std::get<0>(tup);\r auto \2 = std::get<1>(tup);\r auto \3 = std::get<2>(tup);
s/\v\[([^,]+)\,\s?([^\]]+)\]\s?\:\s?(.*)\)\s?\{/tup : \3) {\rauto \1 = std::get<0>(tup);\rauto \2 = std::get<1>(tup);
To be done in connection/after with the development of the interface for el-ph wannier interpolation. Create a python script that reads the QE XML info and converts it into the data format necessary to launch EPA calculations.
It might be interesting to add an application to calculate polaron-related properties.
See the two TODO statements at the top of mpiController.h, which describe one way to reduce code duplication in the class, and one way to safeguard against a potential problem we may encounter down the road. Neither is an urgent fix.
The class State can be deprecated, after the implementation of the phonon-GPU acceleration. Remove it and replace its functionality with *BandStructure class methods.
Since we already have a working scattering operator, it could be interesting and relatively easy to add the time evolution of BTE, on either electrons or phonons (or both), neglecting the space-dependent terms of the BTE. I.e. solve an equation of the form:
dn/dt = - scattMatrix * n
When I build the dev branch with cmake .. -DOMP_AVAIL=ON
, the silicon example segfaults.
An example output is
2020-08-11, 14:54:57 | 95% | 60 / 63 | remaining: 4.16e-02 s.
2020-08-11, 14:54:57 | 100% | 63 / 63 | remaining: 1.67e-02 s.
Elapsed time: 0.531 s.
--------------------------------------------------------------------------------
Solving BTE within the relaxation time approximation.
[dell:71747] *** Process received signal ***
[dell:71747] Signal: Segmentation fault (11)
[dell:71747] Signal code: Address not mapped (1)
[dell:71747] Failing at address: 0x7f0500000000
[dell:71747] [ 0] /usr/lib/libpthread.so.0(+0x14960)[0x7f0636ce0960]
[dell:71747] [ 1] /usr/lib/libc.so.6(__libc_malloc+0x118)[0x7f0636852518]
[dell:71747] [ 2] /usr/lib/libstdc++.so.6(_Znwm+0x1a)[0x7f0636b9252a]
[dell:71747] [ 3] ../../ompbuild/phoebe(+0xd249a)[0x563c531b149a]
[dell:71747] [ 4] ../../ompbuild/phoebe(+0xd2162)[0x563c531b1162]
[dell:71747] [ 5] ../../ompbuild/phoebe(+0x22c25)[0x563c53101c25]
[dell:71747] [ 6] ../../ompbuild/phoebe(+0xcbc55)[0x563c531aac55]
[dell:71747] [ 7] /usr/lib/libgomp.so.1(+0x1a3ee)[0x7f0636d083ee]
[dell:71747] [ 8] /usr/lib/libpthread.so.0(+0x9422)[0x7f0636cd5422]
[dell:71747] [ 9] /usr/lib/libc.so.6(clone+0x43)[0x7f06368c6bf3]
[dell:71747] *** End of error message ***
Segmentation fault (core dumped)
When running with GDB, I find
#0 0x00007ffff697e518 in malloc () from /usr/lib/libc.so.6
#1 0x00007ffff6cbe52a in operator new (sz=sz@entry=8) at /build/gcc/src/gcc/libstdc++-v3/libsupc++/new_op.cc:50
#2 0x000055555562649a in __gnu_cxx::new_allocator<int>::allocate (__n=<optimized out>, this=<optimized out>)
at /usr/include/c++/10.1.0/ext/new_allocator.h:103
#3 std::allocator_traits<std::allocator<int> >::allocate (__a=..., __n=<optimized out>)
at /usr/include/c++/10.1.0/bits/alloc_traits.h:460
#4 std::_Vector_base<int, std::allocator<int> >::_M_allocate (__n=<optimized out>, this=<optimized out>)
at /usr/include/c++/10.1.0/bits/stl_vector.h:346
#5 std::vector<int, std::allocator<int> >::_M_realloc_insert<int const&> (this=this@entry=0x7fffe5ffad90,
__position=32767) at /usr/include/c++/10.1.0/bits/vector.tcc:440
#6 0x0000555555626162 in std::vector<int, std::allocator<int> >::push_back (__x=@0x7fffe5ffac64: 1,
this=0x7fffe5ffad90) at /usr/include/c++/10.1.0/bits/stl_iterator.h:953
#7 MPIcontroller::divideWorkIter (this=<optimized out>, numTasks=<optimized out>)
at /home/anders/phoebe/src/mpi/mpiController.cpp:157
#8 0x0000555555576c25 in BaseBandStructure::parallelStateIterator (this=<optimized out>)
at /home/anders/phoebe/src/bands/bandstructure.cpp:73
#9 0x000055555561fc55 in PhononThermalConductivity::_ZN25PhononThermalConductivity18calcFromPopulationER9VectorBTE._omp_fn.0(void) () at /home/anders/phoebe/src/observable/phonon_thermal_cond.cpp:66
#10 0x00007ffff6e343ee in gomp_thread_start (xdata=<optimized out>) at /build/gcc/src/gcc/libgomp/team.c:123
#11 0x00007ffff6e01422 in start_thread () from /usr/lib/libpthread.so.0
#12 0x00007ffff69f2bf3 in clone () from /usr/lib/libc.so.6
To me, it seems that the problem is the thread-unsafeness of MPIcontroller::divideWorkIter
, which uses some vector member variables instead of local variables. @jcoulter12 ?
The electron-phonon matrix can be quite big. We should read/write files in parallel. There are at least two places where this would be beneficial:
To ensure that blacs is always initialized before a distributed matrix is created, we propose to move the initBlacs() function call from the mpiController class to the ParallelMatrix class. A variable will be added to the mpiController object to mark whether or not initBlacs as already been called for cases where more than one distributed matrix is constructed.
Based on what we already have, it should be simple to also include the phonon self energy correction due to the electron-phonon interaction. It will also need to be incorporated into the scattering matrix.
Once the phonon-defect scattering has been implemented, we could think on how to add electron-defect scattering.
For charge-neutral defect, this is probably very similar to the implementation of phonon-defect scattering.
We need to search for the theory of what happens when we need to add the effect of charge to a defect.
For the purpose of publication of an article on phoebe, we need an example to showcase the code's capabilities.
We need to discuss and decide which material we are going to study, possibly something that is relatively expensive for other codes.
Need an implementation of EPA transport for electrons.
Note, we still need a better plan for managing the interface between quantum espresso and phoebe, to be decided together with the interface needed for the EPW part of the code.
We can just do a simple thing with find_library
and look for either libscalapack.a
or libmkl_scalapack_lp64.a
, with a warning to the user that we'll download SCALAPACK when nothing is found in CMAKE_LIBRARY_PATH
.
Implement iterative schemes for electron BTE solutions
We could have a simple wordpress website for storing news, updates, download links, hosting online documentation, contacts, etc...
Instead of our own LoopPrint implementation, we could use some preexisting library that produces more compact output.
I looked around the internet and found a couple of promising alternatives:
In some cases, the division over columns or rows among processes leaves some processes without any elements (see the return values of numLocalRows/Cols from numroc_). Ideally, we'd like to spread a distributed matrix evenly across processes.
See discussion in #56
Write an interface to use in phoebe the electron-phonon coupling computed by VASP.
Implementation of a patch + XML reader for using electron-phonon coupling from quantum espresso phonon code directly. Goal of this issue is to bypass any need for EPW.
After closure of pull-request for the branch kokkos14, can implement symmetries in the code. Reference PhysRevLett.110.265506
is a good reference for usage of symmetries in the BTE.
Modify the FullBandStructure object in order to be able to store, in particular, the electronic band structure in a distributed way, so to reduce memory usage.
Development of a class for computing the contributions to scattering from defect, to various orders in perturbation theory.
We need to come up with a logo for this project before release.
We need to document the structure of the code, and provide an overall view on how the various classes are used.
Implementation of the electron-phonon scattering and electronic transport, using Wannier interpolation of electronic energies and electron-phonon coupling. First, starting from EPW real-space el-ph coupling. Use symmetries.
The tetrahedron method requires the knowledge of energies at more k/q points than those that are available from the ActiveBandStructure class. This is because it requires the knowledge of the bandstructure at all vertices of the tetrahedron to integrate the dirac-delta. As a result, the tetrahedron method now can only work if all k-points and bands are known (i.e. a FullBandStructure is used, and all energies are stored by each MPI process). We should decide and implement a strategy to use the tetrahedron in conjunction with an ActiveBandStructure
Implementation of an interface to use this code for computing electron-phonon coupling and related quantities.
Inspect the major bottleneck of the code (say, what takes more than 1% of the total execution time), and see if/how we can improve performance.
When we rotate the wavefunction, at the moment, we set to zero the plane wave coefficients that don't stay in the intersections of the G-vectors spheres of the two k points. This slightly breaks the normalization of the wavefunction for poor values of ecutwfc. Can we rotate the wavefunctions without losing some plane wave coefficients? This problem is also documented in the theory description of the electron-phonon coupling.
Currently, in ParallelMatrix::diagonalize, a copy of the current PMatrix object is made (called eigenvectors) to be used as an output to a pdsyev call. This is done because both the input and output PMatrices used by pdsyev must have the same blacs context object. However, this copy creation for the output matrix is not as efficient as it should be, because we only want to copy the properties (not the data) of the matrix being diagonalized.
This is not a high priority, but it should likely be improved at some point.
Understand how frequently we may end up with a calculation where the coupling (electron-phonon, or phonon-phonon matrices in real-space) cannot be stored in the memory of a single-node or a single GPU.
If this is considered critical, we need to explore how to make such coupling-matrices distributed, and how to modify the code accordingly.
Probably, only the caching mechanism needs to be MPI distributed, while the rest may be done MPI-locally.
Write an interface to use VASP formatted dynamical matrix
After the source code is completed and we are already working on an application, we should start preparing a manuscript for presenting the code. Need to decide target journal as well.
Possibly we want to add phono3py as an alternative to shengbte for the 3rd order force constants, to provide more user flexibility.
The idea is to add a patch to Quantum ESPRESSO to be called whenever c_diag is called. The patch should be used to fix the gauge of the wavefunction (e.g. set max plane-wave coefficient to be real) and should also consider the case of degenerate states.
It may be a good idea to write quantities of interest to some sort of standardized format, to avoid users having to write parsers for output files.
If our quantities are small, we can use a lightweight, simple format like JSON.
If we need to output bigger quantities, we can use a binary format like HDF5. This library looks nice.
There is an error related to building eigen with openMP using the clang compiler with c++14. For now, building without openMP works, but this should probably be fixed in the long term.
Error message is not informative and initial fix attempts have not been successful, largely looks like this problem.
To make large calculations more user-friendly, let's output memory estimates for the largest objects as early in the run as possible.
Decide which plots are relevant.
Write source code for creating plot data files.
Add (python?) scripts to generate figures.
We need to implement a method that manipulates the parallel distributed FullBandStructure of Issue#14 into an ActiveBandStructure.
Note: ActiveBandStructure contains a filtered selection of the Bloch States of the band structure, and is not distributed in parallel.
Hence, to map distributed FullBandStructure into ActiveBandStructure we need a sort of MPI_gather operation.
Note, in order to decide how to select Bloch States for electrons, we need to compute the Fermi level, in parallel.
Replicate what we did for the phonon-phonon scattering in the case of electron-phonon scattering. Luckily, the tensor structure of the coupling and how it is used is almost a mirror to the phonon-phonon case.
Once both phonon and electron transport work separately, we could implement the solution of the BTE that couples the two equations.
We are currently using "variable length arrays", such as "double positionSPG[numAtoms][3];", which are supported by the GNU compilers but is not part of the C++ standard. We should rewrite this in a C++ standard compliant form.
Create a python script that read the XML information of the el-ph coupling, assuming that a gauge has been fixed, and unfolds the symmetries. We might consider doing the Bloch to Wannier interpolation at this level.
Need to convert the coupling of EPW to c++ readable format, and run tests to replicate the results of EPW.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.