materialsproject / atomate2 Goto Github PK
View Code? Open in Web Editor NEWatomate2 is a library of computational materials science workflows
Home Page: https://materialsproject.github.io/atomate2/
License: Other
atomate2 is a library of computational materials science workflows
Home Page: https://materialsproject.github.io/atomate2/
License: Other
In the DoubleRelaxMaker
flow, currently the same relax_maker
is used for step 1 and step 2. It would be desirable to split this up into two makers (which by default are the same) in case the user wants to do something like a half k-point relaxation for step 1.
I can take care of this modification. This is mainly a reminder for myself.
Describe the bug
This is potentially related to a problem that occurs in the phonon workflow for one structure: BORN charges that do not reflect the correct symmetry. This then results in a wrong non-analytical term correction around Gamma.
I start with this POSCAR:
Mg3 Sb2
1.0
2.3003714889113018 -3.9843602950772405 0.0000000000000000
2.3003714889113018 3.9843602950772405 0.0000000000000000
0.0000000000000000 0.0000000000000000 7.2813299999999996
Mg Sb
3 2
direct
0.0000000000000000 0.0000000000000000 0.0000000000000000 Mg2+
0.3333333333333333 0.6666666666666666 0.3683250000000000 Mg2+
0.6666666666666667 0.3333333333333334 0.6316750000000000 Mg2+
0.3333333333333333 0.6666666666666666 0.7747490000000000 Sb3-
0.6666666666666667 0.3333333333333334 0.2252510000000000 Sb3-
Then, I run the following workfow:
from atomate2.vasp.jobs.core import DielectricMaker, TightRelaxMaker
from pymatgen.core.structure import Structure
from jobflow import run_locally
from pymatgen.symmetry.analyzer import SpacegroupAnalyzer
from atomate2.vasp.powerups import update_user_incar_settings
from jobflow import Flow
flow=[]
struct=Structure.from_file("POSCAR")
job=TightRelaxMaker().make(structure=struct)
job2=DielectricMaker().make(structure=job.output.structure)
flow=Flow([job,job2],job2.output)
flow = update_user_incar_settings(flow, {"NPAR": 8}, class_filter=TightRelaxMaker)
run_locally(flow, create_folders=True)
The structure after the tight relaxation looks like this:
Mg3 Sb2
1.0000000000000000
2.2774694095435986 -3.9446927300134034 0.0000000000000000
2.2774694095435986 3.9446927300134034 -0.0000000000000000
-0.0000000000000000 -0.0000000000000000 7.1854093835379320
Mg Sb
3 2
Direct
-0.0000000000000000 -0.0000000000000000 -0.0000000000000000
0.3333333333333357 0.6666666666666643 0.3675316159328170
0.6666666666666643 0.3333333333333357 0.6324683840671832
0.3333333333333357 0.6666666666666643 0.7761920492952012
0.6666666666666643 0.3333333333333357 0.2238079507047988
0.00000000E+00 0.00000000E+00 0.00000000E+00
0.00000000E+00 0.00000000E+00 0.00000000E+00
0.00000000E+00 0.00000000E+00 0.00000000E+00
0.00000000E+00 0.00000000E+00 0.00000000E+00
0.00000000E+00 0.00000000E+00 0.00000000E+00
The dielectric run then starts with:
Mg3 Sb2
1.0
2.2774694100000001 -3.9446927299999999 0.0000000000000000
2.2774694100000001 3.9446927299999999 -0.0000000000000000
-0.0000000000000000 -0.0000000000000000 7.1854093800000003
Mg Sb
3 2
direct
-0.0000000000000000 -0.0000000000000000 -0.0000000000000000 Mg
0.3333333300000000 0.6666666700000000 0.3675316200000000 Mg
0.6666666700000000 0.3333333300000000 0.6324683800000001 Mg
0.3333333300000000 0.6666666700000000 0.7761920500000000 Sb
0.6666666700000000 0.3333333300000000 0.2238079500000000 Sb
Clearly, this is less accurate.
If someone already has a suspicion why this happens, I would be happy about any hint. Otherwise, I will go through the code and check how to fix this. I am currently not completely sure if this is leading to the symmetry problems in the BORN charges but it looks likely to me.
In order to be able to use "code-agnostic-like" workflows, we should try to converge on some parts of the TaskDocument's. One obvious example is the finite-difference elastic constants workflow where the deformations could be performed using "any" code, provided the response follows some convention, i.e. it should have the energy, the forces, stress, ...
This needs some discussions with @utf on how to deal with that. We might consider having some general pydantic models that are subclassed for the different codes (somewhat similar to StructureMetadata).
I set a MAGMOM in my INCAR and used pymatgen to read it as the input set. This makes a 'magmoms' list, but I think atomate2 wants it to be a dictionary:
my_structure = Structure.from_file('POSCAR')
my_incar = Incar.from_file('INCAR')
my_input_set = RelaxSetGenerator(user_incar_settings=my_incar.as_dict())
relax_job = RelaxMaker(input_set_generator=my_input_set).make(structure=my_structure)
Error:
File ".../atomate2/lib/python3.9/site-packages/atomate2/vasp/sets/base.py", line 857, in _get_magmoms
mag.append(magmoms.get(site.specie.symbol, 0.6))
AttributeError: 'list' object has no attribute 'get'
atomate2/src/atomate2/vasp/sets/base.py
Line 851 in 4a9dc11
Describe the bug
I ran the test code here after changing si_structure
to mgo_structure
(looks like a typo) with the current version of Atomate2 and FireWorks 1.9.7. I got back the following error. Any ideas where I might have gone wrong or why the error has come up? I don't know enough about FireWorks yet to be sure.
Traceback (most recent call last):
File "test.py", line 21, in <module>
lpad.add_wf(wf)
File "/global/homes/r/rosen/software/miniconda/envs/cms2/lib/python3.8/site-packages/fireworks/core/launchpad.py", line 427, in add_wf
old_new = self._upsert_fws(list(wf.id_fw.values()),
File "/global/homes/r/rosen/software/miniconda/envs/cms2/lib/python3.8/site-packages/fireworks/core/launchpad.py", line 1726, in _upsert_fws
self.fireworks.insert_many((fw.to_db_dict() for fw in fws))
File "/global/homes/r/rosen/software/miniconda/envs/cms2/lib/python3.8/site-packages/pymongo/collection.py", line 769, in insert_many
blk.ops = [doc for doc in gen()]
File "/global/homes/r/rosen/software/miniconda/envs/cms2/lib/python3.8/site-packages/pymongo/collection.py", line 769, in <listcomp>
blk.ops = [doc for doc in gen()]
File "/global/homes/r/rosen/software/miniconda/envs/cms2/lib/python3.8/site-packages/pymongo/collection.py", line 759, in gen
for document in documents:
File "/global/homes/r/rosen/software/miniconda/envs/cms2/lib/python3.8/site-packages/fireworks/core/launchpad.py", line 1726, in <genexpr>
self.fireworks.insert_many((fw.to_db_dict() for fw in fws))
File "/global/homes/r/rosen/software/miniconda/envs/cms2/lib/python3.8/site-packages/fireworks/core/firework.py", line 319, in to_db_dict
m_dict = self.to_dict()
File "/global/homes/r/rosen/software/miniconda/envs/cms2/lib/python3.8/site-packages/fireworks/utilities/fw_serializers.py", line 146, in _decorator
m_dict = func(self, *args, **kwargs)
File "/global/homes/r/rosen/software/miniconda/envs/cms2/lib/python3.8/site-packages/fireworks/core/firework.py", line 275, in to_dict
spec['_tasks'] = [t.to_dict() for t in self.tasks]
File "/global/homes/r/rosen/software/miniconda/envs/cms2/lib/python3.8/site-packages/fireworks/core/firework.py", line 275, in <listcomp>
spec['_tasks'] = [t.to_dict() for t in self.tasks]
File "/global/homes/r/rosen/software/miniconda/envs/cms2/lib/python3.8/site-packages/fireworks/utilities/fw_serializers.py", line 175, in _decorator
m_dict = func(self, *args, **kwargs)
File "/global/homes/r/rosen/software/miniconda/envs/cms2/lib/python3.8/site-packages/fireworks/utilities/fw_serializers.py", line 147, in _decorator
m_dict = recursive_dict(m_dict)
File "/global/homes/r/rosen/software/miniconda/envs/cms2/lib/python3.8/site-packages/fireworks/utilities/fw_serializers.py", line 86, in recursive_dict
return {recursive_dict(k, preserve_unicode): recursive_dict(v, preserve_unicode)
File "/global/homes/r/rosen/software/miniconda/envs/cms2/lib/python3.8/site-packages/fireworks/utilities/fw_serializers.py", line 86, in <dictcomp>
return {recursive_dict(k, preserve_unicode): recursive_dict(v, preserve_unicode)
File "/global/homes/r/rosen/software/miniconda/envs/cms2/lib/python3.8/site-packages/fireworks/utilities/fw_serializers.py", line 80, in recursive_dict
return recursive_dict(obj.as_dict(), preserve_unicode)
File "/global/homes/r/rosen/software/miniconda/envs/cms2/lib/python3.8/site-packages/fireworks/utilities/fw_serializers.py", line 86, in recursive_dict
return {recursive_dict(k, preserve_unicode): recursive_dict(v, preserve_unicode)
File "/global/homes/r/rosen/software/miniconda/envs/cms2/lib/python3.8/site-packages/fireworks/utilities/fw_serializers.py", line 86, in <dictcomp>
return {recursive_dict(k, preserve_unicode): recursive_dict(v, preserve_unicode)
File "/global/homes/r/rosen/software/miniconda/envs/cms2/lib/python3.8/site-packages/fireworks/utilities/fw_serializers.py", line 90, in recursive_dict
return [recursive_dict(v, preserve_unicode) for v in obj]
File "/global/homes/r/rosen/software/miniconda/envs/cms2/lib/python3.8/site-packages/fireworks/utilities/fw_serializers.py", line 90, in <listcomp>
return [recursive_dict(v, preserve_unicode) for v in obj]
File "/global/homes/r/rosen/software/miniconda/envs/cms2/lib/python3.8/site-packages/fireworks/utilities/fw_serializers.py", line 80, in recursive_dict
return recursive_dict(obj.as_dict(), preserve_unicode)
TypeError: as_dict() missing 1 required positional argument: 'self'
Not sure if this is a good idea, just a suggestion.
atomate2
(or rather jobflow
) is built to have composable jobs. Would it make a good example for the docs to show off how this can be used to get a more precise band gap by using a GGA job to find the VBM and CBM k-points followed by an HSE job to get their precise energy levels? I.e. what is referred to in the literature as HSE@GGA?
Describe the bug
Due to the logic shown here in the VaspInputSetGenerator
, the INCAR flags that get carried over to a child calculation are taken from the vasprun.xml
file. However, this can lead to very confusing behavior. For instance, in the attached files, I never used LMAXTAU
or KPOINT_BSE
in my config_dict
or user_incar_settings
, yet it appears in any child jobs. This is because, while these parameters weren't set in the INCAR, they were included in vasprun.incar
. VASP for some reason decided to push these parameters up to the INCAR part of the vasprun.xml
even though they were never explicitly set.
If the INCAR is available, my suggestion would be to prioritize reading that over the vasprun.xml
and only read the latter if the INCAR has vanished.
Sometimes the CONTCAR file gets created by VASP but is empty. Not a big problem since we are copying the file first but would be a helpful error message if something went wrong. We can validate the CONTCAR before copying and throwing an error.
atomate2/src/atomate2/vasp/files.py
Line 101 in d21bdac
This line is causing some problems for me:
atomate2/src/atomate2/vasp/jobs/base.py
Line 149 in a31b86b
This seems to indicate that if the user wants to change the name of a job (ex. adding formulas to the names) it will break subsequent querying of the resulting tasks database.
I like seeing the formula names in the FW web GUI as I'm working on new workflows. Not sure how everyone else feels.
I did not find any documentation on how to run jobs from atomate2 with multiple workers in fireworks. I assume this would be too hard to find out for a new user of atomate2 on their own.
Currently, the MoleculeMetadata
class is in Atomate2 here. Longer term, it probably makes sense to either port it to emmet where it can co-exist with StructureMetadata or, potentially better yet, adopt/merge with whatever @espottesmith is planning to do (or has already done) in terms of making a MoleculeMetadata
for MP. For now, what we have works though.
I think that the "*_file" contains already the full path and adding the dir_name is not necessary. It works because Path removes duplicates of following full paths, something like this:
Path.cwd() / Path.cwd()
where the output is just one single path (no duplicate).
But, using partial paths makes this failing because Path does not remove the duplicate here:
In [140]: Path("./outputs/") / Path("./outputs/")
Out[140]: PosixPath('outputs/outputs')
Therefore, removing the dir_name should also make this work providing a partial path.
Just for context, I was doing some tests creating a TaskDocument manually:
TaskDocument.from_directory(Path("./ferroelectric--23779/polarization_nonpolar/outputs/"))
FileNotFoundError: [Errno 2] No such file or directory: 'ferroelectric--23779/polarization_nonpolar/outputs/ferroelectric--23779/polarization_nonpolar/outputs/vasprun.xml.gz'
where you can see the duplicate path.
Please, double check this. It might be happening somewhere else too.
This is more a bug associated with the vasprun.xml
format and, by extension a limitation of Pymatgen's Vasprun('vasprun.xml).incar
object, but the IVDW flag is not included in the input.incar
set of parameters below even if it's included in the INCAR.
This is because IVDW is not included in vasprun.xml
, so there is a mismatch between the INCAR flags in vasprun.xml
and IVDW.
Do you think it might be better to read in the INCAR flags from the INCAR itself, and if the INCAR isn't present, then pull it from the vasprun.xml? This would also be nice because there are a few extraneous flags included when the INCAR flags are pulled from the vasprun.xml (e.g. KPOINTS_BSE, LMAXTAU even if they aren't set in the INCAR).
@utf , there is still one other issue with the Lobster workflow. I have implemented this pre-convergence step of the WAVECAR. I have, however, the feeling that the speed-up is extremely small. I am not sure why this is the case. VASP still needs many electronic steps after starting from such a pre-converged WAVECAR. I might need to do some more tests to make this really efficient or test some larger structures.
In Vasp, NonScfSetGenerator is defined by a reciprocal_density (for uniform band structures) or by a line_density (for line band structures). It would be nice to be able to use the same convention in Abinit. We should keep the possibility to use the "usual" abinit way also (uniform band structures defined by kppa, and line band structures defined by ndivsm) as abinit users may be more inclined to use that convention.
I am unsuccessful in running the test suite on Windows, Python 3.8.
In a fresh Python 3.8 environment, I've done:
pip install -r requirements.txt
pip install .[tests]
pytest
but all the jobs/flows fail due to
Traceback (most recent call last):
File "C:\Users\asros\miniconda3\envs\atomate2\lib\site-packages\jobflow\managers\local.py", line 98, in _run_job
response = job.run(store=store)
File "C:\Users\asros\miniconda3\envs\atomate2\lib\site-packages\jobflow\core\job.py", line 524, in run
response = function(*self.function_args, **self.function_kwargs)
File "C:\Users\asros\miniconda3\envs\atomate2\lib\site-packages\atomate2\vasp\jobs\base.py", line 147, in make
run_vasp(**self.run_vasp_kwargs)
File "C:\Users\asros\miniconda3\envs\atomate2\lib\site-packages\atomate2\vasp\run.py", line 167, in run_vasp
c.run()
File "C:\Users\asros\miniconda3\envs\atomate2\lib\site-packages\custodian\custodian.py", line 367, in run
self._run_job(job_n, job)
File "C:\Users\asros\miniconda3\envs\atomate2\lib\site-packages\custodian\custodian.py", line 440, in _run_job
p = job.run()
File "C:\Users\asros\miniconda3\envs\atomate2\lib\site-packages\custodian\vasp\jobs.py", line 255, in run
return subprocess.Popen(cmd, stdout=f_std, stderr=f_err) # pylint: disable=R1732
File "C:\Users\asros\miniconda3\envs\atomate2\lib\subprocess.py", line 858, in __init__
self._execute_child(args, executable, preexec_fn, close_fds,
File "C:\Users\asros\miniconda3\envs\atomate2\lib\subprocess.py", line 1311, in _execute_child
hp, ht, pid, tid = _winapi.CreateProcess(executable, args,
FileNotFoundError: [WinError 2] The system cannot find the file specified
My VASP TaskDocument
s are often filled with lots of null
and empty objects that I will likely never populate. This is a slight inconvenience when exploring my datasets in MongoDB. It'd be nice to add a kwarg to the TaskDocument
to recursively drop all key-value pairs that have None
as the value (and maybe {}
or []
?).
For a dictionary, it'd be something like:
def _remove_empties(d: Dict[str, Any]) -> Dict[str, Any]:
"""
For a given dictionary, recursively remove all items that are None
or are empty lists/dicts.
Parameters
----------
d
Dictionary to clean
Returns
-------
Dict
Cleaned dictionary
"""
if isinstance(d, dict):
return {
k: _remove_empties(v)
for k, v in d.items()
if v != None and v != [] and v != {}
}
if isinstance(d, list):
return [_remove_empties(v) for v in d]
return d
It's a bit trickier with the TaskDocument
object though.
Describe the bug
ISMEAR=2
is set for non-metals.
I try to optimize the structure of NaCl with the following script (atomate development version):
from atomate2.vasp.jobs.core import RelaxMaker
from jobflow import run_locally
from pymatgen.core import Structure
from atomate2.vasp.powerups import update_user_incar_settings
structure = Structure.from_file("POSCAR.NaCl.vasp")
relax_job = RelaxMaker().make(structure)
relax_job = update_user_incar_settings(relax_job, {"NPAR": 4}, class_filter=RelaxMaker)
run_locally(relax_job, create_folders=True)
POSCAR.NaCl.vasp
Na4 Cl4
1.0
5.691694 0.000000 0.000000
0.000000 5.691694 0.000000
0.000000 0.000000 5.691694
Na Cl
4 4
direct
0.000000 0.000000 0.000000 Na+
0.000000 0.500000 0.500000 Na+
0.500000 0.000000 0.500000 Na+
0.500000 0.500000 0.000000 Na+
0.500000 0.000000 0.000000 Cl-
0.500000 0.500000 0.500000 Cl-
0.000000 0.000000 0.500000 Cl-
0.000000 0.500000 0.000000 Cl-
I end up with this INCAR:
ALGO = Fast
EDIFF = 1e-05
EDIFFG = -0.02
ENAUG = 1360
ENCUT = 680
GGA = Ps
IBRION = 2
ISIF = 3
ISMEAR = 2
ISPIN = 2
KSPACING = 0.22
LAECHG = True
LASPH = True
LCHARG = False
LELF = False
LMIXTAU = True
LORBIT = 11
LREAL = Auto
LVTOT = True
LWAVE = False
MAGMOM = 8*0.6
NELM = 200
NPAR = 4
NSW = 99
PREC = Accurate
SIGMA = 0.2
I think it is due the combination of this line here
atomate2/src/atomate2/vasp/sets/core.py
Line 43 in 73b9b3e
And this one here:
atomate2/src/atomate2/vasp/sets/base.py
Line 976 in 73b9b3e
I would maybe suggest to switch to a default value of "None" to avoid such errors. Happy to hear what other people are thinking.
(I can of course try to fix it but I suspect it might concern many parts of the code ...)
Describe the bug
So this is a bit hard to reproduce because it only shows up if after a VASP calculation fails and custodian
restarts it.
I end up with this the following error in my vasp.out
-----------------------------------------------------------------------------
| |
| W W AA RRRRR N N II N N GGGG !!! |
| W W A A R R NN N II NN N G G !!! |
| W W A A R R N N N II N N N G !!! |
| W WW W AAAAAA RRRRR N N N II N N N G GGG ! |
| WW WW A A R R N NN II N NN G G |
| W W A A R R N N II N N GGGG !!! |
| |
| Command line argument 'srun' was not understood. |
| |
-----------------------------------------------------------------------------
-----------------------------------------------------------------------------
| |
| W W AA RRRRR N N II N N GGGG !!! |
| W W A A R R NN N II NN N G G !!! |
| W W A A R R N N N II N N N G !!! |
| W WW W AAAAAA RRRRR N N N II N N N G GGG ! |
| WW WW A A R R N NN II N NN G G |
| W W A A R R N N II N N GGGG !!! |
| |
| Command line argument '-N8' was not understood. |
| |
-----------------------------------------------------------------------------
-----------------------------------------------------------------------------
| |
| W W AA RRRRR N N II N N GGGG !!! |
| W W A A R R NN N II NN N G G !!! |
| W W A A R R N N N II N N N G !!! |
| W WW W AAAAAA RRRRR N N N II N N N G GGG ! |
| WW WW A A R R N NN II N NN G G |
| W W A A R R N N II N N GGGG !!! |
| |
| Command line argument '-c2' was not understood. |
| |
-----------------------------------------------------------------------------
-----------------------------------------------------------------------------
| |
| W W AA RRRRR N N II N N GGGG !!! |
| W W A A R R NN N II NN N G G !!! |
| W W A A R R N N N II N N N G !!! |
| W WW W AAAAAA RRRRR N N N II N N N G GGG ! |
| WW WW A A R R N NN II N NN G G |
| W W A A R R N N II N N GGGG !!! |
| |
| Command line argument '/g/g20/shen9/compiled/vasp_gam_63_quartz' |
| was not understood. |
| |
-----------------------------------------------------------------------------
This is my .atomate2.yaml
:
VASP_CMD: "srun -N8 -c2 /g/g20/shen9/compiled/vasp_std_63_quartz"
VASP_GAMMA_CMD: "srun -N8 -c2 /g/g20/shen9/compiled/vasp_gam_63_quartz"
So it looks like VASP_GAMMA_CMD
is being used here (possibly appended after the vasp_std
call) for some reason and I'm not familiar enough with that part of the code to see why this can happen.
This is not breaking anything for me at the moment but I just wanted to report this for future ref.
Like the cclib
-based task documents we have that make it easy to generate task documents for most molecular DFT codes, it'd be ideal I think to also add parallel support for task documents generated using NOMAD parsers (e.g. see here). The idea is that, right out of the box, Atomate2 would have structured input and output data for virtually all the codes users would be interested in. Of course, for codes of particular value to the community, custom task documents can be made (like the VASP one), but this could reduce the barrier for getting started with new codes in my opinion.
If it's of interest, this will go on my to-do list. Like with the cclib
-based task documents, this would involve an optional dependency (pip install nomad-lab[parsing]
).
In a fresh Python 3.8 environment, I've run pip install -r requirements.txt
and pip install .[tests]
but get numpy issues in return. By default, I'm running numpy 1.21.5 with the above procedure. Upgrading to numpy 1.22.1 resolves it.
It might be good to pin a version to numpy in requirements.txt
to resolve this (or might involve shuffling the order of some of the packages in requirements.txt
so pymatgen is compatible with the numpy version).
______________________ ERROR collecting test session _______________________ ..\..\miniconda3\envs\atomate2\lib\importlib\__init__.py:127: in import_module
return _bootstrap._gcd_import(name[level:], package, level)
<frozen importlib._bootstrap>:1014: in _gcd_import
???
<frozen importlib._bootstrap>:991: in _find_and_load
???
<frozen importlib._bootstrap>:975: in _find_and_load_unlocked
???
<frozen importlib._bootstrap>:671: in _load_unlocked
???
..\..\miniconda3\envs\atomate2\lib\site-packages\_pytest\assertion\rewrite.py:170: in exec_module
exec(co, module.__dict__)
tests\vasp\schemas\conftest.py:99: in <module>
class SiNonSCFUniform(SchemaTestData):
tests\vasp\schemas\conftest.py:100: in SiNonSCFUniform
from atomate2.vasp.schemas.calculation import VaspObject
..\..\miniconda3\envs\atomate2\lib\site-packages\atomate2\vasp\schemas\calculation.py:12: in <module>
from pymatgen.command_line.bader_caller import bader_analysis_from_path
..\..\miniconda3\envs\atomate2\lib\site-packages\pymatgen\command_line\bader_caller.py:30: in <module>
from pymatgen.io.cube import Cube
..\..\miniconda3\envs\atomate2\lib\site-packages\pymatgen\io\cube.py:46: in <module>
from pymatgen.core.sites import Site
..\..\miniconda3\envs\atomate2\lib\site-packages\pymatgen\core\__init__.py:20: in <module>
from .lattice import Lattice # noqa
..\..\miniconda3\envs\atomate2\lib\site-packages\pymatgen\core\lattice.py:22: in <module>
from pymatgen.util.coord import pbc_shortest_vectors
..\..\miniconda3\envs\atomate2\lib\site-packages\pymatgen\util\coord.py:17: in <module>
from . import coord_cython as cuc
pymatgen/util/coord_cython.pyx:1: in init pymatgen.util.coord_cython
???
E ValueError: numpy.ndarray size changed, may indicate binary incompatibility. Expected 96 from C header, got 88 from PyObject
I have a pitch I wanted to share here.
Looking towards the future when we want to add molecular DFT codes to Atomate2, I wanted to share the code cclib here. It can parse the outputs of virtually all the popular molecular DFT codes and returns dozens of attributes with consistent naming across codes. Here are the tabulated outputs.
I think that this could be a nice, consistent way of constructing task docs for molecular DFT. Of course, for commonly used codes like Q-Chem (which have detailed pymatgen parsers already), we don't have to use cclib. And even if we do use cclib, we can always append to the returned dictionary with additional properties. But this might lower the burden for incorporating new codes into Atomate2 and provide some continuity between packages.
The con, of course, is that it would add a dependency to atomate2 and it could be argued that it'd just be better to rely on making new "in-house" pymatgen.io
parsers for new codes rather than relying on an external dependency. However, cclib has been around for a long time and is continually updated. I would be happy to pitch a specific PR if there is interest. I have been using cclib with Jobflow for pretty much this purpose.
Something needs to change here about the case sensitivity:
Currently, the name
attribute returns "LOCPOT" but the comments say we should use lower cases eg. {"store_volumetric_data": ["locpot"]}
Describe the bug
Without pinning a specific Python version using default_language_version
in the pre-commit config, pre-commit will install hooks with the first available Python executable. This means that Python 3.10-specific code can get through the linting phase only to fail the 3.8/3.9 tests. This lso seems be the case in the CI, which installs pre-commit hooks from at Python 3.10 environment.
To Reproduce
pip install .\[dev\]
typing.Tuple
with tuple[str]
)pre-commit install; pre-commit run --all-files
--- will pass without error.Expected behavior
Linters should use the minimum supported Python version (currently 3.8).
Screenshots
diff --git a/src/atomate2/settings.py b/src/atomate2/settings.py
index e87a1c0..4d2e0c5 100644
--- a/src/atomate2/settings.py
+++ b/src/atomate2/settings.py
@@ -7,6 +7,9 @@ from pydantic import BaseSettings, Field, root_validator
_DEFAULT_CONFIG_FILE_PATH = "~/.atomate2.yaml"
+test: tuple[str, str] = ("a", "b")
+
+
__all__ = ["Atomate2Settings"]
$ git commit -a -m "Attempt to introduce py10-only feature"
check yaml...........................................(no files to check)Skipped
fix python encoding pragma...............................................Passed
fix end of files.........................................................Passed
trim trailing whitespace.................................................Passed
autoflake................................................................Passed
black....................................................................Passed
blacken-docs.............................................................Passed
isort....................................................................Passed
flake8...................................................................Passed
type annotations not comments............................................Passed
rst ``code`` is two backticks........................(no files to check)Skipped
rst directives end with two colons...................(no files to check)Skipped
rst ``inline code`` next to normal text..............(no files to check)Skipped
mypy.....................................................................Passed
codespell................................................................Passed
pyupgrade................................................................Passed
[ml-evs/update_flake8_precommit b5e62c4] Attempt to introduce py10-only feature
1 file changed, 3 insertions(+)
The simple fix is to use:
default_language_version:
python: python3.8
at the top of the pre-commit config.
I have worked on a phonon workflow including phonopy and the finite displacement method. I would like to make it available in atomate2.
Would there be interest?
Currently when a CONTCAR is copied as POSCAR from the previous directory, it gets overwritten by the new POSCAR generated from write_vasp_input_set
. Even if the new POSCAR is generated from last structure in the last run, it loses the predictor-corrector coordinates stored in CONTCAR when continuing an MD job. When the overwrite option is off in https://github.com/materialsproject/atomate2/blob/9dcf35586eab52c4be842b1010aa268b98c8493d/src/atomate2/vasp/sets/base.py#L71, it will raise a FileExistsError
. Is there a way to skip writing new input files for continuation jobs?
I did a temporary fix in my own branch by just changing the error raised by the existing POSCAR to a warning to allow the job to proceed. I'm not sure if that will break other things. An alternative might be moving the file existence check before copying files in https://github.com/materialsproject/atomate2/blob/9dcf35586eab52c4be842b1010aa268b98c8493d/src/atomate2/vasp/jobs/base.py#L130 to allow the user to turn off the overwrite option when writing inputs?
Just a suggestion but one that has worked out very well for pymatgen over in materialsproject/pymatgen#2847: we could migrate linting of atomate2 to ruff
.
Ruff combines the functionality of most other Python linters into 1 tool written in Rust which makes it ~100x faster than linters written in Python. It brought the pymatgen
linting CI script's run time from 9 min down to 3 min (almost entirely spent installing deps and running mypy
now, ruff itself only takes 1 sec).
Happy to take this on if interested.
Wouldn't normally raise an issue about this but @utf said
Itโs still early days without much adoption, so plenty of scope for changing [...]
May I recommend using markdown instead of rST for docs? While I admit I'm biased, I believe there are good reasons to be. They include all the ones listed here plus others:
Sphinx supports markdown with little extra setup so I think the cost of switching wouldn't be that high. MyST which Sphinx relies on for md parsing supports directives and extensions so I don't think there'd be any downsides.
Previously in the original atomate, calcs_reversed
key is explicitly named for putting the final step as the first entry (https://github.com/hackingmaterials/atomate/blob/4577c94c1850fcebd8624d49ba1ab27f8e3e6d43/atomate/vasp/drones.py#L300)
In atomate2, the calculations are not in a reversed order, but still use this calcs_reversed
key name, which may cause some confusions.
atomate2/src/atomate2/vasp/schemas/task.py
Line 382 in fe657b4
To resolve this issue, we could:
(1) doing the calculation list in a reversed order, obey the principle of making the final step as first entry
OR
(2) keep the calcs in normal sequence (put initial step as the first entry), but rename it with a different key.
Please let me know which way you pro and I could submit a PR later.
See https://matsci.org/t/atomate2-parsing-of-f-orbitals-in-dos/42350 by @jmmshn. I'm out of town right now but can work on a fix next week if nobody else does.
Hi @mkhorton @davidwaroquiers @mjwen. Agreed that we need a proper discussion about restarting. This is something @jmmshn has opinions about too.
I have some ideas for how we could do it but each with their own tradeoffs. I'd be interested to hear if there were any good ideas from your workshop @mkhorton.
I will create a separate issue to discuss this further.
Originally posted by @utf in #134 (comment)
Phonon workflow is currently not documented. @QuantumChemist will start working on this.
An abinit job that has restarted the maximum set by the user/settings should be defined as "final", even if it has failed. How to deal with that, how/can we restart from it ? Would a restart be a continuation of the same Flow/Job or a new Flow/Job ?
Describe the bug
Given the attached OUTCAR, I get
~/software/miniconda/envs/cms2/lib/python3.8/site-packages/atomate2/vasp/schemas/calculation.py in from_outcar(cls, outcar)
208 "cores": "cores",
209 }
--> 210 return cls(**{v: outcar.run_stats.get(k) or 0 for k, v in mapping.items()})
211
212
~/software/miniconda/envs/cms2/lib/python3.8/site-packages/pydantic/main.cpython-38-x86_64-linux-gnu.so in pydantic.main.BaseModel.__init__()
ValidationError: 1 validation error for RunStatistics
cores
value is not a valid integer (type=type_error.integer)
The Atomate2 job then fails as a result. This is on the current version of Atomate2.
To Reproduce
from pymatgen.io.vasp import Outcar
from atomate2.vasp.schemas.calculation import RunStatistics
out = Outcar('OUTCAR.txt')
RunStatistics().from_outcar(out)
This is related to VASP 6.2.0+. First, as seen in the attached OUTCAR, sometimes VASP reports N/A for average memory used, and so Pymatgen converts this to None
. The workaround solution in Atomate2 is to accept either float
or None
(if that's not being done so already). The second issue regarding the number of cores can be fixed upstream in Pymatgen. I opened a PR here: materialsproject/pymatgen#2308
In the code block below, it seems that HighSymmKpath
is called without any supplemental kwargs. It would be ideal if the user could modify the HighSymmKpath
args on-the-fly, e.g. to get path_type = 'latimer_munro'
instead of path_type = 'setyawan_curtarolo'
. I assume this is not currently possible, right? If not, I've left this issue open as a possible feature to add in the future.
atomate2/src/atomate2/vasp/sets/base.py
Lines 664 to 677 in 2eec3b6
As discussed briefly in #173, we would like to create add-on packages for atomate2 to support additional codes that maybe don't fit in within existing workflows. LAMMPS seems to be one such case and as such we have moved development out to a namespace package atomate2-lammps
.
Is this a contribution route that should be officially supported? I am happy to extract the generic stuff from our repository to make an add-on template similar to pymatgen-addon-template
, and make a PR here with some explanation? I guess the alternative would be developing the separate add-on and then eventually folding it back into atomate2 when it is more mature.
Let me know what you think!
When parsing the following directory into a TaskDocument.
The result should have +1 charge (NELECT = 349 vs 350 for the neutral)
The current version of the Vasprun Object in pymatgen gives Vasprun.structure.charge == -1
due to a bug fix in
materialsproject/pymatgen#2577
Either way, with or without the fix to the sign,the current parsing of the directory using TaskDocument
gives zero charge:
tdoc = TaskDocument.from_directory("./")
Gives:
tdoc.structure.charge # --> 0
Files to reproduce linked below:
I wanted to share a few comments below regarding the documentation. It's excellent! It strikes a great balance between being concise/clear and having enough information to get started easily. I did have a few suggestions after reading through it, which I've included below. Feel free to address as little or as much of this as you wish. They're merely my opinions.
StaticMaker
ran a static calculation, it wasn't immediately clear to me what other settings were changed. Should I be updating ISMEAR
, or has Atomate2 already done that for me? Does it output AECCAR files by default, or do I need to take care of that? Of course, going to the code for the StaticSetGenerator
clarified that for me, but I'm wondering if it might be helpful to include a hyperlink (in the documentation for the various Makers) to the various updates Atomate2 automatically applies.VaspInputSetGenerator()
don't need to be repeated in the documentation, a brief mention that other settings can be changed may be helpful for the reader.
VaspInputSetGenerator
documentation, I initially was very unsure what a config_dict
was. After about 15 minutes of realizing "there must be a better way" to handle entirely new input sets than defining massive user_incar_settings
, user_kpoints_settings
, etc. arguments, I then learned what it was after digging through the code itself. Perhaps a link to the BaseVaspSet.yaml
file (as an example) might be helpful. Or something of that nature.
user_incar_settings
option in the "modifying input parameters" section of the tutorial, perhaps it might be beneficial to mention that setting None
will remove the user INCAR setting. I was fortunate to stumble upon it in the documentation for VaspInputSetGenerator
, but this might be something to add to the main documentation as a note.
config_dict
arguments. Just define your own .yaml
file and away you go, for the most part.
The Atomate2 version property doesn't seem to work at the moment. This is also apparent when viewing the top of the documentation page where no version number is listed.
import atomate2
atomate2.__version__
Returns ''
Getting a strange bug at the very end of a DFPT calc which sadly means none of the results make it into the DB. Not sure where to start debugging, if someone else has any ideas, I'm all ears.
Stack trace:
INFO:jobflow.managers.local:dielectric failed with exception:
Traceback (most recent call last):
File "/Users/janosh/.venv/py310/lib/python3.10/site-packages/jobflow/managers/local.py", line 97, in _run_job
response = job.run(store=store)
File "/Users/janosh/.venv/py310/lib/python3.10/site-packages/jobflow/core/job.py", line 524, in run
response = function(*self.function_args, **self.function_kwargs)
File "/Users/janosh/dev/atomate2/src/atomate2/vasp/jobs/base.py", line 150, in make
task_doc = TaskDocument.from_directory(Path.cwd(), **self.task_document_kwargs)
File "/Users/janosh/dev/atomate2/src/atomate2/vasp/schemas/task.py", line 323, in from_directory
calc_doc, vasp_objects = Calculation.from_vasp_files(
File "/Users/janosh/dev/atomate2/src/atomate2/vasp/schemas/calculation.py", line 646, in from_vasp_files
output_doc = CalculationOutput.from_vasp_outputs(
File "/Users/janosh/dev/atomate2/src/atomate2/vasp/schemas/calculation.py", line 458, in from_vasp_outputs
_get_band_props(vasprun.complete_dos, structure)
File "/Users/janosh/dev/atomate2/src/atomate2/vasp/schemas/calculation.py", line 814, in _get_band_props
"filling": complete_dos.get_band_filling(band=orb_type, elements=[el]),
AttributeError: 'CompleteDos' object has no attribute 'get_band_filling'
The error originates on this line:
2022-04-22 16:03:35,889 INFO Started executing jobs locally
2022-04-22 16:03:36,026 INFO Starting job - dielectric (336db245-27b5-4316-9a64-203065a61b9a)
WARNING in EDDRMM: call to ZHEGV failed
ERROR:custodian.custodian:VaspErrorHandler
No matching processes belonging to you were found
2022-04-23 17:26:50,175 INFO dielectric failed with exception:
INFO:jobflow.managers.local:dielectric failed with exception:
Traceback (most recent call last):
File "/Users/janosh/.venv/py310/lib/python3.10/site-packages/jobflow/managers/local.py", line 97, in _run_job
response = job.run(store=store)
File "/Users/janosh/.venv/py310/lib/python3.10/site-packages/jobflow/core/job.py", line 524, in run
response = function(*self.function_args, **self.function_kwargs)
File "/Users/janosh/dev/atomate2/src/atomate2/vasp/jobs/base.py", line 150, in make
task_doc = TaskDocument.from_directory(Path.cwd(), **self.task_document_kwargs)
File "/Users/janosh/dev/atomate2/src/atomate2/vasp/schemas/task.py", line 323, in from_directory
calc_doc, vasp_objects = Calculation.from_vasp_files(
File "/Users/janosh/dev/atomate2/src/atomate2/vasp/schemas/calculation.py", line 646, in from_vasp_files
output_doc = CalculationOutput.from_vasp_outputs(
File "/Users/janosh/dev/atomate2/src/atomate2/vasp/schemas/calculation.py", line 458, in from_vasp_outputs
_get_band_props(vasprun.complete_dos, structure)
File "/Users/janosh/dev/atomate2/src/atomate2/vasp/schemas/calculation.py", line 814, in _get_band_props
"filling": complete_dos.get_band_filling(band=orb_type, elements=[el]),
AttributeError: 'CompleteDos' object has no attribute 'get_band_filling'
2022-04-23 17:26:50,176 INFO Finished executing jobs locally
INFO:jobflow.managers.local:Finished executing jobs locally
Traceback (most recent call last):
File "/Users/janosh/dev/vasp/atomate2/perovskite-diel.py", line 41, in <module>
run_locally(diel_job, create_folders=True, ensure_success=True)
File "/Users/janosh/.venv/py310/lib/python3.10/site-packages/jobflow/managers/local.py", line 165, in run_locally
raise RuntimeError("Flow did not finish running successfully")
RuntimeError: Flow did not finish running successfully
Job script
import os
import warnings
from time import perf_counter
from atomate2.vasp.jobs.core import DielectricMaker
from atomate2.vasp.powerups import (
update_user_incar_settings,
update_user_kpoints_settings,
)
from jobflow import run_locally
from pymatgen.ext.matproj import MPRester
from pymatgen.io.vasp import Kpoints
os.environ["OMP_NUM_THREADS"] = "1"
warnings.filterwarnings("ignore") # ignore pymatgen warnings clogging up the logs
SrHfO3 = MPRester().get_structure_by_material_id("mp-13108")
# make a relax job to optimise the structure
diel_job = DielectricMaker().make(SrHfO3)
diel_job = update_user_incar_settings(
diel_job, {"ENCUT": 700, "EDIFF": 1e-7, "NELM": 40}
)
auto_kpts = Kpoints.automatic_density(SrHfO3, 3000)
diel_job = update_user_kpoints_settings(diel_job, auto_kpts)
start = perf_counter()
run_locally(diel_job, create_folders=True)
elapsed = perf_counter() - start
print(f"SrHfO3 DFPT calc took {elapsed:.1f} sec")
Are there currently any plans on integrating LAMMPS?
I think there would also be many users. As there atomate workflows, I assume there might be plans.
In the abinit jobs, the abinit process is killed before the end of the wall time by the python subprocess module. We should try to deal with those cases. In some cases, an automatic restart may be possible. In some other cases, the user should do something (e.g. increase timelimit, or change number of processors or else ...).
Hi @JaGeo.
I just tried running the phonon workflow and noticed that the k-point density used is extremely high. It's set as grid_density=7000
which equals a reciprocal density of around 600
for silicon. Just for reference, the default reciprocal density in Atomate1 is 64 for insulators and 200 for metals. So this works out almost 10x the default k-point density.
For silicon, this means a supercell with lengths 22 ร is being run with a 2x2x2 k-point mesh!
I'd be very surprised if we need to go this high to get converged results. My feeling is that we should could use reciprocal_density=100
and still get converged results. From my previous experience, I usually run displaced supercells with the same k-point density as the relaxation and this normally works well. My experience is that it is usually the force and energy convergence that are essential to get good phonons.
What do you think about:
grid_density
to reciprocal_density
, this means the number of k-points is independent of the number of atoms and only depends on the cell size.reciprocal_density=100
Obviously, we should do some tests to make sure this gives reasonable results still.
atomate2/src/atomate2/vasp/flows/elastic.py
Line 113 in de92307
When running the elastic constants workflow for metallic systems (HEAs primarily), I find that the calculations for many of the deformations is 1/3 - 2/3rds as expensive as the initial tight relaxation. This is especially true for the shear deformations. And with up to 24 of these calculations, that really adds up.
Perhaps it would be efficient to have each of the deformation calculations be ran in two stages:
Could someone with a deep(er) understanding of elastic constant calculations weigh in on this? For my systems, these deformation calculations make up >80% of the run time of my flows, so there is a ton of potential upside.
Repeatedly writing Dos
objects to the data
store without blob_uuids
During the parsing of the TaskDocument
something is causing the code to repeatedly write Dos
objects to the additional store with no way of retrieving them.
The example below is a standard static calculation with parse_dos
.
The Chgcar
and CompleteDos
objects are both listed at output.vasp_objects
in the TaskDocument but there are 6 other Dos
objects stored in the data
S3Store.
blob_res = JOB_STORE.additional_stores["data"].query({"job_uuid": "fbd9fe16-c86a-4b75-8e8c-727815fb6ae1"})
doc = JOB_STORE.docs_store.query_one({"uuid": "fbd9fe16-c86a-4b75-8e8c-727815fb6ae1"})
def _check_blob(d, prefix = ""):
if isinstance(d, list):
for i, v in enumerate(d):
_check_blob(v, f"{prefix}.{str(i)}")
elif isinstance(d, dict):
for i, v in d.items():
if i == "blob_uuid":
print(prefix)
_check_blob(v, f"{prefix}.{str(i)}")
for d in blob_res:
job_uuid = d["job_uuid"]
print(f"job_uuid: {d['job_uuid']}")
print(f"blob_uuid: {d['blob_uuid']}")
print(f"class: {d['@class']}")
print("Follow paths have `blob_uuid` keys")
_check_blob(doc)
job_uuid: fbd9fe16-c86a-4b75-8e8c-727815fb6ae1
blob_uuid: 4c2d511c-54fe-40fc-9c0c-fc1021ccf38f
class: Dos
job_uuid: fbd9fe16-c86a-4b75-8e8c-727815fb6ae1
blob_uuid: 865ed9da-2ff6-448e-9c48-c2bb2b3a4fbd
class: Dos
job_uuid: fbd9fe16-c86a-4b75-8e8c-727815fb6ae1
blob_uuid: f1d3c57e-54eb-4174-aa65-ba92caa43535
class: Chgcar
job_uuid: fbd9fe16-c86a-4b75-8e8c-727815fb6ae1
blob_uuid: 7b26630a-b93b-41f6-81eb-0666d62042d4
class: Dos
job_uuid: fbd9fe16-c86a-4b75-8e8c-727815fb6ae1
blob_uuid: d3d5d34a-635e-42a0-9e9a-daaa2d9e8dc2
class: Dos
job_uuid: fbd9fe16-c86a-4b75-8e8c-727815fb6ae1
blob_uuid: db388489-7d2a-4008-b9e8-ebe853195112
class: Dos
job_uuid: fbd9fe16-c86a-4b75-8e8c-727815fb6ae1
blob_uuid: 66467040-f58f-473d-b34a-c7a57983ef8a
class: Dos
job_uuid: fbd9fe16-c86a-4b75-8e8c-727815fb6ae1
blob_uuid: bb52fd52-f2f5-4a1d-bb57-78999e22f0e0
class: CompleteDos
Follow paths have `blob_uuid` keys
.output.vasp_objects.chgcar
.output.vasp_objects.dos
@utf , I saw this updated wiki page of VASP recently: https://www.vasp.at/wiki/index.php/LORBIT
It might be interesting to keep this in mind for future changes. LORBIT=14 seems to be recommended now and LORBIT=12 needs to be run without symmetry for VASP versions older than VASP 6.
Boy am I glad to see you're using pytest
. ๐
I noticed you wrote your own tmp_dir
fixture.
Lines 46 to 58 in 614ace0
pytest
ships with a bunch of builtin fixtures including tmp_path
which gives a pathlib.Path
(and tmpdir
which gives a py.path.local
though that's scheduled for deprecation).
Happy to submit a PR to use the builtin (unless I'm missing some special requirement).
As atomate2 in combination with jobflows requires not much configuration, we would like to include Lobster support into atomate2 as well (similar to what we have done with atomate, https://chemistry-europe.onlinelibrary.wiley.com/doi/full/10.1002/cplu.202200123 )
If this is welcome, we (@naik-aakash and I) would start working on it.
In StaticSetGenerator
and TightRelaxSetGenerator
, the ALGO flag is updated to "Normal". Unless there is a strong reason to keep it at "Normal", my personal suggestion would be to not specify ALGO as an update here. There are two reasons behind this suggestion:
config_dict
) to land on something that works. Having the static calculation revert it back to "Normal" after this seems like it might be counterproductive.config_dict
or passed as a user_incar_setting
earlier in the flow. Overwriting this to Normal may be problematic. Of course, using user_incar_settings = {"ALGO": "All"}
in the StaticSetGenerator()
would resolve this, but it's only obvious this is necessary after running the flow and seeing that ALGO was somewhat unexpectedly changed.While I'm on this topic, I might suggest adding "LASPH: True"
to all HSE generators in vasp.sets.core
. This is already included in the default config_dict
, but there is no guarantee that a user might remember to do this in a new config_dict
even though VASP suggests using LASPH = True
for hybrids. This is related to a warning I introduced in Pymatgen here.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.