materialsproject / atomate2 Goto Github PK

View Code? Open in Web Editor NEW

138.0 13.0 80.0 329.34 MB

atomate2 is a library of computational materials science workflows

Home Page: https://materialsproject.github.io/atomate2/

License: Other

Python 99.56% Jupyter Notebook 0.44%

materials-science high-throughput automation dft vasp

atomate2's People

Contributors

Stargazers

Watchers

Forkers

josephmontoya-tri davidwaroquiers malik-ust shiqiaol itsduowang zhuoying rkingsbury jmmshn gpetretto mjwen fermiq nwinner sailfish009 jageo simon-nak orionarcher rdguha1995 jfajardorojas pmiam quantumchemist qianchenqc fraricci matgenix tllu leslie-zheng nityasagarjena mattmcdermott rsilvabuarque naik-aakash shdchen hrushikesh-s matthewkuner lemythe kamronald thomasrockhu-codecov mcgalcode tinaatucsd chiang-yuan jonathanschmidt1 esoteric-ephemera sophiaruan tpurcell90 ab5424 comprhys antobi jonasgrandel wch3n danielzuegner bryantli-bli wuz75 jiqi535 mkhorton andrew-s-rosen iloncaric wangzyphysics ml-evs rul048 guymoore13 yanghan234 tatha0003 gmatteo danielyang59 bleerian rohithsrinivaas sjtuzhanglei arsalan-akhtar emarazzi hongyi-zhao lory-w isabella232 katnykiel cote3804 cakirtufan zhubonan

atomate2's Issues

Feature: Support for different makers in the DoubleRelaxMaker

In the DoubleRelaxMaker flow, currently the same relax_maker is used for step 1 and step 2. It would be desirable to split this up into two makers (which by default are the same) in case the user wants to do something like a half k-point relaxation for step 1.

I can take care of this modification. This is mainly a reminder for myself.

BUG: accuracy of output structure

Describe the bug
This is potentially related to a problem that occurs in the phonon workflow for one structure: BORN charges that do not reflect the correct symmetry. This then results in a wrong non-analytical term correction around Gamma.

I start with this POSCAR:

Mg3 Sb2
1.0
   2.3003714889113018   -3.9843602950772405    0.0000000000000000
   2.3003714889113018    3.9843602950772405    0.0000000000000000
   0.0000000000000000    0.0000000000000000    7.2813299999999996
Mg Sb
3 2
direct
   0.0000000000000000    0.0000000000000000    0.0000000000000000 Mg2+
   0.3333333333333333    0.6666666666666666    0.3683250000000000 Mg2+
   0.6666666666666667    0.3333333333333334    0.6316750000000000 Mg2+
   0.3333333333333333    0.6666666666666666    0.7747490000000000 Sb3-
   0.6666666666666667    0.3333333333333334    0.2252510000000000 Sb3-

Then, I run the following workfow:

from atomate2.vasp.jobs.core import DielectricMaker, TightRelaxMaker
from pymatgen.core.structure import Structure
from jobflow import run_locally
from pymatgen.symmetry.analyzer import SpacegroupAnalyzer
from atomate2.vasp.powerups import update_user_incar_settings
from jobflow import Flow
flow=[]
struct=Structure.from_file("POSCAR")
job=TightRelaxMaker().make(structure=struct)
job2=DielectricMaker().make(structure=job.output.structure)
flow=Flow([job,job2],job2.output)
flow = update_user_incar_settings(flow, {"NPAR": 8}, class_filter=TightRelaxMaker)
run_locally(flow, create_folders=True)

The structure after the tight relaxation looks like this:

Mg3 Sb2                                 
   1.0000000000000000     
     2.2774694095435986   -3.9446927300134034    0.0000000000000000
     2.2774694095435986    3.9446927300134034   -0.0000000000000000
    -0.0000000000000000   -0.0000000000000000    7.1854093835379320
   Mg   Sb
     3     2
Direct
 -0.0000000000000000 -0.0000000000000000 -0.0000000000000000
  0.3333333333333357  0.6666666666666643  0.3675316159328170
  0.6666666666666643  0.3333333333333357  0.6324683840671832
  0.3333333333333357  0.6666666666666643  0.7761920492952012
  0.6666666666666643  0.3333333333333357  0.2238079507047988

  0.00000000E+00  0.00000000E+00  0.00000000E+00
  0.00000000E+00  0.00000000E+00  0.00000000E+00
  0.00000000E+00  0.00000000E+00  0.00000000E+00
  0.00000000E+00  0.00000000E+00  0.00000000E+00
  0.00000000E+00  0.00000000E+00  0.00000000E+00

The dielectric run then starts with:

Mg3 Sb2
1.0
   2.2774694100000001   -3.9446927299999999    0.0000000000000000
   2.2774694100000001    3.9446927299999999   -0.0000000000000000
  -0.0000000000000000   -0.0000000000000000    7.1854093800000003
Mg Sb
3 2
direct
  -0.0000000000000000   -0.0000000000000000   -0.0000000000000000 Mg
   0.3333333300000000    0.6666666700000000    0.3675316200000000 Mg
   0.6666666700000000    0.3333333300000000    0.6324683800000001 Mg
   0.3333333300000000    0.6666666700000000    0.7761920500000000 Sb
   0.6666666700000000    0.3333333300000000    0.2238079500000000 Sb

Clearly, this is less accurate.

If someone already has a suspicion why this happens, I would be happy about any hint. Otherwise, I will go through the code and check how to fix this. I am currently not completely sure if this is leading to the symmetry problems in the BORN charges but it looks likely to me.

FEATURE: Converge on (part) of the TaskDocument

In order to be able to use "code-agnostic-like" workflows, we should try to converge on some parts of the TaskDocument's. One obvious example is the finite-difference elastic constants workflow where the deformations could be performed using "any" code, provided the response follows some convention, i.e. it should have the energy, the forces, stress, ...
This needs some discussions with @utf on how to deal with that. We might consider having some general pydantic models that are subclassed for the different codes (somewhat similar to StructureMetadata).

MAGMOM from INCAR does not work

I set a MAGMOM in my INCAR and used pymatgen to read it as the input set. This makes a 'magmoms' list, but I think atomate2 wants it to be a dictionary:

my_structure = Structure.from_file('POSCAR')
my_incar = Incar.from_file('INCAR')
my_input_set = RelaxSetGenerator(user_incar_settings=my_incar.as_dict())
relax_job = RelaxMaker(input_set_generator=my_input_set).make(structure=my_structure)

Error:

File ".../atomate2/lib/python3.9/site-packages/atomate2/vasp/sets/base.py", line 857, in _get_magmoms
    mag.append(magmoms.get(site.specie.symbol, 0.6))
AttributeError: 'list' object has no attribute 'get'

atomate2/src/atomate2/vasp/sets/base.py

Line 851 in 4a9dc11

def _get_magmoms(magmoms, structure):

BUG: Can't run FireWorks example

Describe the bug
I ran the test code here after changing si_structure to mgo_structure (looks like a typo) with the current version of Atomate2 and FireWorks 1.9.7. I got back the following error. Any ideas where I might have gone wrong or why the error has come up? I don't know enough about FireWorks yet to be sure.

Traceback (most recent call last):
  File "test.py", line 21, in <module>
    lpad.add_wf(wf)
  File "/global/homes/r/rosen/software/miniconda/envs/cms2/lib/python3.8/site-packages/fireworks/core/launchpad.py", line 427, in add_wf
    old_new = self._upsert_fws(list(wf.id_fw.values()),
  File "/global/homes/r/rosen/software/miniconda/envs/cms2/lib/python3.8/site-packages/fireworks/core/launchpad.py", line 1726, in _upsert_fws
    self.fireworks.insert_many((fw.to_db_dict() for fw in fws))
  File "/global/homes/r/rosen/software/miniconda/envs/cms2/lib/python3.8/site-packages/pymongo/collection.py", line 769, in insert_many
    blk.ops = [doc for doc in gen()]
  File "/global/homes/r/rosen/software/miniconda/envs/cms2/lib/python3.8/site-packages/pymongo/collection.py", line 769, in <listcomp>
    blk.ops = [doc for doc in gen()]
  File "/global/homes/r/rosen/software/miniconda/envs/cms2/lib/python3.8/site-packages/pymongo/collection.py", line 759, in gen
    for document in documents:
  File "/global/homes/r/rosen/software/miniconda/envs/cms2/lib/python3.8/site-packages/fireworks/core/launchpad.py", line 1726, in <genexpr>
    self.fireworks.insert_many((fw.to_db_dict() for fw in fws))
  File "/global/homes/r/rosen/software/miniconda/envs/cms2/lib/python3.8/site-packages/fireworks/core/firework.py", line 319, in to_db_dict
    m_dict = self.to_dict()
  File "/global/homes/r/rosen/software/miniconda/envs/cms2/lib/python3.8/site-packages/fireworks/utilities/fw_serializers.py", line 146, in _decorator
    m_dict = func(self, *args, **kwargs)
  File "/global/homes/r/rosen/software/miniconda/envs/cms2/lib/python3.8/site-packages/fireworks/core/firework.py", line 275, in to_dict
    spec['_tasks'] = [t.to_dict() for t in self.tasks]
  File "/global/homes/r/rosen/software/miniconda/envs/cms2/lib/python3.8/site-packages/fireworks/core/firework.py", line 275, in <listcomp>
    spec['_tasks'] = [t.to_dict() for t in self.tasks]
  File "/global/homes/r/rosen/software/miniconda/envs/cms2/lib/python3.8/site-packages/fireworks/utilities/fw_serializers.py", line 175, in _decorator
    m_dict = func(self, *args, **kwargs)
  File "/global/homes/r/rosen/software/miniconda/envs/cms2/lib/python3.8/site-packages/fireworks/utilities/fw_serializers.py", line 147, in _decorator
    m_dict = recursive_dict(m_dict)
  File "/global/homes/r/rosen/software/miniconda/envs/cms2/lib/python3.8/site-packages/fireworks/utilities/fw_serializers.py", line 86, in recursive_dict
    return {recursive_dict(k, preserve_unicode): recursive_dict(v, preserve_unicode)
  File "/global/homes/r/rosen/software/miniconda/envs/cms2/lib/python3.8/site-packages/fireworks/utilities/fw_serializers.py", line 86, in <dictcomp>
    return {recursive_dict(k, preserve_unicode): recursive_dict(v, preserve_unicode)
  File "/global/homes/r/rosen/software/miniconda/envs/cms2/lib/python3.8/site-packages/fireworks/utilities/fw_serializers.py", line 80, in recursive_dict
    return recursive_dict(obj.as_dict(), preserve_unicode)
  File "/global/homes/r/rosen/software/miniconda/envs/cms2/lib/python3.8/site-packages/fireworks/utilities/fw_serializers.py", line 86, in recursive_dict
    return {recursive_dict(k, preserve_unicode): recursive_dict(v, preserve_unicode)
  File "/global/homes/r/rosen/software/miniconda/envs/cms2/lib/python3.8/site-packages/fireworks/utilities/fw_serializers.py", line 86, in <dictcomp>
    return {recursive_dict(k, preserve_unicode): recursive_dict(v, preserve_unicode)
  File "/global/homes/r/rosen/software/miniconda/envs/cms2/lib/python3.8/site-packages/fireworks/utilities/fw_serializers.py", line 90, in recursive_dict
    return [recursive_dict(v, preserve_unicode) for v in obj]
  File "/global/homes/r/rosen/software/miniconda/envs/cms2/lib/python3.8/site-packages/fireworks/utilities/fw_serializers.py", line 90, in <listcomp>
    return [recursive_dict(v, preserve_unicode) for v in obj]
  File "/global/homes/r/rosen/software/miniconda/envs/cms2/lib/python3.8/site-packages/fireworks/utilities/fw_serializers.py", line 80, in recursive_dict
    return recursive_dict(obj.as_dict(), preserve_unicode)
TypeError: as_dict() missing 1 required positional argument: 'self'

Idea for job composition showcase: HSE@GGA

Not sure if this is a good idea, just a suggestion.

atomate2 (or rather jobflow) is built to have composable jobs. Would it make a good example for the docs to show off how this can be used to get a more precise band gap by using a GGA job to find the VBM and CBM k-points followed by an HSE job to get their precise energy levels? I.e. what is referred to in the literature as HSE@GGA?

BUG: Additional input args are added to the INCAR from Vasprun

Describe the bug
Due to the logic shown here in the VaspInputSetGenerator, the INCAR flags that get carried over to a child calculation are taken from the vasprun.xml file. However, this can lead to very confusing behavior. For instance, in the attached files, I never used LMAXTAU or KPOINT_BSE in my config_dict or user_incar_settings, yet it appears in any child jobs. This is because, while these parameters weren't set in the INCAR, they were included in vasprun.incar. VASP for some reason decided to push these parameters up to the INCAR part of the vasprun.xml even though they were never explicitly set.

If the INCAR is available, my suggestion would be to prioritize reading that over the vasprun.xml and only read the latter if the INCAR has vanished.

INCAR.orig.txt
INCAR.txt
vasprun.xml.txt

CONTCAR -> POSCAR check

Sometimes the CONTCAR file gets created by VASP but is empty. Not a big problem since we are copying the file first but would be a helpful error message if something went wrong. We can validate the CONTCAR before copying and throwing an error.

atomate2/src/atomate2/vasp/files.py

Line 101 in d21bdac

if contcar_to_poscar:

Should `task_label` always match job name?

This line is causing some problems for me:

atomate2/src/atomate2/vasp/jobs/base.py

Line 149 in a31b86b

task_doc.task_label = self.name

This seems to indicate that if the user wants to change the name of a job (ex. adding formulas to the names) it will break subsequent querying of the resulting tasks database.

I like seeing the formula names in the FW web GUI as I'm working on new workflows. Not sure how everyone else feels.

Documentation (fireworks)

I did not find any documentation on how to run jobs from atomate2 with multiple workers in fireworks. I assume this would be too hard to find out for a new user of atomate2 on their own.

Molecule schema should probably be merged within emmet

Currently, the MoleculeMetadata class is in Atomate2 here. Longer term, it probably makes sense to either port it to emmet where it can co-exist with StructureMetadata or, potentially better yet, adopt/merge with whatever @espottesmith is planning to do (or has already done) in terms of making a MoleculeMetadata for MP. For now, what we have works though.

dir_name name not necessary

atomate2/src/atomate2/vasp/schemas/calculation.py

Line 669 in 3865460

vasprun_file = dir_name / vasprun_file

I think that the "*_file" contains already the full path and adding the dir_name is not necessary. It works because Path removes duplicates of following full paths, something like this:
Path.cwd() / Path.cwd()
where the output is just one single path (no duplicate).

But, using partial paths makes this failing because Path does not remove the duplicate here:

In [140]: Path("./outputs/") / Path("./outputs/")
Out[140]: PosixPath('outputs/outputs')

Therefore, removing the dir_name should also make this work providing a partial path.

Just for context, I was doing some tests creating a TaskDocument manually:

TaskDocument.from_directory(Path("./ferroelectric--23779/polarization_nonpolar/outputs/"))

FileNotFoundError: [Errno 2] No such file or directory: 'ferroelectric--23779/polarization_nonpolar/outputs/ferroelectric--23779/polarization_nonpolar/outputs/vasprun.xml.gz'

where you can see the duplicate path.

Please, double check this. It might be happening somewhere else too.

BUG: ivdw not included in INCAR parameters

This is more a bug associated with the vasprun.xml format and, by extension a limitation of Pymatgen's Vasprun('vasprun.xml).incar object, but the IVDW flag is not included in the input.incar set of parameters below even if it's included in the INCAR.

atomate2/src/atomate2/vasp/schemas/calculation.py

Line 167 in 7065ce3

incar=dict(vasprun.incar),

This is because IVDW is not included in vasprun.xml, so there is a mismatch between the INCAR flags in vasprun.xml and IVDW.

Do you think it might be better to read in the INCAR flags from the INCAR itself, and if the INCAR isn't present, then pull it from the vasprun.xml? This would also be nice because there are a few extraneous flags included when the INCAR flags are pulled from the vasprun.xml (e.g. KPOINTS_BSE, LMAXTAU even if they aren't set in the INCAR).

BUG: Storing more than CHGCAR when only CHGCAR is requested

Description
If I want to store just CHGCAR

maker = StaticMaker(task_document_kwargs={"store_volumetric_data": ["CHGCAR"]})

I'm also getting a bunch of other volumetric files written to the db:

Improvement of preconvergence step lobster workflow

@utf , there is still one other issue with the Lobster workflow. I have implemented this pre-convergence step of the WAVECAR. I have, however, the feeling that the speed-up is extremely small. I am not sure why this is the case. VASP still needs many electronic steps after starting from such a pre-converged WAVECAR. I might need to do some more tests to make this really efficient or test some larger structures.

FEATURE: Make the abinit BandStructure job follow the Vasp BandStructure job

In Vasp, NonScfSetGenerator is defined by a reciprocal_density (for uniform band structures) or by a line_density (for line band structures). It would be nice to be able to use the same convention in Abinit. We should keep the possibility to use the "usual" abinit way also (uniform band structures defined by kppa, and line band structures defined by ndivsm) as abinit users may be more inclined to use that convention.

User error: Pytest on Python 3.8, Windows

I am unsuccessful in running the test suite on Windows, Python 3.8.

In a fresh Python 3.8 environment, I've done:

pip install -r requirements.txt
pip install .[tests]
pytest

but all the jobs/flows fail due to

Traceback (most recent call last):
  File "C:\Users\asros\miniconda3\envs\atomate2\lib\site-packages\jobflow\managers\local.py", line 98, in _run_job
    response = job.run(store=store)
  File "C:\Users\asros\miniconda3\envs\atomate2\lib\site-packages\jobflow\core\job.py", line 524, in run
    response = function(*self.function_args, **self.function_kwargs)
  File "C:\Users\asros\miniconda3\envs\atomate2\lib\site-packages\atomate2\vasp\jobs\base.py", line 147, in make
    run_vasp(**self.run_vasp_kwargs)
  File "C:\Users\asros\miniconda3\envs\atomate2\lib\site-packages\atomate2\vasp\run.py", line 167, in run_vasp
    c.run()
  File "C:\Users\asros\miniconda3\envs\atomate2\lib\site-packages\custodian\custodian.py", line 367, in run
    self._run_job(job_n, job)
  File "C:\Users\asros\miniconda3\envs\atomate2\lib\site-packages\custodian\custodian.py", line 440, in _run_job
    p = job.run()
  File "C:\Users\asros\miniconda3\envs\atomate2\lib\site-packages\custodian\vasp\jobs.py", line 255, in run
    return subprocess.Popen(cmd, stdout=f_std, stderr=f_err)  # pylint: disable=R1732
  File "C:\Users\asros\miniconda3\envs\atomate2\lib\subprocess.py", line 858, in __init__
    self._execute_child(args, executable, preexec_fn, close_fds,
  File "C:\Users\asros\miniconda3\envs\atomate2\lib\subprocess.py", line 1311, in _execute_child
    hp, ht, pid, tid = _winapi.CreateProcess(executable, args,
FileNotFoundError: [WinError 2] The system cannot find the file specified

Feature Suggestion: Add a kwarg to the VASP TaskDocument that can drop null entries

My VASP TaskDocuments are often filled with lots of null and empty objects that I will likely never populate. This is a slight inconvenience when exploring my datasets in MongoDB. It'd be nice to add a kwarg to the TaskDocument to recursively drop all key-value pairs that have None as the value (and maybe {} or []?).

For a dictionary, it'd be something like:

def _remove_empties(d: Dict[str, Any]) -> Dict[str, Any]:
    """
    For a given dictionary, recursively remove all items that are None
    or are empty lists/dicts.

    Parameters
    ----------
    d
        Dictionary to clean

    Returns
    -------
    Dict
        Cleaned dictionary
    """

    if isinstance(d, dict):
        return {
            k: _remove_empties(v)
            for k, v in d.items()
            if v != None and v != [] and v != {}
        }
    if isinstance(d, list):
        return [_remove_empties(v) for v in d]
    return d

It's a bit trickier with the TaskDocument object though.

BUG: Optimization of NaCl

Describe the bug
ISMEAR=2 is set for non-metals.

I try to optimize the structure of NaCl with the following script (atomate development version):

from atomate2.vasp.jobs.core import RelaxMaker
from jobflow import run_locally
from pymatgen.core import Structure

from atomate2.vasp.powerups import update_user_incar_settings

structure = Structure.from_file("POSCAR.NaCl.vasp")
relax_job = RelaxMaker().make(structure)
relax_job = update_user_incar_settings(relax_job, {"NPAR": 4}, class_filter=RelaxMaker)

run_locally(relax_job, create_folders=True)

POSCAR.NaCl.vasp

Na4 Cl4
1.0
5.691694 0.000000 0.000000
0.000000 5.691694 0.000000
0.000000 0.000000 5.691694
Na Cl
4 4
direct
0.000000 0.000000 0.000000 Na+
0.000000 0.500000 0.500000 Na+
0.500000 0.000000 0.500000 Na+
0.500000 0.500000 0.000000 Na+
0.500000 0.000000 0.000000 Cl-
0.500000 0.500000 0.500000 Cl-
0.000000 0.000000 0.500000 Cl-
0.000000 0.500000 0.000000 Cl-

I end up with this INCAR:

ALGO = Fast
EDIFF = 1e-05
EDIFFG = -0.02
ENAUG = 1360
ENCUT = 680
GGA = Ps
IBRION = 2
ISIF = 3
ISMEAR = 2
ISPIN = 2
KSPACING = 0.22
LAECHG = True
LASPH = True
LCHARG = False
LELF = False
LMIXTAU = True
LORBIT = 11
LREAL = Auto
LVTOT = True
LWAVE = False
MAGMOM = 8*0.6
NELM = 200
NPAR = 4
NSW = 99
PREC = Accurate
SIGMA = 0.2

I think it is due the combination of this line here

atomate2/src/atomate2/vasp/sets/core.py

Line 43 in 73b9b3e

bandgap: float = 0,

And this one here:

atomate2/src/atomate2/vasp/sets/base.py

Line 976 in 73b9b3e

incar["ISMEAR"] = user_incar_settings.get("ISMEAR", 2)

I would maybe suggest to switch to a default value of "None" to avoid such errors. Happy to hear what other people are thinking.

(I can of course try to fix it but I suspect it might concern many parts of the code ...)

BUG: The command for gamma-only vasp is being called in unexpted places

Describe the bug

So this is a bit hard to reproduce because it only shows up if after a VASP calculation fails and custodian restarts it.
I end up with this the following error in my vasp.out

 -----------------------------------------------------------------------------
|                                                                             |
|           W    W    AA    RRRRR   N    N  II  N    N   GGGG   !!!           |
|           W    W   A  A   R    R  NN   N  II  NN   N  G    G  !!!           |
|           W    W  A    A  R    R  N N  N  II  N N  N  G       !!!           |
|           W WW W  AAAAAA  RRRRR   N  N N  II  N  N N  G  GGG   !            |
|           WW  WW  A    A  R   R   N   NN  II  N   NN  G    G                |
|           W    W  A    A  R    R  N    N  II  N    N   GGGG   !!!           |
|                                                                             |
|     Command line argument 'srun' was not understood.                        |
|                                                                             |
 -----------------------------------------------------------------------------

 -----------------------------------------------------------------------------
|                                                                             |
|           W    W    AA    RRRRR   N    N  II  N    N   GGGG   !!!           |
|           W    W   A  A   R    R  NN   N  II  NN   N  G    G  !!!           |
|           W    W  A    A  R    R  N N  N  II  N N  N  G       !!!           |
|           W WW W  AAAAAA  RRRRR   N  N N  II  N  N N  G  GGG   !            |
|           WW  WW  A    A  R   R   N   NN  II  N   NN  G    G                |
|           W    W  A    A  R    R  N    N  II  N    N   GGGG   !!!           |
|                                                                             |
|     Command line argument '-N8' was not understood.                         |
|                                                                             |
 -----------------------------------------------------------------------------

 -----------------------------------------------------------------------------
|                                                                             |
|           W    W    AA    RRRRR   N    N  II  N    N   GGGG   !!!           |
|           W    W   A  A   R    R  NN   N  II  NN   N  G    G  !!!           |
|           W    W  A    A  R    R  N N  N  II  N N  N  G       !!!           |
|           W WW W  AAAAAA  RRRRR   N  N N  II  N  N N  G  GGG   !            |
|           WW  WW  A    A  R   R   N   NN  II  N   NN  G    G                |
|           W    W  A    A  R    R  N    N  II  N    N   GGGG   !!!           |
|                                                                             |
|     Command line argument '-c2' was not understood.                         |
|                                                                             |
 -----------------------------------------------------------------------------

 -----------------------------------------------------------------------------
|                                                                             |
|           W    W    AA    RRRRR   N    N  II  N    N   GGGG   !!!           |
|           W    W   A  A   R    R  NN   N  II  NN   N  G    G  !!!           |
|           W    W  A    A  R    R  N N  N  II  N N  N  G       !!!           |
|           W WW W  AAAAAA  RRRRR   N  N N  II  N  N N  G  GGG   !            |
|           WW  WW  A    A  R   R   N   NN  II  N   NN  G    G                |
|           W    W  A    A  R    R  N    N  II  N    N   GGGG   !!!           |
|                                                                             |
|     Command line argument '/g/g20/shen9/compiled/vasp_gam_63_quartz'        |
|     was not understood.                                                     |
|                                                                             |
 -----------------------------------------------------------------------------

This is my .atomate2.yaml:

VASP_CMD: "srun -N8 -c2 /g/g20/shen9/compiled/vasp_std_63_quartz"
VASP_GAMMA_CMD: "srun -N8 -c2 /g/g20/shen9/compiled/vasp_gam_63_quartz"

So it looks like VASP_GAMMA_CMD is being used here (possibly appended after the vasp_std call) for some reason and I'm not familiar enough with that part of the code to see why this can happen.

This is not breaking anything for me at the moment but I just wanted to report this for future ref.

Suggestion: Task documents based on NOMAD parsers

Like the cclib-based task documents we have that make it easy to generate task documents for most molecular DFT codes, it'd be ideal I think to also add parallel support for task documents generated using NOMAD parsers (e.g. see here). The idea is that, right out of the box, Atomate2 would have structured input and output data for virtually all the codes users would be interested in. Of course, for codes of particular value to the community, custom task documents can be made (like the VASP one), but this could reduce the barrier for getting started with new codes in my opinion.

If it's of interest, this will go on my to-do list. Like with the cclib-based task documents, this would involve an optional dependency (pip install nomad-lab[parsing]).

BUG: Pinning a version of numpy in requirements.txt

In a fresh Python 3.8 environment, I've run pip install -r requirements.txt and pip install .[tests] but get numpy issues in return. By default, I'm running numpy 1.21.5 with the above procedure. Upgrading to numpy 1.22.1 resolves it.

It might be good to pin a version to numpy in requirements.txt to resolve this (or might involve shuffling the order of some of the packages in requirements.txt so pymatgen is compatible with the numpy version).

______________________ ERROR collecting test session _______________________ ..\..\miniconda3\envs\atomate2\lib\importlib\__init__.py:127: in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
<frozen importlib._bootstrap>:1014: in _gcd_import
    ???
<frozen importlib._bootstrap>:991: in _find_and_load
    ???
<frozen importlib._bootstrap>:975: in _find_and_load_unlocked
    ???
<frozen importlib._bootstrap>:671: in _load_unlocked
    ???
..\..\miniconda3\envs\atomate2\lib\site-packages\_pytest\assertion\rewrite.py:170: in exec_module
    exec(co, module.__dict__)
tests\vasp\schemas\conftest.py:99: in <module>
    class SiNonSCFUniform(SchemaTestData):
tests\vasp\schemas\conftest.py:100: in SiNonSCFUniform
    from atomate2.vasp.schemas.calculation import VaspObject
..\..\miniconda3\envs\atomate2\lib\site-packages\atomate2\vasp\schemas\calculation.py:12: in <module>
    from pymatgen.command_line.bader_caller import bader_analysis_from_path
..\..\miniconda3\envs\atomate2\lib\site-packages\pymatgen\command_line\bader_caller.py:30: in <module>
    from pymatgen.io.cube import Cube
..\..\miniconda3\envs\atomate2\lib\site-packages\pymatgen\io\cube.py:46: in <module>
    from pymatgen.core.sites import Site
..\..\miniconda3\envs\atomate2\lib\site-packages\pymatgen\core\__init__.py:20: in <module>
    from .lattice import Lattice  # noqa
..\..\miniconda3\envs\atomate2\lib\site-packages\pymatgen\core\lattice.py:22: in <module>
    from pymatgen.util.coord import pbc_shortest_vectors
..\..\miniconda3\envs\atomate2\lib\site-packages\pymatgen\util\coord.py:17: in <module>
    from . import coord_cython as cuc
pymatgen/util/coord_cython.pyx:1: in init pymatgen.util.coord_cython
    ???
E   ValueError: numpy.ndarray size changed, may indicate binary incompatibility. Expected 96 from C header, got 88 from PyObject

Feature Suggestion: cclib-based taskdocs

I have a pitch I wanted to share here.

Looking towards the future when we want to add molecular DFT codes to Atomate2, I wanted to share the code cclib here. It can parse the outputs of virtually all the popular molecular DFT codes and returns dozens of attributes with consistent naming across codes. Here are the tabulated outputs.

I think that this could be a nice, consistent way of constructing task docs for molecular DFT. Of course, for commonly used codes like Q-Chem (which have detailed pymatgen parsers already), we don't have to use cclib. And even if we do use cclib, we can always append to the returned dictionary with additional properties. But this might lower the burden for incorporating new codes into Atomate2 and provide some continuity between packages.

The con, of course, is that it would add a dependency to atomate2 and it could be argued that it'd just be better to rely on making new "in-house" pymatgen.io parsers for new codes rather than relying on an external dependency. However, cclib has been around for a long time and is continually updated. I would be happy to pitch a specific PR if there is interest. I have been using cclib with Jobflow for pretty much this purpose.

BUG: VaspObject name case sensitive

atomate2/src/atomate2/vasp/schemas/calculation.py

Line 811 in 02c7705

if file_type.name not in store_volumetric_data:

Something needs to change here about the case sensitivity:
Currently, the name attribute returns "LOCPOT" but the comments say we should use lower cases eg. {"store_volumetric_data": ["locpot"]}

BUG: Pre-commit results depend on dev environment's Python version

Describe the bug
Without pinning a specific Python version using default_language_version in the pre-commit config, pre-commit will install hooks with the first available Python executable. This means that Python 3.10-specific code can get through the linting phase only to fail the 3.8/3.9 tests. This lso seems be the case in the CI, which installs pre-commit hooks from at Python 3.10 environment.

To Reproduce

Create a Python 3.10 environment with your method of choice
Clone and install atomate2 pip install .\[dev\]
Introduce Python 3.10 specific code (e.g., replace a typing.Tuple with tuple[str])
pre-commit install; pre-commit run --all-files --- will pass without error.

Expected behavior
Linters should use the minimum supported Python version (currently 3.8).

Screenshots

diff --git a/src/atomate2/settings.py b/src/atomate2/settings.py
index e87a1c0..4d2e0c5 100644
--- a/src/atomate2/settings.py
+++ b/src/atomate2/settings.py
@@ -7,6 +7,9 @@ from pydantic import BaseSettings, Field, root_validator

 _DEFAULT_CONFIG_FILE_PATH = "~/.atomate2.yaml"

+test: tuple[str, str] = ("a", "b")
+
+
 __all__ = ["Atomate2Settings"]

$ git commit -a -m "Attempt to introduce py10-only feature"
check yaml...........................................(no files to check)Skipped
fix python encoding pragma...............................................Passed
fix end of files.........................................................Passed
trim trailing whitespace.................................................Passed
autoflake................................................................Passed
black....................................................................Passed
blacken-docs.............................................................Passed
isort....................................................................Passed
flake8...................................................................Passed
type annotations not comments............................................Passed
rst ``code`` is two backticks........................(no files to check)Skipped
rst directives end with two colons...................(no files to check)Skipped
rst ``inline code`` next to normal text..............(no files to check)Skipped
mypy.....................................................................Passed
codespell................................................................Passed
pyupgrade................................................................Passed
[ml-evs/update_flake8_precommit b5e62c4] Attempt to introduce py10-only feature
 1 file changed, 3 insertions(+)

The simple fix is to use:

default_language_version:
   python: python3.8

at the top of the pre-commit config.

Feature suggestion: harmonic phonon workflow?

I have worked on a phonon workflow including phonopy and the finite displacement method. I would like to make it available in atomate2.

Would there be interest?

CONTCAR copied from previous directory gets overwritten

Currently when a CONTCAR is copied as POSCAR from the previous directory, it gets overwritten by the new POSCAR generated from write_vasp_input_set. Even if the new POSCAR is generated from last structure in the last run, it loses the predictor-corrector coordinates stored in CONTCAR when continuing an MD job. When the overwrite option is off in https://github.com/materialsproject/atomate2/blob/9dcf35586eab52c4be842b1010aa268b98c8493d/src/atomate2/vasp/sets/base.py#L71, it will raise a FileExistsError. Is there a way to skip writing new input files for continuation jobs?

I did a temporary fix in my own branch by just changing the error raised by the existing POSCAR to a warning to allow the job to proceed. I'm not sure if that will break other things. An alternative might be moving the file existence check before copying files in https://github.com/materialsproject/atomate2/blob/9dcf35586eab52c4be842b1010aa268b98c8493d/src/atomate2/vasp/jobs/base.py#L130 to allow the user to turn off the overwrite option when writing inputs?

Migrate linting to Ruff

Just a suggestion but one that has worked out very well for pymatgen over in materialsproject/pymatgen#2847: we could migrate linting of atomate2 to ruff.

Ruff combines the functionality of most other Python linters into 1 tool written in Rust which makes it ~100x faster than linters written in Python. It brought the pymatgen linting CI script's run time from 9 min down to 3 min (almost entirely spent installing deps and running mypy now, ruff itself only takes 1 sec).

Happy to take this on if interested.

Markdown for docs

Wouldn't normally raise an issue about this but @utf said

It’s still early days without much adoption, so plenty of scope for changing [...]

May I recommend using markdown instead of rST for docs? While I admit I'm biased, I believe there are good reasons to be. They include all the ones listed here plus others:

syntax is too confusing (by far the most important point)
setext instead of ATX headers (very annoying to type)
no nesting of inline markup
syntax is more verbose than markdown
development of rST seems all but frozen (maybe some would argue that's a good thing in a markup language but I disagree esp. with fundamental features like inline markup nesting still missing)

Sphinx supports markdown with little extra setup so I think the cost of switching wouldn't be that high. MyST which Sphinx relies on for md parsing supports directives and extensions so I don't think there'd be any downsides.

Clarify calcs_reversed in atomate2

Previously in the original atomate, calcs_reversed key is explicitly named for putting the final step as the first entry (https://github.com/hackingmaterials/atomate/blob/4577c94c1850fcebd8624d49ba1ab27f8e3e6d43/atomate/vasp/drones.py#L300)

In atomate2, the calculations are not in a reversed order, but still use this calcs_reversed key name, which may cause some confusions.

atomate2/src/atomate2/vasp/schemas/task.py

Line 382 in fe657b4

completed_at=calcs_reversed[-1].completed_at,

To resolve this issue, we could:
(1) doing the calculation list in a reversed order, obey the principle of making the final step as first entry
OR
(2) keep the calcs in normal sequence (put initial step as the first entry), but rename it with a different key.

Please let me know which way you pro and I could submit a PR later.

BUG: Parsing of f orbitals fails for DOS properties

See https://matsci.org/t/atomate2-parsing-of-f-orbitals-in-dos/42350 by @jmmshn. I'm out of town right now but can work on a fix next week if nobody else does.

Standardized approach for job checkpoint/restart/continuation

Hi @mkhorton @davidwaroquiers @mjwen. Agreed that we need a proper discussion about restarting. This is something @jmmshn has opinions about too.

I have some ideas for how we could do it but each with their own tradeoffs. I'd be interested to hear if there were any good ideas from your workshop @mkhorton.

I will create a separate issue to discuss this further.

Originally posted by @utf in #134 (comment)

Documentation of Phonon workflow

Phonon workflow is currently not documented. @QuantumChemist will start working on this.

FEATURE: define a convention about abinit jobs that have restarted and failed the max number of restarts

An abinit job that has restarted the maximum set by the user/settings should be defined as "final", even if it has failed. How to deal with that, how/can we restart from it ? Would a restart be a continuation of the same Flow/Job or a new Flow/Job ?

BUG: validation error for RunStatistics

Describe the bug
Given the attached OUTCAR, I get

~/software/miniconda/envs/cms2/lib/python3.8/site-packages/atomate2/vasp/schemas/calculation.py in from_outcar(cls, outcar)
    208             "cores": "cores",
    209         }
--> 210         return cls(**{v: outcar.run_stats.get(k) or 0 for k, v in mapping.items()})
    211
    212

~/software/miniconda/envs/cms2/lib/python3.8/site-packages/pydantic/main.cpython-38-x86_64-linux-gnu.so in pydantic.main.BaseModel.__init__()

ValidationError: 1 validation error for RunStatistics
cores
  value is not a valid integer (type=type_error.integer)

The Atomate2 job then fails as a result. This is on the current version of Atomate2.

To Reproduce

from pymatgen.io.vasp import Outcar
from atomate2.vasp.schemas.calculation import RunStatistics
out = Outcar('OUTCAR.txt')
RunStatistics().from_outcar(out)

OUTCAR.txt

Solution

This is related to VASP 6.2.0+. First, as seen in the attached OUTCAR, sometimes VASP reports N/A for average memory used, and so Pymatgen converts this to None. The workaround solution in Atomate2 is to accept either float or None (if that's not being done so already). The second issue regarding the number of cores can be fixed upstream in Pymatgen. I opened a PR here: materialsproject/pymatgen#2308

Feature: Support for alternate KPath types

In the code block below, it seems that HighSymmKpath is called without any supplemental kwargs. It would be ideal if the user could modify the HighSymmKpath args on-the-fly, e.g. to get path_type = 'latimer_munro' instead of path_type = 'setyawan_curtarolo'. I assume this is not currently possible, right? If not, I've left this issue open as a possible feature to add in the future.

atomate2/src/atomate2/vasp/sets/base.py

Lines 664 to 677 in 2eec3b6

    
           if kconfig.get("line_density"): 
        
               # handle line density generation 
        
               kpath = HighSymmKpath(structure) 
        
               frac_k_points, k_points_labels = kpath.get_kpoints( 
        
                   line_density=kconfig["line_density"], coords_are_cartesian=False 
        
               ) 
        
               base_kpoints = Kpoints( 
        
                   comment="Non SCF run along symmetry lines", 
        
                   style=Kpoints.supported_modes.Reciprocal, 
        
                   num_kpts=len(frac_k_points), 
        
                   kpts=frac_k_points, 
        
                   labels=k_points_labels, 
        
                   kpts_weights=[1] * len(frac_k_points), 
        
               )

Atomate2 add-on packages

As discussed briefly in #173, we would like to create add-on packages for atomate2 to support additional codes that maybe don't fit in within existing workflows. LAMMPS seems to be one such case and as such we have moved development out to a namespace package atomate2-lammps.

Is this a contribution route that should be officially supported? I am happy to extract the generic stuff from our repository to make an add-on template similar to pymatgen-addon-template, and make a PR here with some explanation? I guess the alternative would be developing the separate add-on and then eventually folding it back into atomate2 when it is more mature.

Let me know what you think!

BUG: Charge structure not parsed

When parsing the following directory into a TaskDocument.
The result should have +1 charge (NELECT = 349 vs 350 for the neutral)
The current version of the Vasprun Object in pymatgen gives Vasprun.structure.charge == -1 due to a bug fix in
materialsproject/pymatgen#2577

Either way, with or without the fix to the sign,the current parsing of the directory using TaskDocument gives zero charge:

tdoc = TaskDocument.from_directory("./")

Gives:

tdoc.structure.charge # --> 0

Files to reproduce linked below:

charged_vasprun_ex.zip

Documentation Suggestions

A Few Documentation Suggestions

I wanted to share a few comments below regarding the documentation. It's excellent! It strikes a great balance between being concise/clear and having enough information to get started easily. I did have a few suggestions after reading through it, which I've included below. Feel free to address as little or as much of this as you wish. They're merely my opinions.

I know that one of the main philosophies of Atomate is that it abstracts away a lot of the DFT settings from the user, which is great. At the same time, I think it's important for the more "expert" users to easily see what's going on underneath-the-hood. For instance, while I gathered the StaticMaker ran a static calculation, it wasn't immediately clear to me what other settings were changed. Should I be updating ISMEAR, or has Atomate2 already done that for me? Does it output AECCAR files by default, or do I need to take care of that? Of course, going to the code for the StaticSetGenerator clarified that for me, but I'm wondering if it might be helpful to include a hyperlink (in the documentation for the various Makers) to the various updates Atomate2 automatically applies.
The "modifying input parameters" section of the documentation is fantastic. I think a few things could be added though. Namely, it would probably be worth mentioning that it's not just the INCAR flags that can be changed. The k-points and POTCARs can be changed too. While all the input arguments to VaspInputSetGenerator() don't need to be repeated in the documentation, a brief mention that other settings can be changed may be helpful for the reader.
- EDIT: I've drafted something for this.
In the VaspInputSetGenerator documentation, I initially was very unsure what a config_dict was. After about 15 minutes of realizing "there must be a better way" to handle entirely new input sets than defining massive user_incar_settings, user_kpoints_settings, etc. arguments, I then learned what it was after digging through the code itself. Perhaps a link to the BaseVaspSet.yaml file (as an example) might be helpful. Or something of that nature.
- EDIT: I've drafted something for this.
When describing the user_incar_settings option in the "modifying input parameters" section of the tutorial, perhaps it might be beneficial to mention that setting None will remove the user INCAR setting. I was fortunate to stumble upon it in the documentation for VaspInputSetGenerator, but this might be something to add to the main documentation as a note.
- Edit: This is perhaps too niche in hindsight.
"The different input sets used in atomate2 mean total energies cannot be compared against energies taken from the Materials Project" is found in the documentation, which I think is important. Perhaps it might be good to add "by default" to the end or something. Of course, Atomate2 can be made compatible with MP. It's mostly a matter of using different input sets, which you have made really easy to do with the config_dict arguments. Just define your own .yaml file and away you go, for the most part.
- EDIT: I've drafted something for this.

BUG: Atomate2 version number

The Atomate2 version property doesn't seem to work at the moment. This is also apparent when viewing the top of the documentation page where no version number is listed.

import atomate2
atomate2.__version__

Returns ''

BUG: AttributeError: 'CompleteDos' object has no attribute 'get_band_filling' when running DielectricMaker()

Getting a strange bug at the very end of a DFPT calc which sadly means none of the results make it into the DB. Not sure where to start debugging, if someone else has any ideas, I'm all ears.

Stack trace:

INFO:jobflow.managers.local:dielectric failed with exception:
Traceback (most recent call last):
  File "/Users/janosh/.venv/py310/lib/python3.10/site-packages/jobflow/managers/local.py", line 97, in _run_job
    response = job.run(store=store)
  File "/Users/janosh/.venv/py310/lib/python3.10/site-packages/jobflow/core/job.py", line 524, in run
    response = function(*self.function_args, **self.function_kwargs)
  File "/Users/janosh/dev/atomate2/src/atomate2/vasp/jobs/base.py", line 150, in make
    task_doc = TaskDocument.from_directory(Path.cwd(), **self.task_document_kwargs)
  File "/Users/janosh/dev/atomate2/src/atomate2/vasp/schemas/task.py", line 323, in from_directory
    calc_doc, vasp_objects = Calculation.from_vasp_files(
  File "/Users/janosh/dev/atomate2/src/atomate2/vasp/schemas/calculation.py", line 646, in from_vasp_files
    output_doc = CalculationOutput.from_vasp_outputs(
  File "/Users/janosh/dev/atomate2/src/atomate2/vasp/schemas/calculation.py", line 458, in from_vasp_outputs
    _get_band_props(vasprun.complete_dos, structure)
  File "/Users/janosh/dev/atomate2/src/atomate2/vasp/schemas/calculation.py", line 814, in _get_band_props
    "filling": complete_dos.get_band_filling(band=orb_type, elements=[el]),
AttributeError: 'CompleteDos' object has no attribute 'get_band_filling'

The error originates on this line:

atomate2/src/atomate2/vasp/schemas/calculation.py

Line 814 in b4ccbb1

"filling": complete_dos.get_band_filling(band=orb_type, elements=[el]),

full job logs

 2022-04-22 16:03:35,889 INFO Started executing jobs locally
2022-04-22 16:03:36,026 INFO Starting job - dielectric (336db245-27b5-4316-9a64-203065a61b9a)
WARNING in EDDRMM: call to ZHEGV failed
ERROR:custodian.custodian:VaspErrorHandler
No matching processes belonging to you were found
2022-04-23 17:26:50,175 INFO dielectric failed with exception:

INFO:jobflow.managers.local:dielectric failed with exception:
Traceback (most recent call last):
  File "/Users/janosh/.venv/py310/lib/python3.10/site-packages/jobflow/managers/local.py", line 97, in _run_job
    response = job.run(store=store)
  File "/Users/janosh/.venv/py310/lib/python3.10/site-packages/jobflow/core/job.py", line 524, in run
    response = function(*self.function_args, **self.function_kwargs)
  File "/Users/janosh/dev/atomate2/src/atomate2/vasp/jobs/base.py", line 150, in make
    task_doc = TaskDocument.from_directory(Path.cwd(), **self.task_document_kwargs)
  File "/Users/janosh/dev/atomate2/src/atomate2/vasp/schemas/task.py", line 323, in from_directory
    calc_doc, vasp_objects = Calculation.from_vasp_files(
  File "/Users/janosh/dev/atomate2/src/atomate2/vasp/schemas/calculation.py", line 646, in from_vasp_files
    output_doc = CalculationOutput.from_vasp_outputs(
  File "/Users/janosh/dev/atomate2/src/atomate2/vasp/schemas/calculation.py", line 458, in from_vasp_outputs
    _get_band_props(vasprun.complete_dos, structure)
  File "/Users/janosh/dev/atomate2/src/atomate2/vasp/schemas/calculation.py", line 814, in _get_band_props
    "filling": complete_dos.get_band_filling(band=orb_type, elements=[el]),
AttributeError: 'CompleteDos' object has no attribute 'get_band_filling'

2022-04-23 17:26:50,176 INFO Finished executing jobs locally
INFO:jobflow.managers.local:Finished executing jobs locally
Traceback (most recent call last):
  File "/Users/janosh/dev/vasp/atomate2/perovskite-diel.py", line 41, in <module>
    run_locally(diel_job, create_folders=True, ensure_success=True)
  File "/Users/janosh/.venv/py310/lib/python3.10/site-packages/jobflow/managers/local.py", line 165, in run_locally
    raise RuntimeError("Flow did not finish running successfully")
RuntimeError: Flow did not finish running successfully

Job script

import os
import warnings
from time import perf_counter

from atomate2.vasp.jobs.core import DielectricMaker
from atomate2.vasp.powerups import (
    update_user_incar_settings,
    update_user_kpoints_settings,
)
from jobflow import run_locally
from pymatgen.ext.matproj import MPRester
from pymatgen.io.vasp import Kpoints


os.environ["OMP_NUM_THREADS"] = "1"

warnings.filterwarnings("ignore")  # ignore pymatgen warnings clogging up the logs


SrHfO3 = MPRester().get_structure_by_material_id("mp-13108")


# make a relax job to optimise the structure
diel_job = DielectricMaker().make(SrHfO3)


diel_job = update_user_incar_settings(
    diel_job, {"ENCUT": 700, "EDIFF": 1e-7, "NELM": 40}
)
auto_kpts = Kpoints.automatic_density(SrHfO3, 3000)

diel_job = update_user_kpoints_settings(diel_job, auto_kpts)

start = perf_counter()

run_locally(diel_job, create_folders=True)

elapsed = perf_counter() - start
print(f"SrHfO3 DFPT calc took {elapsed:.1f} sec")

LAMMPS Integration

Are there currently any plans on integrating LAMMPS?
I think there would also be many users. As there atomate workflows, I assume there might be plans.

FEATURE: Deal with restarts when the abinit process is killed by python

In the abinit jobs, the abinit process is killed before the end of the wall time by the python subprocess module. We should try to deal with those cases. In some cases, an automatic restart may be possible. In some other cases, the user should do something (e.g. increase timelimit, or change number of processors or else ...).

Extremely tight phonon k-point mesh

Hi @JaGeo.

I just tried running the phonon workflow and noticed that the k-point density used is extremely high. It's set as grid_density=7000 which equals a reciprocal density of around 600 for silicon. Just for reference, the default reciprocal density in Atomate1 is 64 for insulators and 200 for metals. So this works out almost 10x the default k-point density.

For silicon, this means a supercell with lengths 22 Å is being run with a 2x2x2 k-point mesh!

I'd be very surprised if we need to go this high to get converged results. My feeling is that we should could use reciprocal_density=100 and still get converged results. From my previous experience, I usually run displaced supercells with the same k-point density as the relaxation and this normally works well. My experience is that it is usually the force and energy convergence that are essential to get good phonons.

What do you think about:

Switching from grid_density to reciprocal_density, this means the number of k-points is independent of the number of atoms and only depends on the cell size.
Dropping the density to reciprocal_density=100

Obviously, we should do some tests to make sure this gives reasonable results still.

Potentially increase efficiency of deformation calculations in elastic constant workflow

atomate2/src/atomate2/vasp/flows/elastic.py

Line 113 in de92307

vasp_deformation_calcs = run_elastic_deformations(

When running the elastic constants workflow for metallic systems (HEAs primarily), I find that the calculations for many of the deformations is 1/3 - 2/3rds as expensive as the initial tight relaxation. This is especially true for the shear deformations. And with up to 24 of these calculations, that really adds up.

Perhaps it would be efficient to have each of the deformation calculations be ran in two stages:

an initial relaxation with a lower kpoints density and/or looser EDIFFG
a second relaxation with the current deformation calculation's kpoints/EDIFFG

Could someone with a deep(er) understanding of elastic constant calculations weigh in on this? For my systems, these deformation calculations make up >80% of the run time of my flows, so there is a ton of potential upside.

BUG: Repeatedly writing `Dos` objects to the `data` store without blob_uuids

Repeatedly writing Dos objects to the data store without blob_uuids

During the parsing of the TaskDocument something is causing the code to repeatedly write Dos objects to the additional store with no way of retrieving them.

The example below is a standard static calculation with parse_dos.
The Chgcar and CompleteDos objects are both listed at output.vasp_objects in the TaskDocument but there are 6 other Dos objects stored in the data S3Store.

blob_res = JOB_STORE.additional_stores["data"].query({"job_uuid": "fbd9fe16-c86a-4b75-8e8c-727815fb6ae1"})
doc = JOB_STORE.docs_store.query_one({"uuid": "fbd9fe16-c86a-4b75-8e8c-727815fb6ae1"})
def _check_blob(d, prefix = ""): 
    if isinstance(d, list):
        for i, v in enumerate(d):
            _check_blob(v, f"{prefix}.{str(i)}")
    elif isinstance(d, dict):
        for i, v in d.items():
            if i == "blob_uuid":
                print(prefix)
            _check_blob(v, f"{prefix}.{str(i)}")

for d in blob_res:
    job_uuid = d["job_uuid"]
    print(f"job_uuid: {d['job_uuid']}")
    print(f"blob_uuid: {d['blob_uuid']}")
    print(f"class: {d['@class']}")
print("Follow paths have `blob_uuid` keys")
_check_blob(doc)

job_uuid: fbd9fe16-c86a-4b75-8e8c-727815fb6ae1
blob_uuid: 4c2d511c-54fe-40fc-9c0c-fc1021ccf38f
class: Dos
job_uuid: fbd9fe16-c86a-4b75-8e8c-727815fb6ae1
blob_uuid: 865ed9da-2ff6-448e-9c48-c2bb2b3a4fbd
class: Dos
job_uuid: fbd9fe16-c86a-4b75-8e8c-727815fb6ae1
blob_uuid: f1d3c57e-54eb-4174-aa65-ba92caa43535
class: Chgcar
job_uuid: fbd9fe16-c86a-4b75-8e8c-727815fb6ae1
blob_uuid: 7b26630a-b93b-41f6-81eb-0666d62042d4
class: Dos
job_uuid: fbd9fe16-c86a-4b75-8e8c-727815fb6ae1
blob_uuid: d3d5d34a-635e-42a0-9e9a-daaa2d9e8dc2
class: Dos
job_uuid: fbd9fe16-c86a-4b75-8e8c-727815fb6ae1
blob_uuid: db388489-7d2a-4008-b9e8-ebe853195112
class: Dos
job_uuid: fbd9fe16-c86a-4b75-8e8c-727815fb6ae1
blob_uuid: 66467040-f58f-473d-b34a-c7a57983ef8a
class: Dos
job_uuid: fbd9fe16-c86a-4b75-8e8c-727815fb6ae1
blob_uuid: bb52fd52-f2f5-4a1d-bb57-78999e22f0e0
class: CompleteDos
Follow paths have `blob_uuid` keys
.output.vasp_objects.chgcar
.output.vasp_objects.dos

Projections in VASP

@utf , I saw this updated wiki page of VASP recently: https://www.vasp.at/wiki/index.php/LORBIT

It might be interesting to keep this in mind for future changes. LORBIT=14 seems to be recommended now and LORBIT=12 needs to be run without symmetry for VASP versions older than VASP 6.

Use `pytest` `tmp_path` fixture

Boy am I glad to see you're using pytest. 😄

I noticed you wrote your own tmp_dir fixture.

atomate2/tests/conftest.py

Lines 46 to 58 in 614ace0

    
           @pytest.fixture 
        
           def tmp_dir(): 
        
               """Same as clean_dir but is fresh for every test""" 
        
               import os 
        
               import shutil 
        
               import tempfile 
        
               old_cwd = os.getcwd() 
        
               newpath = tempfile.mkdtemp() 
        
               os.chdir(newpath) 
        
               yield 
        
               os.chdir(old_cwd) 
        
               shutil.rmtree(newpath)

pytest ships with a bunch of builtin fixtures including tmp_path which gives a pathlib.Path (and tmpdir which gives a py.path.local though that's scheduled for deprecation).

Happy to submit a PR to use the builtin (unless I'm missing some special requirement).

Support for Lobster

As atomate2 in combination with jobflows requires not much configuration, we would like to include Lobster support into atomate2 as well (similar to what we have done with atomate, https://chemistry-europe.onlinelibrary.wiley.com/doi/full/10.1002/cplu.202200123 )

If this is welcome, we (@naik-aakash and I) would start working on it.

Suggestion: ALGO and LASPH in input sets

In StaticSetGenerator and TightRelaxSetGenerator, the ALGO flag is updated to "Normal". Unless there is a strong reason to keep it at "Normal", my personal suggestion would be to not specify ALGO as an update here. There are two reasons behind this suggestion:

Typically, static calculations are carried out in a flow following a relaxation job. If SCF convergence was difficult in the relaxation, Custodian may have updated ALGO one or more times (compared to what is in the config_dict) to land on something that works. Having the static calculation revert it back to "Normal" after this seems like it might be counterproductive.
If someone is using a meta-GGA, they almost always want to use ALGO = All. This may be in the config_dict or passed as a user_incar_setting earlier in the flow. Overwriting this to Normal may be problematic. Of course, using user_incar_settings = {"ALGO": "All"} in the StaticSetGenerator() would resolve this, but it's only obvious this is necessary after running the flow and seeing that ALGO was somewhat unexpectedly changed.

While I'm on this topic, I might suggest adding "LASPH: True" to all HSE generators in vasp.sets.core. This is already included in the default config_dict, but there is no guarantee that a user might remember to do this in a new config_dict even though VASP suggests using LASPH = True for hybrids. This is related to a warning I introduced in Pymatgen here.

	if kconfig.get("line_density"):
	# handle line density generation
	kpath = HighSymmKpath(structure)
	frac_k_points, k_points_labels = kpath.get_kpoints(
	line_density=kconfig["line_density"], coords_are_cartesian=False
	)
	base_kpoints = Kpoints(
	comment="Non SCF run along symmetry lines",
	style=Kpoints.supported_modes.Reciprocal,
	num_kpts=len(frac_k_points),
	kpts=frac_k_points,
	labels=k_points_labels,
	kpts_weights=[1] * len(frac_k_points),
	)

	@pytest.fixture
	def tmp_dir():
	"""Same as clean_dir but is fresh for every test"""
	import os
	import shutil
	import tempfile

	old_cwd = os.getcwd()
	newpath = tempfile.mkdtemp()
	os.chdir(newpath)
	yield
	os.chdir(old_cwd)
	shutil.rmtree(newpath)

materialsproject / atomate2 Goto Github PK

atomate2's People

Contributors

Stargazers

Watchers

Forkers

atomate2's Issues

Solution

A Few Documentation Suggestions

Recommend Projects

Recommend Topics

Recommend Org