phasesresearchlab / espei Goto Github PK
View Code? Open in Web Editor NEWFitting thermodynamic models with pycalphad - https://doi.org/10.1557/mrc.2019.59
Home Page: http://espei.org
License: MIT License
Fitting thermodynamic models with pycalphad - https://doi.org/10.1557/mrc.2019.59
Home Page: http://espei.org
License: MIT License
ESPEI does not warn you if the datasets
directory specified in the input YAML file does not exist. This may lead to unexpected results.
This should allow for little to no logic required in run_espei.main
and provides a better interface for users to interact with the ESPEI's fitting procedures.
In espei.plot.dataplot
let users pass eq=None
and their datasets in order to plot just the data
The new Database will be added to the current Database and the initial contributions would be subtracted out. The ideal use case for this is
Eventually when we implement unary fitting, unaires can be oassed in this way to be fixed.
Right now, ESPEI can perform parameter selection for HM_FORM
/HM_MIX
, SM_FORM
/SM_MIX
, or CPM_FORM
/CPM_MIX
data, but it would be useful to be able to fit HM-HM(SER)
, SM
and CPM
directly as well.
In the past, I would have suggested the following workaround procedure to fit absolute energies (no changes to ESPEI code required) that are compatible with SGTE:
GHSER__
symbol)HM-HM(SER)
, SM
, CPM
data as the _FORM
in ESPEI datasets and fit them. This will give energies for each phase referenced to GHSER__
functions which are zero, so the absolute energies and derivatives are fit.I'm convinced now that fitting absolute valued versions of these data is useful enough that ESPEI should allow this data to be fit if given by a user. It shouldn't be too much work for a student that wanted to pick up this project.
I'm not completely sure, but this might "just work" to change espei.parameter_selection.utils.shift_reference_state
to not raise on HM, SM, or CPM data. One would have to decide whether it is a user's responsibility to shift to HM-HM(SER)
or whether to allow ESPEI to do it automatically (which should be possible now that Database().refstates
data is supported by pycalphad and used by ESPEI).
If known tielines are used as datasets from a database, calculating ZPF error for that database and equilibria should give 0 error.
I don't have a test case or any reason to suspect this is broken, but it would be a good sanity check.
Convert PARROT POP files to JSON format
ESPEI can be deterministic and reproducible, but restarting resets the random state.
That means running for 1000 steps in one run and two runs of 500 steps each (1000 total) will give different results, despite each being deterministic.
A solution is to be able to dump and load the random state on restart.
pycalphad's eqplot filters the active phases and sorts them alphabetically to get the phase names from pycalphad.plot.utils.phase_legend
. If phases are not sorted and active phases not removed, mutliplot will not produce the same phase_legend and colors as eqplot.
We would like to be able to optimize databases that may use custom models and therefore we should support building phases with the Model class (which has the other benefit of being more performant than CompiledModel) and subclasses.
See the diff for when CompiledModel was added and Model removed at commit 2bb8c49
It looks like we previous passed around callables for the objective as well as gradient and hessian.
I see ESPEI using AICc to prevent over estimation of parameters.
But F-test was mentioned in Doctor thesis "SOFTWARE ARCHITECTURE FOR CALPHAD MODELING OF PHASE STABILITY AND TRANSFORMATIONS IN ALLOY ADDITIVE MANUFACTURING PROCESSES".
In the thesis AICc is used to fit sigle-phase parameters and F-test is used to fit multi-phases parameters. It looks like ESPEI only using AICc and not using F-test.
question:
(1) Does AICc is suitable for multi-phases?
(2) Why not taking F-test into consideration?
Note that this is a sketch of a procedure and contains code that has not yet been tested.
Key tools:
pympler
- find which objects are using the most memoryobjgraph
- generate flow graphs of backreferences to any objectpyrasite
- attach a Python console to a running processThis will require some modification to the base code. Use the SummaryTracker
from pympler
. Add this code somewhere before MCMC sampling starts:
from pympler import tracker
tr = tracker.SummaryTracker()
# Calibrate it by calling this a few times until it returns no changed objects
tr.print_diff()
tr.print_diff()
tr.print_diff()
# Create all the objects, put all the function's setup code
# All the sampling code is here
Start the sampling job with just a single core. Then use pyrasite-shell
to connect to the running Python process by PID: http://pyrasite.readthedocs.io/en/latest/Shell.html
In the remote shell:
import objgraph ; tr = objgraph.by_type('SummaryTracker')[0] ; tr.print_diff()
This will give output like:
types | # objects | total size
========================================== | =========== | ============
<class 'list | 18730 | 1.71 MB
<class 'str | 18961 | 1.35 MB
<class 'sip.methoddescriptor | 8287 | 453.20 KB
<class 'dict | 513 | 304.12 KB
<class 'int | 5375 | 152.21 KB
<class 'sympy.core.assumptions.StdFactKB | 144 | 90.94 KB
<class 'set | 43 | 72.91 KB
<class '_lrucache.hashseq | 509 | 57.82 KB
<class 'tuple | 473 | 29.11 KB
<class 'sympy.core.numbers.Float | 338 | 26.41 KB
<class '_lrucache.clist | 509 | 23.86 KB
<class 'tinydb.database.Element | 32 | 15.50 KB
<class 'sip.variabledescriptor | 184 | 12.94 KB
<class 'PyQt4.QtCore.QLocale.Country | 248 | 12.59 KB
<class 'sympy.core.mul.Mul | 174 | 12.23 KB
To start spot checking object backreferences, use
objgraph.show_backrefs(objgraph.by_type('list')[0], max_depth=10)
A graph will be rendered as a PNG and written to a temporary directory. The path to the graph will be output to the console.
The following functions should be fully documented with a description, arguments/keyword arguments, returns, and examples (if applicable):
espei.paramselect._fit_parameters
could be improved to make it clear that this selects the model from data with the AICThe next set of functions are short and just need the minimal description, inputs and outputs.
espei.paramselect._build_feature_matrix
espei.paramselect._generate_symmetric_group
espei.core_utils.get_data
espei.core_utils.get_samples
espei.core_utils.symmetry_filter
espei.paramselect.estimate_hyperplane
,espei.paramselect.tieline_error
,espei.paramselect.multi_phase_fit
Web documentation
Currently our walkers (concurrent chains) are initialized by sampling a Gaussian distribution that has a standard deviation of 10% of the parameter.
My understanding of the ensemble sampler implemented in emcee is that the distribution that new parameters are selected from depends on the other active walkers. This means that the rate of convergence initially is strongly dependent on the distribution of used to generate these walkers.
Initializing chains from larger Gaussian distributions means that we are less certain about our parameters initially and we will be searching a larger space in the initial iterations. Having too large a distribution initially might mean slow parameter convergence because the chains have to scale down to the relevant sampling space. Having too small of an initial distribution can cause the reverse, in that we waste a lot of time scaling up our sampling space.
We should benchmark different starting points for a given number of MCMC steps and compare the rate of convergence of parameter mixing with a single run that has 'fully' converged.
ZKL suggested some kind of Mendeley integration. I think it would also be reasonable and fit into the spirit of ESPEI to use bibtex files (possibly managed and imported/exported from Mendeley). There are several benefits to using bibtex:
The following should be tested in order to have unit tests covering the core functionality of ESPEI
espei.paramselect.lnprob
espei.core_utils.get_data
retrieves the right data (do in migration from TinyDB 2 to TinyDB 3)espei.paramselect.fit_formation_energy
should work for endmembers and interactions (mixing). Test against two one data point cases of formation energy for each. Then one with temperature for the endmember.espei.paramselect._fit_parameters
). When writing this test, make sure to verify that the chosen test case really is the lowest AIC among all the models and that all the possible models (parameter combinations) were chosen.espei.core_utils.endmembers_from_interaction
are properly computed for several cases of mixing sublatticesespei.core_utils.get_samples
are properly computed for several cases of mixing sublatticesespei.core_utils.build_sitefractions
properly constructs site fractions from sublattice configurations and occupanciesespei.paramselect._generate_symmetric_group
handles cases with and without symmetry correctlyThis is a feature request which is probably out of scope for #28.
Can every place where the run settings file accepts a filename or path, accept a general URI (e.g., https, ssh, git)? I think urlparse/urllib
in the stdlib makes this a reasonable request.
See: https://stackoverflow.com/questions/22238090/validating-urls-in-python
One complicating factor is all the calls to open()
and np.load()
would need to get filtered through urllib
, but I think this would be a very nice feature long term: Download datasets pinned to a Git repo, upload output TDBs to an S3 bucket, etc.
Related to this, being able to specify the output
key multiple times would be useful once it would be possible to write results out to multiple remote locations.
Enable fitting to thermochemical data such as activities.
Should MCMC also consider this and single phase data (e.g. heat capacities)?
I had several issues running the Cu-Mg example from the ESPEI website. I installed ESPEI using the conda command, and took the Cu-Mg data directory from the ESPEI-datasets repository.
I first tried reproducing the diagram from the section titled, First-principles phase diagram
The code successfully ran, but the returned phase diagram didn't match the example well:
I then tried reproducing the results in the MCMC optimization section. I wasn't able to successfully perform the MCMC optimization. The code returned numerous errors over the course of several minutes and eventually hung with no further output.
This file contains the full python output when I ran the optimization:
espei_mcmc_error.txt
Here is my python version and installed packages/versions:
python_info.txt
Check...
This is less of an issue when things are automatic
This doesn't really negatively affect user experience, it just adds some noise.
Also raises the question to whether tracefile and probfile belong in the output
section (because they are output) or the mcmc
section (because they are only used for mcmc.
We want to catch singular matrix errors and pass infinite log probability instead of stopping runs. Treat like a convergence failure
AICc aims to prevent over parameterization for small number of samples.
where k is number of parameters, L is likelihood, n is number of samples.
AICc collapses to AIC for high n.
All that needs to be done is change the formulation in the paramselect.py
module
Example espei -n 4
will select the n_workers=4 on dask. Currently the dask scheduler is hardcoded to use half of the available processors in multiprocessing.
This will require adding the argparse argument n
with a default. The default should be half of the available cores for dask and all of the MPI ranks.
The implementer should make a judgement on whether or not the -n
option should support MPI. Would it make sense to use less than the available MPI ranks?
Since MPIPool has shown we aren't required to use dask, we could support multiprocessing as well, especially in light of #22.
This would need changes to
'emcee'
(simple; and understandable) or 'multiprocessing'
(more accurate)run_espei.py
. Pass emcee's InterruptiblePool
docs link as an object like with MPIPool and dask's client
.We shouldn't need any changes to paramselect.py
, but this should be tested on multiple platforms, if possible.
pycalphad is constraining our dependences to dask<0.20
and sympy<1.2
. Once pycalphad 0.7.1 is released, these should be fixed and we can release the constraints in travis.
Phases that do not have phase equilibria data should have their parameters fixed before the MCMC run.
A particular phase in an ESPEI run can have single phase DFT data and no phase equilibria. This means that the parameters that were calculated in the single phase fitting have no effect on the error function that is used in the MCMC run.
When parameters have no effect on the error function, they diverge when used in emcee because the ensemble sampler scales them up to infinity in an attempt to force that parameter to affect the error function.
I haven't checked yet, but my guess is some of the new dask config stuff affects ESPEI: http://matthewrocklin.com/blog/work/2018/06/14/dask-0.18.0
A first draft and feedback was written in this gist
The current iteration is:
Header area.
Include any metadata above the `---`.
---
# core run settings
run_type: full # choose full | dft | mcmc
phase_models: input.json
datasets: input-datasets # path to datasets. Defaults to current directory.
scheduler: dask # can be dask | MPIPool
# control output
verbosity: 0 # integer verbosity level 0 | 1 | 2, where 2 is most verbose.
output_tdb: out.tdb
tracefile: chain.npy # name of the file containing the mcmc chain array
probfile: lnprob.npy # name of the file containing the mcmc ln probability array
# the following only take effect for full or mcmc runs
mcmc:
mcmc_steps: 2000
mcmc_save_interval: 100
# the following take effect for only mcmc runs
input_tdb: null # TDB file used to start the mcmc run
restart_chain: null # restart the mcmc fitting from a previous calculation
This issue will focus on the development of a first generation input file structure and spec, and also as a place to brainstorm options that should be user-facing.
Was distributed (1.18.0) when this error occurred. Changed to distributed (1.16.3).
File "/Applications/anaconda/envs/my_pycalphad/bin/espei", line 11, in <module>
sys.exit(main())
File "/Applications/anaconda/envs/my_pycalphad/lib/python3.6/site-packages/espei/run_espei.py", line 135, in main
mcmc_steps=args.mcmc_steps, save_interval=args.save_interval)
File "/Applications/anaconda/envs/my_pycalphad/lib/python3.6/site-packages/espei/paramselect.py", line 754, in fit
for i, result in enumerate(sampler.sample(walkers, iterations=mcmc_steps)):
File "/Applications/anaconda/envs/my_pycalphad/lib/python3.6/site-packages/emcee/ensemble.py", line 259, in sample
lnprob[S0])
File "/Applications/anaconda/envs/my_pycalphad/lib/python3.6/site-packages/emcee/ensemble.py", line 332, in _propose_stretch
newlnprob, blob = self._get_lnprob(q)
File "/Applications/anaconda/envs/my_pycalphad/lib/python3.6/site-packages/emcee/ensemble.py", line 382, in _get_lnprob
results = list(M(self.lnprobfn, [p[i] for i in range(len(p))]))
File "/Applications/anaconda/envs/my_pycalphad/lib/python3.6/site-packages/espei/utils.py", line 39, in map
result = [x.result() for x in result]
File "/Applications/anaconda/envs/my_pycalphad/lib/python3.6/site-packages/espei/utils.py", line 39, in <listcomp>
result = [x.result() for x in result]
File "/Applications/anaconda/envs/my_pycalphad/lib/python3.6/site-packages/distributed/client.py", line 155, in result
six.reraise(*result)
File "/Applications/anaconda/envs/my_pycalphad/lib/python3.6/site-packages/six.py", line 685, in reraise
raise value.with_traceback(tb)
File "/Applications/anaconda/envs/my_pycalphad/lib/python3.6/site-packages/distributed/protocol/pickle.py", line 59, in loads
return pickle.loads(x)
RuntimeError: cannot release un-acquired lock```
I haven't been able to reproduce it consistently, but dark workers sometimes die with the dask scheduler.
To debug this, I turned on debugging output by scheduler = LocalCluster(n_workers=cores, threads_per_worker=1, processes=True, silence_logs=verbosity[output_settings['verbosity']])
.
I am still waiting for that job to have workers die to see the output, but for now as iterations in emcee
complete the results are processed in Python (it is known that this is happening because of the progress bar output). During this time, the LocalCluster debugging gives output
distributed.core - WARNING - Event loop was unresponsive for 1.69s. This is often caused by long-running GIL-holding functions or moving large chunks of data. This can cause timeouts and instability.
Usually I get two similar messages in a row.
As another possibility, the most recent time I was able to reproduce this was when I had two instances of ESPEI running at the same time. I wouldn't think that the different client instances would interact, but maybe it should be investigated.
Convert to JSON and validate internally.
Could be useful for anything digitized, particularly equilibria. Formatting problems are much easier to handle.
Not too much improvement if the data is already stored as arrays.
espei.paramselect._shift_referece_state
should handle non _FORM
or _MIX
outputs, but there needs to be a way to specify what the reference state is if, for example, CPM
data is passed
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.