pypest / pyemu Goto Github PK

python modules for model-independent uncertainty analyses, data-worth analyses, and interfacing with PEST(++)

License: BSD 3-Clause "New" or "Revised" License

Python 11.52% TeX 28.48% Reason 44.80% Smarty 3.87% Jupyter Notebook 9.06% Batchfile 0.01% PLSQL 0.02% Faust 0.03% Go 2.00% JetBrains MPS 0.13% FreeBasic 0.02% Visual Basic 6.0 0.06% VBA 0.01%

python uncertainty-analysis

pyemu's Introduction

pyEMU

python modules for model-independent FOSM (first-order, second-moment) (a.k.a linear-based, a.k.a. Bayes linear) uncertainty analyses and data-worth analyses, non-linear uncertainty analyses and interfacing with PEST and PEST++.
pyEMU also has a pure python (pandas and numpy) implementation of ordinary kriging for geostatistical interpolation and support for generating high-dimensional PEST(++) model interfaces, including support for (very) high-dimensional ensemble generation and handling

Main branch:

Develop branch:

Documentation

Complete user's guide:

https://pyemu.readthedocs.io/en/latest/

The pyEMU documentation is being treated as a first-class citizen! Also see the example notebooks in the repo.

What is pyEMU?

pyEMU is a set of python modules for model-independent, user-friendly, computer model uncertainty analysis. pyEMU is tightly coupled to the open-source suite PEST (Doherty 2010a and 2010b, and Doherty and other, 2010) and PEST++ (Welter and others, 2015, Welter and other, 2012), which are tools for model-independent parameter estimation. However, pyEMU can be used with generic array objects, such as numpy ndarrays.

Several equations are implemented, including Schur's complement for conditional uncertainty propagation (a.k.a. Bayes Linear estimation) (the foundation of the PREDUNC suite from PEST) and error variance analysis (the foundation of the PREDVAR suite of PEST). pyEMU has easy-to-use routines for parameter and data worth analyses, which estimate how increased parameter knowledge and/or additional data effect forecast uncertainty in linear, Bayesian framework. Support is also provided for high-dimensional Monte Carlo analyses via ObservationEnsemble and ParameterEnsemble class, including the null-space monte carlo approach of Tonkin and Doherty (2009); these ensemble classes also play nicely with PESTPP-IES.

pyEMU also includes lots of functionality for dealing with PEST(++) datasets, such as:

manipulation of PEST control files, including the use of pandas for sophisticated editing of the parameter data and observation data sections
creation of PEST control files from instruction and template files
going between site sample files and pandas dataframes - really cool for observation processing
easy-to-use observation (re)weighting via residuals or user-defined functions
handling Jacobian and covariance matrices, including functionality to go between binary and ASCII matrices, reading and writing PEST uncertainty files. Covariance matrices can be instantiated from relevant control file sections, such as parameter bounds or observation weights. The base Matrix class overloads most common linear algebra operators so that operations are automatically aligned by row and column name. Builtin SVD is also included in all Matrix instances.
geostatistics including geostatistical structure support, reading and writing PEST structure files and creating covariance matrices implied by nested geostatistical structures, and ordinary kriging (in the utils.geostats.OrdinaryKrige object), which replicates the functionality of pest utility ppk2fac.
composite scaled sensitivity calculations
calculation of correlation coefficient matrix from a given covariance matrix
Karhunen-Loeve-based parameterization as an alternative to pilot points for spatially-distributed parameter fields
a helper functions to start a group of tcp/ip workers on a local machine for parallel PEST++/BeoPEST runs
full support for prior information equations in control files
preferred differencing prior information equations where the weights are based on the Pearson correlation coefficient
verification-based tests based on results from several PEST utilities

Version => 1.1 includes the PstFrom setup class to support generating PEST(++) interfaces in the 100,000 to 1,000,000 parameter range with all the bells and whistles. A publication documenting the PstFrom class can be found here:

https://doi.org/10.1016/j.envsoft.2021.105022

A publication documenting pyEMU and an example application can be found here:

http://dx.doi.org/10.1016/j.envsoft.2016.08.017

Funding

pyEMU was originally developed with support from the U.S. Geological Survey. The New Zealand Strategic Science Investment Fund as part of GNS Science’s (https://www.gns.cri.nz/) Groundwater Research Programme has also funded contributions 2018-present. Intera, Inc. has also provided funding for pyEMU development and support

Examples

Several example ipython notebooks are provided to demonstrate typical workflows for FOSM parameter and forecast uncertainty analysis as well as techniques to investigate parameter contributions to forecast uncertainty and observation data worth. Example models include the Henry saltwater intrusion problem (Henry 1964) and the model of Freyberg (1988)

There is a whole world of detailed learning material for script-based approaches to parameter estimation and uncertainty quantification using PEST(++) at https://github.com/gmdsi/GMDSI_notebooks. These are and excellent resource for people picking up Pyemu for the first time and for those needing to revisit elements.

How to get started with pyEMU

pyEMU is available through pyPI and conda. To install pyEMU type:

>>>conda install -c conda-forge pyemu

>>>pip install pyemu

pyEMU needs numpy and pandas. For plotting, matplotloib, pyshp, and flopy to take advantage of the auto interface construction

After pyEMU is installed, the PEST++ software suite can be installed for your operating system using the command:

get-pestpp :pyemu

See documentation for more information.

Found a bug? Got a smart idea? Contributions welcome.

Feel free to raise and issue or submit a pull request.

pyEMU CI testing, using GitHub actions, has recently been switched over to run with pytest. We make use of pytest-xdist for parallel execution. Some notes that might be helpful for building your PR and testing:

Test files are in ./autotest
Pytest settings are in ./autotest/conftest.py and ./autotest/pytest.ini
Currently, files ending _tests.py or _tests_2.py are collected
Functions starting test_ or ending _test are collected
ipython notebooks in .examples are also run
As tests are run in parallel, where tests require read/write access to files it is safest to sandbox runs. Pytest has a built-in fixture tmp_path that can help with this. Setting optional argument --basetemp can be helpful for accessing the locally run files.

Running test locally

To be able to make clean use of pytests fixture decorators etc., it is recommended to run local tests through pytest (rather than use from script execution and commenting in main block). For e.g.:

Run all tests:

pytest --basetemp=runner autotest

with pytest-xdist, local runs can be parallelized:

pytest --basetemp=runner -n auto autotest

Run all tests in a file:

pytest --basetemp=runner -n auto autotest/testfile_tests.py

Run a specific test [`this_test()`]:

pytest --basetemp=runner autotest/testfile_tests.py::this_test

Using an IDE:

Most modern, feature-rich editors and IDEs support launching pytest within debug or run consoles. Some might need "encouraging" to recognise the non-standard test tags used in this library. For example, in pycharm, to support click-and-run testing, the pytest-imp plugin is required to pickup test functions that end with _test (a nosetest hangover in pyEMU).

pyemu's People

Contributors

Stargazers

Watchers

Forkers

brclark-usgs jroth-usgs wk1984 hrobot michaeltryby kyledavis-usgs zandy19 iaborsi rosskush xuexianwu lihu8918 mwtoews daniel-partington jbellino-usgs kbrannan fluidmotion mjknowling partham123 briochh hdcm fangqx corcorf mikpim01 dpphat yannikbehr volpatto elwan3 wha7 spark-brc hwreeves-usgs ntdosch sbai7 jwhite-usgs ougx scalet98 jesshe jepmonteagudo wxh0000mm smwesten-usgs dnangle whejs aleaf jtwhite79 aymanalz surajitdb cecile-a-c ogweno philip928lin datageoranger pesca92 behroozetebari-dwr lechambre44 passion4energy lfoster-usgs robin-kw ciegolon vicmansep apryet cnicol-gwlogic hugovdberg mnfienen mearll scotthmckean rpacheco87 topazog kallejahn scchhh3 johanna-scheidegger butbut0 rhugman constablecatnip guiv06 jonathanqv rhugman-intera swfwmd nikobenho 012db jdhughes-usgs snkn westie314 ashalamu d11638104 dgketchum wkitlasten laat0003 nicohiggs davidlaw182

pyemu's Issues

various utils rely on inschek

Several examples and utils rely on inschek.exe. Is it possible to add inschek to the bin? If so, I will do the menial task of changing the run('inschek...) commands to the relative path so Windows users don't have to deal with another download and changing their path in environment variables.

parameter names too long, hk template file too wide

Just in case Matt doesn't get back to me...

My pest setup is auto-generating parameter names that are longer than 12 characters (for simple stuff like hk and porosity). Also due to the size of the model (168 cols), the hk template file is too wide (> 2000 characters). Pest ++ is not impressed and won't run.

Apparently there is a switch for the setup pest file command to shorten the names. Could someone please show me where this is?

Many thanks,
Rebecca

dictionary key mismatch in fac2real?

probably abusing this - but i don't yet see how the dictionary key for pp_dict would match the key for fac_data?

looking around line 1986 in geostats.py
fac_sum = sum([pp_dict[pp] * fac_data[pp] for pp in fac_data])

the pp_dict key is defined as the pilot point name
pp_dict = {name:val for name,val in zip(pp_data.index,pp_data.parval1)}

but the fac_data key is defined by some value preceding the factors? so i keep getting a 'key error'.
perhaps it's my inputs (i'm providing fac2real a dataframe rather than a pilot point file) - but it seems to reading it ok, and the required columns are present.

Issue with regularization and parameters containing "-" in name

When parameter names contain "-" character (kinda bad form, but legal in PEST according to manual) there is a bug in Pst.write() when prior information has been implemented using zero_order_tikhonov. In the Pst-write() method, there is a call to private function _parse_pi_par_names which splits each prior information equation on + or -.

As a result, the function concludes that the name pulled from the equation (which splits up names with a - in them) is invalid and removes all the prior information equations prior to writing the files.

The fix I suggest would be to check whether each parameter name is in an equation (likely very expensive). Not sure best other choice except maybe warn users if they use a - in a parameter name?

nested list format for the helper function to generate the PEST control file

In the function doc, it says the nested format is like this ["lpf.hk",[0,1,2,]]. However, to make it work, it should be in the format like the one below
[["lpf.hk",0], ["lpf.hk",1]],["lpf.hk",2]]

pestpp-ies build error resolved.

Hi Jeremy.
Thanks for your help resolving the template build issue I was having over on usgs\pestpp . The issue that you found was that the prior.jcb included negative numbers. I was building prior.jcb with an ensemble I prepared using pyEmu without enforcing parameter bounds.

I was attempting to follow the setup_pest_interface.pdf workflow presented on the mnfienen-usgs/GW1876 GitHub page; however, as I was using FEFLOW I was not able to use the pyemu.helpers.PstFromFlopyModel to build the ensemble. It turns out I selected the wrong utility as an alternative.

Instead, what I should have used is the pyemu.ParameterEnsemble.from_gaussian_draw with enforce_bounds set to 'True'. Pestpp-ies is now running.

Thank you so much for your assistance. I was not sure where to begin to debug my workflow (after several wrong turns before asking for help).

exe name in forward_run when automagically generated

Yo -

For Freyberg example, trying to make the automagically generated PEST files use NWT instead of MF2005 for the exe_name in the forward_run.py script. Can't seem to force that, though. I tried implementing passing in an org_model_exe_name optional variable and setting it, but somehow it's getting overwritten. Check out the PST_to_pyemu_to_Schur_Overview.ipynb notebook.

Having trouble running the Freyberg example

I am running the Freyberg example so I can learn to use flopy and pyemu to develop PEST calibration simulations for a project I am working on. However, I am running into an issue at line 70 in the pest_freyberg.py script (pyemu.pst_utils.smp_to_ins(os.path.join("misc","freyberg_heads.smp"))). This is writing a freyberg_heads.smp.ins file with the following format:
pif ~
l1 w w w !OR00C00_02012015!
...
...

This same file in my cloned version of the example has the following format:
pif ~
l1 (OR00C00_0)39:46
...
...

Eventually, my freyberg_heads.smp.ins is causing an error when the script is setting up the par value, bounds, and groups. Do you have suggestions for getting past this error?

Thanks in advance,
porter

Baffled by "ControlData"

Not really an issue with pyemu, but can I change NOPTMAX for pst class using pyemu?

Docs and refactor coming soon

Just a heads up, Im in the process of reworking (or initially working up) more complete documentation and also cleaning up code as I go. This is in preparation for the new control file format as well as a dedicated prior information equation class.

Lots of stuff potentially on the chopping block, including:

deprecation-warning wrapped reroutes for functions that have moved
sparse covariance matrix support (not needed now that we draw by groups)
some of the myriad of reweighting functions in Pst (how did we get so many!?)
The MonteCarlo class. This has been largely replaced by the classmethod constructors in ParameterEnsemble and ObservationEnsemble. Probably gonna move the null-space projection stuff to ErrVar since that is where all the subspace stuff.

Speak now or be prepared to patch old scripts!

pst.write() iterates through n characters (rather than n paths) if pst.model_command value is changed.

obslist_dict

Thinking of a tiny refactor for the data/parameter importance methods. One of the optional inputs is the obslist_dict. But, since this is typically a dictionary with a list as the keys and the values, I suggest allowing users to supply only the list and then make the dict internally. If so, it would make sense to change the input arg from obslist_dict to obslist and similarly parlist_dict to parlist

I already implemented it in my fork. If you think it's ok, I will submit a pull request.

update kriging factors after update the pp pars ranges in the pst class

How can I update the factor files with PyEMU after I change the parameter ranges of some pilot points? The initial factor files were generated automatically when PstFromFlopyModel was called.

PstFromFlopyModel fails when use_pp_zones==False but ibound includes values>1

I was having trouble getting PstFromFlopyModel to run with my model when using only pilot point properties (no grid_props or grid_props). This stopped being an issue when I replaced all non-zero values in my ibound array with ones. I think setup_pilotpoints_grid was only creating pps where ibound==1, and as a result the helper crashed out during calculation of interpolation factors.
The documentation says that if use_pp_zones is False, ibound values greater than zero are treated as a single zone for pilot points, but this doesn't seem to be the case at present.

Code and error:

pp_props = [["upw.{}".format(p),l] for p in ["hk","vka","ss","sy"] for l in range(nlay)]
helper = pyemu.helpers.PstFromFlopyModel(m, new_model_ws, model_exe_name='mfnwt', pp_props=pp_props,remove_existing=True,pp_space=4,use_pp_zones=False,build_prior=True)

~/miniconda3/envs/flopy/lib/python3.7/site-packages/pyemu/utils/helpers.py in init(...)
1717 self.setup_list_pars()
-> 1718 self.setup_array_pars()

~/miniconda3/envs/flopy/lib/python3.7/site-packages/pyemu/utils/helpers.py in setup_array_pars(self)
2483 if self.pp_suffix in mlt_df.suffix.values:
2484 self.log("setting up pilot point process")
-> 2485 self.pp_prep(mlt_df)

~/miniconda3/envs/flopy/lib/python3.7/site-packages/pyemu/utils/helpers.py in pp_prep(self, mlt_df)
-> 2270 ok_pp.calc_factors_grid(self.m.sr, var_filename=var_file, zone_array=ib_k)

~/miniconda3/envs/flopy/lib/python3.7/site-packages/pyemu/utils/geostats.py in calc_factors_grid(self, spatial_reference, zone_array, minpts_interp, maxpts_interp, search_radius, verbose, var_filename, forgive)
776 if self.interp_data is None or self.interp_data.dropna().shape[0] == 0:
--> 777 raise Exception("no interpolation took place...something is wrong")

Exception: no interpolation took place...something is wrong

pst_helper - processing a different number of wells in each stress period

Hello,

I'm trying to run the pst_helper on an existing model that has a different (increasing) number of wells in each stress period. I get the following error:

Exception: spatial_list_pars() error: must have same number of entries for every stress period for wel

I've done the obvious fix of re-writing my model .wel file to run all wells in all stress periods, some with zero pumping rates. But that's a bit painful and I'm a bit lazy.

I wonder how common this is in other models, and whether it is worth developing a fix to re-write the .wel file to the appropriate format. I suspect it would take about 3 lines of code. I also suspect that I could probably do this if it annoys me enough and if I can find a few spare minutes. I'll schedule it in for 2021. :)

Cheerio,
Rebecca

PstFromFlopyModel expects pilot points

Ran into a problem with PstFromFlopyModel in which methods bombed if pp_props was not set. I made traps to try and allow subsets of properties to be set. I committed them in develop.

Log transform in build_jac_test

build_jac_test_csv, in helpers - develop branch, takes the log of the parameters adds the increment and then does 10^(result) to get the perturbed value for the dataframe. I though the parameter specifications in PEST were on the untransformed values for log parameters, so that the value in the table should just be (base_value + increment). Right now, if the base value for a log transformed parameter is 1.0, and the increment is defined as 'relative' with a value of 0.1, the jac_test output table is getting 1.258925 (10^1.1) and I was expecting 1.1 for the first increment.

Add observations not adding obsval or obgnme

This is most likely my crappy python skills, but...

I'm setting up a pest control file. I've created a timeseries of all obs well locations for all stress periods using setup_hds_timeseries(). The dataframe looks fine, and pulls head values out of the .hds file.

I run the pest helper to set up the control file. All looks good.

Then I pst.add_observations(), and write the forward run with frun.

But in my resulting pest control file I have all of the head obs names that I put in with the timeseries appended nicely, but the obs val has gone to 10^10 and obgnme is just "obgnme". Where did my obs values go??!!

I'm at the point where I've just trying to overwrite the obs values in the pst file (if only my df filtering skills were up to speed) but it would be good to know whether this is a bug or my ineptitude. It seems like double handling.

Minor edit in the function modflow_hob_to_instruction_file()

Hello,

I'm using the notebook titled 'modflow_to_pest_like_a_boss' to build PEST++ files for a groundwater model (I'm working with a team from USGS-NMWSC) . On running the function pyemu.gw_utils.modflow_hob_to_instruction_file() in cell 4 of the notebook, I get an instruction file in which a typical line is:

l1 w w !i03j10.2!

Since we're interested in reading the values in the first column in the freyberg.hob.out file titled 'SIMULATED EQUIVALENT', I changed a line in the pyEMU source code (in the script gw_utils.py, in the function modflow_hob_to_instruction_file(), line 69 - removed the two w's), so a typical line in the instruction file now reads:

l1 !i03j10.2!

I think this change is necessary for the head observation file that I'm working with (which is exactly like the file freyberg.hob.out) - I'm not sure if this change is required for other head observation files as well, but I thought I'd bring it to your attention.

Thanks,

Rishi Jumani.

setup_hob in helpers.py fails to get hob_out_fname

I've just been trying out the 'notest_MODFLOW_to_PEST_even_more_boss' example notebook and I think I've run into a bug:

'setup_hob' calls flopy's 'get_output_attribute' function to get the head observation output filename, but doesn't pass 'fname' to the 'attr' field.
as a result, 'get_output_attribute' doesn't return anything and the 'os.path.join()' call fails.
(I guess it's also possibly an issue that flopy doesn't give an error when no 'attr' is passed to 'get_output_attribute).

Loving your work.
F

Getting actual head observations from actual groundwater wells into pest control file

Hi clever people,

Sorry to use your bug fix zone as a help centre, but....

Which util do I need to use to get all of my head observation time series (and river flux time series) into the pest control file? I think I'm looking for something like PESTGEN. If it's in pyemu I'll use it, otherwise I'll just use boring basic dos PESTGEN.

And on that, did I miss something in the training workshop, or does the Freyberg example notebooks not have any field observations included? I can't find them anywhere in the model files or in the pest file. If not, how can you calculate a phi?

uniform distributions for draw

OK - got a strange bug. I worked on it but couldn't quite sort it out....

With the new functionality to allow making a draw from either a uniform or Gaussian distribution, I hit trouble when sampling more than once.

Check out notebook NSMCexample_freyberg.ipynb in examples. If you never specify how='uniform' then sampling is always made as Gaussian (the default). Cool. If you specity how='gaussian', same (as expected) behavior. But, if you make a draw specifying how='uniform' and then make a subsequent draw using Gaussian (either explicitly or by not specifying how and thus going default style) a strange set of exceptions get thrown that I haven't been able to get to the bottom of.

The notebook in my branch master_mnf illustrates this behavior.

pst_handler / #comments

A minor, possibly me-specific issue in pst_handler: if suitable for broad use, can we please add comment='#' to pd.read_csv() in _read_df()? Pyemu bombs on reading a big pest control file I have which has comments throughout, marked modflow-style with #s.

Thanks a lot,
Chris.

changing to method of Sobol

@mnfienen-usgs has got me hooked on this package, but I'm having a problem right now.

pestpp-sen is going away from gsa files for specifying methods.

I'm trying to use the following code to specify the method of Sobol.

morris = False
if morris:
    pst.pestpp_options["gsa_method"] = 'morris'   
else:
    pst.pestpp_options["gsa_method"] = 'sobol'

However, only the method of Morris is running.

How do I specify the method of Sobol?

Issue to run "python forward_run.py"

I am using miniconda to manage my Python packages. Although I have added Python path into the system environment path variable. I still cannot run "python forward_run.py" directly. It will raise the following error if I run it directly from a cmd window. However, if I run activate base first, it works well. I have tried to add activate base into the PEST control file. PESTPP cannot start activate command. Any suggestion?

Traceback (most recent call last):
File "forward_run.py", line 3, in
import numpy as np
File "C:\Users\cui00e\AppData\Local\Continuum\miniconda3\lib\site-packages\numpy_init_.py", line 140, in
from . import _distributor_init
File "C:\Users\cui00e\AppData\Local\Continuum\miniconda3\lib\site-packages\numpy_distributor_init.py", line 34, in
from . import _mklinit
ImportError: DLL load failed: The specified module could not be found.

what is the int variable in spatial_list_props ([[`str`,[`int`]]])?

what is the the int variable in spatial_list_props ([[str,[int`]]])'?

Zero variance with pyemu.Cov.from_parameter_data despite valid parbnds?

I am trying to build a prior parameter covariance matrix from parameter bounds.

pst=pyemu.Pst('pcf.pst')
cov = pyemu.Cov.from_parameter_data(pst)

Gives the following:
Exception: Cov.from_parameter_data() error: variance for parameter 10309025a1 is 0.0

The relevant line in the pst file is:

10309025a1 log factor 3.221851906E+00 1.000000000E-06 4.000001000E+00 inflow 1.000000000E+00 -2.000001000E+00 1.000000000E+00

from pst.parameter_data.loc['10309025a1',:]

parnme 10309025a1
partrans log
parchglim factor
parval1 3.22185
parlbnd 1e-06
parubnd 4
pargp inflow
scale 1
offset -2
dercom 1
extra NaN
Name: 10309025a1, dtype: object

Changing parlbnd from 4.0 to 5.0 gets gets beyond parameter 10309025a1. Strange, of just me?

(won't let me attach control file)

restart option in start_slaves utility?

Hi there,

I hacked the start_slaves utility to include an option to restart an aborted run. Seems to have worked and wonder if you think it's something worth including in your code.

I just added an option 'restart'=True (default value is False)...then under the master section, added an if/else to include ' \r' in the arg list if restart=True

if restart:
args = [exe_rel_path, pst_rel_path, "/r", "/h", ":{0}".format(port)]
else:
args = [exe_rel_path, pst_rel_path, "/h", ":{0}".format(port)]

inschek chokes with <100K observations

Another small issue with automatically setting up PST from flopy model. inschek chokes with >100K observations, so can't use it to populate starting values for observations. In my example, I just removed some potential obs from HYDMOD, but might want to figure a way around that...

Is STR (stream flow package) supported currently?

I have a STR package in the model. I have tried to include a multiplier for str.con by the following code.

bc_props = []

bc_props.append(["str.cond",0])

mfp_boss = pyemu.helpers.PstFromFlopyModel(model=m,new_model_ws=new_model_ws,org_model_ws=temp_model_ws,mflist_waterbudget=False, pp_props=pp_props, obssim_smp_pairs = obssim_smp_pairs, zone_props=zone_props, **spatial_list_props=bc_props,** remove_existing=True,pp_space=5,par_bounds_dict = {"hk":[0.01,100.0]})

I got the following error when I run it. It seems like PyEMU doesn't rewrite the STR file as it does for other packages, such as WEL and RCH.
FileNotFoundError: [Errno 2] No such file or directory: 'template\.\STR_0000.dat'

Ensemble Smoother Example Notebook.

Hi Jeremy.
Thanks so much for your work on the pyEmu library.
Do you have any notebooks in which you demonstrate setting up of an iterative ensemble smoother that you are able to share?
Thanks again.
Dan Puddephatt

zone based parameterisation for GHB conductance

Can I apply zone-based parameterisation to the GHB conductance? Basically, I want to include a few multipliers for GHB conductance that have been grouped into a few classes based on their locations.

plotting coming soon

plotting needs:

Pst.plot() - 1to1 (by groups) (if res is available), parameter (and forecast) historgrams (if pest++ unc summaries are available)

Ensemble.plot() - historgrams (several per page, prob multipage pdf), support for several Ensembles on same plots to compare histrograms (and changes), add initial/observed values from Pst (if avail)

Schur.plot() - bar charts of prior/post/reduce uncertainty for pars and forecasts. Optionally historgrams if initial and final par/forecast values are avail.

ErrVar.plot() - stacked bar charts for each forecast. option to add Schur posterior as a horizontal line

Any thoughts why figures are changing sizes?

Recently 1:1 plots are coming out with odd sizes. It was working fine about a month ago.

With nr,nc=4,2 in plot_utils.py I get:

With nr,nc=2,2 the plots for the first two plots are the expected size, the next two are tiny, the next two are a little bigger, and all the rest are the expected size

(Sorry I can't get jupyter notebook to stop compressing the size of the window and adding the slider bar)

Since I am prone to messing things up and I was messing around with the subplot rows and cols, I attempted to make a clean copy. This give me a clean copy, right?

git fetch upstream
git checkout develop
git merge upstream/develop

Mixing JCO/PST objects/filenames in instantiation

Bit of an obscure error here. When making a LinearAnalysis object (I was making a Scur object which calls the LinearAnalysis constructor) user can pass a string for jco or a pyemu.Jco object.

So, if user passes a string for jco, pyemu replaces .jco or .jcb extension on the string with .pst and reads what it needs from that associated PEST control file.

But, if the user passes a pyemu.Jco object to the constructor, then there must also be a string or object for the pst argument to load from there. But.....if a user passes an object for jco and nothing for pst, pyemu is unable to find the PEST control file (tries to operate on jco like a string).

I added an exception to handle this case and warn the user simply that if they provide an object for jco they must also pass an explicit pst arg. However, this is bombing a bunch of tests (tests that are intended to fail, but they fail without my new exception being caught and thus kill Travis). So....I propose we trap for these explicitly in the tests. However, I request a quick review to make sure that the intent of the tests is still met.

par bounds when multipliers are used

Are the bounds for the multipliers or the raw parameters in par_bounds_dict, when PstFromFlopyModel is called.

Scipy 0.15 not working with SVD

FYI - I couldn't not get parameter identifiability to work when running with Scipy 0.15. Rolling back to 0.14 worked. In the error_variance_example notebook it would fail on the first code line under parameter identifiability. It appeared to be an issue with la.svd, "LAPACK function dgesdd_lwork could not be found"

PstFromFlopyModel creates template files with rows longer than 2000 characters

As I understand it this is a problem for standard PEST. PEST++ seems to be managing ok, but I can't find positive confirmation in the manuals that the row length limit doesn't apply for PEST++.

Is there a way to reduce the parameter space width, to reduce the resulting width of the template files?

MonteCarlo Draw Assertion Error with Sparse Covariance

Seems that the MC draw method bombs on an assertion error when passing it a Cov object made with pyemu.utils.helpers.sparse_geostatistical_prior_builder()

Not sure if best bet is to change the assertion at 161 in mc.py so that rather then assert isinstance(cov,Cov) also asserts to allow isinstance(cov,SparseMatrix)?

Alternative would be to issue a specific error if SparseMatrix isn't supported.

pestpp-ies not running with test case

Hi there. I'm trying to do a test run with pestpp-ies before I do a full monte carlo run. Think notebook "prior_montecarlo" which I am following like a fool so I can learn this at mere mortal speed. At this point:

"replace the par vals with the first row in the par ensemble"
pst.parameter_data.loc[pe.columns,"parval1"] = pe.iloc[0,:] #pe.iloc is [num runs,num params]
pst.control_data.noptmax = 0
pst.write(os.path.join(t_d,"test.pst"))
pyemu.os_utils.run("pestpp-ies test.pst",cwd=t_d, verbose = True) #HERE
res = pyemu.pst_utils.read_resfile("test.base.rei") #os.path.join(t_d,"test.base.rei"))
res.loc[pst.nnz_obs_names,:]

I get an error saying that it can't find test.base.rei because the pestpp-ies has not executed properly.

--- 'noptmax'=0, running control file parameter values and quitting ---
...running control file parameter values
...failed realizations: 1
...the following par:obs realization runs failed: BASE:BASE,

--- control file parmeter value run failed...bummer ---

Yah, bummer indeed.

My original w14c.pst file runs fine with -ies. The only difference I can see between test.pst and w14c.pst is that the 1.0 multipliers have been replaced by the ensemble from the jacobian, as they should have. And the obs groups have been reversed, which shouldn't matter much.

Is pestpp-ies really sensitive to white space or something odd like that? I can't work out why I'm not getting my test.base.rei file generated.

I haven't gone straight to the full monte carlo run yet, might test this next.

Thanks a million.

numbering of pst and par files

I was going to change this myself and do a pull request, but wasn't sure how deep the implications are:

when writing out multiple pest control files with write_psts method of a MonteCarlo object, numbering starts at 0.

but....when wither writing out individual par files with parensemble.to_parfiles or a big csv file with parensemble.to_csv, numering starts at 1. In the code for to_parfiles it is using the index rather than just iterating over a range (as it does in write_psts). To integrate most cleanly with HTCondor, I'd love to see zero-based numbering throughout, but in any case, it would be good if the PST and PAR files were consistent one way or the other.

Tied/fixed parameter and data worth analysis issues

I'm having difficulty running through the Schurexample_Freyberg.ipynb code with my own data. The problems seem to be related to having tied or fixed parameters when calling 1) get_par_group_contribution() or 2) get_removed_obs_importance() on the Schur class.

Also, I have a general question. I'm performing a data worth analysis on output data from a hydrologic model run in a data-poor system. I have observation weights for all model predictions (time series of gw level) set to zero as in the example, but I find I get errors in the above functions if I don't have at least a few weights greater than zero...is this a quirk in the code or am I overlooking something fundamental?

Thank you for your time, and for a great, easy-to-implement tool!

pyEMU_test_KBreen.zip

adjust_weights_by_list

So this method has a bit of strange behavior. You can pass a list and a weight and the weights in the observation data section whose names are on the passed list get assigned the new weight. All good - BUT - there is a line in the code that restricts this behavior to only observations that start with a weight of 0.0. A couple thoughts on this:
0) at a minimum, the documentation should make this clear. I've been working with a user who had been trying to run this and could not figure out why no weights were being updated (and the reason is that the weights start with nonzero values)

we could make an option regarding nonzero existing weights
we could remove this method completely because it can easily be implemented directly in Pandas on the observation_data dataframe like:
pst.observation_data.loc[obslist, 'weight'] = new_weight

Open to ideas, but as it is, I think the documentation can lead to confusion.

Issue with calculating correlation coefficients from Covariance matrix

Hi,

I've been trying to use the to_person method to convert a covariance matrix (from a jco file) to a correlation coefficient matrix, but I get an error: ''''assert not self.isdiagonal''''.

The covariance matrix I'm passing is diagonal (167 x 167). Are there some methods within pyemu to correctly convert the covariance matrix so that it can be used by the to_pearson method? Apologies if there is something obvious I'm missing that is explained elsewhere.

Thanks.

pestpp-ies can't find .hds_timeseries.processed or .hds_timeseries.processed.ins files

Think the title explains it. The .hds_timeseries.processed files are in the same directory as pestpp-ies and all the rest of the model files - I can see them, pestpp-ies can't.

I have the most up to date version of pestpp-ies.exe

I've stuck them on github here: https://github.com/RebeccaDoble/gw-model

I’m using setup_pest_interface_GISERA6.ipynb, and the working directory starts as w14c_test and ends up as w14c_template.

Control Data typo

On line 17 of pst_controldata.py, the default value for obsreref is mis-typed as noobsref (should be noobsreref).

transferring repo ownership

just a heads up for those with active development forks: Im going to transfer the ownership of the pyemu repo to usgs/pestpp. This is to make clear that pyemu is not owned by any one person and instead is something for all of us to work on (it needs lots of work!). Logistically, according to some random bits of information I found, I dont think it will interrupt anyone's fork - you should still be able to push and pull just as before.

pyemu draws outside par*bnd

I am trying to build a scaled parcov in the attached notebook and check the parvals from the pyemu draws, but it looks like the parvals from the draws are going outside the parlbnd and parubnd (line [27:] ). Parbnds should be enforced in the draws, yes?

pyemu_parcov_wes2.zip

pst_handler and tied parameters

the pst_handler class doesn't handle tied parameters

Initial version of documentation - need help!

Thanks to @smwesten-usgs , we now have some initial docuementation for pyemu! It needs lots of work, especially examples. All PRs accepted (no credit, bad credit, doesn't matter!) - this is an easy way to jump up your GH stats!