Coder Social home page Coder Social logo

pystan's People

Contributors

ahartikainen avatar amas0 avatar asottile avatar er-eis avatar jburroni avatar jgabry avatar michaelclerx avatar mikediessner avatar mjcarter95 avatar riddell-stan avatar riddella avatar smoh avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

pystan's Issues

AssertionError with num_flat_params and constrained_param_names

Hi,

I have a simple model which won't sample.

import stan as stan
import numpy as np

schools_code = """
data {
             int<lower=0> J;
             real y[J];
             real<lower=0> sigma[J];
}
parameters {
             real mu;
             real<lower=0> tau;
             real theta_tilde[J];
}
transformed parameters {
             real theta[J];
             for (j in 1:J)
                 theta[j] = mu + tau * theta_tilde[j];
}
model {
             mu ~ normal(0, 5);
             tau ~ cauchy(0, 5);
             theta_tilde ~ normal(0, 1);
             y ~ normal(theta, sigma);
}
generated quantities {
             vector[J] log_lik;
             vector[J] y_hat;
             for (j in 1:J) {
                 log_lik[j] = normal_lpdf(y[j] | theta[j], sigma[j]);
                 y_hat[j] = normal_rng(theta[j], sigma[j]);
             }
}
"""

data = {
    "J": 8,
    "y": np.array([28.0, 8.0, -3.0, 7.0, -1.0, 1.0, 18.0, 12.0]),
    "sigma": np.array([15.0, 10.0, 16.0, 11.0, 9.0, 11.0, 10.0, 18.0]),
}

posterior = stan.build(schools_code, data=data)
fit = posterior.sample(num_chains=4, num_samples=100, num_warmup=0)

This will give the following exception

Traceback (most recent call last):
  File "run_sc.py", line 43, in <module>
    fit = stan_model.sample(num_chains=4, num_samples=100, num_warmup=0)
  File ".../miniconda3/envs/stan3/lib/python3.7/site-packages/stan/model.py", line 176, in sample
    save_warmup,
  File ".../miniconda3/envs/stan3/lib/python3.7/site-packages/stan/fit.py", line 58, in __init__
    assert num_flat_params == len(constrained_param_names)
AssertionError

This compiles and samples fine with PyStan 2.18.

black/flake should wrap on 100, not 99

The line length argument to black governs the col at which wrapping occurs. So a limit of 100, I believe, limits us to 99 characters, which is what PEP8 mentions.

(I'm happy with 120 but if arviz is going with 100, let's do that.)

All non-sampling, non-maximize related tests from PyStan 2 should pass

Tests from the old pystan repository need to be translated into pystan 3.

Compiling

  • test_extra_compile_args.py

Fixed param

  • test_fixed_param.py

Other

  • test_rstan_getting_started.py
  • test_rstan_stan_args.py
  • test_rstan_stanfit.py
  • test_stan_file_io.py
  • test_user_inits.py
  • test_utf8.py
  • test_vb.py
  • test_chains.py (rhat, ess)
  • test_ess.py
  • test_misc_args.py
  • test_misc.py
  • test_pickle.py

Extra

  • test_lookup.py

PyStan changes data inplace

I noticed that the following loop changes data inplace.

# in `data`: convert numpy arrays to normal lists
for key, value in data.items():
    if isinstance(value, np.ndarray):
        data[key] = value.tolist()

Is this what we want? Because I highly doubt that deep copy would fill anyones RAM.

`stan` is an (empty) package on PyPI

People may be confused and try to pip install stan. stan was claimed on Nov 18, 2018. But there seems to be nothing there. We might try sending a nice email to the owner asking for the name. The owner's name is in the setup.py of the (empty) package.

import stan

Hi,

I'm not sure if this is viable option, but for PyStan 3 should we start to prefer import as idiom for examples and common usage?

import pystan as stan

Or could we even use

import stan

The last option would probably fail on many users who have a folder named stan on working directory.

CI: Add a basic windows test

In the event that changes in mingw or Stan math solve the windows crashing bug it would be nice to have an automatic way of learning about it. Testing windows compilation and model running with mingw -- even if it fails right now -- would let us do that.

Segfault sampling from model, 8schools samples work fine

The model is straight from the manual (below). 8schools works fine.

program_code = """
data {
  int<lower=0> N;              // num individuals
  int<lower=1> K;              // num ind predictors
  int<lower=1> J;              // num groups
  int<lower=1> L;              // num group predictors
  int<lower=1,upper=J> jj[N];  // group for individual
  matrix[N, K] x;              // individual predictors
  row_vector[L] u[J];          // group predictors
  vector[N] y;                 // outcomes
}
parameters {
  corr_matrix[K] Omega;        // prior correlation
  vector<lower=0>[K] tau;      // prior scale
  matrix[L, K] gamma;          // group coeffs
  vector[K] beta[J];           // indiv coeffs by group
  real<lower=0> sigma;         // prediction error scale
}
model {
  tau ~ cauchy(0, 2.5);
  Omega ~ lkj_corr(2);
  to_vector(gamma) ~ normal(0, 5);
  {
    row_vector[K] u_gamma[J];
    for (j in 1:J)
      u_gamma[j] = u[j] * gamma;
    beta ~ multi_normal(u_gamma, quad_form_diag(Omega, tau));
  }
  for (n in 1:N)
    y[n] ~ normal(x[n] * beta[jj[n]], sigma);
}
"""

Type annotations are not at 100%

All code should be type annotated. (We need to at least start measuring this.)

  • 100% typing coverage on httpstan
  • 100% typing coverage on pystan

Document how to run tests

Dear @ariddell,

Newcomers would welcome a "Setup" section in the README. I have followed these steps to install the version I'm also hacking on:

$ conda create -n pystan-next python=3.6 numpy scipy cython -c conda-forge
$ source activate pystan-next
$ pip install -r test-requirements.txt
$ pip install -e .

But I couldn't run the test suite successfully. Running

$ python -m pytest

gave OS errors related to servers and processes ([Errno 98] Address already in use).
I haven't dived into the internals, I would like to work on #338 (comment) directly!

Thank you,
Marianne

Implement `save_warmup`

Relevant tests are in PyStan 2's test_extract.py. Should be relatively easy to do.

I myself have never used this. Is it widely used?

Use xarray-dataset to save all data

I think we should store all the data from the sampling straight to xarray.Dataset. This way the results are in a compact place and we have a good way to access it.

Not sure should also store functions there or only the data.

See. e.g. arviz-devs/arviz#97

Run from JupyterLab / Jupyter Notebook

Running httpstan from Jupyter Lab/Notebook fails due to jupyter is already running asyncio event

~\miniconda3\envs\stan3\lib\asyncio\base_events.py in run_forever(self)
    427             raise RuntimeError(
--> 428                 'Cannot run the event loop while another loop is running')
    429         self._set_coroutine_wrapper(self._debug)

RuntimeError: Cannot run the event loop while another loop is running

Using IPython works (or I'm running this on Windows, and Python crash when I exit the python, so I can save the results with ArviZ to netCDF and use it later).

Diagnostic messages not printed

In Pystan 2 I get this message (no hard error):

DIAGNOSTIC(S) FROM PARSER:
Unknown variable: to_vector

INFO:pystan:COMPILING THE C++ CODE FOR MODEL anon_model_b8f2cb83530f4085fcbf89c7040888ab NOW.

The diagnostic message should be printed for the user in pystan 3.

Exception: variable does not exist;

I run this pystan 3.0 with the eight schools case, and find out an error:
File "/opt/python3/lib/python3.6/site-packages/stan/model.py", line 189, in build
raise RuntimeError(response_payload["error"]["message"])
RuntimeError: Error calling param_names: `Exception: variable does not exist; processing stage=data initialization; variable name=J; base type=int (in 'unknown file name' at line 3)

Could you please check it? Thank you.

The source code is :
import stan
program_code = """
data {
int<lower=0> J; // number of schools
real y[J]; // estimated treatment effects
real<lower=0> sigma[J]; // s.e. of effect estimates
}
parameters {
real mu;
real<lower=0> tau;
real eta[J];
}
transformed parameters {
real theta[J];
for (j in 1:J)
theta[j] = mu + tau * eta[j];
}
model {
target += normal_lpdf(eta | 0, 1);
target += normal_lpdf(y | theta, sigma);
}
"""

data = {'J': 8,
'y': [28, 8, -3, 7, -1, 1, 18, 12],
'sigma': [15, 10, 16, 11, 9, 11, 10, 18]}

posterior = stan.build(program_code, data=data)
fit = posterior.sample(num_chains=4, num_samples=1000)

Non-ascii char in README.rst

Failing to pip install from source due to non-ascii char in README.rst (python 3.6 on Ubuntu (miniconda) docker)

Could (r) work instead of ®.

Warnings not shown to user

Warnings printed after sampling is finished are not shown to user. For example (PyStan 2):

Elapsed Time: 3.53326 seconds (Warm-up)
               5.25664 seconds (Sampling)
               8.7899 seconds (Total)

WARNING:pystan:174 of 500 iterations ended with a divergence (34.8 %).
WARNING:pystan:Try running with adapt_delta larger than 0.8 to remove the divergences.
WARNING:pystan:294 of 500 iterations saturated the maximum tree depth of 10 (58.8 %)
WARNING:pystan:Run again with max_treedepth larger than 10 to avoid saturation

use keyword-only API

There's a new trick in Python 3 which is tailor-made for calls to the sampling functions and any functions which have lots of arguments. It is possible to require that arguments be passed by keyword.

Details: https://www.python.org/dev/peps/pep-3102/

I believe one uses it like this:

def hmc_nuts(*, iter, warmup, adapt, kwarg1, kwarg2, kwarg3, kwarg4):

tqdm progress bar missing

.sample doesn't show tqdm. What is the current situation with it?

Can we follow sampling in httpstan, and just send iter count info over to pystan which then could update the graph?

Setup travis

Travis needs .travis.yml-file.

Could not find .travis.yml, using standard configuration

All sampling-related tests from pystan 2 should pass

Tests from the old pystan repository need to be translated into pystan 3.

Compile

  • test_stanc.py

Sampling

  • test_basic_array.py
  • test_basic_matrix.py
  • test_basic_pars.py
  • test_basic.py (split into test_basic_normal.py and test_basic_bernoulli.py)
  • test_generated_quantities_seed.py (#47)
  • test_linear_regression.py

Exceptions need to be transmitted to pystan from httpstan

If an integer variable for data is out of bounds, httpstan throws the right exception:

ValueError: Exception: _7b946e826d3147f9a1b54029a1a8d47e28f04a956afab3f87ecfa5fba97b478e_namespace::_7b946e826d3147f9a1b54029a1a8d47e28f04a956afab3f87ecfa5fba97b478e: player0[k0__] is 0, but must be greater than or equal to 1  (in 'unknown file name' at line 6)

but this doesn't arrive at pystan in an aesthetically pleasing form.

Add information from model to fit

Hi,

could we add model.program_code maybe other information too, so one can infer the model and recreate the model without model instance.

They could to dict under fit.model_info?

Close connection if Exception is raised

Currently, the shutdown procedure is not called if model_string is invalid. This keeps the port (8080) locked.

import pystan
model_string = "parameters {vector y;} model {y ~ cauchy(0,1);}"
program = pystan.compile(model_string, {})
# AssertionError is raised

model_string2 = "parameters {vector[10] y;} model {y ~ cauchy(0,1);}"
program2 = pystan.compile(model_string2, {})
# OSError is raised 
# OSError: [Errno 48] error while attempting to bind on address ('127.0.0.1', 8080): address already in use

One solution is to add try-finally blocks for post and post_aiter functions.

Failing to use caching

On Windows, trying to recompile previously run model (closed and reopened ipython between the runs) fails with

In [3]: posterior = stan.build(program_code, data=data)
ERROR:aiohttp.server:Error handling request
Traceback (most recent call last):
  File "C:\Users\user\miniconda3\envs\stan3\lib\site-packages\aiohttp\web_protocol.py", line 418, in start
    resp = await task
  File "C:\Users\user\miniconda3\envs\stan3\lib\site-packages\aiohttp\web_app.py", line 458, in _handle
    resp = await handler(request)
  File "c:\users\user\github\httpstan\httpstan\views.py", line 154, in handle_show_params
    model_module = httpstan.models.import_model_extension_module(model_name, module_bytes)
  File "c:\users\user\github\httpstan\httpstan\models.py", line 172, in import_model_extension_module
    return _import_module(module_name, module_path)
  File "c:\users\user\github\httpstan\httpstan\models.py", line 136, in _import_module
    module = importlib.import_module(module_name)
  File "C:\Users\user\miniconda3\envs\stan3\lib\importlib\__init__.py", line 126, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 994, in _gcd_import
  File "<frozen importlib._bootstrap>", line 971, in _find_and_load
  File "<frozen importlib._bootstrap>", line 953, in _find_and_load_unlocked
ModuleNotFoundError: No module named 'model_d968dc8b91'
---------------------------------------------------------------------------
JSONDecodeError                           Traceback (most recent call last)
<ipython-input-3-249fe4204f1a> in <module>
----> 1 posterior = stan.build(program_code, data=data)

c:\users\user\github\pystan-next\stan\model.py in build(program_code, data, random_seed)
    222         path, payload = f"/v1/{model_name}/params", {"data": data}
    223         response = requests.post(f"http://{host}:{port}{path}", json=payload)
--> 224         response_payload = response.json()
    225         if response.status_code != 200:
    226             raise RuntimeError(response_payload["message"])

~\miniconda3\envs\stan3\lib\site-packages\requests\models.py in json(self, **kwargs)
    895                     # used.
    896                     pass
--> 897         return complexjson.loads(self.text, **kwargs)
    898
    899     @property

~\miniconda3\envs\stan3\lib\json\__init__.py in loads(s, encoding, cls, object_hook, parse_float, parse_int, parse_constant, object_pairs_hook, **kw)
    352             parse_int is None and parse_float is None and
    353             parse_constant is None and object_pairs_hook is None and not kw):
--> 354         return _default_decoder.decode(s)
    355     if cls is None:
    356         cls = JSONDecoder

~\miniconda3\envs\stan3\lib\json\decoder.py in decode(self, s, _w)
    340         end = _w(s, end).end()
    341         if end != len(s):
--> 342             raise JSONDecodeError("Extra data", s, end)
    343         return obj
    344

JSONDecodeError: Extra data: line 1 column 5 (char 4)

After removing httpstan cached models folder and database, recompilation works.

C:\Users\user\AppData\Local\httpstan\httpstan\Cache\0.8.1

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.