Coder Social home page Coder Social logo

pastas / metran Goto Github PK

View Code? Open in Web Editor NEW
19.0 9.0 5.0 15.21 MB

Multivariate timeseries analysis using dynamic factor modelling.

Home Page: https://metran.readthedocs.io

License: MIT License

Python 100.00%
timeseries analysis hydrology groundwater pastas multivariate python

metran's People

Contributors

bdestombe avatar dbrakenhoff avatar martinvonk avatar wlberendrecht avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

metran's Issues

No solution when using engine="numba" for Kalman Filter

For some datasets, the optimization does not succeed due to a ZeroDivisionError in SciPy.

Can be reproduced with the following dataset: test.csv

df = pandas.read_csv(test.csv, index_col=0, parse_dates=True)
mt = metran.Metran(df)
mt.solve()

The issue can be resolved by using in the SPKalmanFilter(engine="numpy"). I discussed this issue with @dbrakenhoff and we suspect that engine="numpy" is more robust since it fills in inf or nan for logarithms and fractions automatically while engine="numba" does not.

This can be resolved by allowing the user to specify the SPKalmanFilter engine which is currently only possible by changing the source code.

Save/load metran models

It would be nice to build a pas-like file (see Pastas) to save models and load them again.

We could try to follow pastas' style and have the different relevant objects in a model contain to_dict() methods. Similar to pastas this would include a series keyword argument to store the dictionary with or without time series.

We can reuse the pastas encoder to write certain datatypes to json for storing it as a file. I guess we could use a .metran extension or something similar.

Metran seems to work!

Hoi @wlberendrecht en @dbrakenhoff ! Ik heb net even getest, en het is me gelukt metran te installeren en de notebook te runnen. Een paar methodes aan het einde van de notebook moest ik aanpassen maar verder werkt alles! ๐Ÿ‘๐Ÿป

Ik moet er nog eens rustig naar kijken, maar leuk dat het technisch al goed lijkt te werken.

Groet,
Raoul

Velicer's MAP Test results in 0 factors for Dynamic Factor Model notebook

David and I get different results dependent on our machines. They get 0 factors with Velicer's MAP test while I get 1 (as intended originally). Velicer's MAP test code:

def _maptest(cov, eigvec, eigval):
"""Internal method to run Velicer's MAP test.
Determines the number of factors to be used. This method includes
two variations of the MAP test: the orginal and the revised MAP test.
Parameters
----------
cov : numpy.ndarray
Covariance matrix.
eigvec : numpy.ndarray
Matrix with columns eigenvectors associated with eigenvalues.
eigval : numpy.ndarray
Vector with eigenvalues in descending order.
Returns
-------
nfacts : integer
Number factors according to MAP test.
nfacts4 : integer
Number factors according to revised MAP test.
References
----------
The original MAP test:
Velicer, W. F. (1976). Determining the number of components
from the matrix of partial correlations. Psychometrika, 41, 321-327.
The revised (2000) MAP test i.e., with the partial correlations
raised to the 4rth power (rather than squared):
Velicer, W. F., Eaton, C. A., and Fava, J. L. (2000). Construct
explication through factor or component analysis: A review and
evaluation of alternative procedures for determining the number
of factors or components. Pp. 41-71 in R. D. Goffin and
E. Helmes, eds., Problems and solutions in human assessment.
Boston: Kluwer.
"""
nvars = len(eigval)
fm = np.array([np.arange(nvars, dtype=float), np.arange(nvars, dtype=float)]).T
np.put(
fm,
[0, 1],
((np.sum(np.sum(np.square(cov))) - nvars) / (nvars * (nvars - 1))),
)
fm4 = np.copy(fm)
np.put(
fm4,
[0, 1],
(
(np.sum(np.sum(np.square(np.square(cov)))) - nvars)
/ (nvars * (nvars - 1))
),
)
for m in range(nvars - 1):
biga = np.atleast_2d(eigvec[:, : m + 1])
partcov = cov - np.dot(biga, biga.T)
# exit function with nfacts=1 if diag partcov contains negatives
if np.amin(np.diag(partcov)) < 0:
return 1, 1
d = np.diag((1 / np.sqrt(np.diag(partcov))))
pr = np.dot(d, np.dot(partcov, d))
np.put(
fm,
[m + 1, 1],
((np.sum(np.sum(np.square(pr))) - nvars) / (nvars * (nvars - 1))),
)
np.put(
fm4,
[m + 1, 1],
(
(np.sum(np.sum(np.square(np.square(pr)))) - nvars)
/ (nvars * (nvars - 1))
),
)
minfm = fm[0, 1]
nfacts = 0
minfm4 = fm4[0, 1]
nfacts4 = 0
for s in range(nvars):
fm[s, 0] = s
fm4[s, 0] = s
if fm[s, 1] < minfm:
minfm = fm[s, 1]
nfacts = s
if fm4[s, 1] < minfm4:
minfm4 = fm4[s, 1]
nfacts4 = s
return nfacts, nfacts4

On my device:

eigvec = array([[ 0.96750358, -0.25285732],  [ 0.96750358,  0.25285732]])
eigvec[0,0] = 0.9675035797467857
eigvec[0,1] = -0.25285731782401605
eigvec[1,0] = 0.9675035797467855
eigvec[1,1] = 0.2528573178240161

Later on this results in:

minfm = 1.000000000000007
minfm4 = 1.0000000000000142

which yields True for (if s = 1):

if fm[s, 1] < minfm:`

with fm[s, 1] = 1.0

Update testing routine

  • Add tests for newer versions of Python
  • Add black / isort formatting
  • Use tox for testing routine, similar to Pastas

Version Specifyer Deprecation

DEPRECATION: metran 0.2.0 has a non-standard dependency specifier numpy>=1.16.5matplotlib>=3.0. pip 23.3 will enforce this behaviour change. A possible replacement is to upgrade to a newer version of metran or contact the author to suggest that they release a version with a conforming dependency specifiers. Discussion can be found at pypa/pip#12063

lag structure and autocorrelation of residuals

Hi,
Good work.

Wondering if the metran only assumes AR(1) process.

statsmodels have more flexible DFM configurations (see here an example), but they lack the ability to determine the optimal lags and numbers of structures.

Bug in kalmanfilter - decompose

cdf_means = [[]] * ncdf

An error occurs here for ncdf>1, as the lists produced by the loop are ncdf times too long. Consider the following example:

a = [[]] * 2
a[0].append([0])
print(a)

I think the code in metran expects [[[0]], [[]]] to be printed, but instead [[[0]], [[0]]] is printed.

In which case decompose_simulation retrieves cdf_means that are too long. Or it could definitely be the case that I don't understand the code..

Allow user to specify number of common dynamic components

It would be a nice feature to be able to override the automatic method to determine the number of common dynamic components.

For example:

import metran as mt

ml = mt.Metran(oseries, nfactors=2)
ml.solve()

Currently the FactorAnalysis class also contains a maxfactors argument that can presumably be used to limit the no. of factors. This is not exposed through the Metran model class however. So perhaps we should also expose this argument in the Metran class?

Additionally it would be nice to test the current implementation for estimating the number of factors on a dataset that results in 2 (or more) common components.

So in short:

  • Allow manual setting for number of factors
  • Expose maxfactors keyword argument in FactorAnalysis (if this makes sense)
  • Test metran with dataset that results in 2+ common dynamic components

Todo list when Metran goes public

  • build documentation on readthedocs
  • code-style checks codacy
  • upload coverage codacy
  • add dev branch for development
  • add automatic PyPI release workflow
  • make first release

problem importing metran

Hi

I am getting this error when trying to load the package

AttributeError: module 'pastas' has no attribute 'stats'

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.