lrydin / jumpdiff Goto Github PK

JumpDiff: Non-parametric estimator for Jump-diffusion processes for Python

License: MIT License

Python 100.00%

jump-diffusion non-parametric estimator kramers-moyal python conditional-moments stochastic-processes jump-amplitude diffusion-term jump-rate

jumpdiff's Introduction

jumpdiff

jumpdiff is a python library with non-parametric Nadaraya─Watson estimators to extract the parameters of jump-diffusion processes. With jumpdiff one can extract the parameters of a jump-diffusion process from one-dimensional timeseries, employing both a kernel-density estimation method combined with a set on second-order corrections for a precise retrieval of the parameters for short timeseries.

Installation

To install jumpdiff, run

   pip install jumpdiff

Then on your favourite editor just use

   import jumpdiff as jd

Dependencies

The library parameter estimation depends on numpy and scipy solely. The mathematical formulae depend on sympy. It stems from kramersmoyal project, but functions independently from it³.

Documentation

You can find the documentation here.

Jump-diffusion processes

The theory

Jump-diffusion processes¹, as the name suggest, are a mixed type of stochastic processes with a diffusive and a jump term. One form of these processes which is mathematically traceable is given by the Stochastic Differential Equation

which has 4 main elements: a drift term , a diffusion term , and jump amplitude term , which is given by a Gaussian distribution, and finally a jump rate . You can find a good review on this topic in Ref. 2.

Integrating a jump-diffusion process

Let us use the functions in jumpdiff to generate a jump-difussion process, and subsequently retrieve the parameters. This is a good way to understand the usage of the integrator and the non-parametric retrieval of the parameters.

First we need to load our library. We will call it jd

import jumpdiff as jd

Let us thus define a jump-diffusion process and use jd_process to integrate it. Do notice here that we need the drift and diffusion as functions.

# integration time and time sampling
t_final = 10000
delta_t = 0.001

# A drift function
def a(x):
    return -0.5*x

# and a (constant) diffusion term
def b(x):
    return 0.75

# Now define a jump amplitude and rate
xi = 2.5
lamb = 1.75

# and simply call the integration function
X = jd.jd_process(t_final, delta_t, a=a, b=b, xi=xi, lamb=lamb)

This will generate a jump diffusion process X of length int(10000/0.001) with the given parameters.

Using `jumpdiff` to retrieve the parameters

Moments and Kramers─Moyal coefficients

Take the timeseries X and use the function moments to retrieve the conditional moments of the process. For now let us focus on the shortest time lag, so we can best approximate the Kramers─Moyal coefficients. For this case we can simply employ

edges, moments = jd.moments(timeseries = X)

In the array edges are the limits of our space, and in our array moments are recorded all 6 powers/order of our conditional moments. Let us take a look at these before we proceed, to get acquainted with them.

We can plot the first moment with any conventional plotter, so lets use here plotly from matplotlib

import matplotlib.plotly as plt

# we want the first power, so we need 'moments[1,...]'
plt.plot(edges, moments[1,...])

The first moment here (i.e., the first Kramers─Moyal coefficient) is given solely by the drift term that we have selected -0.5*x

And the second moment (i.e., the second Kramers─Moyal coefficient) is a mixture of both the contributions of the diffusive term and the jump terms and .

You have this stored in moments[2,...].

Retrieving the jump-related terms

Naturally one of the most pertinent questions when addressing jump-diffusion processes is the possibility of recovering these same parameters from data. For the given jump-diffusion process we can use the jump_amplitude and jump_rate functions to non-parametrically estimate the jump amplitude and jump rate terms.

After having the moments in hand, all we need is

# first estimate the jump amplitude
xi_est = jd.jump_amplitude(moments = moments)

# and now estimated the jump rate
lamb_est = jd.jump_rate(moments = moments)

which resulted in our case in (xi_est) ξ = 2.43 ± 0.17 and (lamb_est) λ = 1.744 * delta_t (don't forget to divide lamb_est by delta_t)!

Other functions and options

Include in this package is also the Milstein scheme of integration, particularly important when the diffusion term has some spacial x dependence. moments can actually calculate the conditional moments for different lags, using the parameter lag.

In formulae the set of formulas needed to calculate the second order corrections are given (in sympy).

Contributions

We welcome reviews and ideas from everyone. If you want to share your ideas, upgrades, doubts, or simply report a bug, open an issue here on GitHub, or contact us directly. If you need help with the code, the theory, or the implementation, drop us an email. We abide to a Conduct of Fairness.

Changelog

Version 0.4 - Designing a set of self-consistency checks, the documentation, examples, and a trial code. Code at PyPi.
Version 0.3 - Designing a straightforward procedure to retrieve the jump amplitude and jump rate functions, alongside with a easy sympy displaying the correction.
Version 0.2 - Introducing the second-order corrections to the moments
Version 0.1 - Design an implementation of the moments functions, generalising kramersmoyal km.

Literature and Support

History

This project was started in 2017 at the neurophysik by Leonardo Rydin Gorjão, Jan Heysel, Klaus Lehnertz, and M. Reza Rahimi Tabar, and separately by Pedro G. Lind, at the Department of Computer Science, Oslo Metropolitan University. From 2019 to 2021, Pedro G. Lind, Leonardo Rydin Gorjão, and Dirk Witthaut developed a set of corrections and an implementation for python, presented here.

Funding

Helmholtz Association Initiative Energy System 2050 - A Contribution of the Research Field Energy and the grant No. VH-NG-1025 and STORM - Stochastics for Time-Space Risk Models project of the Research Council of Norway (RCN) No. 274410.

Bibliography

¹ Tabar, M. R. R. Analysis and Data-Based Reconstruction of Complex Nonlinear Dynamical Systems. Springer, International Publishing (2019), Chapter Stochastic Processes with Jumps and Non-vanishing Higher-Order Kramers–Moyal Coefficients.

² Friedrich, R., Peinke, J., Sahimi, M., Tabar, M. R. R. Approaching complexity by stochastic methods: From biological systems to turbulence, Physics Reports 506, 87–162 (2011).

³ Rydin Gorjão, L., Meirinhos, F. kramersmoyal: Kramers–Moyal coefficients for stochastic processes. Journal of Open Source Software, 4(44) (2019).

Extended Literature

You can find further reading on SDE, non-parametric estimatons, and the general principles of the Fokker–Planck equation, Kramers–Moyal expansion, and related topics in the classic (physics) books

Risken, H. The Fokker–Planck equation. Springer, Berlin, Heidelberg (1989).
Gardiner, C.W. Handbook of Stochastic Methods. Springer, Berlin (1985).

And an extensive review on the subject here

jumpdiff's People

Contributors

Stargazers

Watchers

Forkers

vishalbelsare eduardobsg trendingtechnology davydai789 goldstar111

jumpdiff's Issues

Reproducing the results in the exampels fails

Thank you for this great work!

As a mechanical engineer, I currently do not have a strong mathematical background regarding the formulation of stochastic processes, but I would like to use this package to perform a Monte Carlo simulation for energy system analysis, e.g., by sampling projections of the development of energy prices. I tried to reproduce the results from the code snippets given by the examples in the documentation and was not able to achieve the same results. I have done the following:

Firstly, I have simulated a jump diffusion process using the same code given by your documentation:

# integration time and time sampling
t_final = 10000
delta_t = 0.001

# A drift function
def a(x):
    return -0.5*x

# and a (constant) diffusion term
def b(x):
    return 0.75

# Now define a jump amplitude and rate
xi = 2.5
lamb = 1.75

# and simply call the integration function
X = jd.jd_process(t_final, delta_t, a=a, b=b, xi=xi, lamb=lamb)

The plot of X is given below, and so far, it looks similar to the results given in the examples.

Then I tried the ability of the package to estimate the given parameters, and the results started to become a bit confusing. I used the following code according to the documentation:

edges, moments = jd.moments(timeseries = X)

# we want the first power, so we need 'moments[1,...]'
plt.plot(edges, moments[1,...], label='D1')
plt.plot(edges, moments[2,...], label='D2')

plt.xlabel('x') 
plt.ylabel('D_m(x)')  
plt.legend()

plt.xlim(-5, 5) 

plt.show()

The results show the first two Kramers-Moyal coefficients. In comparison with the results in the documentation for the same simulation, the values of the coefficients seem to be completely different:

I have also looked into your paper. Figure 2 shows the estimation results for the coefficients based on the same simulation. Here, the second Kramers-Moyal coefficient D2 shows again a different value (shown below). In this case, it seems to directly correspond to the constant value b that was chosen in the model parametrization, whereas the former results are stating that the resulting curve is also dependent on the parameters of the jump process.

Could you clarify my observations a bit and provide some more guidance on how I can retrieve the final parameters of the jump diffusion process based on the Kramers-Moyal coefficients? There might be a huge misunderstanding on my side since I am currently not very familiar with the deep theory behind the package.

Thank you so far!

negative lambda and others

This is a great package. And I have some questions.
Recently I come across the KM perspective to estimate levy triplet by virtue of the book of Mr Tabar published in 2019. Statisticians, especially French scholars, have done a lot on it. However, real time series are traces of dynamic processes ruled by some unknown laws. Methods that lack the support of some natural law governance yet only focus on data may not be the right road. And the KM perspective shows me a direction.
I tried your package last night with some real time series, but many of them are negative lambda. Besides, I want to use the estimated lambda and xi to generate surrogate process to compare with the original time series (I also want to get drift and diffusion from the moments but moments are not scalar,thus I calculated from the original), but the generated X has many nan. Could you give some guidance?
And the KM perspective is a nonparametric method, which may have some similarities with the statistical methods worked by, e.g. Fabienne Comte or Jean Jacod. Could you show some details on them?
Finally neural network is nonparametric method to approximate the unknown distribution by stacked neural layers, do you have some plan to try them?

Many thanks

Example notation

Hi @LRydin, thank you for this great package. I have a question regarding the notation in the definition of the jump term. The parameter xi is referred to as the jump amplitude, which is however normally distributed with variance sigma_xi. Is the xi parameter as set in the example actually the standard deviation of the jump amplitude?

Furthermore, is there a way to set the initial condition, e.g., X_0 = 0?

Thanks again!
Kyriakos

Mistake in `jd_process.py`

Thank you for your great effort in the jumpdiff, it is the first package in both simulation and estimation for the research in SDEwJ with Python language. But I find some mistakes in your simulation parts.

In the code file jd_process.py, you directly treat the jump as the dJ[i] * np.random.normal(loc = 0, scale = np.sqrt(xi)) for both Euler and Milstein schemes, which is a wrong simulation method. The correct method is treat the diffusion as the sum of d[j] iid samples that from normal distribution. The data generated with wrong data will leads to the estimation bias of your program. The example you used in the introduction in github and publication has very small lambda and the dJ[i] is mostly equal to 1, so the data generated with wrong simulation method is almost the same with the right one. If you change the lambda to 50, the estimation will crash with big bias of about 30, which is showed the code file attached. After I change the simulation data using the correct simulation method, the estimation task is worked for both small and big lambda.

I think your estimation method is good without any problem since I have tested other sample with the right simulation code, but you should correct the mistakes in jd_process.py.

Wrong Code in jd_process.py:

# Integration, either Euler
if solver == 'Euler':
    for i in range(1, length):
        X[i] = X[i-1] + a(X[i-1]) * delta_t + b(X[i-1]) * dw[i]
        if dJ[i] > 0.:
            X[i] += dJ[i] * np.random.normal(loc = 0, scale = np.sqrt(xi))

if solver == 'Milstein':
    # Generate corrective terms of the Milstein integration method
    dw_2 = (dw**2 - delta_t) * 0.5

    for i in range(1, length):
        X[i] = X[i-1] + a(X[i-1]) * delta_t + b(X[i-1]) * dw[i] \
                + b(X[i-1]) * b_prime(X[i-1]) * dw_2[i]
        if dJ[i] > 0.:
            X[i] += dJ[i] * np.random.normal(loc = 0, scale = np.sqrt(xi))

Correct Code for jd_process.py:

# Integration, either Euler
if solver == 'Euler':
    for i in range(1, length):
        X[i] = X[i-1] + a(X[i-1]) * delta_t + b(X[i-1]) * dw[i]
        if dJ[i] > 0.:
            for j in range(int(dJ[i])):
                X[i] += np.random.normal(loc = 0, scale = np.sqrt(xi))

if solver == 'Milstein':
    # Generate corrective terms of the Milstein integration method
    dw_2 = (dw**2 - delta_t) * 0.5

    for i in range(1, length):
        X[i] = X[i-1] + a(X[i-1]) * delta_t + b(X[i-1]) * dw[i] \
                + b(X[i-1]) * b_prime(X[i-1]) * dw_2[i]
        if dJ[i] > 0.:
            for j in range(int(dJ[i])):
                X[i] += np.random.normal(loc = 0, scale = np.sqrt(xi))

Simulation and Estimation wth big lambda:

# -*- coding: utf-8 -*-
"""
Created on Fri Feb 10 15:19:29 2023

@author: JChonpca_Huang
"""
import jumpdiff as jd
import matplotlib.pyplot as plt

# integration time and time sampling
t_final = 1000
delta_t = 0.001

# A drift function
def a(x):
    return -0.5*x

# and a (constant) diffusion term
def b(x):
    return 0.75

# Now define a jump amplitude and rate

xi = 2.5

lamb = 50


# and simply call the integration function
X = jd.jd_process(t_final, delta_t, a=a, b=b, xi=xi, lamb=lamb)

plt.plot(X)
plt.show()

edges, moments = jd.moments(timeseries = X)


# we want the first power, so we need 'moments[1,...]'
plt.plot(edges, moments[1,...])
plt.show()

plt.plot(edges, moments[2,...])
plt.show()

# first estimate the jump amplitude
xi_est = jd.jump_amplitude(moments = moments)

print(xi_est)

# and now estimated the jump rate
lamb_est = jd.jump_rate(moments = moments)

print(lamb_est/delta_t)