Coder Social home page Coder Social logo

Comments (4)

auderson avatar auderson commented on June 1, 2024

The rolling algos in pandas generally use the online updating version for performance. It is expected the result can be a bit different due to floating point artifacts.
If you really want consistent result you can try rolling.apply(lambda x: x.kurt()), but this is much slower.

from pandas.

HaloCollider avatar HaloCollider commented on June 1, 2024

The rolling algos in pandas generally use the online updating version for performance. It is expected the result can be a bit different due to floating point artifacts. If you really want consistent result you can try rolling.apply(lambda x: x.kurt()), but this is much slower.

I understand that the online updating method may cause numerical instability in a time-series manner. But 20.0 or larger than 20.0 is an overall characteristic of a series. In other words, you cannot get a 20.0 and a larger than 20.0 in a single series.

For example:
pd.Series([1] * 19 + [-1] * 7 + [1] * 1).rolling(20).kurt().max() gives a 20.00000000000001,
while
pd.Series([1] * 19 + [-1] * 7 + [1] * 2).rolling(20).kurt().max() gives a 20.0.
Their difference is just an additional 1 at the tail, which doesn't affect the max kurtosis from the view of online updating.

(That's also why I'm calling it inconsistency rather than instability.)

from pandas.

auderson avatar auderson commented on June 1, 2024

Looks like it's due to a demean operation prior to calculation:

for i in range(0, V):
val = values_copy[i]
if val == val:
nobs_mean += 1
sum_val += val
mean_val = sum_val / nobs_mean
# Other cases would lead to imprecision for smallest values
if min_val - mean_val > -1e4:
mean_val = round(mean_val)
for i in range(0, V):
values_copy[i] = values_copy[i] - mean_val

from pandas.

HaloCollider avatar HaloCollider commented on June 1, 2024

Looks like it's due to a demean operation prior to calculation:

for i in range(0, V):
val = values_copy[i]
if val == val:
nobs_mean += 1
sum_val += val
mean_val = sum_val / nobs_mean
# Other cases would lead to imprecision for smallest values
if min_val - mean_val > -1e4:
mean_val = round(mean_val)
for i in range(0, V):
values_copy[i] = values_copy[i] - mean_val

Thanks a lot. This solves my issue. Previously I checked the source but missed the demean operation, which made my version produce consistent results that caused confusion.

I found the exact thresholds of the proportion of 1 of a series being 0.25 and 0.75, i.e., the mean being -0.5 and 0.5. Out of range (0.25 to 0.75) distributions lead to 20.0.

from pandas.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.