Comments (4)
The rolling algos in pandas generally use the online updating version for performance. It is expected the result can be a bit different due to floating point artifacts.
If you really want consistent result you can try rolling.apply(lambda x: x.kurt())
, but this is much slower.
from pandas.
The rolling algos in pandas generally use the online updating version for performance. It is expected the result can be a bit different due to floating point artifacts. If you really want consistent result you can try
rolling.apply(lambda x: x.kurt())
, but this is much slower.
I understand that the online updating method may cause numerical instability in a time-series manner. But 20.0
or larger than 20.0
is an overall characteristic of a series. In other words, you cannot get a 20.0
and a larger than 20.0
in a single series.
For example:
pd.Series([1] * 19 + [-1] * 7 + [1] * 1).rolling(20).kurt().max()
gives a 20.00000000000001
,
while
pd.Series([1] * 19 + [-1] * 7 + [1] * 2).rolling(20).kurt().max()
gives a 20.0
.
Their difference is just an additional 1
at the tail, which doesn't affect the max kurtosis from the view of online updating.
(That's also why I'm calling it inconsistency rather than instability.)
from pandas.
Looks like it's due to a demean operation prior to calculation:
pandas/pandas/_libs/window/aggregations.pyx
Lines 828 to 838 in 283a2dc
from pandas.
Looks like it's due to a demean operation prior to calculation:
pandas/pandas/_libs/window/aggregations.pyx
Lines 828 to 838 in 283a2dc
Thanks a lot. This solves my issue. Previously I checked the source but missed the demean operation, which made my version produce consistent results that caused confusion.
I found the exact thresholds of the proportion of 1
of a series being 0.25
and 0.75
, i.e., the mean being -0.5
and 0.5
. Out of range (0.25
to 0.75
) distributions lead to 20.0
.
from pandas.
Related Issues (20)
- BUG: DatetimeIndex.is_year_start breaks on custom business days frequencies bigger then `1C`
- DEPR: Deprecate method argument of reindex_like HOT 4
- Potential regression induced by "CLN: Simplify map_infer_mask (#58483)" HOT 1
- Potential regression induced by "CLN: Enforce read_csv(keep_date_col, parse_dates) deprecations (#58622)"
- ENH: Also apply formatters to the index in `to_latex` HOT 3
- BUG: In `main`, using `resample().interpolate(inplace=True)` raises an exception HOT 7
- BUG: edge case when masking "null[pyarrow]" pd.Series
- BUG: .max() raises exception on Series with object dtype and mixture of Timestamp and NaT: TypeError: '>=' not supported between instances of 'Timestamp' and 'float' HOT 1
- BUG: No kwargs in df.apply(raw=True, engine="numba") HOT 5
- BUG: pd.merge fail with numpy.uintc on Windows HOT 2
- BUG: scipy rolling exponential is breaking MultiIndex columns HOT 2
- BUG: ChainedAssignmentError link to documentation will break? HOT 2
- BUG: joining dataframes with multi-index and None index label results in AssertionError HOT 5
- BUG: `margins` value incorrect with `count` aggfunc and no index HOT 3
- BUG: NotImplementedError: `mod` not implemented in `pandas 2.2.2` with `int64[pyarrow]` HOT 2
- BUG: DatetimeIndex.is_year_start breaks on BusinessMonthStart frequency
- ENH: Python 3.13 support
- BUG: "styler.format.thousands" option doesn't work for integers HOT 4
- BUG: Pandas 2 is broken! HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from pandas.