aura-healthcare / hrv-analysis Goto Github PK

View Code? Open in Web Editor NEW

354.0 354.0 90.0 9.1 MB

Package for Heart Rate Variability analysis in Python

License: GNU General Public License v3.0

Python 100.00%

feature-engineering heart-rate-variability python rr-interval

hrv-analysis's People

Contributors

Stargazers

Watchers

Forkers

clecoued qiriro alizbazar spannawit joseph669 cedricgilon josefcs yudong-wang rhenanbartels boris69bg wg900410 hectheo meistalampe markdyousef saiuz hehaowei pjercic gwlund esyyes marielledado syuanbo luisdamora youngbaymax namseungyoon 05peaunt paulracooni caiki vaish0911 merinstdl salvador-moreno ethanlovequeen caicai7 qiyueyu teddy1wy jonmart0304 jabituyaben biggates ecgkit nickzhuang0613 ecgtools minsooyeo jonozw zeeeeeee1 szsongyj test1234-sudo andremov rickwu11 maqianlidy theawesomeandy hengley soniabaee 0ooooops running-snail66 marioantolinezh 2021300350 chenxuesongthu sanikathakur jiehawes snakesgun seon-taewoong dophanngoc kaisyaliu maksimks81 langestefan elijahahianyo paritoshgoyal kyounghyunpark nian-jingqing danlessa adgilbert codehgq madsolsen simonetravaglini 652165151 shunsu321 dodimariadi nguljh anatasiagrant jamiepaul135 caichicong eric-guanwy lucacerina kiruthiga-a og-spiden logan9872 iamkani bryangyc khs-hub aysedogan777

hrv-analysis's Issues

is_outlier - returns true when it's not an outlier

Hey,
Was just perusing your code, looking to learn more about HRV data cleanup, classifying and cleaning ectopic beats etc...
I stared at this code for like 15 minutes scratching my head until I realized...
Did you misname your is_outlier method, or do I not understand your code?
It looks like, even from the description that is_outlier returns true on a valid RR-interval... i.e. not an outlier?
In remove_ectopic_beats you are testing is_outlier, then only inmcrementing outlier count if its false.

Anyways, if I was correct here, you might want to clean it up for readability.
Thanks, and thanks for sharing your code!!

hrv-analysis/hrvanalysis/preprocessing.py

Line 149 in 285019a

    
           def is_outlier(rr_interval: int, next_rr_interval: float, method: str = "malik",

Implement Wavelet QRS Complex detection on ECG signal

Recently different methods for QRS complex detection on ECG raw signals has been developed.

The most reliable are using wavelets decomposition to extract the QRS complex information (and also the others segments of the cardiac cycle)
Wavedet algorithm

The algorithm is already implemented in Matlab
Matlab ECG kit

Add features Logarithmic index & differential index in get_geometrical_features

Add the Logarithmic index & differential index int the "get_geometrical_features" method.

reference for the features:
https://www.escardio.org/static_file/Escardio/Guidelines/Scientific-Statements/guidelines-Heart-Rate-Variability-FT-1996.pdf

Question of citation

Hello.

I'm writing a research paper using this library. Are there any citations to quote?

Regards,
Seungwoo

Add tests for frequency domain analysis

Add multiple tests for frequency domain analysis methods.
Use Kubios software as a reference.
Download tool here: https://www.kubios.com/hrv-standard/

add a warning if Nyquist frequency isn't respected (Frequency domain analysis methods)

see: https://en.wikipedia.org/wiki/Nyquist_frequency

https://www.escardio.org/static_file/Escardio/Guidelines/Scientific-Statements/guidelines-Heart-Rate-Variability-FT-1996.pdf

https://www.ahajournals.org/doi/pdf/10.1161/01.CIR.81.6.1803

Create Tutorial in Sphinx documentation

Create Tutorial in Sphinx documentation for the following use-cases:

preprocessing
time domain analysis.
frequency domain analysis.
non linear domain analysis

Missing documentation

the link for the documentation seems to be broken...

Error with Frequency domain

Hi,

I have issue with frequency domain, if I resample my RR intervals into segments less than 50 minutes.

For segments greather than 50 minutes it's OK, but if I use less, for example 5-minutes intervals I got error:

IndexError: index out of bounds

Do you know why and how to solve it?

The issue about heart beat interval preprocessing for interpolation

I am on studying heart rate variability with your great packages, I sincerely appreciate your efforts for this package

I have founded a problem on pre-processing in heart rate variability below toy codes

from hrvanalysis import preprocessing as pre

hrv_sample_1 = [700, 800, 10000, 10000, 650, 700, 750, 540]
hrv_sample_2 = [10000, 10000, 800, 700, 800, 900, 10000, 10000, 650, 700]
hrv_sample_3 = [10000, 10000, 800, 700, 800, 900, 10000, 10000, 650, 700, 10000, 10000]

hrv_sample_1 = pre.interpolate_nan_values(pre.remove_outliers(hrv_sample_1))
hrv_sample_2 = pre.interpolate_nan_values(pre.remove_outliers(hrv_sample_2))
hrv_sample_3 = pre.interpolate_nan_values(pre.remove_outliers(hrv_sample_3))

print("hrv sample 1: {}".format(hrv_sample_1))
print("hrv sample 2: {}".format(hrv_sample_2))
print("hrv sample 3: {}".format(hrv_sample_3))

>>> hrv sample 1: [700.0, 800.0, 750.0, 700.0, 650.0, 700.0, 750.0, 540.0]
>>> hrv sample 2: [nan, nan, 800.0, 700.0, 800.0, 900.0, 816.6666666666666, 733.3333333333334, 650.0, 700.0]
>>> hrv sample 3: [nan, nan, 800.0, 700.0, 800.0, 900.0, 816.6666666666666, 733.3333333333334, 650.0, 700.0, 700.0, 700.0]

you can see this code, 10000 value is abnormal heart beat interval.
a variable "hrv_sample_1" was prepossessed normally, however, hrv_sample_2 and hrv_sample_3 still include nan value if first data is abnormal value

to solve it, i proposes a simple method.

you can see code, variable hrv_sample_3 has abnormal value on end point, it was replaced to previous value.
Similar to this method, i add code in interpolate_nan_values function below

from typing import Tuple
from typing import List
import pandas as pd
import numpy as np

# Static name for methods params
MALIK_RULE = "malik"
KARLSSON_RULE = "karlsson"
KAMATH_RULE = "kamath"
ACAR_RULE = "acar"
CUSTOM_RULE = "custom"


def interpolate_nan_values(rr_intervals: list,
                           interpolation_method: str = "linear",
                           limit_area: str = None,
                           limit_direction: str = "forward",
                           limit=None, ) -> list:
    """
    Function that interpolate Nan values with linear interpolation
    Parameters
    ---------
    rr_intervals : list
        RrIntervals list.
    interpolation_method : str
        Method used to interpolate Nan values of series.
    limit_area: str
        If limit is specified, consecutive NaNs will be filled with this restriction.
    limit_direction: str
        If limit is specified, consecutive NaNs will be filled in this direction.
    limit: int
        TODO
    Returns
    ---------
    interpolated_rr_intervals : list
        new list with outliers replaced by interpolated values.
    """
    # search first nan data and fill it post value until it is not nan
    if np.isnan(rr_intervals[0]):
        start_idx = 0

        while np.isnan(rr_intervals[start_idx]):
            start_idx += 1

        rr_intervals[0:start_idx] = [rr_intervals[start_idx]] * start_idx
    else:
        pass
    # change rr_intervals to pd series
    series_rr_intervals_cleaned = pd.Series(rr_intervals)
    # Interpolate nan values and convert pandas object to list of values
    interpolated_rr_intervals = series_rr_intervals_cleaned.interpolate(method=interpolation_method,
                                                                        limit=limit,
                                                                        limit_area=limit_area,
                                                                        limit_direction=limit_direction)
    return interpolated_rr_intervals.values.tolist()



from hrvanalysis import preprocessing as pre

hrv_sample_1 = [700, 800, 10000, 10000, 650, 700, 750, 540]
hrv_sample_2 = [10000, 10000, 800, 700, 800, 900, 10000, 10000, 650, 700]
hrv_sample_3 = [10000, 10000, 800, 700, 800, 900, 10000, 10000, 650, 700, 10000, 10000]

hrv_sample_1 = interpolate_nan_values(pre.remove_outliers(hrv_sample_1))
hrv_sample_2 = interpolate_nan_values(pre.remove_outliers(hrv_sample_2))
hrv_sample_3 = interpolate_nan_values(pre.remove_outliers(hrv_sample_3))

print("hrv sample 1: {}".format(hrv_sample_1))
print("hrv sample 2: {}".format(hrv_sample_2))
print("hrv sample 3: {}".format(hrv_sample_3))


out:
>>> hrv sample 1: [700.0, 800.0, 750.0, 700.0, 650.0, 700.0, 750.0, 540.0]
>>> hrv sample 2: [800.0, 800.0, 800.0, 700.0, 800.0, 900.0, 816.6666666666666, 733.3333333333334, 650.0, 700.0]
>>> hrv sample 3: [800.0, 800.0, 800.0, 700.0, 800.0, 900.0, 816.6666666666666, 733.3333333333334, 650.0, 700.0, 700.0, 700.0]

it is change abnormal 2 start point to post value correctly,
I made pull-request

thank you.

Calculation of pNN50 wrong?

I'm calculating the pNN50, but it seems like there is a mistake:

import numpy as np
import hrvanalysis

# create fake NNs, all with >60ms diff
NNs = np.arange(1100, 2000, 60)

pNN50 = hrvanalysis.get_time_domain_features(NNs)['pnni_50']
# should be 100%, but is 93%

I also found the reason:

https://github.com/Aura-healthcare/hrvanalysis/blame/2aca66ee65e2bf4867a6badc17322197c196d70d/hrvanalysis/extract_features.py#L109

You take a np.diff(RR) and then divide the sum of >50 by len(RR). However, np.diff(RR) will have one element less than RR.

Edit: I wrote some things in a non-friendly way, please accept my apologies for doing so, I edited them out.

Add Detrended fluctuation analysis (DFA1 & DF12)

see reference article for the features:
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5624990/pdf/fpubh-05-00258.pdf

test notif

test

Add plot for geometrical features

Create a plotting method for the 2 geometrical features.
Add an option in the "plot_distrib" method.

Implement Multitaper spectral analysis method

Multitaper spectral analysis provides a more accurate information than Welsh periodogram:
high frequency resolution and low variance
https://raphaelvallat.com/bandpower.html

MNE library already provides a python implementation for this algorithm
https://martinos.org/mne/dev/generated/mne.time_frequency.psd_multitaper.html

Unit of nn_intervals

Hi,

After encountering an UserWarning form spicy.signal.spectral (UserWarning: nperseg = 256 is greater than input length = 5, using nperseg = 5), I check the code and it seems that the function get_frequency_domain_features only takes nn_intervals in ms.

Please run the following script to replicate the results:

from hrvanalysis import get_frequency_domain_features
import pyhrv.frequency_domain as fd. # used as a reference

nn_intervals_in_s = np.random.uniform(0.4,1.8, size=(2000,))
print('frequency domain of nn_intervals in seconds')
print(get_frequency_domain_features(nn_intervals_in_s))

nn_intervals_in_ms = nn_intervals_in_s * 1000
print('frequency domain of nn_intervals in miliseconds')
print(get_frequency_domain_features(nn_intervals_in_ms))

print('frequency domain of nn_intervals in miliseconds by pyhrv')
fd.welch_psd(nn_intervals_in_ms)

print('frequency domain of nn_intervals in seconds by pyhrv')
fd.welch_psd(nn_intervals_in_s)

If this is the case, please specify the unit of nn_intervals in the documentation.

RuntimeError: implement_array_function method already has a docstring

Error when Runing tests example,please tell me how to solve this problem,thanks~

Correct SD2/SD1 ratio to SD1/SD2 ratio

We have to fix the SD2/SD1 to SD1/SD2 ratio standard measurement

The ratio of SD1/SD2, which measures the unpredictability of the RR time series, is used to measure autonomic balance when the monitoring period is sufficiently long and there is sympathetic activation. SD1/SD2 is correlated with the LF/HF ratio

UserWarning from spectral.py when using get_time_domain_features

I am getting the following warning when passing a list of heart rate PPG values to get_time_domain_features.

/Users/mqm0624/miniconda3/envs/mist/lib/python3.8/site-packages/scipy/signal/spectral.py:1961: UserWarning: nperseg = 256 is greater than input length  = 10, using nperseg = 10
  warnings.warn('nperseg = {0:d} is greater than input length '

Suggestion - DFA α1 analysis ?

Since HRV related dfa a1 algorithm would be nice. Used frequently in endurance sports after discovery of it's correlation with LT1, LT2 thresholds

Add Python 2.7 to 3.7 backward-compatibility

We need to do a backward-compatibility for Python 2.7 to 3.7 users.
We could use "Six" Module.
See https://docs.python.org/3/howto/pyporting.html

Wrong divisor in computing pnni_50/pnni_20

Hi :)

I think I have just spotted a minor error when computing pnni_xx. Currently you have:

diff_nni = np.diff(nn_intervals)
length_int = len(nn_intervals)

nni_50 = sum(np.abs(diff_nni) > 50)
pnni_50 = 100 * nni_50 / length_int
nni_20 = sum(np.abs(diff_nni) > 20)
pnni_20 = 100 * nni_20 / length_int

The problem is that the divisor is length_int. I believe the divisor should be len(diff_nni) instead?

This error is also in the description:

- **nni_50**: Number of interval differences of successive RR-intervals greater than 50 ms.

- **pnni_50**: The proportion derived by dividing nni_50 (The number of interval differences \
of successive RR-intervals greater than 50 ms) by the total number of RR-intervals.

- **nni_20**: Number of interval differences of successive RR-intervals greater than 20 ms.

- **pnni_20**: The proportion derived by dividing nni_20 (The number of interval differences \
of successive RR-intervals greater than 20 ms) by the total number of RR-intervals.

Cheers!

[Request] Is it possible to get the LF at a 5 seconds or 1 second interval?

Currently, we get to a minute granularity and any attempt to get 5 or 1 second, has a different value adding up the LF instead.

Add support to compute ULF frequency band

The ultra-low-frequency (ULF) band (≤0.003 Hz) requires a recording period of at least 24 h (12) and is highly correlated with the SDANN time-domain index (44).

Calculating rr-interval list

Hi Robin,

Do we need to compute the rr_intervals_list from the raw ecg signal or the rr_intervals_list refers to raw signal? I did not get it correctly?

Adopt a clean git flow for future integrations

We need to adopt a clean git flow with a specific branch by library version and merge the master branch with the current version

plot_distrib fails

When I run plot.plot_distrib(rr), where rr is a float list or a numpy array, I get the following error:

Traceback (most recent call last):
  File "<input>", line 1, in <module>
  File "C:\Users\berief\Python\Python37\lib\site-packages\hrvanalysis\plot.py", line 78, in plot_distrib
    plt.hist(nn_intervals, bins=range(min_nn_i - 10, max_nn_i + 10, bin_length))
TypeError: 'float' object cannot be interpreted as an integer

Possible fix:
Just cast max_nn_i and max_nn_i to int or use np.arange.

Edit feature TINN for geometrical domain analysis

Add the triangular interpolation of NN interval histogram feature for the "get_geometrical_features" method.

reference for the feature:
https://www.escardio.org/static_file/Escardio/Guidelines/Scientific-Statements/guidelines-Heart-Rate-Variability-FT-1996.pdf

Cannot import LombScargle in extract_features.py

Once trying to import the module in the script extract_features.py:
from astropy.stats import LombScargle

Get a message: module LombScargle not found.

So I found a workaround as the following:

try:
from astropy.timeseries import LombScargle
except ImportError:
from astropy.stats import LombScargle

Any possibility to implement this?

Thanks.

Francis

interpolate_nan_values(rr_intervals=a) where a[0] is nan returns list with first index nan

'>=' not supported between instances of 'int' and 'ellipsis'

running the line below with the demo code produces the above-stated error in preprocessing.py

This remove outliers from signal

rr_intervals_without_outliers = remove_outliers(rr_intervals=rr_intervals_list,
low_rri=300, high_rri=2000)

I use python 3.6. Is that a Python version conflict issue? There are other errors thrown in the subsequent methods function_base.py and fromnumeric.py all related to TypeError: unsupported operand type(s) for -: 'ellipsis' and 'int'.

Full error message:

TypeError Traceback (most recent call last)
in ()
1 # This remove outliers from signal
2 rr_intervals_without_outliers = remove_outliers(rr_intervals=rr_intervals_list,
----> 3 low_rri=300, high_rri=2000)
4 # This replace outliers nan values with linear interpolation
5 interpolated_rr_intervals = interpolate_nan_values(rr_intervals=rr_intervals_without_outliers,

~/anaconda3/lib/python3.6/site-packages/hrvanalysis/preprocessing.py in remove_outliers(rr_intervals, verbose, low_rri, high_rri)
58 # Conversion RrInterval to Heart rate ==> rri (ms) = 1000 / (bpm / 60)
59 # rri 2000 => bpm 30 / rri 300 => bpm 200
---> 60 rr_intervals_cleaned = [rri if high_rri >= rri >= low_rri else np.nan for rri in rr_intervals]
61
62 if verbose:

~/anaconda3/lib/python3.6/site-packages/hrvanalysis/preprocessing.py in (.0)
58 # Conversion RrInterval to Heart rate ==> rri (ms) = 1000 / (bpm / 60)
59 # rri 2000 => bpm 30 / rri 300 => bpm 200
---> 60 rr_intervals_cleaned = [rri if high_rri >= rri >= low_rri else np.nan for rri in rr_intervals]
61
62 if verbose:

TypeError: '>=' not supported between instances of 'int' and 'ellipsis'

Fix Error: Cannot import LombScargle from astropy.stats

Hello,

I encountered an ImportError while using the hrvanalysis package due to an outdated import statement for LombScargle. The LombScargle class has been moved from astropy.stats to astropy.timeseries in newer versions of astropy.

Error message: ImportError: cannot import name 'LombScargle' from 'astropy.stats' (/usr/local/lib/python3.10/dist-packages/astropy/stats/init.py)

Request to update the import statement in the hrvanalysis codebase from:

"from astropy.stats import LombScargle" to "from astropy.timeseries import LombScargle"

This change resolved the import issues in my environment, and I believe updating this in the package would help maintain compatibility with current and future versions of 'astropy'.

rr_intervals_without_outliers = remove_outliers(rr_intervals=rr_intervals_list,low_rri=300, high_rri=2000)

'>=' not supported between instances of 'int' and 'ellipsis'

The code is running with an error

Hello, I'm getting this error in my running code and would like to know what's causing it：FileNotFoundError: [Errno 2] No such file or directory: 'D:\apnea-ecg-database-1.0.0\apnea-prediction_1min_ahead.pkl'
I don't know how to go about creating this file apnea-prediction_1min_ahead.pkl.I hope you can help me with this, I would appreciate it!