pia-group / biosppy Goto Github PK

View Code? Open in Web Editor NEW

569.0 35.0 273.0 2.57 MB

Biosignal Processing in Python

License: Other

Python 100.00%

python physiological-computing biosignals data-science signal-processing

biosppy's Introduction

This repository is archived. The BioSPPy toolbox is now maintained at scientisst/BioSPPy.

BioSPPy - Biosignal Processing in Python

A toolbox for biosignal processing written in Python.

The toolbox bundles together various signal processing and pattern recognition methods geared towards the analysis of biosignals.

Highlights:

Support for various biosignals: BVP, ECG, EDA, EEG, EMG, PCG, PPG, Respiration
Signal analysis primitives: filtering, frequency analysis
Clustering
Biometrics

Documentation can be found at: http://biosppy.readthedocs.org/

Installation

Installation can be easily done with pip:

$ pip install biosppy

Simple Example

The code below loads an ECG signal from the examples folder, filters it, performs R-peak detection, and computes the instantaneous heart rate.

from biosppy import storage
from biosppy.signals import ecg

# load raw ECG signal
signal, mdata = storage.load_txt('./examples/ecg.txt')

# process it and plot
out = ecg.ecg(signal=signal, sampling_rate=1000., show=True)

This should produce a plot similar to the one below.

Dependencies

bidict
h5py
matplotlib
numpy
scikit-learn
scipy
shortuuid
six
joblib

Citing

Please use the following if you need to cite BioSPPy:

Carreiras C, Alves AP, Lourenço A, Canento F, Silva H, Fred A, et al. BioSPPy - Biosignal Processing in Python, 2015-, https://github.com/PIA-Group/BioSPPy/ [Online; accessed <year>-<month>-<day>].

@Misc{,
  author = {Carlos Carreiras and Ana Priscila Alves and Andr\'{e} Louren\c{c}o and Filipe Canento and Hugo Silva and Ana Fred and others},
  title = {{BioSPPy}: Biosignal Processing in {Python}},
  year = {2015--},
  url = "https://github.com/PIA-Group/BioSPPy/",
  note = {[Online; accessed <today>]}
}

License

BioSPPy is released under the BSD 3-clause license. See LICENSE for more details.

Disclaimer

This program is distributed in the hope it will be useful and provided to you "as is", but WITHOUT ANY WARRANTY, without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. This program is NOT intended for medical diagnosis. We expressly disclaim any liability whatsoever for any direct, indirect, consequential, incidental or special damages, including, without limitation, lost revenues, lost profits, losses resulting from business interruption or loss of data, regardless of the form of action or legal theory under which the liability may be asserted, even if advised of the possibility of such damages.

biosppy's People

Contributors

Stargazers

Watchers

Forkers

appscluster hugodlopes lingwang7547 shashank971 ball-hayden vidyaa123 yutaforlab jj1118 coventryresearch olive01 lucfisc ywenlu tostasmistas carlozofjuan vandanacr hiredd dominiquemakowski mgschwan willemzegers rachitjn python3pkg alexmikunis tuhuynhthanh he-lourenco vasiliynovosad lupupam mlatypo1 vkav venturit daiweigithub kelvinzch nikhilsha jadevarelle qiriro fabio256 sridharkumarkannam sp00044 masanobbb monster1985 parry-do zhaoerchao zqkhan honorforlee barongeng ojabajracharya davidm-public yongfu-li shanu199 joaopandolfi jessicarryly rfazeli herrkaefer ericyao2013 amitnpujari ewchavez rbtchc barrukurniawan very3b bonnynich dannyisme grayland119 capcarr afcarl ngatilio thiwankajayasiri feverdreams clin366 itoshikihiro sandeepnccs nkatebi paarisaa sihhua tomatenbiss idouania airob duguyue100 xezxey ronaldrosejoseph shintaro-mat vilmarzti renatosc datoslabs cordeirojoao marcocode23 andersoonluan zuchunshan nsunami cw5121 ecg-cloud magdasalatka nssreenivasalu yingding syuanbo osmolr gcathelain boris69bg nerechka liuwenhaha patriciabota junlovejun

biosppy's Issues

Missing docs for timing module

Forgot to add the new timing module to the docs.

Segmentation Fault in IDLE or CLI

The following scenarios are all in conda virtual environment

Simplest reproducible scenarios

Run the script in command line python hello.py

import biosppy

if __name__ == '__main__':
    print("hello")

or in IDLE

>>> import biosppy

In jupyter notebook it works fine. Any idea why it's the case?

biosppy.signals.ecg

Hello,I have some question,and need your help
I wanted to use Neuropython but it's still failed
I ran the code,and it made error

error
module 'biosppy.signals.ecg' has no attribute 'correct_rpeaks'

I checked the file 'biosppy.signals.ecg', there has function'correct_rpeaks',and already has attribute
Is file "bio_ecg_preprocessing" wrong or file "biosppy.signals.ecg"?

code

import neurokit as nk
import pandas as pd
import numpy as np
import seaborn as sns

df = pd.read_csv("C:/Users/User/Desktop/CH1.csv")
df.plot()

bio = nk.ecg_process(ecg=df["ECG1"], rsp=None, sampling_rate=250, filter_type='FIR', filter_band='bandpass', filter_frequency=[3, 45], segmenter='hamilton', quality_model='default', hrv_features=['time', 'frequency'], age=None, sex=None, position=None)

nk.z_score(bio["df"]).plot()

raise key error

Hello and thanks for any future help!
I am trying to use Biosppy in combination with pyhrv to analyse an ECG signal and when I try to geenerate the tuples containing all HRV parameters I get this error message 👍 KeyError:
File "/home/omar/.local/lib/python2.7/site-packages/biosppy/utils.py", line 407, in getitem
raise KeyError("Unknown key: %r." % key)

KeyError: "Unknown key: 'ar_order'."

Basestring error?

I'm trying to run this in a virtual environment (with Anaconda) on Python 2.7.3 or 2.7.4. I get the error "name basestring" is not defined when I try to run biosppy.ecg.ecg(signal=mydata, sampling_rate=1000,show=True).

What version of Python is recommended for use with this?

Documentation of "resolution" metadata

Can you document the use of the "resolution" metadata in the simple text format, like in the example files?

Future Warning from sklearn

The warning comes from storage.py. I think there is a two lines fix. I can create a pull request if it is appropriate. Thanks.

DeprecationWarning: sklearn.externals.joblib is deprecated in 0.21 and will be removed in 0.23. 

Please import this functionality directly from joblib, which can be installed with: pip install joblib. 
If this warning is raised when loading pickled models, you may need to re-serialize those models with scikit-learn 0.21+.

  warnings.warn(msg, category=DeprecationWarning)

Possible error in timestamps

It seems to me endpoint should be True (the default).

BioSPPy/biosppy/signals/ecg.py

Lines 102 to 103 in 212c3dc

    
           T = (length - 1) / sampling_rate 
        
           ts = np.linspace(0, T, length, endpoint=False)

update to latest bidict

Hi @capcarr, I'm the author of bidict, and saw this project depends on an older version of it. Wanted to give you a heads up that there's a newer version available which fixes bugs in the version you're using, but which also has some breaking API changes (for the sake of a cleaner and safer API) — see https://bidict.readthedocs.org/en/master/changelog.html#id2.

FWIW, I don't expect any further breaking API changes, and hope to release bidict 1.0 soon.

Hope bidict has been working well for you, and please feel free to give feedback here or on Gitter.

Not enough heartbeats: HB detection issue

Hey guys,

I got this modified LEAD II ECG signal from which I'd like to extract several features. However, using the signals.ecg.ecg() function throws out a ValueError: Not enough beats to compute heart rate. I tried to narrow down the locus of the error and it seems to happen within the hamilton_segmenter() (which returns 0 peaks)...

What is causing this error? Is there a way to fix it?

Thanks!

Implications of using Lead II ECG data?

I noticed that the ECG routine assumes a single-channel Lead I ECG signal.

Are there any implications for using the code on Lead II data?

Amplitudes returned by EDA library are too small

I am facing the issue that the eda.eda method returns amplitudes that are significantly smaller than those in the EDA signal. I already analyzed where the problem is and summarized my findings in the following pdf. I might be wrong so I would be glad to hear your opinion on this.

SCR_Library_Amplitudes.pdf

UnboundLocalError: local variable 'twopeaks' referenced before assignment

Hi,
I get this for like 1% files from my DB, not entirely sure how to fix it myself:

File "/usr/local/lib/python2.7/dist-packages/biosppy/signals/ecg.py", line 1157, in hamilton_segmenter
posdiv = abs(twopeaks[0][0] - twopeaks[1][0])
UnboundLocalError: local variable 'twopeaks' referenced before assignment

Eda.eda falsely returns 0 onsets in some signals

I am facing the issue that the eda.eda method returns 0 onsets in some of my signals even though from visual inspection I can definitely see a lot of them. I already contacted Hugo Silva regarding this issue and he asked to raise my concern here on GitHub.

I already analyzed where the problem is and summarized my findings in the following pdf. I might be wrong so I would be glad to hear your opinion on this.
Issues_with_SCR_library.pdf

R peak detection failure

Hey, sorry to bother you again. I have a dataset that somehow makes the r peak detection to fail...

Reproducible example:

import biosppy
import pandas as pd

df = pd.read_csv("https://raw.githubusercontent.com/neuropsychology/NeuroKit.py/master/examples/Bio/data/bio_rest.csv", index_col=0)[14097:314163]
biosppy_ecg = biosppy.ecg.ecg(df["ECG"])

It seems that it takes the negative peak for the positive. Why is that? How could we fix it?

Thanks for all!

FutureWarning from scipy

Hello,

Thank you for your work. I am currently working wth biosppy to process ECG signals. It's been of great help so far.

However, I have some warnings thrown at me using signals.ecg:

lib/python3.6/site-packages/scipy/signal/_arraytools.py:45: FutureWarning: Using a non-tuple sequence for multidimensional indexing is deprecated; use `arr[tuple(seq)]` instead of `arr[seq]`. In the future this will be interpreted as an array index, `arr[np.array(seq)]`, which will result either in an error or a different result. b = a[a_slice]

This same message appears for _arraytools.py (the example) and signaltools.py:1341, :1344 and :1350

The result from biosppy is correct but I wonder until when.

BVP filter

I was wondering why you are filtering the BVP signal with a lowpass 4Hz butterworth filter?
Is it meant to be a LP 40 Hz filter?

Zero R peaks found when capture starts with noise

Hi!

We're having some trouble with R-peak detection.
We've been using ecg.ecg() with some captures, and observed that every time the capture starts with noisy enough data, it will fail to find R peaks, regardless of what comes after the noisy part. This means that if a capture begins very noisy but later has a perfect ECG for a long time, ecg.ecg() will still never be able to find not a single R peak (at least that's what we can observe)

Example:

Capture 1) Good, clean samples. R peaks found correctly.

Capture 2) Bad, somehow noisy samples. No R peaks found. (plotted directly with matplotlib)

Noisy start, clean continuation. No R peaks found. (plotted directly with matplotlib) This is capture 1) followed by capture 2).

Is this behavior expected?
What can be done to detect the R peaks of the clean part?
Should we be the ones handling this?

Many thanks in advance!

EDA library detects peak positions imprecisely

I am facing the issue that the eda.eda method returns incorrect positions of the peaks. I already analyzed where the problem is and summarized my findings in the following pdf. I might be wrong so I would be glad to hear your opinion on this.

SCR_Library_PeakPositions.pdf

BCG support proposal

Dear Carlos Carreiras,

I appreciate the code that you provide with this signal processing library, especially the tools.
I am a PhD student working in Ballistocardiography (cardiac mechanical activity monitoring).

Would you like me to contribute and create a BCG processing section ? I found no other open source Python library in this domain.

Sincerely yours,

Guillaume Cathelain

Uncaught ValueError in basic_scr() method

When using the basic_scr() method of the EDA library I faced an uncaught ValueError for some of my signals. The error message is the following:

a = np.array([np.max(signal[i1[i]:i3[i]]) for i in range(li)])
File "/usr/local/lib/python3.6/site-packages/numpy/core/fromnumeric.py", line 2320, in amax out=out, **kwargs)
File "/usr/local/lib/python3.6/site-packages/numpy/core/_methods.py", line 26, in _amax return umr_maximum(a, axis, None, out, keepdims)
ValueError: zero-size array to reduction operation maximum which has no identity

I tracked down the error to line 167 that contains

a = np.array([np.max(signal[i1[i]:i3[i]]) for i in range(li)])

More precisely the error occurs when signal[i1[i]:i3[i]] has a length of zero (which occurred a couple of times in almost half of my datasets). I figured this error is thrown when np.max() is called on an empty list. A possible solution would be to replace this line with the following:

a = np.array([np.max(signal[i1[i]:i3[i]]) for i in range(li) if len(signal[i1[i]:i3[i]]) > 0])

While analyzing this issue I recognized that the above list comprehension is supposed to yield the amplitudes of the peaks. But what it actually does is to yield the maximum value of the original signal in the given interval. This could be solved by yielding the difference between min and max value in the given interval. To combine both of my proposals the following line might be used:

a = np.array([np.max(signal[i1[i]:i3[i]]) - np.min(signal[i1[i]:i3[i]]) for i in range(li) if len(signal[i1[i]:i3[i]]) > 0])

ImportError: cannot import name 'tools'

when i run the ecg.py code under BioSPPy-master/biosppy/signals i get this error, ImportError: cannot import name 'tools'

Citation

Could you add a method to cite this package? Some thing like the scipy style, method:

Carlos Carreiras. BioSPPy: Biosignal Processing in Python. 2015-, https://github.com/PIA-Group/BioSPPy [Online; accessed <year>-<month>-<day>].

Or Bibtex:

@Misc{biosppy,
       author = {Carlos Carreiras},
       title = {{BioSPPy}: Biosignal Processing in  {Python}},
       year = {2015--},
       url = "https://github.com/PIA-Group/BioSPPy",
}

Path to file not working on Windows

When I run the following:
ecg_signal, ecg_mdata = storage.load_txt('D:\Data\bitalinoECG.txt')

I obtain this error:
OSError: [Errno 22] Invalid argument: 'D:\\Data\x08italinoECG.txt'

Apparently the string is not properly escaped.

Walktree not working

I can't seem to obtain any resut from the walktree function.

I'm doing this: filename = walktree(top= '../../data/Trials', spec=r'\.XML$')

and the function acts like if I had no file in that directory whatsoever (see images for better understanding).

Thank you

Multi-lead ECG processing

For ECG module, I think it's better to have the ability to process multi-lead (or channel) data at the same time as n-d array.

For R peaks detection and correction, it should be more reliable if we consider multi-channel together, because heart beat cycles should be synchronized among all channels. There maybe some research and algorithms though I don't know.

Paper reference of how the bvp algorithm was implemented

I am trying to understand how BVP peaks are detected. Is the BVP onsets detection algorithm based on any published work? If there is, I would love to read it to make sense of how the algorithms were implemented.

eda basic_scr mismatch in returned indices

test_biosppy_basic_scr.zip

eda.basic_scr method

The basic_scr seems to be cutting wrongly the returned indices. After detecting null-derivative singular points, onset and peak indices to be returned are arranged to be returned in pairs.

However, the way peaks and onsets are discarded leads to (peak,onset,peak,onset) returns instead of (onset,peak,onset,peak) returns that would be more meaningful.

https://github.com/PIA-Group/BioSPPy/blob/master/biosppy/signals/eda.py

Pan-Tompkins QRS Segmenter

I'm doing HRV research. My team is most comfortable with the the Pan-Tompkins R peak detection algorithm since it is old and established. While Biosppy has many good segmenters, including the Hamilton segmenter used by default in biosppy.signals.ecg.ecg, it would be nice to have an old workhorse like Pan-Tompkins available in biosppy.

Would I be welcome to submit a fork adding Pan-Tompkins to biosppy?

Simple store the processed data

Hello,
I´m pretty new on GitHub but I love it to try a lot...
So I also used to analyse me eda-data with biosppy. The plotting works perfect.
To get more insights into my data I want to store the processed data (means the "table") which is the basic for plotting.
But how can I do this?
I have tried a lot of things (and stupid things)... but none worked...
Maybe there is some one how can help me with this problem.
Thank you!
regards
portas525

Source for ECG Data

Thanks for an awesome tool! Is there possibly a library of different ECG signals to test with this? I'm thinking of like arrhythmia etc. Is it possible to use something like physionet data for example?

Loading Acqknowledge files

Hi this might be a stupid question - I don't have much experience working with Python/Signal processing...

I have acquired my data with Biopac Acqknowledge and converted the signals into a text file. When trying to load it with the np.loadtxt command I get errors 'ValueError: could not convert string to float:' . Does Biosppy not support the acqknowledge outputs?

Thanks,

ValueError was happend in respiration methods.

I'm trying to run resp() method in Python 2.7.11 with measured respiration data. But I got an error "v cannot be empty" at convolve method in smoother method line 548 in tools.py.
I also checked my data with ecg.ecg method. Then program ran correctly.
Do you have any idea to solve this problem? Python, numpy, scipy version, or any.

There is a version works with python 3.5? Or do we use Python 2.7?

Unable to get Rpeaks with ecg.engzee_segmenter

I'm trying to get Rpeaks from ECG with all segmenters implemented in BioSPPY.signal.ecg.
All of them work except for the engzee_segmenter which gives no Rpeak at all. The other ones give correct 30 RPeaks.

I try to change the threshold but no of my test was successful.

Does it exist a method to choose the threshold?

Find respiratory cycles

It's me again 😅
I'm looking for a way to locate respiratory cycles. To get a list of indices corresponding to the beginning of each cycle. Currently, the rsp.rsp() function returns zero-crossings, which are not exactly the same... do you have any idea on how to achieve that?
again, thanks

ZeroDivisionError when plotting a clustering

plotting.plot_clustering fails when a cluster has no elements.

ValueError exception when linkage method is 'centroid'

The method clustering.hierarchical raises a ValueError exception when the linkage parameter is set to 'centroid'. It seems that the underlying linkage method from scipy needs the raw data with this setting.

Add unit testing

Add unit testing coverage.

TypeError in windower method

The signals.tools.windower method raises a TypeError when the fcn_kwargs argument is not specified. The problem can be overcome be specifying an empty dict, e.g. fcn_kwargs={}.

ICP

Hi!
I'm a medical researcher studying intracranial pressure (ICP) responses to various treatment interventions in intensive care unit patients. I was wondering if I can use BioSPPy to process and visualize ICP signals. I wouldn't assume it being more different than other signals like ecg-eeg etc. but it's not among the biosignals covered in the documentation.
Thanks in advance!

The function biosppy.signals.tools.signal_stats() is not reliable.

I noticed the function signal_stats() in biosppy.signals.tools computes wrong max values.
When I call stats = tools.signal_stats( data[ dataName ] ), stats["max"] is different from max(data[dataName]).
The array data[] contains doubles.
Just to let you know.

Phasic EDA

I have another question. Is the filtered EDA returned by the eda.eda function corresponding to phasic EDA? If not, do you know any way of computing it?

AttributeError: module 'neurokit' has no attribute 'bio_process'

I just wanted to try the features of your neurokit.
I´m using Anaconda installed on Windows 10.
The pip install of neurokit was successful.
Then I copied your following sample code into a Spyder window:

Import packages

import neurokit as nk
import pandas as pd
import numpy as np
import seaborn as sns

Download data

df = pd.read_csv("https://raw.githubusercontent.com/neuropsychology/NeuroKit.py/master/examples/Bio/bio_100Hz.csv")

Plot it

df.plot()

Process the signals

bio = nk.bio_process(ecg=df["ECG"], rsp=df["RSP"], eda=df["EDA"], add=df["Photosensor"], sampling_rate=100)

Plot the processed dataframe, normalizing all variables for viewing purpose

nk.z_score(bio["df"]).plot()
pd.DataFrame(bio["ECG"]["Cardiac_Cycles"]).plot(legend=False)

The first part imcuding the first plot runs well until nk.bio_process where I get this error message:
bio = nk.bio_process(ecg=df["ECG"], rsp=df["RSP"], eda=df["EDA"], add=df["Photosensor"], sampling_rate=100)
AttributeError: module 'neurokit' has no attribute 'bio_process'

What went wrong ?

ecg.ecg sometimes duplicates R peaks

The function ecg.ecg sometimes reports the same R peak twice in a row, which may cause users' own downstream processing to fail (by calculating an R-R distance of zero). The corresponding heartbeat wave templates also get duplicated.
This happens, for example, at index 33 & 34 of the returned out['rpeaks'] and out['templates'] when the attached ecg is given as input.

ecg_resulting_in_duplicated_rpeaks.npy.zip

Heart Rate Variability

I was just wondering, are there any plans for implementing HRV computation?

Thanks!

Confusion of implementation of utils.random_fraction

In utils.random_fraction, I find the following code,

   # copy because shuffle works in place
    aux = copy.deepcopy(indx)

    # shuffle
    np.random.shuffle(indx)

    # select
    use = aux[:nb]
    unuse = aux[nb:]

About the shuffle, should it shuffle aux instead of index?

Inconsistent decibel use

When plotting the frequency response of a filter, the power is not expressed in decibels, as was expected.

BVP: Allow d1 and d2 parameters to be configured

I'm struggling a little bit to understand what's going on, so it may be that I'm barking up the wrong tree.

I'm using data from physionet.org, and have found that find any onsets in bvp data I have to adjust d2_th (https://github.com/PIA-Group/BioSPPy/blob/master/biosppy/signals/bvp.py#L128).

Should I be needing to do this? I am using a significantly lower sample rate (125Hz), so perhaps it could be related to that?

Thanks for the great library!

Scikit-Learn cross-validation module deprecation

The cross_validation and grid_search modules were deprecated in sklearn 0.18 and replaced with the model_selection module.

	T = (length - 1) / sampling_rate
	ts = np.linspace(0, T, length, endpoint=False)