Coder Social home page Coder Social logo

neuropsychology / neurokit Goto Github PK

View Code? Open in Web Editor NEW
1.4K 29.0 382.0 3.69 GB

NeuroKit2: The Python Toolbox for Neurophysiological Signal Processing

Home Page: https://neuropsychology.github.io/NeuroKit

License: MIT License

Python 99.87% TeX 0.13%
python ecg eda emg eeg ppg signal physiology biosignals scr

neurokit's Introduction

image

image

image

image

image

image

Maintainability

The Python Toolbox for Neurophysiological Signal Processing

NeuroKit2 is a user-friendly package providing easy access to advanced biosignal processing routines. Researchers and clinicians without extensive knowledge of programming or biomedical signal processing can analyze physiological data with only two lines of code.

Quick Example

import neurokit2 as nk

# Download example data
data = nk.data("bio_eventrelated_100hz")

# Preprocess the data (filter, find peaks, etc.)
processed_data, info = nk.bio_process(ecg=data["ECG"], rsp=data["RSP"], eda=data["EDA"], sampling_rate=100)

# Compute relevant features
results = nk.bio_analyze(processed_data, sampling_rate=100)

And boom 💥 your analysis is done 😎

Download

You can download NeuroKit2 from PyPI

pip install neurokit2

or conda-forge

conda install -c conda-forge neurokit2

If you're not sure what to do, read our installation guide.

Contributing

License

GitHub CI

Black code

NeuroKit2 is the most welcoming project with a large community of contributors with all levels of programming expertise. But the package is still far from being perfect! Thus, if you have some ideas for improvement, new features, or just want to learn Python and do something useful at the same time, do not hesitate and check out the following guide:

Also, if you have developed new signal processing methods or algorithms and you want to increase their usage, popularity, and citations, get in touch with us to eventually add them to NeuroKit. A great opportunity for the users as well as the original developers!

You have spotted a mistake? An error in a formula or code? OR there is just a step that seems strange and you don't understand? Please let us know! We are human beings, and we'll appreciate any inquiry.

Documentation

Documentation Status

API

Tutorials

Click on the links above and check out our tutorials:

General

Examples

Don't know which tutorial is suited for your case? Follow this flowchart:

image

Citation

image

image

The NeuroKit2 paper can be found here 🎉 Additionally, you can get the reference directly from Python by running:

nk.cite()
You can cite NeuroKit2 as follows:

- Makowski, D., Pham, T., Lau, Z. J., Brammer, J. C., Lespinasse, F., Pham, H.,
Schölzel, C., & Chen, S. A. (2021). NeuroKit2: A Python toolbox for neurophysiological signal processing.
Behavior Research Methods, 53(4), 1689–1696. https://doi.org/10.3758/s13428-020-01516-y

Full bibtex reference:

@article{Makowski2021neurokit,
    author = {Dominique Makowski and Tam Pham and Zen J. Lau and Jan C. Brammer and Fran{\c{c}}ois Lespinasse and Hung Pham and Christopher Schölzel and S. H. Annabel Chen},
    title = {{NeuroKit}2: A Python toolbox for neurophysiological signal processing},
    journal = {Behavior Research Methods},
    volume = {53},
    number = {4},
    pages = {1689--1696},
    publisher = {Springer Science and Business Media {LLC}},
    doi = {10.3758/s13428-020-01516-y},
    url = {https://doi.org/10.3758%2Fs13428-020-01516-y},
    year = 2021,
    month = {feb}
}

Let us know if you used NeuroKit2 in a publication! Open a new discussion (select the NK in publications category) and link the paper. The community would be happy to know about how you used it and learn about your research. We could also feature it once we have a section on the website for papers that used the software.

Physiological Data Preprocessing

Simulate physiological signals

You can easily simulate artificial ECG (also 12-Lead multichannel ECGs), PPG, RSP, EDA, and EMG signals to test your scripts and algorithms.

import numpy as np
import pandas as pd
import neurokit2 as nk

# Generate synthetic signals
ecg = nk.ecg_simulate(duration=10, heart_rate=70)
ppg = nk.ppg_simulate(duration=10, heart_rate=70)
rsp = nk.rsp_simulate(duration=10, respiratory_rate=15)
eda = nk.eda_simulate(duration=10, scr_number=3)
emg = nk.emg_simulate(duration=10, burst_number=2)

# Visualise biosignals
data = pd.DataFrame({"ECG": ecg,
                     "PPG": ppg,
                     "RSP": rsp,
                     "EDA": eda,
                     "EMG": emg})
nk.signal_plot(data, subplots=True)

image

Electrodermal Activity (EDA/GSR)

# Generate 10 seconds of EDA signal (recorded at 250 samples / second) with 2 SCR peaks
eda = nk.eda_simulate(duration=10, sampling_rate=250, scr_number=2, drift=0.01)

# Process it
signals, info = nk.eda_process(eda, sampling_rate=250)

# Visualise the processing
nk.eda_plot(signals, info)

image

Cardiac activity (ECG)

# Generate 15 seconds of ECG signal (recorded at 250 samples/second)
ecg = nk.ecg_simulate(duration=15, sampling_rate=250, heart_rate=70)

# Process it
signals, info = nk.ecg_process(ecg, sampling_rate=250)

# Visualise the processing
nk.ecg_plot(signals, info)

image

Respiration (RSP)

# Generate one minute of respiratory (RSP) signal (recorded at 250 samples / second)
rsp = nk.rsp_simulate(duration=60, sampling_rate=250, respiratory_rate=15)

# Process it
signals, info = nk.rsp_process(rsp, sampling_rate=250)

# Visualise the processing
nk.rsp_plot(signals, info)

image

Photoplethysmography (PPG/BVP)

# Generate 15 seconds of PPG signal (recorded at 250 samples/second)
ppg = nk.ppg_simulate(duration=15, sampling_rate=250, heart_rate=70)

# Process it
signals, info = nk.ppg_process(ppg, sampling_rate=250)

# Visualize the processing
nk.ppg_plot(signals, info)

image

Electromyography (EMG)

# Generate 10 seconds of EMG signal (recorded at 250 samples/second)
emg = nk.emg_simulate(duration=10, sampling_rate=250, burst_number=3)

# Process it
signals, info = nk.emg_process(emg, sampling_rate=250)

# Visualise the processing
nk.emg_plot(signals, info)

image

Electrooculography (EOG)

# Import EOG data
eog_signal = nk.data("eog_100hz")

# Process it
signals, info = nk.eog_process(eog_signal, sampling_rate=100)

# Plot
nk.eog_plot(signals, info)

image

Electrogastrography (EGG)

Consider helping us develop it!

Physiological Data Analysis

The analysis of physiological data usually comes in two types, event-related or interval-related.

image

This type of analysis refers to physiological changes immediately occurring in response to an event. For instance, physiological changes following the presentation of a stimulus (e.g., an emotional stimulus) are indicated by the dotted lines in the figure above. In this situation, the analysis is epoch-based. An epoch is a short chunk of the physiological signal (usually < 10 seconds), that is locked to a specific stimulus and hence the physiological signals of interest are time-segmented accordingly. This is represented by the orange boxes in the figure above. In this case, using bio_analyze() will compute features like rate changes, peak characteristics, and phase characteristics.

This type of analysis refers to the physiological characteristics and features that occur over longer periods of time (from a few seconds to days of activity). Typical use cases are either periods of resting state, in which the activity is recorded for several minutes while the participant is at rest, or during different conditions in which there is no specific time-locked event (e.g., watching movies, listening to music, engaging in physical activity, etc.). For instance, this type of analysis is used when people want to compare the physiological activity under different intensities of physical exercise, different types of movies, or different intensities of stress. To compare event-related and interval-related analysis, we can refer to the example figure above. For example, a participant might be watching a 20s-long short film where particular stimuli of interest in the movie appear at certain time points (marked by the dotted lines). While event-related analysis pertains to the segments of signals within the orange boxes (to understand the physiological changes pertaining to the appearance of stimuli), interval-related analysis can be applied on the entire 20s duration to investigate how physiology fluctuates in general. In this case, using bio_analyze() will compute features such as rate characteristics (in particular, variability metrics) and peak characteristics.

Heart Rate Variability (HRV)

image

Check-out our Heart Rate Variability in Psychology: A Review of HRV Indices and an Analysis Tutorial paper for:

  • a comprehensive review of the most up-to-date HRV indices
  • a discussion of their significance in psychological research and practices
  • a step-by-step guide for HRV analysis using NeuroKit2
You can cite the paper as follows:

- Pham, T., Lau, Z. J., Chen, S. H. A., & Makowski, D. (2021).
Heart Rate Variability in Psychology: A Review of HRV Indices and an Analysis Tutorial.
Sensors, 21(12), 3998. https://doi:10.3390/s21123998
  • Compute HRV indices using Python
    • Time domain: RMSSD, MeanNN, SDNN, SDSD, CVNN, etc.
    • Frequency domain: Spectral power density in various frequency bands (Ultra low/ULF, Very low/VLF, Low/LF, High/HF, Very high/VHF), Ratio of LF to HF power, Normalized LF (LFn) and HF (HFn), Log transformed HF (LnHF).
    • Nonlinear domain: Spread of RR intervals (SD1, SD2, ratio between SD2 to SD1), Cardiac Sympathetic Index (CSI), Cardial Vagal Index (CVI), Modified CSI, Sample Entropy (SampEn).
# Download data
data = nk.data("bio_resting_8min_100hz")

# Find peaks
peaks, info = nk.ecg_peaks(data["ECG"], sampling_rate=100)

# Compute HRV indices
nk.hrv(peaks, sampling_rate=100, show=True)
>>>    HRV_RMSSD  HRV_MeanNN   HRV_SDNN  ...   HRV_CVI  HRV_CSI_Modified  HRV_SampEn
>>> 0  69.697983  696.395349  62.135891  ...  4.829101        592.095372    1.259931

image

Miscellaneous

ECG Delineation

  • Delineate the QRS complex of an electrocardiac signal (ECG) including P-peaks, T-peaks, as well as their onsets and offsets.
# Download data
ecg_signal = nk.data(dataset="ecg_3000hz")

# Extract R-peaks locations
_, rpeaks = nk.ecg_peaks(ecg_signal, sampling_rate=3000)

# Delineate
signal, waves = nk.ecg_delineate(ecg_signal, rpeaks, sampling_rate=3000, method="dwt", show=True, show_type='all')

image

Signal Processing

  • Signal processing functionalities
    • Filtering: Using different methods.
    • Detrending: Remove the baseline drift or trend.
    • Distorting: Add noise and artifacts.
# Generate original signal
original = nk.signal_simulate(duration=6, frequency=1)

# Distort the signal (add noise, linear trend, artifacts, etc.)
distorted = nk.signal_distort(original,
                              noise_amplitude=0.1,
                              noise_frequency=[5, 10, 20],
                              powerline_amplitude=0.05,
                              artifacts_amplitude=0.3,
                              artifacts_number=3,
                              linear_drift=0.5)

# Clean (filter and detrend)
cleaned = nk.signal_detrend(distorted)
cleaned = nk.signal_filter(cleaned, lowcut=0.5, highcut=1.5)

# Compare the 3 signals
plot = nk.signal_plot([original, distorted, cleaned])

image

Complexity (Entropy, Fractal Dimensions, ...)

  • Optimize complexity parameters (delay tau, dimension m, tolerance r)
# Generate signal
signal = nk.signal_simulate(frequency=[1, 3], noise=0.01, sampling_rate=200)

# Find optimal time delay, embedding dimension, and r
parameters = nk.complexity_optimize(signal, show=True)

image

  • Compute complexity features
    • Entropy: Sample Entropy (SampEn), Approximate Entropy (ApEn), Fuzzy Entropy (FuzzEn), Multiscale Entropy (MSE), Shannon Entropy (ShEn)
    • Fractal dimensions: Correlation Dimension D2, ...
    • Detrended Fluctuation Analysis
nk.entropy_sample(signal)
nk.entropy_approximate(signal)

Signal Decomposition

# Create complex signal
signal = nk.signal_simulate(duration=10, frequency=1)  # High freq
signal += 3 * nk.signal_simulate(duration=10, frequency=3)  # Higher freq
signal += 3 * np.linspace(0, 2, len(signal))  # Add baseline and linear trend
signal += 2 * nk.signal_simulate(duration=10, frequency=0.1, noise=0)  # Non-linear trend
signal += np.random.normal(0, 0.02, len(signal))  # Add noise

# Decompose signal using Empirical Mode Decomposition (EMD)
components = nk.signal_decompose(signal, method='emd')
nk.signal_plot(components)  # Visualize components

# Recompose merging correlated components
recomposed = nk.signal_recompose(components, threshold=0.99)
nk.signal_plot(recomposed)  # Visualize components

image

Signal Power Spectrum Density (PSD)

# Generate complex signal
signal = nk.signal_simulate(duration=20, frequency=[0.5, 5, 10, 15], amplitude=[2, 1.5, 0.5, 0.3], noise=0.025)

# Get the PSD using different methods
welch = nk.signal_psd(signal, method="welch", min_frequency=1, max_frequency=20, show=True)
multitaper = nk.signal_psd(signal, method="multitapers", max_frequency=20, show=True)
lomb = nk.signal_psd(signal, method="lomb", min_frequency=1, max_frequency=20, show=True)
burg = nk.signal_psd(signal, method="burg", min_frequency=1, max_frequency=20, order=10, show=True)

image

Statistics

  • Highest Density Interval (HDI)
x = np.random.normal(loc=0, scale=1, size=100000)

ci_min, ci_max = nk.hdi(x, ci=0.95, show=True)

image

Popularity

image

image

image

NeuroKit2 is one of the most welcoming packages for new contributors and users, as well as the fastest-growing package. So stop hesitating and hop on board 🤗

image

Used at

ntu univ_paris univ_duke uni_auckland uni_pittsburh uni_washington

Disclaimer

The authors do not provide any warranty. If this software causes your keyboard to blow up, your brain to liquefy, your toilet to clog or a zombie plague to break loose, the authors CANNOT IN ANY WAY be held responsible.

neurokit's People

Contributors

anshu-97 avatar awwong1 avatar danibene avatar deranderejohannes avatar dominiquemakowski avatar duylp avatar elaineteo2000 avatar hungpham2511 avatar jancbrammer avatar jonasemrich avatar jukka avatar levsilva avatar lrydin avatar max-ziliang avatar mperreir avatar mscheltienne avatar nattapong-oneplanet avatar peterhcharlton avatar pierreelias avatar purpl3f0x avatar raimonpv avatar richrobe avatar rostro36 avatar sangfrois avatar sokolmarek avatar sourcery-ai-bot avatar stasinos avatar tam-pham avatar tiagotostas avatar zen-juen avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

neurokit's Issues

Basic EMG processing

Add a basic EMG processing pipeline:

  • emg_clean()
  • emg_amplitude()
  • emg_process()
  • emg_plot()

For an initial implementation, most of it can be taken from neurokit1

signal_rate()

@JohnDoenut There seems to be a lot of overlap between rsp_rate and ecg_rate, maybe we could put their common basis in a signal_rate() generic utility function?

ECG simulation with ecgsyn

  • ecgsyn: Better simulation of the QRS complex: there is a new algorithm implemented in different languages, including matlab code here.

I started translating the code here, but the results are not what it should like. IMO something gets round around here.

Processing ECG leads

Hello I have 3 ECG leads (different signals) loaded to pandas DataFrame

How can I process and what can I calculate using with it using NeuroKit?

Scope and Dependencies?

Dear all,

Dependencies

Neurokit 1 wrapped a lot of functionality of BioSPPy and MNE. Do we want to continue these dependencies in Neurokit 2 or implement our own (all this under the assumptions that less dependencies are a good thing)?

Scope

I feel like including functionality for EEG/MEG is quite ambitious, especially since MNE already covers a lot of ground in this area. What would be our addition to MNE's functionality?
Alternatively, couldn't Neurokit 2 cover all biosignals that aren't EEG/MEG? That is, doing ECG, PPG, respiration etc., really well and make Neurokit 2 the go-to place for these signals while leaving EEG/MEG to MNE?

Curious about everyone's thoughts!

Easily switchable default settings

So the way it takes shape currently, users that want to fine-tune their processing parameters can do so easily by calling directly the subfunctions *_clean(), *_rate() etc. They could even build their own custom *_customprocess() function.

Otherwise, people would use the high level functions *_process() that offers an quick and easy way of getting results with good and sensible default parameters.

However, these default parameters can differ from guidelines to guidelines and from package to package. Thus, it would be interesting to have easily switchable default settings. For instance, for rsp_process(), could have the defaults="khodadad2018" (the current implementation), that could be switched for example to defaults="biosppy"` to have the same behaviour as biosppy (bandpass filter at [0.1, 0.35], no detrending etc.). This way, people could easily try different sets of settings on their data and also be able to reproduce/validate/test existing and old methods.

@JohnDoenut does that make sense to you?

method="engzeemod2012" does not work for ecg_process

I tried:

ecg_signal = nk.ecg_simulate(duration=30, sampling_rate=250)

signals, info = nk.ecg_process(ecg_signal, method="engzeemod2012")

and it doesn't seem to work.

Was given the error of

  File "C:\Users\Pham Thanh Tam\Desktop\WPy-3710b\python-3.7.1.amd64\lib\site-packages\neurokit2\signal\signal_formatpeaks.py", line 35, in _signal_from_indices
    if isinstance(indices[0], np.float):

IndexError: list index out of range

Import a package in tests that is not a dependency

In one of the test file, I'd like to import a package to test our solution against it:

# from pyentrp import entropy as pyentrp

Problem

However, this package is not supposed to be a dependency of neurokit. Thus, I added it to the setup.py:

test_requirements = requirements + ['pytest', 'coverage', 'pyentrp']

To the tox file:

pyentrp

And to the pipfile:

pyentrp = "*"

But still, if I import the package in testing, travis fails, saying that "the module couldn't be found" 😠 Any ideas?

@hungpham2511 @Tam-Pham

*_clean: add sanity checks

Following this issue: #100, it might be good to add more checks about the size and shape of the input to throw more informative error messages to the clean functions, which are usually the point of entry of the user input.

Publish NeuroKit

The NeuroKit2 project grew (and keeps growing) faster than expected, thanks to the involvement and contributions of talented people. In fact, it already contains more lines of code than its older brother and better documentation. And it can only get better 🎉

We are reaching a stage where the package is maintained, useable, useful and documented, and we can now think about working toward publishing the software in a few months (end of spring?). The question is where to publish it.

Here's a list of potential journals (feel free to share other journals or any experience):


Existing packages: hrv (JOSS), HeartPy (JORS), pyphysio (SoftwareX), unfold (PERJ), pygaze (Behaviour Research Methods), fiedltrip (Computational Intelligence and Neuroscience), MNE (NeuroImage, Front. Neurosci), eeglab (Journal of Neuroscience Methods)

Indices of complexity, entropy and fractal dimension

This is a summary of the progress of implementing a comprehensive framework for complexity analysis. The goal is too carefully reimplement all indices so that it's efficient, validated and tested.

Measures

Entropy

Fractal Dimension

Other

Helpers

References

ecg_plot(): add all heart beats

We could add also, either below or on the side (related the example on how to "manually" get this plot #87), a plot with all the heartbeats plotted on top of each other (with markers on the different waves). Related to delineation #89

Improve EDA/SCR plot

  • Include (half) recovery-time
  • Include a vertical segment below the peak
  • Dashed horizontal lines between this segment and the onset, and the recovery time

Examples

Implementing the new method for EDA decomposition by Amin (2019)

Simulate synthetic physiological signals (ECG, RSP, EDA/SCR)

I want to add functions to simulate synthetic bio signals (e.g., ECG, RSP and EDA), which would be convenient for documentation, tests and all. I found some useful materials and implementations, but there are all in matlab... I'd like to translate it to python, but I don't know matlab enough... Anybody knows 😅? Or knows someone who does to give it a look?

ECG

  • Basic algorithm
  • Better simulation of the QRS complex: there is a new algorithm implemented in different languages, including matlab. The code doesn't look too complicated to translate, but I don't know where to start (@suklamaa)

RSP

EDA

  • I found Bach's (2010) paper which attempts to model the canonical SCR. The mathematical definition is in Appendix A, and some of the pieces are apparently implemented in matlab (here and here)

EMG

PPG

Banerjee, R., Ghose, A., Choudhury, A. D., Sinha, A., & Pal, A. (2015, April). Noise cleaning and Gaussian modeling of smart phone photoplethysmogram to improve blood pressure estimation. In 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 967-971). IEEE.

EDA decomposition: SCRanalyze and Ledalab

Currently, we have cvxEDA and Acqknowledge as methods for EDA decomposition, and plan to have the update by Amin (2019) #36

For the sake of completeness, would be good to have the two others, namely Ledalab and SCRanalyze (the latter being deemed as superior by Bach, 2014, thus a priority).

Existing implementations:

  • pyphysio (Ledalab - python adaptation)
  • Pypsy (Ledalab - python adaptation)
  • PsPM (SCRanalyze - original matlab code)

Add EMG minimal example to the README

@zen-juen could you add a minimal EMG example (below the others) to the README? you have to 1) edit the README.rst and 2) add the code in README_examples.py and then run it to create and save the figures.

Note that you have to run with the updated package. For that, I usually open my try_stuff.py file at the root of the repo, run it (so it loads the local version of the pkg). Then, with the same console, I comment out the import neurokit2 as nk line in the README_examples.py and then run this file (so it runs it but do not re-load neurokit). Finally, I uncomment the import line (otherwise, codeclimate complains).

Neurokit2: Call for help

The original neurokit package started as a small project as I was learning python, but ended having far more success than I originally expected, far beyond its quality of implementation.

Thus, I think it's better to restart the project from scratch, in a more open and collaborative manner, to create the top toolbox for neurophysiological data processing. If you're interested by such project, whether it is because you need it, because you want to learn python, because you need published papers or technical achievements to add to your CV, you're very welcome to hop aboard! Now is the right time, just let us know with a 👍 react or a response to this thread, mentioning features would you like to see, or how could you eventually help!

@hcp4715 @sangfrois @Tam-Pham @zen-juen

Docs: Tutorials/Example Ideas (and how to contribute)

Examples ideas

Examples are short, specific and self-contained articles about specific topics or possibilities.

  • Understanding NeuroKit: how to see what a function does in the docs, then its code on github, then where is the code located on your machine, and where you can try to make changes/fixes
  • How to contribute: once some change/fixes are mde, how to push back to github. With screenshots etc. (all steps from when github desktop is installed to PR)
  • How to help with the documentation: Pretty much what I wrote below, + also different types of documentation (i.e., examples, docstrings, etc)
  • Building a custom processing pipeline: Where to see how a pipeline is made, and how to build a new function with a custom routine
  • Get familiar with Python in 10min: Crash course to python presenting the basics useful for using neurokit
  • How to record ECG: Documenting how to position electrodes etc.
  • Heart rate and artefacts detection: About the peaks artefacts detection, visualization etc. (@JohnDoenut do you have some ideas?)
  • Event-related analysis: there is a basis in NK1's doc
  • Interval-related analysis: how to do get physiological features for longer periods (example with resting state)
  • How to create epochs: How to create epochs of data, either from one signal (showing events detection, and documenting the epochs_create function) or from multiple data (when people have different files for each subject), how to read them in a loop and put them in a dict and format it as epochs so it can be further analyzed.
  • Extract and visualize heartbeats (QRS): Check if R peaks correctly detected, and adjust the method if need be. Then, extract and plot all R peaks and QRS complexes.
  • Find average respiration rate: Get average heart rate and RSP rate (e.g., https://python-heart-rate-analysis-toolkit.readthedocs.io/en/latest/heartrateanalysis.html)
  • Heart Rate Variability (HRV): once we have HRV, write about cardiac coherence (what it is, what it isn't, how to estimate it) (inspiration)
  • Electrodermal Activity (EDA) analysis: phasic, tonic, SCR's features etc. (https://imotions.com/blog/skin-conductance-response/; https://support.mindwaretech.com/2018/04/all-about-eda-part-2-components-of-skin-conductance/)
  • Working with non-human data: Showing another applied example of "building a custom pipeline" (#104)
  • Derive respiration from ECG: ECG-Derived Respiration (EDR).
  • Respiration Rate Variability (RRV): Small example showing how to compute RRV indices. (@zen-juen). With a "hint box" saying "what if I don't have respiration data?" and linking to the above example on EDR.
  • Simulate Physiological Signals using Python: Showcasing simulation function and their parameters.
  • ...

How to contribute

How to write

The documentation that is on the website is automatically built by the hosting website, readthedocs, from reStructured Text (RST) files (a syntax similar to markdown) or from jupyter notebooks (.ipynb). Notebooks are preferred if your example contains code and images.

Where to add the files

These documentation files that we need to write are located in the /docs/ folder. For instance, if you want to add an example, you need to create a new file, for instance myexample.rst, in the docs/examples/ folder.

If you want to add images to an .rst file, best is to put them in the /docs/img/ folder and to reference their link.

However, in order for this file to be easily accessible from the website, you also need to add it to the table of content located in the index file (just add the name of the file without the extension).

Examples of good documentation

Multiscale Entropy (MSE)

I am trying to add MSE. Current python implementations that I found are:

Unfortunately, they give different results, but I'm not sure if this is reliable as they also give a different result for sample entropy 🤔

As the implementation seems fairly straightforward and heavily depends on sample entropy, I was wondering detective @CSchoel if you could take a quick look to know if it made sense to you 🕵 ☺️

here's the code:

max_scale = len(signal) # Set to max
# Initalize mse vector
mse = np.zeros(max_scale)
for i in range(max_scale):
temp = _entropy_multiscale_granularizesignal(signal, i+1)
if len(temp) >= 4:
mse[i] = entropy_sample(temp, order, r)

thanks 🙌

bio_plot(): how would it look like?

Now that bio_process() exists, should we bio_plot()? How would it look like? We could plot all the different "signal plots" (from ecg_plot, eda_plot etc.) side by side?

Error in calculation of breathing amplitude in rsp_rate()

# Add amplitude if troughs are available
if troughs is not None:
# TODO: normalize amplitude?
amplitude = peaks - troughs

We currently return breathing period instead of breathing amplitude from rsp_rate(). Also visible here (compare the y-axis scale of the first and last plot and also note how the second and last plot contain very similar information).

https://github.com/neuropsychology/NeuroKit/blob/master/docs/img/README_respiration.png?raw=true

Fixing this requires a bit of refactoring, since the calculcation of breathing amplitude requires both the raw breathing signal and the breathing extrema. Currently, we only have access to the extrema in rsp_rate().

We could return the signal value at the extrema alongside the extrema in rsp_findpeaks() in order to fix this. @DominiqueMakowski, what are your thoughts?

Resources: ECG processing and features

Add different R-peaks detectors

I think it would be useful to support different ECG segmenters and R-peaks detection algorithms (useful also for comparing them to base ourselves on the best).

@marianpetruk based on your repo showing a comparison between different detectors, would you be interested in eventually copying and documenting some of them to neurokit2?

# =============================================================================
# Pantompkins
# =============================================================================
def _ecg_findpeaks_pantompkins():
raise ValueError("NeuroKit error: ecg_findpeaks(): pamtompkins 'method' "
"is not implemented yet.")

Aside from biosppy, many of them are also implemented in py-ecg-detectors (so we can cross-validate our implementations)

Progress

See also

EMG threshold methods: emg_activation()

The default threshold of emg_activation() is specified as a tenth of the standard deviation of the amplitude signal. However, there should be other methods of finding the appropriate threshold, one of which by BioSPPy find_onsets().

Give a shout if there are other implementations of this!

First step: setup the repository (pytest)

The first step is to set up the repository in a clean manner. I am struggling quite a bit since yesterday, and one of the step I am having trouble is testing with pytest.

It seems that codecov (https://codecov.io/gh/neuropsychology/NeuroKit), as triggered by travis does not run through the (minimal) test (see here). It says No coverage report found 😕 It is the first time I use pytest, with which I have no experience with. Any idea what might be wrong?

Design

Alright folks, now that we started to add some functions and all, the time has come to talk design 😏

One very good point was brought by @JohnDoenut, related to its great implementation of rsp processing (btw, thanks again for the effort you put in documenting the code, it was a joy to read 😌)

I guess the discussion ultimately comes down to how we want the API to look like. I prefer giving the user some more granularity in their choice as to what they want to get out of the signal. I.e., they instantiate the breathing object (Rsp) and can then choose to return only those attributes that they're interested in. I.e., a user might only want to look at the signal or only have the extrema returned. This would be in contrast to a rsp_process() function that returns all attributes at once in a nested dict like it was the case in Nk1. The latter feels less intuitive to me.

Basically this boils down to the question of whether we want most of the API to be functions or classes. Your comment made me think a lot, here's my current thoughts (might be subject to changes though 😁)

When I started python, I didn't understand classes (as I had no background in programming whatsoever). As I didn't understand them, I didn't like them and didn't use them. Eventually, I managed to understand them, and it felt so powerful and more like "legit programming" (it felt looked more sophisticated). As a result... I started using them everywhere. Now, after having widened my experience in other languages, I feel like I tend to come back to functions... besides the intuition and the feeling, here are some arguments that crossed my mind:

IMO functions are more intuitive, especially for new users without much programming background, in the sense that you apply a function on an input and you get an output, y = f(x), as opposed to creating an entity that both contains information and also functions to apply on itself.

Importantly, functions are more maintainable, meaning more easy to understand, debug, test and contribute to, due to their self-contained nature and because they can be nicely organise in different files or subfunctions. For example, incredible packages like mne for eeg are quite tough to understand and contribute to if you're not a core Dev,having their powerful (and complex) classes defined in verrrry long files (with a lot of internal methods and all).

All in all, I feel like this preference for functions or classes has strong interindividual variability, depending on our background and experience, and that the end both are equivalent in terms of functionalities. Most importantly, please do challenge my opinion and try to change my mind! Here's how two designs could look like (these are examples of names and functions):

class based API

# Create object
rsp_object = nk.Rsp(signal)  

# Call methods
rsp_object.stats()
rsp_object.findpeaks()
rsp_object.plot()

function based API

I think that part of the success of nk1 was due to the easy with which you get results. Anybody, without any understanding of ecg, could run process_ecg() and get tons and tons of stuff out (this famous dict with all the values). More that most of us understood, which had the benefit of making you happy (who doens't like having some results values) and hopefully motivate to learn more about the values that are spitted out.

While I don't want necessarily to reproduce nk1's design, having this "master" function that returns a lotta stuff is appealing, especially as a point of entry for new users. However, if something was to be changed, it would be the way this master function is defined. Indeed, I'd like nk2 to be more flexible and more easy to use for advanced users with personalised flows. This means, that the different steps that ecg_process does (preprocessing, segmentation, peaks finding, rate computing, features extraction etc.) could be available and accessible indenpendently and directly. So while a user would mainly start by using the master function to see how, he could then start controling more his pipeline with the individual functions as he gains expertise. Here's how it coud look like:

Beginner

# Get results
results = nk.rsp_process(signal)

# Extract results
results["Stats"]
results["Peaks"]
nk.rsp_plot(results)

Advanced

As rsp_process would be a high level function, calling other functions, an advanced user that desires more control over the processing flow could be able to call the other subfunctions separately.

filtered = nk.rsp_prepare(signal)
peaks = nk.rsp_findpeaks(filtered)
stats = nk.rsp_stats(peaks)

Critically, having the subfunctions individualised would also make it easier to document (as we could have specific documentation for subfunctions and refer to them from the master, instead of having one big docstring), debug, and improve. I might be repeating myself a lot here, but this is aspect cannot be neglected if we want our package to have high standards in terms of quality and safety (and we want to be "professional" 😁), as it it was ultimately pushed me to reimplement nk2 instead of trying to improve nk1.

Let me know your thoughts!!

smarter bio_process()

if the first argument is a dataframe, try to recover the relevant signals from its columns (ecg will be the column that has "ECG" or "EKG" in the name, etc.)

QRS delineation algorithms

Progress

Resources

Simulation study to identify the best preprocessing pipeline for RSP and ECG

in the meantime we could already think about a testing framework akin to wfdb, maybe using ecg_simulate() instead of a database such as MITDB or GUDB. In any case, would be nice to have some principled way of comparing different detectors without depending on third party databases.

@JohnDoenut Related to that, so last week I got hyped with the simulation functions and the preprocessing tools that were added. And I was also looking at the different sets of defaults (#48) existing in other software, and I was once again struck with the discrepancies in the preprocessing parameters suggested, i.e., what type of filter, what order, what frequency bands to filter, detrending before filtering or after, what type of detrending, etc. etc.

Although this lack of standard, well established and validated routines are common neurophysiology (and even worse in EEG), this always annoyed me a bit not to have some kind of demonstration showing that a butterworth 2nd order filter is better than a bessel 9th order with linear detrending.

I thought, well, we have functions to simulate signals, as well as functions to distort them (see signal_distord() that can now also add artefacts and powerline noise), in a controlled fashion. And (at work) I have access to a server that can run some code for an extended period of time without burning up (in theory).

So why not run a brute-force simulation study to basically test different preprocessing parameters in a (very) large number of situations.

And from there I started struggling a lot with the simulation code, at first by having all combinations of preprocessing parameters, but it was just too much. So I then tried to narrow it down, starting with the filtering parameters (for RSP only).

Basically, the idea is to generate iteratively a large number of signals of different possible rates for healthy humans), then to distort them by adding a random amount of noise of random frequencies, then to clean it using given parameters and 1) compare the cleaned signal to the original signal and 2) compare the rate extracted from the cleaned signal to the rate extracted from the original signal.

Then, basically look at the best set of parameters. This morning I tested the simulation code from home by generating - just to see if it works - 100 iterations (the plots are shown on the README). Next week I can try with running a lot more.

But in general, the goal would be to have a working framework for testing stuff. Moreover, aside from the theoretical justifications of the parameters, having some visual "evidence" would be quite useful for the users and could make for a straightforward but appreciated study.

What do you think?

Example data

I'd say that a start will be to 1) implement some data reading function(s) [can start with the read_acqknowledge in NeuroKit1] and 2) upload some test data, which we will be able to use as a testcase, and also for documentation and tutorials.

For 1), we could also think about a wrapper around the wfdb reader to do tests on data from https://physionet.org/

For 2) I will probably be able to record and upload some data using Biopac and the empathica wristband, for a simple paradigm (e.g., neutral vs. emotional pictures), in the next few weeks.

@hcp4715 @sangfrois may I know if you have any particular recording device that you used/use/might use?

rsp_clean run time and memory error

New user here, trying to implement this package with my rodent whole body plethysmography. The processing methods I've used prior are rudimentary and written up totally on my own, and your package utilizes much more robust methods.

The issue

I'm not sure if this is just a problem on my end, but when running this:

rsp_cleaned = nk.rsp_clean(raw_signal_vals[:20000],sampling_rate=5000, method = 'khodadad')

the command has quite a large run time (>10 min on i7 processor w/ 16GB RAM) and the signal I passed is just a fraction of the actual file (118,000,000 samples, about 6.5 hours of recording at 5khz). When I run it on the whole signal, the kernel crashes due to a memory error. The traceback shows it failing during the detrending step. Is my data simply too much for these methods to handle, or is there something I can do to help here?

*Note: The signal is imported from CEDSpike2 files using NeoIO and comes out as a numpy array.

Access data folder from testing

In tests, we will need to access the data stored in the data folder. Unfortunately, it seems that on travis, the following way of accessing it doesn't work (it says, no such folder):

# df, sampling_rate = nk.read_acqknowledge("../data/example_acqknowledge", sampling_rate=2000)

Any idea how to access (or upload) the data folder so that accessing it works in travis?

savior @hungpham2511, any thoughts 😅 ?

signal_erp(): a basis for event related analysis

So now that we the processing of raw signals is getting in shape, the last piece of the short-term puzzle is the implementation of functions to facilitate event-related analysis. Meaning, the extraction of relevant features for each event (obtained for instance via epochs_create()).

Before going into the implementation of ecg_erp, eda_erp etc., we could have a generic basis in signal_erp() that can extract from epochs things like maximum, minimum, time of max, time of min, presence of peak or not, etc. etc.

What would the most appropriate name for this set of functions? _erp()? _eventrelated(), _events()?

Dicrepancy in sample entropy (SampEn) between two python implementations (nolds and entro-py); which one is correct?

Trying to add Sample Entropy (SampEn). Found two (working) python implementations:

I added and adapted the former in here (currently on dev) for quick testing, but unfortunately, they give a slightly different result:

signal = np.cos(np.linspace(start=0, stop=30, num=100))
nk.entropy_sample(signal, order=2, r=0.2)
0.26809106297461127
nolds.sampen(signal, emb_dim=2, tolerance=0.2)
0.2767230586620615

Unfortunately, I wasn't able to trace down where the discrepancy appears, and I don't which implementation is the "correct" one, as I wasn't able to compare it with other implementations.

If someone has a hint on where/why/what is the correct solution, do not hesitate ☺️

Other implementations:

rsp_clean for high frequency (rodent) respiration

High frequency breathing like that done by rodents during sniffing is flattened out by the filters used in both cleaning methods.
This may be outside the scope of the aims of this package, but if we could extend the rsp_clean function to be able to handle a wider range of respiratory frequencies it would be very useful to my lab, I'd love to contribute here if you're interested.

Below is an example of the Khodadad filter flattening out the high frequency sniffing respiratory cycles of a rat. Rats usually breath in the range of 85-100 cpm, but during sniffing can get as high as 600 cpm. A very similar, in fact more drastic, effect is seen with the biosppy filter.

Screen Shot 2020-01-24 at 10 49 52 AM

How could we do it?
I'm still looking into the details of how these filters work, but perhaps we could pass an optional boolean parameter to the clean function (e.g. rodent = True or something) that changes the highcut value in signal_filter from the default value of 2 to something more appropriate for the animal (~10). On that note we could even extend this to a range of animals and have default high and lowcut values for a range of typical model organisms.

`rsp_plot`: improvements?

import neurokit2 as nk

rsp = nk.rsp_simulate(duration=60, respiratory_rate=15)
signals, info = nk.rsp_process(rsp)
nk.rsp_plot(signals)

README_respiration

I suppose some improvements could be made here, for instance:

  • different colours for signals
  • Improved titles
  • Better title spacing so it doesn't overlap
  • plotting both the raw and cleaned signal so one can appreciate the cleaning
  • replacing x axis marks with seconds instead of samples
  • Make sure it works if no amplitude data
  • ...

@zen-juen care to transfer your ggplot skills and taste to matplotlib 😏 ?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.