mne-tools / mne-features Goto Github PK

View Code? Open in Web Editor NEW

139.0 139.0 32.0 8 MB

MNE-Features software for extracting features from multivariate time series

Home Page: https://mne-tools.github.io/mne-features/

License: BSD 3-Clause "New" or "Revised" License

Makefile 0.82% Python 99.18%

mne-features's People

Contributors

Stargazers

Watchers

mne-features's Issues

ImportError: cannot import name 'joblib' from 'sklearn.externals'

Error installing mne-features

Hi,
I have all required packages listed as requirements with right versions, but still i get the following:
ERROR: Could not find a version that satisfies the requirement mne-features (from versions: none)
ERROR: No matching distribution found for mne-features
On trying:
pip3 install git+https://github.com/mne-tools/mne-features.git#egg=mne_features --no-cache-dir
i get the following:
ERROR: Complete output from command /home/venkat/miniconda3/bin/python -u -c 'import setuptools, tokenize;file='"'"'/tmp/pip-install-wpgqjkz0/python-mne-features/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(file);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, file, '"'"'exec'"'"'))' clean --all:
ERROR: Traceback (most recent call last):
File "", line 1, in
File "/home/venkat/miniconda3/lib/python3.6/tokenize.py", line 452, in open
buffer = _builtin_open(filename, 'rb')
FileNotFoundError: [Errno 2] No such file or directory: '/tmp/pip-install-wpgqjkz0/python-mne-features/setup.py'

ERROR: Failed cleaning build dir for python-mne-features
Successfully built python-mne-features
Failed to build python-mne-features

How to resolve this?

Integrate funcs_params to the FeatureExtractor

Currently, the FeatureExtractor does not have the capability to accept funcs_params as posses by extract_features.

This making it difficult to supply 'pow_freq_bands' if the data having sampling rate of, say 100 Hz. Which will produce

ValueError: The entries of the given freq_bands parameter ([ 0.5 4. 8. 13. 30. 100. ]) must be positive and less than the Nyquist frequency.

Maybe the dev team can consider this issue.

GPU acceleration

Hi. I have working on my project when I wanted to use compute_svd_entropy but it was taking too much time.
I stumbled upon a library cupy that is replacement for numpy for running on cuda.
I modified locally compute_svd_entropy like this:

def compute_svd_entropy(data, tau=2, emb=10, n_jobs=1):
...
    import numpy as np
    if n_jobs == "cuda":
        import cupy as cp
        np = cp
        data = cp.asarray(data)
    _, sv, _ = np.linalg.svd(_embed(data, d=emb, tau=tau, n_jobs=n_jobs))
    m = np.sum(sv, axis=-1)
    sv_norm = np.divide(sv, m[:, None])
    out = -np.sum(np.multiply(sv_norm, np.log2(sv_norm)), axis=-1)
    if n_jobs == "cuda":
        out = cp.asnumpy(out)
    return out

And for my case it speeded up from 37s per chunk that I was passing to 3-4s per chunk.
I can create PR for that for every function that is applicable will you be willing to merge something like that. I know that it introduces one more dependency (it can be optional) so I wanted just to check first before submitting PR. Thanks in advance.

PS. n_jobs was motivated by the mne that is using n_jobs in that way.

compute_samp_entropy argument of type 'builtin_function_or_method' is not iterable

Hello,

I have been using the compute_samp_entropy function and I have updated my sklearn to version 1.3.0 and now the function gives this issue argument of type 'builtin_function_or_method' is not iterable. This is related to a change that skelarn made to the KDTree where the attribute valid_metrics is now a function. The only fix needed is to added parentheses where you try to access KDTree.valid_metrics.

Would love to have this bug fixed asap.

Thank you

Behavior of edge parameter in compute_spect_edge_freq not consistant with documentation

The description for the optional edge parameter of univariate.compute_spect_edge_freq is as follow:

If not None, the values of edge are assumed to be positive and will be normalized to values between 0 and 1. Each entry of edge corresponds to a percentage. The spectral edge frequency will be computed for each different value in edge. If None, edge = [0.5] is used.

There are two issues that makes the actual behavior unexpected when compared to the documentation:

The sentence If None, edge = [0.5] is used implies that passing edge = [0.5] to the function should give the same results as calling the function without the parameter. It is however not the case. To get the same results, edge = [50] must be passed to the function.
If edge is not None, its values are not actually "normalized" as the documentation says, but are simply divided by 100 as shown in the code below

if edge is None:
        _edge = [0.5]
    else:
        _edge = [e / 100. for e in edge]

I suggest fixing either by removing the explanation about normalization, removing the division and let the user enter a value between 0 and 1, or by making the current behavior clearer in the documentation. A better explanation of the current behavior could be as follow:

If not None, the entries of edge should be positive values between 0 and 100, each corresponding to a percentage. The spectral edge frequency will be computed for each value in edge. If None, edge = [50] is used.

Add Empirical Mode Decomposition to features extraction functions

I think it would be nice to have the possibility to compute the intrinsic mode functions thanks to the Empirical Mode Decomposition method. It is highly efficient for analyzing nonlinear and non-stationary data such as EEG signals.
This could be later integrated into a sub-module of mne-features containing all the functions allowing a signal decomposition.

Multiprocessing while feature extraction

Greetings, Every one. First of all thanks for such an amazing package. I am trying to find features from a data shape of 100,64,3072 and it takes a lot of time. So i decided to go for multiprocessing.
I have tried multiprocessing across different subjects, and it work faster than sequential one, but with reduced channels (9 channels).

Now i am trying to multiprocessing on single subject, but across multiple channels.
Here is my code

from mne_features.feature_extraction import FeatureExtractor
import time
from multiprocessing import Process,current_process,Pool

feature=['app_entropy', 'decorr_time', 'energy_freq_bands', 'higuchi_fd', 'hjorth_complexity', 'hjorth_complexity_spect', 'hjorth_mobility', 'hjorth_mobility_spect', 'hurst_exp', 'katz_fd', 'kurtosis', 'line_length', 'mean', 'pow_freq_bands', 'ptp_amp', 'samp_entropy', 'skewness', 'spect_entropy', 'spect_slope', 'std', 'svd_entropy', 'svd_fisher_info', 'teager_kaiser_energy', 'variance', 'wavelet_coef_energy', 'zero_crossings', 'max_cross_corr', 'nonlin_interdep', 'phase_lock_val', 'spect_corr', 'time_corr']
fe = FeatureExtractor(sfreq=1024, selected_funcs=feature)

#print("Number of processors: ", mp.cpu_count())
def cal_feature(ch):
    data= np.random.randint(0, 100, size=(100, 64, 3072))

    return fe.fit_transform(data[:,ch,:])


if __name__ == '__main__':

    start = time.time()
    ch=[i for i in range(64)]
    with Pool(12) as p:
        result = p.map(cal_feature,(ch) )
    print(time.time()-start)

If i replace fe.fit_transform(data[:,ch,:]) with np.mean(fe.fit_transform(data[:,ch,:],-1) every thing works fine. Hence i assumed that there is no issue with multiprocessing code
I am getting following error

Traceback (most recent call last):

  File "<ipython-input-3-1b73aa731887>", line 1, in <module>
    runfile('F:/Scizhophrenia/4- feature extraction multiscript a.py', wdir='F:/Scizhophrenia')

  File "C:\Anaconda3\lib\site-packages\spyder_kernels\customize\spydercustomize.py", line 827, in runfile
    execfile(filename, namespace)

  File "C:\Anaconda3\lib\site-packages\spyder_kernels\customize\spydercustomize.py", line 110, in execfile
    exec(compile(f.read(), filename, 'exec'), namespace)

  File "F:/Scizhophrenia/4- feature extraction multiscript a.py", line 33, in <module>
    result = p.map(cal_feature,(ch) )

  File "C:\Anaconda3\lib\multiprocessing\pool.py", line 268, in map
    return self._map_async(func, iterable, mapstar, chunksize).get()

  File "C:\Anaconda3\lib\multiprocessing\pool.py", line 657, in get
    raise self._value

IndexError: too many indices for array````

data : ndarray, shape (n_channels, n_times)

Hi,

I have a set of EEG data that contains EEG signal of 4 channel, like so.

[[ 3.9401 -5.4757 8.6351 -13.8168]
[ 4.5597 -9.3532 13.0823 -18.3166]
[ 6.3138 -10.7552 16.6764 -18.6332]
...
[-24.2438 11.6975 -56.4585 47.0863]
[-24.9816 11.1459 -54.4973 47.6616]
[-23.3121 14.4831 -49.4173 49.9967]]

What do I need to do to my data set so that I can put that into "compute_spect_edge_freq" code?

I would appreciate it if someone could explain "data : ndarray, shape (n_channels, n_times)".

Thanks in advance.

transform() takes 2 positional arguments but was given 3

Trying to run the example - https://github.com/mne-tools/mne-features/blob/master/examples/plot_seizure_example.py throws error as below:

~/miniconda3/lib/python3.6/site-packages/mne_features/feature_extraction.py in transform(self, X, y)
67 details.
68 """
---> 69 X_out = super(FeatureFunctionTransformer, self).transform(X, y)
70 self.output_shape_ = X_out.shape[0]
71 return X_out

TypeError: transform() takes 2 positional arguments but 3 were given

Is this a scikit-learn version issue? sklearn is up to date.

Warnings with Scikit-Learn >= 0.20

With Scikit-Learn >= 0.20 the examples give rise to the following warning:

with LogisticRegression : FutureWarning: Default solver will be changed to 'lbfgs' in 0.22. Specify a solver to silence this warning. FutureWarning)

Fix: Constant features with bivariate feature functions

import numpy as np
from mne_features.feature_extraction import extract_features

rng = np.random.RandomState(42)
sfreq = 256.
X = rng.standard_normal((2, 1, int(sfreq)))
# data with only 2 epochs and 3 channels
data = np.concatenate([X, X ** 2, X ** 3], axis=1)

sel_funcs = ['spect_corr']
Xnew = extract_features(data, sfreq, sel_funcs, {'spect_corr__db': False})
print(Xnew)

outputs:

[[ 1.00000000e+00 -2.89225913e-01 -2.92641378e-01  1.00000000e+00
  -8.30714688e-01  1.00000000e+00  0.00000000e+00  1.16927731e+00
   1.83072269e+00]
 [ 1.00000000e+00 -4.14820591e-01 -2.88700088e-02  1.00000000e+00
  -8.97548079e-01  1.00000000e+00  4.44089210e-16  1.02198101e+00
   1.97801899e+00]]

This example shows that, with spect_corras feature function, some features (column index 0, 3, 5) are constant. These constant features were computed from pairs of channels of the form (data[i, :], data[i, :]). Same is true with other bivariate feature functions.

In some cases (using Xnew for classification, for instance), one may wish to remove these constant features. This could be done by post-processing the extracted features with sklearn.feature_selection.VarianceThreshold. However, having an optional parameter (like include_diag=False) in bivariate features might be a good idea to remove features corresponding to pairs of channels of the form (data[i, :], data[i, :]).

ModuleNotFoundError: No module named 'mne_features'

After installing through both conda install -c conda-forge mne-features and pip install mne-features, I'm still having trouble importing the module into my Python program: import mne_features. I get the following error:

ModuleNotFoundError: No module named 'mne_features'

What am I missing? Thank you!

ENH: Implement a function to fine-tune features extraction functions' parameters.

It would be really useful to have a function that works in the same way as sklearn's GridSearchCV does allowing to fine-tune parameters from the univariate.py and bivariate.py functions. In order to do so a classification problem to solve is needed which could be seizure detection.

Add conda-forge package?

Hello, I just wanted to ask if you'd be interested in a conda-forge package of MNE-Features? I could build one.

Asking because I'm right now looking at packages that would be nice to include in the MNE installers.

Is this package already compatible with the upcoming 1.0 release of MNE-Python?

PSD computation

As suggested by @agramfort in #37, the PSD estimation with power_spectrum should use a different method than FFT. Indeed, the PSD estimated from the FFT can be very noisy. We could probably go for something like:

from mne.time_frequency import psd_array_multitaper, psd_array_welch

def power_spectrum(sfreq, data, fmin=0., fmax=np.inf, method='welch', 
                   verbose=None):
    ...
    if method == 'welch':
        return psd_array_welch(data, sfreq, fmin=fmin, fmax=fmax, 
                               verbose=verbose)
    elif method == 'multitaper':
        return psd_array_multitaper(data, sfreq, fmin=fmin, fmax=fmax, 
                                    verbose=verbose)
    elif method == 'fft':
        ...
    else:
        raise ValueError('PSD estimation is not implemented for the given '
                         'method (%s).' % str(method))

ENH: Meaningful tests

As @agramfort noted, the tests (for univariate and bivariate) feature functions only check that the returned arrays have the right shape. It is enough to check that a function does not fail on random input data but such tests do not ensure that the features are correctly implemented...

Tests need to be improved.

Use a threshold to compute zero-crossings

Example:

import numpy as np
from mne_features.univariate import compute_zero_crossings

sig = np.sin(np.linspace(0, 4 * 2 * np.pi, 10))
z = compute_zero_crossings(sig[None, :])

In this example, y is given by:

array([ 0.00000000e+00,  3.42020143e-01, -6.42787610e-01,  8.66025404e-01,
       -9.84807753e-01,  9.84807753e-01, -8.66025404e-01,  6.42787610e-01,
       -3.42020143e-01, -9.79717439e-16])

And the result z of compute_zero_crossings is array([8.]).

Depending on whether the last value of y should be treated as 0, compute_zero_crossings might give wrong results. To address this problem, compute_zero_crossings should use a threshold to determine whether a value is treated as 0. For instance:

compute_zero_crossings(sig[None, :], threshold=1e-12)  # returns array([9.])
compute_zero_crossings(sig[None, :], threshold=np.finfo(np.float64).eps)  # returns array([8.])

Multiple calls to `power_spectrum`

from mne_features.feature_extraction import extract_features
import numpy as np

sfreq = 256.
rng = np.random.RandomState(seed=42)
data = rng.standard_normal((10, 20, int(sfreq)))
selected_features = ['pow_freq_bands', 'hjorth_mobility_spect', 'hjorth_complexity_spect', 'spect_entropy']
features = extract_features(data, sfreq, selected_funcs=selected_features)

With this code, each feature function contains ps, freqs = power_spectrum(sfreq, data). Therefore, the power spectrum of the data is computed 4 times... which is not optimal! It would be good if some utility function (such as power_spectrum) could benefit from a memory parameter. This way, the code above would only compute the power spectrum of the data once.

FIX: deal with flat epochs in feature extraction

The function extract_features extracts features from a data array X, shape (n_epochs, n_channels, n_times) epoch by epoch. If an epoch is flat (that is, if np.allclose(X[j, :, :], 0) is True), then most feature functions will either return arrays of 0 or fails (log of negative values, division by 0,...).

Possible fix:
The utility function _apply_extractor should check if the epoch is flat and, if so, deal with it (for example: return 0 for all the features + warning to the user).

evocative feature names for power band ratios

looking at get_feature_names

Installation from conda

@larsoner and I packaged mne-features for conda-forge; if desired, we can include this in the install docs.

compute_pow_freq_bands returns ratios

it would be nice to be able to return the ratios between powers in bands.

cc @l-omar-chehab

Error inside the package: _check_input() got an unexpected keyword argument 'reset'

Version: 0.2

Problem: I just updated the version of mne-features, I am using mne_features.feature_extraction.extract_features() but no matter what the input arguments are, it pops up the following error:

TypeError: _check_input() got an unexpected keyword argument 'reset'

An example could be:

feature = extract_features(X = data, sfreq = 250.0, selected_funcs = 'max_cross_corr', funcs_params = {'max_cross_corr__include_diag': False})

With data being a numpy array with shape: (318, 2, 500), i.e. (n_epochs, n_channels, n_times)

Any ideas?

ENH/FIX: passing optional parameters to feature functions

In its current form, extract_features accepts a selected_funcs parameter which is a list of str. Each str in selected_funcs is an alias for a given feature function. However, feature functions such as compute_higuchi_fd, compute_svd_entropy or compute_spect_edge_freq accept optional keyword parameters/arguments. Still, the value of these optional parameters cannot be changed when using extract_features. I think it would be good to give the user the ability to change these optional parameters.

Possible fix:
Instead of having a list of str for selected_funcs, one could have a list of tuples, each tuple being of the form (alias_of_the_function, dict_of_optional_arguments). For example: selected_funcs = [('mean', dict()), ('higuchi_fd', {'kmax': 5}), ('higuchi_fd', {'kmax': 10})]. If heterogeneous lists are allowed, one could have: selected_funcs = ['mean', ('higuchi_fd', {'kmax': 5}), ('higuchi_fd', {'kmax': 10})].

CSP.fit_transform() throws LinALgError

Dataset consists of some 20 files, 10 control and 10 typical/healthy individuals. Each file is read and the raw data is epoched (With overlap) and squashed into one huge array of dimension (n_epochs, n_channels, n_times).
For instance, if a certain file is read in using read_raweeg(), it is subsequently epoched using make_fixed_length_events() into a matrix of dimension - (n_epochs, n_channels, n_times). Each file is epoched in this manner and n_epochs and n_times vary for each file. We select the min(n_times) (of all files) and truncate the epochs. What we have left is a list of epochs for each file where n_times is the same across all epochs. We prepare a corresponding 'y' binary variable whose length is (sum(n_epochs),).
We aim to call csp.fit_transform(all_epochs, y) to classify between control and healthy subjects and to plot the respective patterns.

On doing so, we get the following error:
LinAlgError: the leading minor of order 4 of 'b' is not positive definite. The factorization of 'b' could not be completed and no eigenvalues or eigenvectors were computed.

How should this be handled?

Inconsistency for compute_time_corr

I noticed the documentation of the time_corr function states that the default values of the parameters are:

with_eigenvalues: False
include_diag: True

However, in the source code these values are changed (see compute_time_corr):

with_eigenvalues: True
include_diag: False

This creates an unexpected behaviour when calling time_corr. I don't know if the fix must be applied to the docs or the code.

ENH: Use custom feature functions

Although MNE-features proposes a large number of feature functions, a user might want to use MNE-features with his own feature functions. The function extract_features or the class FeatureExtractor should allow the user to work with his own feature functions.

NumbaWarning: Compilation is falling back to object mode WITH looplifting enabled

Greetings. I am working on kaggle kernals and every thing is working well. I created a new kernal and donwload the jupyter notebook file from previous kernal and uploaded to new kernal.
I do
!pip install git+https://github.com/mne-tools/mne-features.git#egg=mne_features
when i do
from mne_features.feature_extraction import FeatureExtractor

i got the following error

/opt/conda/lib/python3.6/site-packages/mne_features/univariate.py:856: NumbaWarning:
Compilation is falling back to object mode WITH looplifting enabled because Function "_higuchi_fd" failed type inference due to: Invalid use of Function() with argument(s) of type(s): (tuple(OptionalType(int64) i.e. the type 'int64 or None' x 1))

parameterized
In definition 0:
All templates rejected with literals.
In definition 1:
All templates rejected without literals.
This error is usually caused by passing an argument of a type that is unsupported by the named function.
[1] During: resolving callee type: Function()
[2] During: typing of call at /opt/conda/lib/python3.6/site-packages/mne_features/univariate.py (874)

File "../../opt/conda/lib/python3.6/site-packages/mne_features/univariate.py", line 874:
def _higuchi_fd(data, kmax):

for s in range(n_channels):
lk = np.empty((kmax,))
^

@nb.jit([nb.float64[:](nb.float64[:, :], nb.optional(nb.int64)),
/opt/conda/lib/python3.6/site-packages/mne_features/univariate.py:856: NumbaWarning:
Compilation is falling back to object mode WITHOUT looplifting enabled because Function "_higuchi_fd" failed type inference due to: cannot determine Numba type of <class 'numba.dispatcher.LiftedLoop'>

File "../../opt/conda/lib/python3.6/site-packages/mne_features/univariate.py", line 873:
def _higuchi_fd(data, kmax):

higuchi = np.empty((n_channels,), dtype=data.dtype)
for s in range(n_channels):
^

@nb.jit([nb.float64[:](nb.float64[:, :], nb.optional(nb.int64)),
/opt/conda/lib/python3.6/site-packages/numba/compiler.py:725: NumbaWarning: Function "_higuchi_fd" was compiled in object mode without forceobj=True, but has lifted loops.

File "../../opt/conda/lib/python3.6/site-packages/mne_features/univariate.py", line 858:
nb.float32[:](nb.float32[:, :], nb.optional(nb.int32))])
def _higuchi_fd(data, kmax):
^

self.func_ir.loc))
/opt/conda/lib/python3.6/site-packages/numba/compiler.py:734: NumbaDeprecationWarning:
Fall-back from the nopython compilation path to the object mode compilation path has been detected, this is deprecated behaviour.

For more information visit http://numba.pydata.org/numba-doc/latest/reference/deprecation.html#deprecation-of-object-mode-fall-back-behaviour-when-using-jit

File "../../opt/conda/lib/python3.6/site-packages/mne_features/univariate.py", line 858:
nb.float32[:](nb.float32[:, :], nb.optional(nb.int32))])
def _higuchi_fd(data, kmax):
^

warnings.warn(errors.NumbaDeprecationWarning(msg, self.func_ir.loc))
/opt/conda/lib/python3.6/site-packages/mne_features/univariate.py:856: NumbaWarning:
Compilation is falling back to object mode WITH looplifting enabled because Function "_higuchi_fd" failed type inference due to: Invalid use of Function() with argument(s) of type(s): (tuple(OptionalType(int32) i.e. the type 'int32 or None' x 1))

parameterized
In definition 0:
All templates rejected with literals.
In definition 1:
All templates rejected without literals.
This error is usually caused by passing an argument of a type that is unsupported by the named function.
[1] During: resolving callee type: Function()
[2] During: typing of call at /opt/conda/lib/python3.6/site-packages/mne_features/univariate.py (874)

File "../../opt/conda/lib/python3.6/site-packages/mne_features/univariate.py", line 874:
def _higuchi_fd(data, kmax):

for s in range(n_channels):
lk = np.empty((kmax,))
^

File "../../opt/conda/lib/python3.6/site-packages/mne_features/univariate.py", line 858:
nb.float32[:](nb.float32[:, :], nb.optional(nb.int32))])
def _higuchi_fd(data, kmax):
^

For more information visit http://numba.pydata.org/numba-doc/latest/reference/deprecation.html#deprecation-of-object-mode-fall-back-behaviour-when-using-jit

File "../../opt/conda/lib/python3.6/site-packages/mne_features/univariate.py", line 858:
nb.float32[:](nb.float32[:, :], nb.optional(nb.int32))])
def _higuchi_fd(data, kmax):
^

warnings.warn(errors.NumbaDeprecationWarning(msg, self.func_ir.loc))

ENH: Identify extracted features

Once features were extracted as a matrix X, shape (n_epochs, n_features), it is difficult to determine which column of X corresponds to a given feature. It would be nice to have an optional parameter return_as_dataframe=True in extract_features to get a Pandas DataFrame like this:

     mean                         std            
      0        1         2         0        1        2
0 -0.01684 -0.062938 -0.108511  0.138243 -0.07428  1.009863 ...

This would make post-processings on the extracted features easier.

Why mne-feature take long time to load?

Relative to mne-python, mne-feature take significantly long time to load during import calling.

May I know whether there is workaround to improve this

import timeit
start = timeit.timeit()

import mne
end = timeit.timeit()

start1 = timeit.timeit()

from mne_features.feature_extraction import extract_features
end1 = timeit.timeit()

print(f'{end - start} vs {end1 - start1}')

Produce


0.0004304109897930175 vs 1.1699972674250603e-05


0.0012418319820426404 vs 0.00043890299275517464

issue with the univariate.compute_spect_entropy

This is not an "issue" but wanted to note that since the compute_spect_entropy function won't normalize the power distributions with the total numbers of frequencies, the return values of the spectral entropy of each signal are higher than 1, which should not be.

Cheers,
Farzad.

User-defined feature functions made easy

I think that _embed, _filt, _wavelet_coefs (in utils.py) should not be private functions and should be added (along with power_spectrum) to MNE-features' API. The reason for that is that those functions are "building blocks" for user-defined feature functions. For instance, it would be great if they could do:

from mne_features.utils import power_spectrum
from mne_features.feature_extraction import extract_features

def myspectralfeature(sfreq, data):
    psd, _ = power_spectrum(sfreq, data)
    ...
    return ...

selected_funcs = ['mean', 'variance', ('myfunc', myspectralfeature)]
features = extract_features(epochs_data, sfreq, selected_funcs)

It is implicit that, for any feature function, if sfreq is one of the parameters, then it should be placed first because unless feature_funcs[alias] = partial(func, sfreq) (in _get_feature_funcs) will not work as expected. This should be mentioned in the doc... Or partial(func, sfreq) should be changed for partial(func, **{'sfreq': sfreq}).

extract_features returns TypeError

mne_features.feature_extraction.extract_features() now returns the error TypeError: _check_input() missing 1 required keyword-only argument: 'reset'.

Checking the arguments of this method on line 87 within feature_extraction.py does show that 'reset' is apparently an existing argument, as self._check_input.__code__.co_varnames returns ('self', 'X', 'reset')

Calculate Epileptogenicity Index for EEG/iEEG

Hi, in epilepsy, Epileptogenicity Index(EI) in this paper is as important as HFOs.

I noticed that calculation of EI hasn't been implemented in Python, only in MATLAB-based software AnyWave, the code of which is close-source.

I've been trying to implement this method recently, and I think this could be a function of mne-feature.

If so, what API is appropriate? Like compute_EI?

Doc-related issues

Since the last PR, there are warnings when building the doc of MNE-features.

WARNING: [autosummary] failed to import u'mne_features.univariate.compute_spect_edge': no module named mne_features.univariate.compute_spect_edge

It seems that Travis is not checking the doc carefully enough.
Maybe, we should ask Travis to do something like:

sphinx-build -nW -b html -d _build/doctrees . _build/html

so that the build actually fails if warnings are triggered by Sphinx.

ENH: Enable caching fitted transformers

As @agramfort suggested in #17, it would be a good idea to have a memory parameter (like in sklearn.pipeline.Pipeline) in extract_features and FeatureExtractor to allow caching the fitted transformers.

deviation_inverse_freq_curve

@agramfort

Should we include a deviation to the inverse freq curve (prototypical power spectral density) functionality?

If so, do we do a linear regression on the log-log curve, in which case our estimator is formally:
estimated power = k1 / f^lambda , where k1 = exp(estimated intercept) and lambda = - estimated slope

Or, as in the Winkler 2011 paper, do we add a term k2 to our estimator such that:
estimated power = (k1 / f^lambda) - k2

In other words, do we assume or not that the power converges to 0 for very high frequencies?

Fix: embed utility func + app_en/samp_en

Given a ndarray x of shape (n_channels, n_times), embed(x, d, tau) should return an array of shape (n_channels, n_times - (d - 1) * tau, d). Instead, it returns an array of shape (n_channels, n_times - 1 - (d - 1) * tau, d).
The functions compute_app_entropy and compute_samp_entropy do not seem to fully implement Approximate and Sample Entropy as defined in [1, 2, 3]. These functions should be rewritten to conform to the definitions from the litterature.

[1] Borowska, M. (2015). Entropy-based algorithms in the analysis of biomedical signals.
[2] https://en.wikipedia.org/wiki/Approximate_entropy
[3] https://en.wikipedia.org/wiki/Sample_entropy

Package is not installable using setuptools

The package is broken in many ways:

It has a compile-time dependency on NumPy, but doesn't declare it.
It has a source distribution for version 0.2, but no distributable packages for this version, and because of that pip or setuptools will idiotically try to build the package while installing it, but will fail because of (1).
Finally, if you even manage to fix the NumPy dependency, the package also depends on mne package, but is required to install mne... which creates a dependency loop. This somehow works for pip as it is either able to break dependency loops, or ignores the error (more likely), but fails with setuptools. Thus any package that tries to depend on mne-features becomes also impossible to install.

Set frequency bands value while extracting features

I am trying to extract frequency band features using following way

feature=['energy_freq_bands','pow_freq_bands','line_length', 'kurtosis', 'ptp_amp', 'skewness']
fe = FeatureExtractor(sfreq=128, selected_funcs=feature)
X_new=fe.fit_transform(X)

It throwing me following error
'The entries of the given freq_bands parameter (%s) must be positive and less than the Nyquist 'frequency.' % str(freq_bands).

So how can I remove the 100Hz frequency value limit from ['energy_freq_bands','pow_freq_bands'] while extracting features.
I mean how can I just set freq_bands=np.array([0.5, 4., 8., 13.,30.]) without changing much things

Head development branch should be renamed to "main" for consistency with other mne-tools projects

Normalization with log power in compute_pow_freq_bands

Hi ,

Just looking at the code, there might be an issue with the normalization with the combination log=True, normalize=True in the function compute_pow_freq_bands() at

mne-features/mne_features/univariate.py

Line 722 in 399092a

if log:

    pow_freq_bands = np.empty((n_channels, n_freq_bands))
    for j in range(n_freq_bands):
        mask = np.logical_and(freqs >= fb[j, 0], freqs <= fb[j, 1])
        psd_band = psd[:, mask]
        pow_freq_bands[:, j] = np.sum(psd_band, axis=-1)

    if log:
        pow_freq_bands = np.log10(pow_freq_bands)

    if normalize:
        pow_freq_bands = np.divide(pow_freq_bands,
                                   np.sum(psd, axis=-1)[:, None])

If I'm not wrong the normalization should be done before apply the log:

  if normalize:
        pow_freq_bands = np.divide(pow_freq_bands,
                                   np.sum(psd, axis=-1)[:, None])

   if log:
        pow_freq_bands = np.log10(pow_freq_bands)

get_feature_names of FeatureFunctionTransformer fails when n_jobs > 1

@GuillaumeCorda and I noticed the following bug:

import numpy as np
from mne_features.feature_extraction import extract_features

rng = np.random.RandomState(42)
sfreq = 256.
data = rng.standard_normal((10, 20, int(sfreq)))

sel_funcs = ['mean']
df = extract_features(data, sfreq, sel_funcs, n_jobs=-1, return_as_df=True)

outputs:

Traceback (most recent call last):
  File "/cal/homes/jbschiratti/mne-features/mne_features/tests/test.py", line 9, in <module>
    df = extract_features(data, sfreq, sel_funcs, n_jobs=-1, return_as_df=True)
  File "/cal/homes/jbschiratti/mne-features/mne_features/feature_extraction.py", line 262, in extract_features
    return _format_as_dataframe(Xnew, extractor.get_feature_names())
  File "/cal/homes/jbschiratti/anaconda3/lib/python3.5/site-packages/sklearn/pipeline.py", line 699, in get_feature_names
    trans.get_feature_names()])
  File "/cal/homes/jbschiratti/mne-features/mne_features/feature_extraction.py", line 69, in get_feature_names
    raise ValueError('Call `transform` or `fit_transform` first.')
ValueError: Call `transform` or `fit_transform` first.

ModuleNotFoundError: No module name 'utils'

Hi,

I am trying to calculate the spectral edge frequency of my EEG Data using men-features. I am currently using Jupyter notebook via Anaconda-Navigator and I have installed mne environment, but I keep getting this "ModuleNotFoundError: No module name 'utils'" error... I don't have much experience with coding and python, so I will be greatly appreciated it if you could explain thoroughly how to import the features below.

from utils import (power_spectrum, _embed, _filt, _get_feature_funcs,
_get_feature_func_names, _wavelet_coefs, _idxiter,
_psd_params_checker)

Thanks in advance.

Getting all features from mne-features

Hi,

I am trying to extract all available feature from mne-features by using FeatureExtractor class. I actually can not find the list of all available features there and even in documentations. can someone guide me how to extract all available feature from that class?

Below line of code is using two of them but I want to get all of them

fe = FeatureExtractor(sfreq=100., selected_funcs=['std', 'kurtosis'])

FIX: Generalizable definition of aliases

In extract_features or FeatureExtractor, the parameter selected_funcs is a list of aliases (str) which correspond to feature functions. For simplicity (and generalizability), the aliases should correspond to the name of the feature function, without compute_.

Each feature functions should have a name of this form: compute_[suffix]
For each feature function, the corresponding alias should simply be: [suffix].

mne-tools / mne-features Goto Github PK

mne-features's People

Contributors

Stargazers

Watchers

Forkers

mne-features's Issues

Recommend Projects

Recommend Topics

Recommend Org