Hi,
I have all required packages listed as requirements with right versions, but still i get the following:
ERROR: Could not find a version that satisfies the requirement mne-features (from versions: none)
ERROR: No matching distribution found for mne-features
On trying:
pip3 install git+https://github.com/mne-tools/mne-features.git#egg=mne_features --no-cache-dir
i get the following:
ERROR: Complete output from command /home/venkat/miniconda3/bin/python -u -c 'import setuptools, tokenize;file='"'"'/tmp/pip-install-wpgqjkz0/python-mne-features/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(file);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, file, '"'"'exec'"'"'))' clean --all:
ERROR: Traceback (most recent call last):
File "", line 1, in
File "/home/venkat/miniconda3/lib/python3.6/tokenize.py", line 452, in open
buffer = _builtin_open(filename, 'rb')
FileNotFoundError: [Errno 2] No such file or directory: '/tmp/pip-install-wpgqjkz0/python-mne-features/setup.py'
ERROR: Failed cleaning build dir for python-mne-features
Successfully built python-mne-features
Failed to build python-mne-features
Hi. I have working on my project when I wanted to use compute_svd_entropy but it was taking too much time.
I stumbled upon a library cupy that is replacement for numpy for running on cuda.
I modified locally compute_svd_entropy like this:
And for my case it speeded up from 37s per chunk that I was passing to 3-4s per chunk.
I can create PR for that for every function that is applicable will you be willing to merge something like that. I know that it introduces one more dependency (it can be optional) so I wanted just to check first before submitting PR. Thanks in advance.
PS. n_jobs was motivated by the mne that is using n_jobs in that way.
I have been using the compute_samp_entropy function and I have updated my sklearn to version 1.3.0 and now the function gives this issue argument of type 'builtin_function_or_method' is not iterable. This is related to a change that skelarn made to the KDTree where the attribute valid_metrics is now a function. The only fix needed is to added parentheses where you try to access KDTree.valid_metrics.
The description for the optional edge parameter of univariate.compute_spect_edge_freq is as follow:
If not None, the values of edge are assumed to be positive and will be normalized to values between 0 and 1. Each entry of edge corresponds to a percentage. The spectral edge frequency will be computed for each different value in edge. If None, edge = [0.5] is used.
There are two issues that makes the actual behavior unexpected when compared to the documentation:
The sentence If None, edge = [0.5] is used implies that passing edge = [0.5] to the function should give the same results as calling the function without the parameter. It is however not the case. To get the same results, edge = [50] must be passed to the function.
If edge is not None, its values are not actually "normalized" as the documentation says, but are simply divided by 100 as shown in the code below
I suggest fixing either by removing the explanation about normalization, removing the division and let the user enter a value between 0 and 1, or by making the current behavior clearer in the documentation. A better explanation of the current behavior could be as follow:
If not None, the entries of edge should be positive values between 0 and 100, each corresponding to a percentage. The spectral edge frequency will be computed for each value in edge. If None, edge = [50] is used.
I think it would be nice to have the possibility to compute the intrinsic mode functions thanks to the Empirical Mode Decomposition method. It is highly efficient for analyzing nonlinear and non-stationary data such as EEG signals.
This could be later integrated into a sub-module of mne-features containing all the functions allowing a signal decomposition.
Greetings, Every one. First of all thanks for such an amazing package. I am trying to find features from a data shape of 100,64,3072 and it takes a lot of time. So i decided to go for multiprocessing.
I have tried multiprocessing across different subjects, and it work faster than sequential one, but with reduced channels (9 channels).
Now i am trying to multiprocessing on single subject, but across multiple channels.
Here is my code
from mne_features.feature_extraction import FeatureExtractor
import time
from multiprocessing import Process,current_process,Pool
feature=['app_entropy', 'decorr_time', 'energy_freq_bands', 'higuchi_fd', 'hjorth_complexity', 'hjorth_complexity_spect', 'hjorth_mobility', 'hjorth_mobility_spect', 'hurst_exp', 'katz_fd', 'kurtosis', 'line_length', 'mean', 'pow_freq_bands', 'ptp_amp', 'samp_entropy', 'skewness', 'spect_entropy', 'spect_slope', 'std', 'svd_entropy', 'svd_fisher_info', 'teager_kaiser_energy', 'variance', 'wavelet_coef_energy', 'zero_crossings', 'max_cross_corr', 'nonlin_interdep', 'phase_lock_val', 'spect_corr', 'time_corr']
fe = FeatureExtractor(sfreq=1024, selected_funcs=feature)
#print("Number of processors: ", mp.cpu_count())
def cal_feature(ch):
data= np.random.randint(0, 100, size=(100, 64, 3072))
return fe.fit_transform(data[:,ch,:])
if __name__ == '__main__':
start = time.time()
ch=[i for i in range(64)]
with Pool(12) as p:
result = p.map(cal_feature,(ch) )
print(time.time()-start)
If i replace fe.fit_transform(data[:,ch,:]) with np.mean(fe.fit_transform(data[:,ch,:],-1) every thing works fine. Hence i assumed that there is no issue with multiprocessing code
I am getting following error
Traceback (most recent call last):
File "<ipython-input-3-1b73aa731887>", line 1, in <module>
runfile('F:/Scizhophrenia/4- feature extraction multiscript a.py', wdir='F:/Scizhophrenia')
File "C:\Anaconda3\lib\site-packages\spyder_kernels\customize\spydercustomize.py", line 827, in runfile
execfile(filename, namespace)
File "C:\Anaconda3\lib\site-packages\spyder_kernels\customize\spydercustomize.py", line 110, in execfile
exec(compile(f.read(), filename, 'exec'), namespace)
File "F:/Scizhophrenia/4- feature extraction multiscript a.py", line 33, in <module>
result = p.map(cal_feature,(ch) )
File "C:\Anaconda3\lib\multiprocessing\pool.py", line 268, in map
return self._map_async(func, iterable, mapstar, chunksize).get()
File "C:\Anaconda3\lib\multiprocessing\pool.py", line 657, in get
raise self._value
IndexError: too many indices for array````
import numpy as np
from mne_features.feature_extraction import extract_features
rng = np.random.RandomState(42)
sfreq = 256.
X = rng.standard_normal((2, 1, int(sfreq)))
# data with only 2 epochs and 3 channels
data = np.concatenate([X, X ** 2, X ** 3], axis=1)
sel_funcs = ['spect_corr']
Xnew = extract_features(data, sfreq, sel_funcs, {'spect_corr__db': False})
print(Xnew)
This example shows that, with spect_corras feature function, some features (column index 0, 3, 5) are constant. These constant features were computed from pairs of channels of the form (data[i, :], data[i, :]). Same is true with other bivariate feature functions.
In some cases (using Xnew for classification, for instance), one may wish to remove these constant features. This could be done by post-processing the extracted features with sklearn.feature_selection.VarianceThreshold. However, having an optional parameter (like include_diag=False) in bivariate features might be a good idea to remove features corresponding to pairs of channels of the form (data[i, :], data[i, :]).
After installing through both conda install -c conda-forge mne-features and pip install mne-features, I'm still having trouble importing the module into my Python program: import mne_features. I get the following error:
ModuleNotFoundError: No module named 'mne_features'
It would be really useful to have a function that works in the same way as sklearn's GridSearchCV does allowing to fine-tune parameters from the univariate.py and bivariate.py functions. In order to do so a classification problem to solve is needed which could be seizure detection.
As suggested by @agramfort in #37, the PSD estimation with power_spectrum should use a different method than FFT. Indeed, the PSD estimated from the FFT can be very noisy. We could probably go for something like:
from mne.time_frequency import psd_array_multitaper, psd_array_welch
def power_spectrum(sfreq, data, fmin=0., fmax=np.inf, method='welch',
verbose=None):
...
if method == 'welch':
return psd_array_welch(data, sfreq, fmin=fmin, fmax=fmax,
verbose=verbose)
elif method == 'multitaper':
return psd_array_multitaper(data, sfreq, fmin=fmin, fmax=fmax,
verbose=verbose)
elif method == 'fft':
...
else:
raise ValueError('PSD estimation is not implemented for the given '
'method (%s).' % str(method))
As @agramfort noted, the tests (for univariate and bivariate) feature functions only check that the returned arrays have the right shape. It is enough to check that a function does not fail on random input data but such tests do not ensure that the features are correctly implemented...
And the result z of compute_zero_crossings is array([8.]).
Depending on whether the last value of y should be treated as 0, compute_zero_crossings might give wrong results. To address this problem, compute_zero_crossings should use a threshold to determine whether a value is treated as 0. For instance:
from mne_features.feature_extraction import extract_features
import numpy as np
sfreq = 256.
rng = np.random.RandomState(seed=42)
data = rng.standard_normal((10, 20, int(sfreq)))
selected_features = ['pow_freq_bands', 'hjorth_mobility_spect', 'hjorth_complexity_spect', 'spect_entropy']
features = extract_features(data, sfreq, selected_funcs=selected_features)
With this code, each feature function contains ps, freqs = power_spectrum(sfreq, data). Therefore, the power spectrum of the data is computed 4 times... which is not optimal! It would be good if some utility function (such as power_spectrum) could benefit from a memory parameter. This way, the code above would only compute the power spectrum of the data once.
The function extract_features extracts features from a data array X, shape (n_epochs, n_channels, n_times)epoch by epoch. If an epoch is flat (that is, if np.allclose(X[j, :, :], 0) is True), then most feature functions will either return arrays of 0 or fails (log of negative values, division by 0,...).
Possible fix:
The utility function _apply_extractor should check if the epoch is flat and, if so, deal with it (for example: return 0 for all the features + warning to the user).
Problem: I just updated the version of mne-features, I am using mne_features.feature_extraction.extract_features() but no matter what the input arguments are, it pops up the following error:
TypeError: _check_input() got an unexpected keyword argument 'reset'
In its current form, extract_features accepts a selected_funcs parameter which is a list of str. Each str in selected_funcs is an alias for a given feature function. However, feature functions such as compute_higuchi_fd, compute_svd_entropy or compute_spect_edge_freq accept optional keyword parameters/arguments. Still, the value of these optional parameters cannot be changed when using extract_features. I think it would be good to give the user the ability to change these optional parameters.
Possible fix:
Instead of having a list of str for selected_funcs, one could have a list of tuples, each tuple being of the form (alias_of_the_function, dict_of_optional_arguments). For example: selected_funcs = [('mean', dict()), ('higuchi_fd', {'kmax': 5}), ('higuchi_fd', {'kmax': 10})]. If heterogeneous lists are allowed, one could have: selected_funcs = ['mean', ('higuchi_fd', {'kmax': 5}), ('higuchi_fd', {'kmax': 10})].
Dataset consists of some 20 files, 10 control and 10 typical/healthy individuals. Each file is read and the raw data is epoched (With overlap) and squashed into one huge array of dimension (n_epochs, n_channels, n_times).
For instance, if a certain file is read in using read_raweeg(), it is subsequently epoched using make_fixed_length_events() into a matrix of dimension - (n_epochs, n_channels, n_times). Each file is epoched in this manner and n_epochs and n_times vary for each file. We select the min(n_times) (of all files) and truncate the epochs. What we have left is a list of epochs for each file where n_times is the same across all epochs. We prepare a corresponding 'y' binary variable whose length is (sum(n_epochs),).
We aim to call csp.fit_transform(all_epochs, y) to classify between control and healthy subjects and to plot the respective patterns.
On doing so, we get the following error:
LinAlgError: the leading minor of order 4 of 'b' is not positive definite. The factorization of 'b' could not be completed and no eigenvalues or eigenvectors were computed.
Although MNE-features proposes a large number of feature functions, a user might want to use MNE-features with his own feature functions. The function extract_features or the class FeatureExtractor should allow the user to work with his own feature functions.
Greetings. I am working on kaggle kernals and every thing is working well. I created a new kernal and donwload the jupyter notebook file from previous kernal and uploaded to new kernal.
I do
!pip install git+https://github.com/mne-tools/mne-features.git#egg=mne_features
when i do
from mne_features.feature_extraction import FeatureExtractor
i got the following error
/opt/conda/lib/python3.6/site-packages/mne_features/univariate.py:856: NumbaWarning:
Compilation is falling back to object mode WITH looplifting enabled because Function "_higuchi_fd" failed type inference due to: Invalid use of Function() with argument(s) of type(s): (tuple(OptionalType(int64) i.e. the type 'int64 or None' x 1))
parameterized
In definition 0:
All templates rejected with literals.
In definition 1:
All templates rejected without literals.
This error is usually caused by passing an argument of a type that is unsupported by the named function.
[1] During: resolving callee type: Function()
[2] During: typing of call at /opt/conda/lib/python3.6/site-packages/mne_features/univariate.py (874)
File "../../opt/conda/lib/python3.6/site-packages/mne_features/univariate.py", line 874:
def _higuchi_fd(data, kmax):
@nb.jit([nb.float64[:](nb.float64[:, :], nb.optional(nb.int64)),
/opt/conda/lib/python3.6/site-packages/mne_features/univariate.py:856: NumbaWarning:
Compilation is falling back to object mode WITHOUT looplifting enabled because Function "_higuchi_fd" failed type inference due to: cannot determine Numba type of <class 'numba.dispatcher.LiftedLoop'>
File "../../opt/conda/lib/python3.6/site-packages/mne_features/univariate.py", line 873:
def _higuchi_fd(data, kmax):
@nb.jit([nb.float64[:](nb.float64[:, :], nb.optional(nb.int64)),
/opt/conda/lib/python3.6/site-packages/numba/compiler.py:725: NumbaWarning: Function "_higuchi_fd" was compiled in object mode without forceobj=True, but has lifted loops.
self.func_ir.loc))
/opt/conda/lib/python3.6/site-packages/numba/compiler.py:734: NumbaDeprecationWarning:
Fall-back from the nopython compilation path to the object mode compilation path has been detected, this is deprecated behaviour.
warnings.warn(errors.NumbaDeprecationWarning(msg, self.func_ir.loc))
/opt/conda/lib/python3.6/site-packages/mne_features/univariate.py:856: NumbaWarning:
Compilation is falling back to object mode WITH looplifting enabled because Function "_higuchi_fd" failed type inference due to: Invalid use of Function() with argument(s) of type(s): (tuple(OptionalType(int32) i.e. the type 'int32 or None' x 1))
parameterized
In definition 0:
All templates rejected with literals.
In definition 1:
All templates rejected without literals.
This error is usually caused by passing an argument of a type that is unsupported by the named function.
[1] During: resolving callee type: Function()
[2] During: typing of call at /opt/conda/lib/python3.6/site-packages/mne_features/univariate.py (874)
File "../../opt/conda/lib/python3.6/site-packages/mne_features/univariate.py", line 874:
def _higuchi_fd(data, kmax):
@nb.jit([nb.float64[:](nb.float64[:, :], nb.optional(nb.int64)),
/opt/conda/lib/python3.6/site-packages/mne_features/univariate.py:856: NumbaWarning:
Compilation is falling back to object mode WITHOUT looplifting enabled because Function "_higuchi_fd" failed type inference due to: cannot determine Numba type of <class 'numba.dispatcher.LiftedLoop'>
File "../../opt/conda/lib/python3.6/site-packages/mne_features/univariate.py", line 873:
def _higuchi_fd(data, kmax):
@nb.jit([nb.float64[:](nb.float64[:, :], nb.optional(nb.int64)),
/opt/conda/lib/python3.6/site-packages/numba/compiler.py:725: NumbaWarning: Function "_higuchi_fd" was compiled in object mode without forceobj=True, but has lifted loops.
self.func_ir.loc))
/opt/conda/lib/python3.6/site-packages/numba/compiler.py:734: NumbaDeprecationWarning:
Fall-back from the nopython compilation path to the object mode compilation path has been detected, this is deprecated behaviour.
Once features were extracted as a matrix X, shape (n_epochs, n_features), it is difficult to determine which column of X corresponds to a given feature. It would be nice to have an optional parameter return_as_dataframe=True in extract_features to get a Pandas DataFrame like this:
This is not an "issue" but wanted to note that since the compute_spect_entropy function won't normalize the power distributions with the total numbers of frequencies, the return values of the spectral entropy of each signal are higher than 1, which should not be.
I think that _embed, _filt, _wavelet_coefs (in utils.py) should not be private functions and should be added (along with power_spectrum) to MNE-features' API. The reason for that is that those functions are "building blocks" for user-defined feature functions. For instance, it would be great if they could do:
from mne_features.utils import power_spectrum
from mne_features.feature_extraction import extract_features
def myspectralfeature(sfreq, data):
psd, _ = power_spectrum(sfreq, data)
...
return ...
selected_funcs = ['mean', 'variance', ('myfunc', myspectralfeature)]
features = extract_features(epochs_data, sfreq, selected_funcs)
It is implicit that, for any feature function, if sfreq is one of the parameters, then it should be placed first because unless feature_funcs[alias] = partial(func, sfreq) (in _get_feature_funcs) will not work as expected. This should be mentioned in the doc... Or partial(func, sfreq) should be changed for partial(func, **{'sfreq': sfreq}).
mne_features.feature_extraction.extract_features() now returns the error TypeError: _check_input() missing 1 required keyword-only argument: 'reset'.
Checking the arguments of this method on line 87 within feature_extraction.py does show that 'reset' is apparently an existing argument, as self._check_input.__code__.co_varnames returns ('self', 'X', 'reset')
As @agramfort suggested in #17, it would be a good idea to have a memory parameter (like in sklearn.pipeline.Pipeline) in extract_features and FeatureExtractor to allow caching the fitted transformers.
Should we include a deviation to the inverse freq curve (prototypical power spectral density) functionality?
If so, do we do a linear regression on the log-log curve, in which case our estimator is formally:
estimated power = k1 / f^lambda , where k1 = exp(estimated intercept) and lambda = - estimated slope
Or, as in the Winkler 2011 paper, do we add a term k2 to our estimator such that:
estimated power = (k1 / f^lambda) - k2
In other words, do we assume or not that the power converges to 0 for very high frequencies?
Given a ndarray x of shape (n_channels, n_times), embed(x, d, tau) should return an array of shape (n_channels, n_times - (d - 1) * tau, d). Instead, it returns an array of shape (n_channels, n_times - 1 - (d - 1) * tau, d).
The functions compute_app_entropy and compute_samp_entropy do not seem to fully implement Approximate and Sample Entropy as defined in [1, 2, 3]. These functions should be rewritten to conform to the definitions from the litterature.
It has a compile-time dependency on NumPy, but doesn't declare it.
It has a source distribution for version 0.2, but no distributable packages for this version, and because of that pip or setuptools will idiotically try to build the package while installing it, but will fail because of (1).
Finally, if you even manage to fix the NumPy dependency, the package also depends on mne package, but is required to install mne... which creates a dependency loop. This somehow works for pip as it is either able to break dependency loops, or ignores the error (more likely), but fails with setuptools. Thus any package that tries to depend on mne-features becomes also impossible to install.
I am trying to extract frequency band features using following way
feature=['energy_freq_bands','pow_freq_bands','line_length', 'kurtosis', 'ptp_amp', 'skewness']
fe = FeatureExtractor(sfreq=128, selected_funcs=feature)
X_new=fe.fit_transform(X)
It throwing me following error 'The entries of the given freq_bands parameter (%s) must be positive and less than the Nyquist 'frequency.' % str(freq_bands).
So how can I remove the 100Hz frequency value limit from ['energy_freq_bands','pow_freq_bands'] while extracting features.
I mean how can I just set freq_bands=np.array([0.5, 4., 8., 13.,30.]) without changing much things
Just looking at the code, there might be an issue with the normalization with the combination log=True, normalize=True in the function compute_pow_freq_bands() at
I am trying to calculate the spectral edge frequency of my EEG Data using men-features. I am currently using Jupyter notebook via Anaconda-Navigator and I have installed mne environment, but I keep getting this "ModuleNotFoundError: No module name 'utils'" error... I don't have much experience with coding and python, so I will be greatly appreciated it if you could explain thoroughly how to import the features below.
I am trying to extract all available feature from mne-features by using FeatureExtractor class. I actually can not find the list of all available features there and even in documentations. can someone guide me how to extract all available feature from that class?
Below line of code is using two of them but I want to get all of them
fe = FeatureExtractor(sfreq=100., selected_funcs=['std', 'kurtosis'])
In extract_features or FeatureExtractor, the parameter selected_funcs is a list of aliases (str) which correspond to feature functions. For simplicity (and generalizability), the aliases should correspond to the name of the feature function, without compute_.
Each feature functions should have a name of this form: compute_[suffix]
For each feature function, the corresponding alias should simply be: [suffix].