ambitious-octopus / mi-eeg-1d-cnn Goto Github PK

A new approach based on a 10-layer one-dimensional convolution neural network (1D-CNN) to classify five brain states (four MI classes plus a 'baseline' class) using a data augmentation algorithm and a limited number of EEG channels. Paper: https://doi.org/10.1088/1741-2552/ac4430

License: GNU General Public License v3.0

Python 98.98% Dockerfile 1.02%

brain-computer-interface artificial-neural-networks neuroscience

mi-eeg-1d-cnn's Introduction

A 1D CNN for high accuracy classification and transfer learning in motor imagery EEG-based brain-computer interface

Please read here

As @zewail-liu pointed out in issue #22, this code contains a bug that strongly impacts the results of the paper. Please read the issue, in the fix folder you will find two files that fix the problem, but the results of the paper change drastically. The Journal was promptly informed.

Reference paper

Mattioli F, Porcaro C, Baldassarre G. A 1D CNN for high accuracy classification and transfer learning in motor imagery EEG-based brain-computer interface . J Neural Eng. 2022 Jan 6; 18(6). doi: 10.1088/1741-2552/ac4430. PMID: 34920443.

Usage

Install the dependencies

In order to train the network (and perform inference) you need to install the dependencies. There are two ways to install dependencies: (1) Using a docker container (recommended), (2) Using a python environment. An NVIDIA GPU with at least 6 GB is also recommended. The network was trained with an NVIDIA RTX 2060 and an NVIDIA TESLA P100, the former taking about 30 minutes to train, the latter about 15 minutes.

Using a docker container (recommended)

What is a docker container?

In addition to docker we also recommend installing NVIDIA Container Runtime (v2) which allows you to create containers that use your nvidia GPU. Guide: NVIDIA Container Runtime (v2) installation guide. The container building process has been tested on Ubuntu 20.04. If you have a Windows machine you can try Windows Linux Subsystem but it has not been tested yet (If you test it, tell us about your experience by opening an issue.).

Assuming you have docker installed. Building the container is straightforward.

Clone this repository git clone https://github.com/Kubasinska/MI-EEG-1D-CNN.git
Open a shell and cd into the docker folder MI-EEG-1D-CNN/docker
Run docker build -t eegcnn . Depending on the permissions set on your machine you may need to run this command as root by adding sudo at the beginning. This command will build a new container called eegcnn. This may take a few minutes. During the build process, the original dataset is downloaded and generated as described in the methods section of the original paper. Once the container is built, you already have everything ready.
Run the container to check that everything has been installed correctly. To launch the container, go to the working directory of the MI-EEG-1D-CNN repository. You can launch the container in several ways; here, we recommend 2, one that allows you to plot graphs and one that does not.
1. I want to see some graphs. If you want to see graphs, you must allow the container to access your screen To do this, run xhost +local:root; this exposes your xhost so that the container can render the correct display by reading and writing through the X11 unix socket. Then, from the working directory of the repository launch: docker run -it --gpus all -v $(pwd):/workspace -v /tmp/.X11-unix:/tmp/.X11-unix:rw -e DISPLAY=unix$DISPLAY --device /dev/dri --privileged -v /home/$USER/.Xauthority:/root/.Xauthority eegcnn bash. This will open a bash shell inside the container, and you are good to go! If you don't have an NVIDIA GPU or have NVIDIA Container Runtime (v2) installed, omit --gpus all. When you close the container, remember to launch xhost -local:root.
2. I don't care about graphs. In this case, it is much easier! Just run: docker run -it --gpus all -v $(pwd):/workspace eegcnn bash and you're in.

NOTE: All code paths are made, so you don't have to change anything using the container. For example, inside the directory /dataset of the container, you find three sub-folders /dataset/original, the original dataset data, /dataset/paper the data generated through the method described in the paper with the script dataset_generator/generator.py, and a third folder /dataset/saved_models, where the trained models automatically save after training. When you run the script to make an inference, you don't have to change anything because the python interpreter already knows where to get the trained model.

Using a python environment.

This procedure is more straightforward but can create dependency issues based on your machine or operating system. Using this procedure, you must manually download the original dataset and generate the dataset used in the paper. You also need Anaconda or Miniconda to create a separate python environment. The following guide assumes you have Anaconda or Miniconda installed on your system.

Open a terminal and cd into the MI-EEG-1D-CNN/docs folder. Run conda env create -f environment.yml. These create a new python environment called eeg containing almost all necessary dependencies. The only dependencies missing are CUDA and cuDNN, which TensorFlow needs to use your GPU. If you don't have an NVIDIA GPU, go ahead. If you have an NVIDIA GPU, you need to install CUDA 10.1 and cuDNN 7.6 (be careful with the version, CUDA 10.2 or 10.0 is not good, you need 10.1, same for cuDNN). Please refer to the official NVIDIA website for installation, here is a guide for windows.
Download the EEG Motor Movement/Imagery Dataset here. The dataset is quite large (3.4 GB); it will take a while. Once downloaded, extract it. If you have wget you can download it from the terminal with the command. wget -r -N -c -np https://physionet.org/files/eegmmidb/1.0.0/.
Generate the dataset; this procedure simply takes the raw data and breaks it into the input dimension of the neural network. Use the script MI-EEG-1D-CNN/dataset_generator/generator.py. Change the dataset path to the path of the dataset you downloaded, and you are ready! Don't forget to run the script with the new conda environment eeg.

Train the network(s)

From here on, we assume that all dependencies have been installed correctly.

Directory structure

.
└── MI-EEG-1D-CNN/
    ├── data_processing/ # A module with useful functions
    │   └── general_processor.py
    ├── dataset_generator/ # Script that generates the dataset
    │   └── generator.py
    ├── docker/ # All the useful things to build the container
    │   ├── Dockerfile
    │   ├── environment.yml
    │   └── generator.py
    ├── docs/ # Scripts for inference and plotting
    │   ├── inference
    │   └── environment.yml
    ├── models/ # Scripts for training networks
    │   ├── hand_test
    │   └── transfer
    └── model_set/ # A module with all models
        └── models.py

The data_processing folder is a module that contains many helpful functions for assigning labels, generating the dataset as described in the paper, and other essential things. The dataset_generator folder contains the script that generates the dataset. Running the script generator.py with the correct paths creates a new dataset described in the paper. The script saves the data of each subject and channel combinations separately, so it is possible to load the data subject by subject into memory. The docker folder contains the Dockerfile and the environment.yml file. The docs folder contains the scripts for inference and plotting. Inside the inference subfolder, you can find one script for each ROI described in the paper (including the test without batch normalization and the test without SMOTE). There are also scripts to do transfer learning on the seven random subjects. Be careful; these scripts assume that there is a trained model. If you want to run these scripts, you must first train the models and save them. The scripts to train the models can be found in the folder models. Inside this folder, you will find a script for each ROI. The name pattern is train_ and the corresponding letter of each ROI (a,b,c,d,e,f). For example, running the script train_a.py will start training the network with the ROI-A, composed of channels ( [FC1, FC2], [FC3, FC4], [FC5, FC6]).

Be careful; if you are using the container, you won't have any problems; the script already knows where to get the data and save the model. If you are not using the docker container, you will have to change the paths manually. source_path refers to the dataset generated through the generator.py script. save_path refers to the path where the trained model will be saved.

The hand_test subfolder contains the tests done on the network to evaluate the importance of data augmentation, checkpointing, and batch normalization. As usual, if you don't use our container, change the paths. The transfer subdirectory contains seven scripts that train the model with ROI-E and exclude a single subject. These scripts are used to evaluate transfer learning. The model_set folder contains the neural network (informally called HopefulNet) written in tensorflow using the subclass api.

Problems?

If you experience any problems, feel free to contact me ([email protected]) or open an issue.

Cite this paper

@article{mattioli20211d,
  title={A 1D CNN for high accuracy classification and transfer learning in motor imagery EEG-based brain-computer interface},
  author={Mattioli, Francesco and Porcaro, Camillo and Baldassarre, Gianluca},
  journal={Journal of Neural Engineering},
  year={2021},
  publisher={IOP Publishing}
}

mi-eeg-1d-cnn's People

Contributors

Stargazers

Watchers

Forkers

greeshma1987 yangqbo xjohnxjohn wartensie168 synizter hanjiaguan bugdaryan haoxiaodao ukaserge christosbouronikos igor-id jinbo4321 dawin2015 lkebir realzli yoorahi tuwenhao1 pianissimo-3115

mi-eeg-1d-cnn's Issues

Test #2

Trained network, single subject, sliding windows [step 0.5 seconds]. Show Plot.

Theoretical aspects, open discussion

First issue: Do we have a physical limit, in motor imagery?

Questions about the operation of script train_d.py and train_e.py

I'm sorry to bother you. I set up the required environment on a ubuntu18 computer a few days ago and also ran the tran_a,b,c scripts without any problems.

I use jupyter lab to view and run the code.train_a,b,c They all run smoothly as shown below:
They can train to complete.

When I run the scripts train_d and train_e, the script stops running when it prints out
" before oversampling=[ ] " (as if it had already finished).

Confusingly, there is no error reported either.

I am unable to solve this problem on my own for the time being and would appreciate your suggestions in your free time.

Additional note: I used pycharm to run these scripts in a windows 10 environment and they all worked fine.

2 second window, changelog

channgelog -> 08/02
learning_rate = 1e-4 -> 1e-3
Added callbacks, Early Stopping and Checkpoints
epochs = 150 -> 200

Installing the packages with the environment YML

Hi! I'm trying to replicate your results to also test & compare the model with some data I personally recorded.

I have a windows PC so I choose the second solution of cloning the repo and creating the environment through the YML file. But I run into some problems:

Firstly, most of the packages put into the YML file raise a PackageNotFound Error, this is strange bc for example one of those packages is matplotlib, this is the full Error:

ResolvePackageNotFound:                                                                                                   - sip==4.19.13=py38he6710b0_0                                                                                           - matplotlib==3.4.2=py38h06a4308_0                                                                                      - zstd==1.4.9=haebb681_0                                                                                                - libgfortran4==7.5.0=ha8ba4b0_17                                                                                       - lz4-c==1.9.3=h295c915_1                                                                                               - libxml2==2.9.12=h03d6c58_0                                                                                            - intel-openmp==2021.3.0=h06a4308_3350                                                                                  - libstdcxx-ng==9.3.0=hd4cf53a_17                                                                                       - mkl==2021.3.0=h06a4308_520                                                                                            - qt==5.9.7=h5867ecd_1                                                                                                  - gst-plugins-base==1.14.0=h8213a91_2                                                                                   - libpng==1.6.37=hbc83047_0                                                                                             - ca-certificates==2021.5.30=ha878542_0                                                                                 - pip==21.2.2=py38h06a4308_0                                                                                            - setuptools==58.0.4=py38h06a4308_0                                                                                     - fontconfig==2.13.1=h6c09931_0                                                                                         - ld_impl_linux-64==2.35.1=h7274673_9                                                                                   - libgomp==9.3.0=h5101ec6_17                                                                                            - xz==5.2.5=h7b6447c_0                                                                                                  - glib==2.69.1=h5202010_0                                                                                               - libgfortran-ng==7.5.0=ha8ba4b0_17                                                                                     - openssl==1.1.1l=h7f8727e_0                                                                                            - gstreamer==1.14.0=h28cd5cc_2                                                                                          - libuuid==1.0.3=h1bed415_2                                                                                             - libgcc-ng==9.3.0=h5101ec6_17                                                                                          - libtiff==4.2.0=h85742a9_0                                                                                             - readline==8.1=h27cfd23_0                                                                                              - brotli==1.0.9=he6710b0_2                                                                                              - mkl-service==2.4.0=py38h7f8727e_0                                                                                     - pyqt==5.9.2=py38h05f1152_4                                                                                            - dbus==1.13.18=hb2f20db_0                                                                                              - libwebp-base==1.2.0=h27cfd23_0                                                                                        - sqlite==3.36.0=hc218d9a_0                                                                                             - tornado==6.1=py38h27cfd23_0                                                                                           - pcre==8.45=h295c915_0                                                                                                 - libopenblas==0.3.13=h4367d64_0                                                                                        - matplotlib-base==3.4.2=py38hab158f2_0                                                                                 - freetype==2.10.4=h5ab3b9f_0                                                                                           - icu==58.2=he6710b0_3                                                                                                  - openjpeg==2.4.0=h3ad879b_0                                                                                            - zlib==1.2.11=h7b6447c_3                                                                                               - lcms2==2.12=h3be6417_0                                                                                                - certifi==2021.5.30=py38h578d9bd_0                                                                                     - libffi==3.3=he6710b0_2                                                                                                - ncurses==6.2=he6710b0_1                                                                                               - numpy-base==1.19.2=py38h75fe3a5_0                                                                                     - pillow==8.3.1=py38h2c7a002_0                                                                                          - kiwisolver==1.3.1=py38h2531618_0                                                                                      - _openmp_mutex==4.5=1_gnu                                                                                              - python==3.8.3=hcff3b4d_2                                                                                              - jpeg==9d=h7f8727e_0                                                                                                   - tk==8.6.11=h1ccaba5_0                                                                                                 - libxcb==1.14=h7b6447c_0                                                                                               - expat==2.4.1=h2531618_2                                                                                               - cudatoolkit==10.1.243=h6bb024c_0

(sorry, copying it from the command line must have breaking it into colums, but it was a print of each package under the next one)

Then I tried putting all packages under the pip line (I know this could raise some trouble in the future, but I was trying to al least make it run once, if that worked then I could have done it again but only putting under pip the ones Anaconda couldn't find -they are a lot though) but I got another Error:

Pip subprocess error:
ERROR: Invalid requirement: '_libgcc_mutex=0.1=main' (from line 1 of C:\Users\ottaaproject\Desktop\Lixi\MI GPU\MI-EEG-1D-CNN\docs\condaenv.amrlke0r.requirements.txt)
Hint: = is not a valid operator. Did you mean == ?
failed
CondaEnvException: Pip failed

On first glance I feel like I might be handling a different Anaconda version, I just updated to the latest release of anaconda but no changes happened. Hope you can help me solve this, I believe I could get a lot out of this comparison (and of course I can send you my results too!)

Thanks in advance. Greetings from Argentina!

LSTM single-subject approach

Code

Generate data and understand how tf.data works
How to split train and test!
Clean the repo.

Theory:

Understand how the hell a LSTM works.

ICA on Real Movement

ICA with template matching for runs [3,7,11]

A problem with the path to manually download the dataset

Hello, I am a postgraduate student who just started studying BCI;
I tried running this code on my computer (Windows10, using anaconda to install python environment)；
I now have the Python environment installed, but I encountered the following error while performing this step (Generate the dataset):

FileNotFoundError: [WinError 3],The system could not find the specified path: './MI-EEG-1D-CNN-master/files/paper\FC1FC2'

I have downloaded and unpacked the dataset myself,I put this data set under the D:\CODE\MI-EEG-1D-CNN-master path.
I actually saw you talking about it in readme:Change the dataset path to the path of the dataset you downloaded, and you are ready!
I have tried several times and failed, so I come here for help. What should I do?
Sincerely hope to get your reply

Original HopefullNet Architecture

class HopefullNet(tf.keras.Model):
    """
    Original HopeFullNet
    """
    def __init__(self, inp_shape = (640,2)):
        super(HopefullNet, self).__init__()
        self.inp_shape = inp_shape

        self.kernel_size_0 = 20
        self.kernel_size_1 = 6
        self.drop_rate = 0.5

        self.conv1 = tf.keras.layers.Conv1D(filters=32,
                                            kernel_size=self.kernel_size_0,
                                            activation='relu',
                                            padding= "same",
                                            input_shape=self.inp_shape)
        self.batch_n_1 = tf.keras.layers.BatchNormalization()
        self.conv2 = tf.keras.layers.Conv1D(filters=32,
                                            kernel_size=self.kernel_size_0,
                                            activation='relu',
                                            padding= "valid")
        self.batch_n_2 = tf.keras.layers.BatchNormalization()
        self.spatial_drop_1 = tf.keras.layers.SpatialDropout1D(self.drop_rate)
        self.conv3 = tf.keras.layers.Conv1D(filters=32,
                                            kernel_size=self.kernel_size_1,
                                            activation='relu',
                                            padding= "valid")
        self.avg_pool1 = tf.keras.layers.AvgPool1D(pool_size=2)
        self.conv4 = tf.keras.layers.Conv1D(filters=32,
                                            kernel_size=self.kernel_size_1,
                                            activation='relu',
                                            padding= "valid")
        self.spatial_drop_2 = tf.keras.layers.SpatialDropout1D(self.drop_rate)
        self.flat = tf.keras.layers.Flatten()
        self.dense1 = tf.keras.layers.Dense(296, activation='relu')
        self.dropout1 = tf.keras.layers.Dropout(self.drop_rate)
        self.dense2 = tf.keras.layers.Dense(148, activation='relu')
        self.dropout2 = tf.keras.layers.Dropout(self.drop_rate)
        self.dense3 = tf.keras.layers.Dense(74, activation='relu')
        self.dropout3 = tf.keras.layers.Dropout(self.drop_rate)
        self.out = tf.keras.layers.Dense(5, activation='softmax')

Prediction Time

Prediction time for the original HopefullNet with 4 second windows:
n = 10999

Mean -> 0.03883716804090982/s or 388.37168040909825/ms
STD -> 0.014576505798775771/s

Question consultation

Dear authors:
Recently I have looking your this project and feel interested in it !could have a dataset of this project? Thanks a lot! I am looking forward to your reply Sincerely!

Best regards.

Test

Test, Camillo suggests taking only the first 2 seconds of each stimulus.

Question about the train_test_spliter

Hi there, inspiring method and great paper
have been trying to apply your work on other dataset for couple of days, but cant achieve good results..
recheck the code, i think maybe it's a data-split problem.
for example, in this file, MI-EEG-1D-CNN/models/train_a.py, line 45

x is the loaded data, already shapes (events_num, 2, 640).
as we know, in one specific MI-task, different channel-couple in one ROI have similar behaviors,
in line 52, spliting reshape_x may split channel-couples in one task into train_set and test_set at same time, that maybe cause the acc rise not for the Model cause.

the data loading code of your work is a little bit hard for me to read, so i am trying to write my data loading function( humble one without base type event or SMOTE), which split data to train and test set first then reshaped it from (events_num, channels_num, 640) to (events_num, 2, 640) . then using HopefullNet to fit them, didn't end well.
i will paste my function below, after figure out how..

hope could get your respond, instruction about how to transfer HopefullNet to other dataset will be more than great.
best wishes

New strategy for real time classification

In order to achieve real-time classification, we are working on a new strategy:
Instead of creating a sliding window, the new strategy is to split the epoch into 2 (for now) and then train the net with a double number of examples. This implies no incoherent sample where there is more than one possible label.

ICA on Imagery movement

ICA with template matching on runs [4,8,12] for all subjects.

To-Dos week April 26th

Generate for all subjects (for all runs) a database with window size 1/2 second and stride 1/2 and 1/4 second for all couples of specular channel in the corresponding hemisphere
Generate for all subjects (for all runs) a database with window size 2 second and stride 1/2 and 1/2 second for all couples of specular channel in the corresponding hemisphere

Questions about the SavedModel

Hello, I am a college student. When I reproduce your code, I have not found a trained model. Where can I get the trained model?
Thank you very much