Coder Social home page Coder Social logo

erikbjare / thesis Goto Github PK

View Code? Open in Web Editor NEW
67.0 5.0 4.0 133.48 MB

MSc thesis on: Classifying brain activity using EEG and automated time tracking of computer use (using ActivityWatch)

Home Page: https://erik.bjareholt.com/thesis/

Makefile 1.83% Python 54.90% Jupyter Notebook 43.27%
openbci activitywatch thesis muse msc-thesis eeg research neurosity machine-learning

thesis's Issues

Fulfill requirements for goal document

The process for CS students: http://cs.lth.se/examensarbete/hur-gaar-det-till/
General CS dep resource: http://cs.lth.se/examensarbete/
General LTH resource: http://www.student.lth.se/studieinformation/examensarbete/examensarbetsprocessen/

  • Arbetstitel, inblandades namn och kontaktuppgifter samt preliminärt start- och slutdatum.
  • Bakgrund/kontext och motiv för examensarbetet.
  • Övergripande mål och problemställningar/forskningsfrågor.
  • Angreppssätt/metodik och metoder.
  • Vetenskaplig grund och beprövad erfarenhet som examensarbetet ska bygga vidare på. Detta kan t ex beskrivas i form av ett par nyckelreferenser till artiklar eller annat underlag.
  • Hur förväntas examensarbetet bidra till kunskapsutvecklingen?
  • Preliminär beskrivning av resurser som krävs för arbetets genomförande, t ex arbetsplats och utrustning, och hur dessa ordnas och finns tillgängliga.

Make better use of MNE

I found some tricky issues and found that MNE has tools for just the thing.

After browsing the documentation a bit again, it seems like I've duplicated a lot of work by not using MNE when I probably should.

A partial rewrite is in order.

  • See if MNE is useful for our arbitrary-length epochs and window-sliding approach.
  • Complete restructure of dataset into BIDS (#10)

Muse data is frequently -1000 for TP9 and TP10

Not sure what's up with that, or how to deal with it.

From looking at the raw data, it looks like it's -1000 exactly every 5th row. Sometimes there are 2 in a row, and then it repeats every 5th row again.

Edit: Maybe this is just powerline noise? At the sampling freq of 250Hz the powerline peak would happen roughly every 4-5th sample. Why are TP9 and TP10 so much more sensitive though?

Example:

1603711387.314,-1000.000,-44.434,-38.574,-1000.000,0.000
1603711387.318,-609.375,-29.297,-27.344,-574.707,0.000
1603711387.322,787.109,-19.531,-23.926,814.941,0.000
1603711387.325,-852.051,-27.832,-22.461,-858.887,0.000
1603711387.329,184.082,-37.598,-23.926,189.941,0.000
1603711387.333,-1000.000,-45.410,-39.062,-1000.000,0.000
1603711387.337,-836.914,-34.668,-30.762,-804.688,0.000
1603711387.341,519.043,-18.555,-11.719,561.523,0.000
1603711387.345,-801.758,-18.555,-20.508,-808.105,0.000
1603711387.349,150.391,-23.438,-27.832,155.762,0.000
1603711387.353,-1000.000,-29.785,-26.367,-1000.000,0.000
1603711387.357,-1000.000,-30.762,-27.832,-1000.000,0.000
1603711387.361,178.711,-21.484,-20.508,231.934,0.000
1603711387.365,-764.648,-22.949,-21.484,-768.555,0.000
1603711387.368,222.168,-33.203,-31.250,198.242,0.000
1603711387.372,-1000.000,-39.551,-32.715,-1000.000,0.000
1603711387.376,-1000.000,-33.203,-31.250,-1000.000,0.000
1603711387.380,-36.133,-21.484,-26.367,-6.836,0.000
1603711387.384,-789.062,-16.113,-27.832,-781.738,0.000
1603711387.388,409.668,-18.066,-33.691,378.906,0.000
1603711387.392,-909.180,-28.320,-34.180,-925.293,0.000
1603711387.396,-1000.000,-26.855,-33.203,-1000.000,0.000
1603711387.400,-213.867,-16.113,-29.785,-186.523,0.000
1603711387.404,-873.535,-15.137,-23.438,-854.492,0.000
1603711387.407,650.879,-21.484,-28.320,603.516,0.000
1603711387.411,-738.281,-66.406,-38.086,-779.785,0.000
1603711387.415,-1000.000,-78.613,-34.180,-1000.000,0.000
1603711387.419,-204.102,-25.879,-19.531,-210.449,0.000
1603711387.423,-943.848,-7.812,-18.555,-930.664,0.000
1603711387.427,878.418,-9.277,-25.879,835.938,0.000
1603711387.431,-439.453,-28.320,-37.109,-500.977,0.000
1603711387.435,-1000.000,-36.621,-27.344,-1000.000,0.000
1603711387.439,-177.734,-18.066,-20.996,-181.641,0.000
1603711387.443,-991.699,-17.578,-37.598,-982.910,0.000
1603711387.447,-974.121,-29.297,-38.574,-996.094,0.000
1603711387.450,-139.160,-31.738,-42.969,-194.824,0.000
1603711387.454,-1000.000,-18.066,-48.340,-1000.000,0.000
1603711387.458,-303.223,-12.695,-33.691,-268.066,0.000

Phase 3: Analysis

Tasks

  • Split phase into issues/tasks
  • Clean/align data
  • Set up a pipeline using MNE-Python and scikit-learn
  • Try different classifiers (one-vs-all and multiclass)
  • Implement classifier for codeprose (#25)
    • More or less done
  • More?
    • Get feedback from someone who knows their eeg

GQM

TODO (fetch from goaldoc)

Metrics

  • Confusion matrix (which activities are hard to classify/discern?)

Implement code vs prose comprehension task

Issue in eeg-notebooks: NeuroTechX/EEG-ExPy#70

Publicize work

Places to publicize thesis once done:

  • Personal Twitter, Facebook, LinkedIn
  • Any relevant subreddits?
  • OpenBCI channels (their LinkedIn content gets decent engagement)
  • NeuroTechX meetups/hacknights
  • Conferences (Markus might know)
  • Kaggle? (if parts of dataset can be made public)
  • Journals
    • Journal of Open Source Software: https://joss.theoj.org/
      • Probably a good fit for ActivityWatch. Should probably consider publishing a paper on it there, eventually.

PM

Hello Erik

Sorry to abuse your repo, but I've emailed you on 10/dec ("Muse pipeline wrapper"), but didn't get any reply. maybe it hit your spam folder? I'd be happy to get a ping back, to know if you're in to it.

Thanks, and again - sorry for this repo-lution.
Oori

Collect PPG data from Muse S

  • Basic support
  • Test resilience during longer recordings
    • There seems to be some issues, see #11.
  • Convert PPG data to actionable features (such as HR, HRV).
    • Unclear how to do this from the PPG1, PPG2, PPG3 columns in CSV.

Phase 1: Pilot study

Tasks

GQM

TODO (fetch from goaldoc)

Phase 2: Controlled multi-subject study

Tasks

  • Split phase into issues/tasks
  • Design experiment
    • Which activities?
    • Electrode placement?
    • Duration?
  • Enlist volunteers
  • Collect data (#27)

GQM

TODO (fetch from goaldoc)

Investigate classification tasks/datasets

We'll investigate previous approaches to classify EEG data that is similar to our task at hand.

See also #17.

Classifying tasks

Likely the most similar type classification.

Synchronized Brainwave Dataset (2015)

Dataset on Kaggle: https://www.kaggle.com/berkeley-biosense/synchronized-brainwave-dataset

Stimuli:

They both follow the same process:

  • Blinking
  • Relax (closed eyes, focus on breathing)
  • Arithmetic
  • Relax with music
  • Video clip
  • Come up with examples from category
  • Count squares of chosen color

A popular subset of the stimuli is relaxation vs math:

Reading prose vs code (Fucci et al)

No publicly available dataset. Ask Fucci?

Classifying sleep stages

Shares some similarities (long recordings, "organic" data).

See the excellent YASA: https://github.com/raphaelvallat/yasa

Classifying emotion

Might be somewhat similar. Often uses 1min clips of happy/sad movie scenes as stimuli.

Sometimes split into arousal/valence.

Classifying mental states (focus etc)

Classifying things like focus is sometimes considered a simpler task where acceptable classification can be achieved with a simple power band ratio.

EEG data for Mental Attention State Detection (focused, unfocused, drowsy)

Dataset on Kaggle (MATLAB files): https://www.kaggle.com/inancigdem/eeg-data-for-mental-attention-state-detection

Switch to using BIDS as primary data structure

BIDS spec & common principles: https://bids-specification.readthedocs.io/en/stable/02-common-principles.html

Might differ for different devices. Only Muse S has been tested so far.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.