Coder Social home page Coder Social logo

wendtke / psyphr Goto Github PK

View Code? Open in Web Editor NEW
1.0 1.0 1.0 1.89 MB

legacy repo for R package suite for psychophysiological data; see github.com/psyphr-dev

blood-pressure electrodermal-activity electromyography heart-rate-variability impedance-cardiography mindware-technologies psychophysiology

psyphr's People

Contributors

ajmcoqui avatar almccombs avatar iqis avatar wendtke avatar

Stargazers

 avatar

Watchers

 avatar  avatar

psyphr's Issues

Extra materials

What is the best location (in repo or out) for extra materials like the templates and example data from MindWare and the BIOPAC editing steps that one lab shared? Do we need a cloud folder, @iqis ?

Update description file (authors, contributors, funders, acknowledgements)

See #48 and #11

See here

Navigating authorship and contributions (from discussion with GBA)

  • Amanda, Audrey, rOpenSci peer-reviewers as possible contributors; acknowledge in README and Wiki
  • Mallory likely as third author
  • Brooke as either author or contributor -- TBD
  • NSF GRFP (and other funders) in description section (acknowledgement and disclosure)

"This material is based upon work supported by the National Science Foundation Graduate Research Fellowship Program under Grant No. 006784-00002 [to KEW]. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation."

  • unconf as original spot
  • acknowledgements, attribution, and/or disclaimers for and from supported companies (MW; BIOPAC) and maybe CSU. Ask them what they would like to include! Their legal teams might need to be in the loop, especially IF we use proprietary code from one of the supported companies; how does this influence the license (#48) selection?

Follow {tidyverse} principles

In addition to reading

consider (@geanders suggestions):

  • 1. input/output data in same format (allows functions to retain order)
  • 2. common prefix to function name (psyphr_read_wb())
  • 3. check tidy eval book on how to manage column-naming conventions within mutate to allow users to bring in non-MindWare data
  • 4. create unique geom
  • 5. check lubridate or tidyr for examples of maintaining consistency across functions and packages (e.g., verbose = TRUE option within function)
  • 6. create umbrella package with modular packages within it to wrangle raw and output, visualize, analyze (see #52)

Common Data Quality Expectations

Hi @wendtke , as we've talked on the phone, it seems a valuable proposition to take some of the data QA work into the package.

Could you please make a short list of common expectations, starting with those we've talked about?

For example:

  • HRV Stats : Respiration Peak Frequency is expected to be within the range of Settings : HF/RSA Frequency Band
  • HRV Stats : Segment Duration expected to have the consistent value, also same with Settings : Segment Time

Check file before reading

check:

  • Whether the file is indeed from the specified vendor (MindWare)
  • Whether the file is indeed the specified type ("HRV")
    • auto guess file type?

Visualization Schemes

We would need some examples of commonly applicable visualizations, within a subject or across a study.

Implement GBA code suggestions

@iqis I met with @geanders today. She recommended making the following changes:

  • 1. restructure expected study directory to separate folders (e.g., subject_1/task_1 and task_2 and task_....) rather than file name with subject_task
  • 2. Use importFrom magrittr %>% in roxygen notes to specify pipe
  • 3. Don’t put suppressWarnings() within function; instead, use purrr::quietly or purrr::safely
  • 4. For exported functions, put an example in roxygen notes (deferred, pending better sample data)
  • 5. Define precise parameters (e.g., a character string that gives path to... rather than path only)
  • 6. user-friendly output of print(psyphr_workbook); see #43

data munging example

brief description of data munging process (from start to end with data examples)

@iqis do you still want/need this? You mentioned it in the phone call.

Compare MW output

compare output data across all MindWare applications; create example data from demo software

TIMELINE: rOpenSci > CRAN > journal

See #15 for rOpenSci info
See #45 for author discussion

Proposed dissemination timeline

  1. rOpenSci: data munging
  1. CRAN
  • Commitment of maintainer (one person; can transfer responsibility) depends on package (complexity, reach, and versions)
  • 1-2 emails/month about bugs or feature requests
  • Biggest thing: Keep up with tidyverse and system application versions
  1. Journal: Compare JOSS vs. The R Journal vs. content area journal (e.g., Psychophysiology)

Study/file size

We want psyphr to work on a normal laptop, which nowadays has somewhere between 4-12G's of usable memory, and R normally should not use more than half of the total memory. Currently read_study() reads everything all at once. A really big study can create a problem.

If the problem exists, there are at least two ways to mitigate the problem:

  • Construct a promise in lieu of reading in the data; the data is read from disk as needed.
  • Read the study and cache the resulting R object onto disk incrementally.

What is a likely the total size of a study? I'm looking for a figure at about the 80th percentile, and I surely hope it will be small enough.

Best way to rename/recreate repo

@iqis @ajmcoqui @almccombs

I would like to change the repo/package name to psyphyr. Do you think the best approach to this would be to (eventually) recreate a new repo, transfer the content and collaborators, and delete psyr?

Any suggestions would be helpful. This is not a vital change at the moment, but I figured changing the name earlier (before making the repo public or trying to publish the package) would be better.

consolidate read_MW_*() family functions

read_MW() ->
validate data format ->
dispatch corresponding parsing function

Automatically detect and parse workbook format, using:

  • Unique names of the worksheets
  • Unique fields in the Setting worksheet.

README Page

create a README page with:

Package name, Badges

Brief Introduction

Installation instructions

Minimal working code example (pick one use case)

TODO

License

Survey questions

Read Issue #19 first for more ideas.

Some ideas for questions. There are a lot, so we will probably have to cut some.

  • What kind of data collection system do you use? [multiple choice + other?]
  • If you use MindWare, which analysis applications do you use? [MC]
  • Which physiological measures do you include in your research? [MC?]
  • Please describe your study design(s) in terms of number of participants and laboratory procedure sequence (e.g., 200 healthy adults; physiological baseline period of X minutes, challenge period of Y minutes, and recovery period of Z minutes).
  • What is the length of your segments/bins/epochs? [MC - 30s, 1m, etc.)
  • What is your file directory structure for data editing and compilation?
  • What are your file naming conventions (e.g., subjectID_task)?
  • What is your desired output data structure?
  • Do you employ data filters based on empirical guidelines? If so, which ones? [MC + open]
    -- Segment length 30s+ for valid RSA [HRV]
    -- Drop segments >10% of estimated R-peaks [HRV]
    -- Exclude segments outside expected range of respiratory peak frequency [HRV]
  • What are some common visualization and exploratory data analysis techniques do you employ? (e.g., time series of RSA; average RSA per time period - baseline, challenge, recovery)

MW BioLab Epoch File?

An Epoch File contains the metadata of a subject's activity period; manually tagged? How to integrate with measurement data?

incorporate other MW formats

Workbook formats:

  • BPV

  • EMG

  • Startle EMG : @wendtke Are you familiar with this type? In the sample data there is no information on "Right Eye". Is this expected?

  • IMP

  • BSA: Unstable format, need a closer look

Evaluate BIDS Schema

Tom Johnston on Twitter suggested "BIDS", Brain Imaging Data Structure here.

  • Promotes creation of portable, open analysis pipelines & software.
  • spec on psysiological data

Goals:

  • find useful fields for psychophysiology in BIDS schema,
  • explore higher level compatibility

Python API:

Validation Tool:

Cast data into correct type

Currently all data are read in verbatim as "character".
Make a parse_MW_() function family to address all kinds, then call from read_MW_() family

Use dplyr::mutate_*() family.
Keep categorical variables as "character" or press into "factor"? @wendtke This also begs another question, what are the possible levels of a factor? e.g. SCR Type in SCR Stats from EDA databook.

  • EDA
  • HRV
  • EMG
  • Startle EMG
  • IMP
  • BPV
  • BSA

MindWare info/input

I have a video chat with a MW representative on Tuesday, June 4. I had to reschedule from a few weeks back.

I will ask about

  • MindWare's file naming conventions for output data
  • Best approach for end-user to set study schema (number of subjects, tasks, and files)
  • MindWare's structure for the other output data files per analysis application (i.e., are all of them structured similarly to EDA and HRV? can we have a sample of each output type for our package development and testing?)
  • quality control criteria for Electrodermal Activity (I asked them this in the past, and they were not that helpful. I am re-reading the EDA chapter from the Handbook of Psychophysiology.)
  • common visualization needs for end-user (I have gotten some insight on this from my recent analyses of respiratory sinus arrhythmia for a poster.)

@iqis Do you have any other questions for MindWare?

(Suspended) Add study/subject/activity information to workbook objs

Per Discord conversation @iqis @wendtke 20190625:

A workbooks generally has three ID dimensions:

  • What subject(participant)...
  • doing what activity....
  • in what study.
    ... and may potentially be more.

This information shall be inferred from folder/name structure. See: #21.
This information is key to downstream analysis.

Issue Suspension:

  • Already possible to identify workbooks through read_study(), with mechanism proposed here. is it necessary to repeat on individual workbooks?

Design and Implement a _Study_ Object

... composed of many psyphr_workbook objects, with subject/activity identification inferred from file structure of workbooks (See #21).

Able to:

  • Generate high-level summaries/visualization across subjects/activities
  • Output all data in desired format to the file system
    • save_study()
  • Helper function to saveRDS() (?)

S3 object, class name: psyphr_study

generics:

  • print.psyphr_study()
  • summary.psyphr_study()
  • ...

Use a control file in YAML or DCF.

Downstream analyses: Common approaches and use cases

Right now as I'm trying to figure out the best approach, I need to know some common characteristics in downstream analyses. Some detailed use cases will help. For example, what are some frequently used statistical models? Are modeling usually done for each and every subject, or across some kind of summation of a group?

Originally posted by @iqis in #58 (comment)

Consider existing resources

Look at this and other related resources for ideas -- of what NOT to do. This kind of package is to clean raw heart rate variability data, not to wrangle existing data.

psyphr is different than what this project offers.

Background on MindWare Technologies

MindWare Technologies, Ohio sells 6 analysis applications that provide output data we are interested in wrangling. These include Basic Signal Analysis (BSA), Blood Pressure Variability (BPV) Analysis, Electrodermal Activity (EDA) Analysis, Electromyography (EMG) Analysis, Heart Rate Variability (HRV) Analysis, and Impedance Cardiography (IMP) Analysis. So far, I am only familiar with EDA and HRV. Eventually, I would like psyphr to wrangle data from all MindWare analysis applications and then move to add options for data from BIOPAC Research Solutions.

Here is some more information on EDA and HRV.
EDA Analysis 3.2 Manual
HRV Analysis 3.2 Manual

Aside: BioLab is the data acquisition software, which provides the raw data files for the analysis applications. The analysis applications then export the edited output data for compilation, analysis, and visualization.

Interesting tidbit: Years ago, MindWare had its own proprietary study compilation tool for use across analysis applications. They do not offer it to clients anymore, but maybe there is content in the manual that might inform our approach in managing the file naming problem or other things. It looks they required users to enter subject ID, etc.

License

Update #11 and #45 with final decision

  • End-user license (MIT vs. GPL)
    "Worst" case scenario: company takes psyphr, puts GUI on it, and sells it (could be with or without attribution). Let's talk through the scenario with each license and consider if we are comfortable with which/either outcome.

  • Can we change license after making repo public or submitting/publishing?
    GBA 20190714 suggested not to do so:

Yes, I think you should be able to change licenses down the line, with all coauthors’ agreement. I’d try not to too often, though—if people ever use it within other things they make, I think a change from MIT to GPL might affect what they can do (if they’re creating under a license that isn’t open source). The general consideration, when you are maintaining a package, is to try to limit the changes you make that could break a lot of things “downstream” for people who might be building off your package. This, of course, is only a big issue when the package has a lot of users, which isn’t the case for plenty of packages (although I think yours could get a lot of downstream development, where people are using your package as a dependency in their own package). But it’s not the end of the world if you change your license later, I think.

Resources from GBA
R Packages
Understanding Open Source and Free Software Licensing

Set up continuous deployment (integration + delivery)

Continuous integration is a service that automatically checks error in your code each time a new commit is pushed to GItHub. A badge can be displayed on whether the build passes the test. At current stage, it is most likely that our code with fail the CI's stringent standards. But don't be discouraged.

Before we can implement free CI service, our repo needs to be open.

Set up:

  • TravisCI
  • AppVeyor

Hi!

Introducing myself again! My name is Siqi Zhang, I've been using R since 2012. , and am a freelance R developer. Rather than using the language for analysis, my edge is sharper on the language itself. It is my pleasure to meet to be onboard this open source project.

I'm excited that you've already made very substantial progress. I think when we're ready to take it further, we should exchange opinion on each other's thoughts and situations. Hit me up at [email protected].

In the mean time, I'm going to branch it off and start poking around. Looking forward to hearing from you!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.