Coder Social home page Coder Social logo

rymc / bhealth Goto Github PK

View Code? Open in Web Editor NEW
3.0 3.0 3.0 69.31 MB

This library is designed to be used with accelerometer and RSS data sets, collected as part of longitudinal studies in Digital Health

Python 7.80% Shell 0.02% Jupyter Notebook 92.18%
accelerometer activity-recognition digitalhealth localization

bhealth's People

Contributors

farnooshheidarivincheh avatar lucyhaixiabi avatar mkoz71 avatar perellonieto avatar rafaelpo avatar rymc avatar weisongyang avatar

Stargazers

 avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

bhealth's Issues

Add periods of non-wear detection / removal

A nice feature of the library would be a method to detect, and remove, periods of non-wear for wearable devices.

How I have done it in the past with the EurValve wearable is:
"Periods of time where the patient is not wearing the wearable are excluded by measuring the variance in arm angle changes over 20 minute blocks of time. If the variance in a block is less than 1×10−7, then that block of time is excluded from analysis."

Though we may want to try other methods..

ValueError: setting an array element with a sequence in localisation_example.py

Running localisation_example.py results in the following error.

Traceback (most recent call last):                                                                           
  File "examples/localisation_example.py", line 149, in <module>                             
    clf_grid.fit(X_train, y_train)                                                                                     
  File "/home/rmc/anaconda2/lib/python3.6/site-packages/sklearn/model_selection/_search.py", line 722, in fit             
    self._run_search(evaluate_candidates)                                                                  
  File "/home/rmc/anaconda2/lib/python3.6/site-packages/sklearn/model_selection/_search.py", line 1191, in _run_search
    evaluate_candidates(ParameterGrid(self.param_grid))                                                  
  File "/home/rmc/anaconda2/lib/python3.6/site-packages/sklearn/model_selection/_search.py", line 711, in evaluate_candidates
    cv.split(X, y, groups)))                                                                                    
  File "/home/rmc/anaconda2/lib/python3.6/site-packages/sklearn/externals/joblib/parallel.py", line 917, in __call__
    if self.dispatch_one_batch(iterator):                                                    
  File "/home/rmc/anaconda2/lib/python3.6/site-packages/sklearn/externals/joblib/parallel.py", line 759, in dispatch_one_batch
    self._dispatch(tasks)                                                                                                                                                                                                                      
  File "/home/rmc/anaconda2/lib/python3.6/site-packages/sklearn/externals/joblib/parallel.py", line 716, in _dispatch
    job = self._backend.apply_async(batch, callback=cb)                                                                                                                                                                                                                                     
  File "/home/rmc/anaconda2/lib/python3.6/site-packages/sklearn/externals/joblib/_parallel_backends.py", line 182, in apply_async
    result = ImmediateResult(func)                                                                               
  File "/home/rmc/anaconda2/lib/python3.6/site-packages/sklearn/externals/joblib/_parallel_backends.py", line 549, in __init__
    self.results = batch()                                                                                   
  File "/home/rmc/anaconda2/lib/python3.6/site-packages/sklearn/externals/joblib/parallel.py", line 225, in __call__
    for func, args, kwargs in self.items]                                                    
  File "/home/rmc/anaconda2/lib/python3.6/site-packages/sklearn/externals/joblib/parallel.py", line 225, in <listcomp>
    for func, args, kwargs in self.items]                                                            
  File "/home/rmc/anaconda2/lib/python3.6/site-packages/sklearn/model_selection/_validation.py", line 528, in _fit_and_score
    estimator.fit(X_train, y_train, **fit_params)                                                    
  File "/home/rmc/anaconda2/lib/python3.6/site-packages/sklearn/pipeline.py", line 265, in fit       
    Xt, fit_params = self._fit(X, y, **fit_params)                                            
  File "/home/rmc/anaconda2/lib/python3.6/site-packages/sklearn/pipeline.py", line 230, in _fit
    **fit_params_steps[name])                                                                  
  File "/home/rmc/anaconda2/lib/python3.6/site-packages/sklearn/externals/joblib/memory.py", line 342, in __call__
    return self.func(*args, **kwargs)                                                             
  File "/home/rmc/anaconda2/lib/python3.6/site-packages/sklearn/pipeline.py", line 614, in _fit_transform_one
    res = transformer.fit_transform(X, y, **fit_params)
  File "/home/rmc/anaconda2/lib/python3.6/site-packages/sklearn/base.py", line 467, in fit_transform
    return self.fit(X, y, **fit_params).transform(X)  
  File "/home/rmc/anaconda2/lib/python3.6/site-packages/sklearn/impute.py", line 223, in fit
    X = self._validate_input(X)  
  File "/home/rmc/anaconda2/lib/python3.6/site-packages/sklearn/impute.py", line 197, in _validate_input                  
    raise ve                            
  File "/home/rmc/anaconda2/lib/python3.6/site-packages/sklearn/impute.py", line 190, in _validate_input
    force_all_finite=force_all_finite, copy=self.copy)
  File "/home/rmc/anaconda2/lib/python3.6/site-packages/sklearn/utils/validation.py", line 527, in check_array
    array = np.asarray(array, dtype=dtype, order=order)
  File "/home/rmc/anaconda2/lib/python3.6/site-packages/numpy/core/numeric.py", line 538, in asarray      
    return array(a, dtype, copy=False, order=order)   
ValueError: setting an array element with a sequence.    

Add a sliding window class

If the transform class is only a set of functions:

  1. Should we have a sliding window class that applies the given transformations to the data?
  2. Or should the transformation class use the sliding window and apply the transformations?

Consider other feature modalities apart from Acc. and RSSI

The current examples only consider accelerometer and RSSI values. However, if the library wants to be general and more useful to other research groups it would be convenient to thinkg about other modalities and see if they fit in the current framework, or how this should be modified.

Some examples of modalities that may be non-trivial to be added are RGB video data, or silouhetes.

bhealth.metric_wrappers.Wrapper with the argument csv_prep generates exception

On every example in the examples folder, if the csv_prep argument when creating a Wrapper is set, the plot_metrics will raise the following exception

Traceback (most recent call last):
  File "synthetic_long_example.py", line 210, in <module>
    figures_dict = plot_metrics(metric_container_daily, date_container_daily, labels_=labels)
  File "../bhealth/visualisations.py", line 96, in plot_metrics
    for key in proportion:
TypeError: 'float' object is not iterable

To avoid the exception I have removed the optional argument from all the examples, but needs to be investigated.

Add mathmatical & textual description to documentation

For the documentation it would be nice if we added both a mathematical description of the function (where sensible), as well as a textual description.

Currently the transform functions (e.g. spectral entropy) contain no information as to what the function does.

Add Python TSFEL library support for feature extraction

The TSFEL: Time Series Feature Extraction Library for Python was released recently. This library may incorporate very useful feature extraction methods that we do not have currently.

Is it possible to incorporate the library into some of our feature extraction functions? Maybe as a wrapper to TSFEL, or as examples on how this can be done.

Marília Barandas, Duarte Folgado, Letícia Fernandes, Sara Santos, Mariana Abreu, Patrícia Bota, Hui Liu, Tanja Schultz, Hugo Gamboa, TSFEL: Time Series Feature Extraction Library, SoftwareX, Volume 11, 2020, https://doi.org/10.1016/j.softx.2020.100456.

Issue in accelerometer_example.py

In accelerometer_example.py when using the provided example dataset:

while True:
windowed_raw = transform.slide(X)
if len(windowed_raw) > 0:
try:
windowed_features = [ts[transform.current_position][0]]
except Exception as e:
print(e)
break

I get an error of out of bound access. The error might occur in transform.slide(), where "windowed_raw" is first assigned the value and then its "current_position" goes forward:

window = x[int(self.current_position-self.window_length):int(self.current_position)]
if len(window) > 0:
if len(window.shape) > 1:
window = window[~np.isnan(window).any(axis=1)]
else:
window = window[~np.isnan(window)]
if update:
# TODO Check that this does not break anything
self.current_position += self.step
return window

UnboundLocalError: local variable 'X_new' referenced before assignment

When running accelerometer_example.py I get the following error.

(base) rmc@gamma:~/NewLib/digihealth$ python examples/accelerometer_example.py
Found 4 house folders.
Found 4 experiment folders.
Running folder:  1
Running folder:  2
Running folder:  3
Running folder:  4
Window size of 10 seconds and overlap of 0.1%
Use number of mean crossings, spectral entropy as features...
index 69100 is out of bounds for axis 0 with size 69091
Traceback (most recent call last):
  File "examples/accelerometer_example.py", line 153, in <module>
    X, y = preprocess_X_y(ts, X, y)
  File "examples/accelerometer_example.py", line 66, in preprocess_X_y
    new_X = transform.feature_selection(new_X, new_y, 'uni')
  File "../digihealth/transforms.py", line 246, in feature_selection
    return X_new
UnboundLocalError: local variable 'X_new' referenced before assignment

It appears that the feature_selection method in transforms.py can only handle two cases, l1 and tree, while in the example file this method is called with the value 'uni'.

Addition of Metrics

We currently do not have any metrics implemented in this library.

Here are a list of metrics we currently use elsewhere:

Room Transfers - Daily average
Duration Outside - Daily average
Times Exited Home - Daily average
Typically Sleeps In - Daily
Sleep Efficiency - Daily average
Sleep Quality - Daily average
Main Sleep Length- Daily average
Total Sleep Length- Daily average
Walking - Hourly average
Sitting - Hourly average
Lying - Hourly average
Compliance (duration of wear)
Number of times bathroom visited during the night
Number of times kitchen visited during the night
Average speed walking - Daily average
Maximum speed of walking - Daily average
Speed of stand up/sit down - Daily average
Number of sit-to-stand transitions - Daily average
Number of times stairs used - Daily average
Speed travelling upstairs - Daily average
Time to go from room down stairs to upstairs - Daily average
Time to go from room upstairs to downstairs - Daily average
Number of times activites undertaken (e.g. cooking / cleaning) - Daily average

Label mappings returns different structures depending on a csv being specified

Both functions label_mappings and label_mappings_localisation return different formats depending on a CSV being specified during the instantiation of a Wrapper. I would opt to return always the same format, and convert it to the required format if necessary while exporting the CSV file, however, I am not familiar with the code and I am not sure which would be the more generic format, and the implications that this may have.

@mkoz71 could you help me with this decision?

Also, this may be related to the issue #13

def label_mappings(self, container, is_duration, label_to_extract=None):

Human friendly functions

Currently in the example script we have function names such as ' metr.average_labels_per_window(labs, times)'.

While these abstractions make a lot of sense in the design sense, from a human user of the library perspective, I don't think they are clear.

Can we create 'human friendly' wrapper functions, such as 'get_duration_walking(data, period='daily', level='hours'), for each of the metrics?

Public Datasets needed

Hi,

I think we need to identify and use some public datasets for activity recognition and localization.

Ideally these datasets would be multi-day datasets, so we can easily calculate daily average metrics from them. Even better if they were multi-week datasets, so we can calculate weekly metrics from them, and so on.

Alternatively we could use a dataset that doesn't meet this criteria, and modify the timestamps to simulate multi-day/multi-week datasets for testing. For example, using https://www.nature.com/articles/sdata2018168

What do you think?

Issue in pre-processing data and labels

When extracting features in the code below:

while True:
windowed_raw = transform.slide(X)
if len(windowed_raw) > 0:
try:
windowed_features = [ts[transform.current_position][0]]
except Exception as e:
print(e)
break
for function in feature_transforms:
windowed_features.extend((np.apply_along_axis(function, 0, windowed_raw).tolist()))
new_X.append(windowed_features)
windowed_raw_labels = transform.slide(y, update=False)
most_freq_label = np.bincount(windowed_raw_labels).argmax()
new_y.append(most_freq_label)

There seems to be a misalignment between "windowed_raw" (for X) and "windowed_raw_labels" (for y): After applying "transform.slide(X)" in line 52, "transform.current_position" is shifted forward equal to the "stride" value. Then when transform.slide(y) is applied in line 63, it uses the new "transform.current_position" which does not match with where "X" was extracted from.

Calculate 'compliance'

Often in interventions it is useful to know the compliance rate of a patient with regard to the wearable. For example, over a two week period, they were wearing the wearable for 95% of the time.

Further, this compliance value can be calculated daily, and used to exclude days where the compliance value is below a threshold.

Report generation

I think it would be useful if we had report generation functionality. That is, a method of wrapping up all the results (data quality, metrics, visualizations etc.) into a nice HTML or PDF report.

If anyone has any examples it would be great to post them below. Otherwise I can try and sketch out what I have in mind.

Unify format of descriptor maps (or not)

The localisation and activity examples use different format of descriptor_map

eg. The localisation uses arrays of integers denoting the labels

descriptor_map = {
'foyer' : [0],
'bedroom' : [1],
'living_room' : [2],
'bathroom' : [3]
}

While the accelerometer example uses integers

descriptor_map = {
'sitting' : 77,
'walking' : 78,
'washing' : 79,
'eating' : 80,
'sleeping': 81,
'studying': 82
}

What is the reasoning behind it, and which method should we choose as standard. A benefit from the list is that it could group labels, for example "upstairs" could be associated to [0, 1, 2], or "sedentary" as well.

I am not sure if there are collateral implications on this modification.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.