oxwearables / stepcount Goto Github PK

View Code? Open in Web Editor NEW

16.0 4.0 10.0 27.42 MB

A step-counting model based on self-supervised learning for wrist-worn accelerometer data.

License: Other

Python 8.05% Jupyter Notebook 91.91% Shell 0.04%

fitness-tracker gait-analysis healthcare deep-learning machine-learning self-supervised-learning wearables

stepcount's People

Contributors

Stargazers

Watchers

Forkers

xiangnandang yonbrand amitsal khanutbj rakeshpilkar82 aidanacquah sayanmitra jeremy-pdh muschellij2 ashreves

stepcount's Issues

Data dictionary missing adjusted variable columns

The data dictionary is incomplete and is missing columsn for the adjusted variables.

Installation error

On a Mac OS, when I did a clean installation with python 3.9, I received the error below:

(step_count) hangy@ADMINS-MACBOOK-PRO ~ %  pip install stepcount 
Collecting stepcount
  Using cached stepcount-2.0.5-py3-none-any.whl (24 kB)
Collecting imbalanced-learn==0.9.1
  Using cached imbalanced_learn-0.9.1-py3-none-any.whl (199 kB)
Collecting hmmlearn==0.2.7
  Using cached hmmlearn-0.2.7-cp39-cp39-macosx_10_15_x86_64.whl (101 kB)
Collecting scikit-learn==1.1.1
  Using cached scikit_learn-1.1.1-cp39-cp39-macosx_10_13_x86_64.whl (8.6 MB)
Collecting numpy>=1.22
  Using cached numpy-1.24.2-cp39-cp39-macosx_10_9_x86_64.whl (19.8 MB)
Collecting joblib>=1.2.0
  Using cached joblib-1.2.0-py3-none-any.whl (297 kB)
Collecting pandas>=1.4
  Using cached pandas-1.5.3-cp39-cp39-macosx_10_9_x86_64.whl (12.0 MB)
Collecting torchvision~=0.13.1
  Using cached torchvision-0.13.1-cp39-cp39-macosx_10_9_x86_64.whl (1.3 MB)
Collecting transforms3d~=0.4.1
  Using cached transforms3d-0.4.1.tar.gz (1.4 MB)
  Preparing metadata (setup.py) ... done
Collecting tqdm>=4.64
  Using cached tqdm-4.64.1-py2.py3-none-any.whl (78 kB)
Collecting actipy==2.0.1
  Using cached actipy-2.0.1-py3-none-any.whl (47 kB)
Collecting scipy>=1.9
  Using cached scipy-1.10.1-cp39-cp39-macosx_10_9_x86_64.whl (35.2 MB)
Collecting torch~=1.12.1
  Using cached torch-1.12.1-cp39-none-macosx_10_9_x86_64.whl (133.8 MB)
Collecting Jpype1>=1.3
  Downloading JPype1-1.4.1-cp39-cp39-macosx_10_9_x86_64.whl (381 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 381.8/381.8 kB 4.7 MB/s eta 0:00:00
Collecting statsmodels>=0.13
  Downloading statsmodels-0.13.5-cp39-cp39-macosx_10_9_x86_64.whl (9.7 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 9.7/9.7 MB 12.5 MB/s eta 0:00:00
Collecting threadpoolctl>=2.0.0
  Using cached threadpoolctl-3.1.0-py3-none-any.whl (14 kB)
Collecting python-dateutil>=2.8.1
  Using cached python_dateutil-2.8.2-py2.py3-none-any.whl (247 kB)
Collecting pytz>=2020.1
  Downloading pytz-2022.7.1-py2.py3-none-any.whl (499 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 499.4/499.4 kB 8.0 MB/s eta 0:00:00
Collecting typing-extensions
  Downloading typing_extensions-4.5.0-py3-none-any.whl (27 kB)
Collecting pillow!=8.3.*,>=5.3.0
  Using cached Pillow-9.4.0-2-cp39-cp39-macosx_10_10_x86_64.whl (3.3 MB)
Collecting requests
  Downloading requests-2.28.2-py3-none-any.whl (62 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 62.8/62.8 kB 1.7 MB/s eta 0:00:00
Collecting packaging
  Downloading packaging-23.0-py3-none-any.whl (42 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 42.7/42.7 kB 999.7 kB/s eta 0:00:00
Collecting six>=1.5
  Using cached six-1.16.0-py2.py3-none-any.whl (11 kB)
Collecting patsy>=0.5.2
  Downloading patsy-0.5.3-py2.py3-none-any.whl (233 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 233.8/233.8 kB 5.5 MB/s eta 0:00:00
Collecting idna<4,>=2.5
  Downloading idna-3.4-py3-none-any.whl (61 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 61.5/61.5 kB 1.6 MB/s eta 0:00:00
Collecting charset-normalizer<4,>=2
  Downloading charset_normalizer-3.0.1-cp39-cp39-macosx_10_9_x86_64.whl (124 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 124.0/124.0 kB 3.1 MB/s eta 0:00:00
Requirement already satisfied: certifi>=2017.4.17 in ./opt/miniconda3/envs/step_count/lib/python3.9/site-packages (from requests->torchvision~=0.13.1->stepcount) (2022.12.7)
Collecting urllib3<1.27,>=1.21.1
  Downloading urllib3-1.26.14-py2.py3-none-any.whl (140 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 140.6/140.6 kB 3.8 MB/s eta 0:00:00
Building wheels for collected packages: transforms3d
  Building wheel for transforms3d (setup.py) ... done
  Created wheel for transforms3d: filename=transforms3d-0.4.1-py3-none-any.whl size=1376757 sha256=f223421fe0475e15ab17bf83fe42032b58a62ff2fd0952864d34543df77cc016
  Stored in directory: /Users/hangy/Library/Caches/pip/wheels/c6/f4/11/e68752d386554db04e4b39de9da6bb4daf7c23281dae21b845
Successfully built transforms3d
Installing collected packages: pytz, charset-normalizer, urllib3, typing-extensions, transforms3d, tqdm, threadpoolctl, six, pillow, packaging, numpy, joblib, idna, torch, scipy, requests, python-dateutil, patsy, Jpype1, torchvision, scikit-learn, pandas, statsmodels, imbalanced-learn, hmmlearn, actipy, stepcount
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
pyfew 0.0.4 requires catch22, which is not installed.
pyfew 0.0.4 requires pyyaml, which is not installed.
Successfully installed Jpype1-1.4.1 actipy-2.0.1 charset-normalizer-3.0.1 hmmlearn-0.2.7 idna-3.4 imbalanced-learn-0.9.1 joblib-1.2.0 numpy-1.24.2 packaging-23.0 pandas-1.5.3 patsy-0.5.3 pillow-9.4.0 python-dateutil-2.8.2 pytz-2022.7.1 requests-2.28.2 scikit-learn-1.1.1 scipy-1.10.1 six-1.16.0 statsmodels-0.13.5 stepcount-2.0.5 threadpoolctl-3.1.0 torch-1.12.1 torchvision-0.13.1 tqdm-4.64.1 transforms3d-0.4.1 typing-extensions-4.5.0 urllib3-1.26.14

Failed to build wheel for hmmslearn

I could not install the package using the normal install routine since pip failed to build the wheel for the hmmslearn v0.2.7 package. I solved this issue by cloning the repo and in setup.py changing the required hmmslearn version to 0.3.0. Then it installed without a problem using python -m pip install /path/to/package.

FYI, I am using MacOSX 13.4.1 on a Macbook Pro with the Apple M1 Pro chip.

column header for output files

@chanshing, would it be possible to get "time" headers on all output files?

Currently have "time" header:
-DailyStepsAdjusted.csv
-DailyWalkAdjusted.csv
-HourlyStepsAdjusted.csv

Still needs "time" header:
-DailySteps.csv,
-DailyWalk.csv
-HourlySteps.csv
-Steps.csv

This is probably most important for anyone wanting to seamlessly use our -Steps.csv output with the new step.metrics package for extracting cadence metrics.

missing `StartTime`, `EndTime`, and other field names when processing CSV files

Add Stepcount version info to Info.json

For posterity sake and potential future debugging, can we output something like "StepCountVersion": "3.7.2" in each xxxxx-Info.json generated like in this example.

Error Processing Multiple Files

I have a text file called 'command.txt' with 3 .cwa filenames.
But on trying to run with command: cmd < commands.txt as per your instructions (https://github.com/OxWearables/stepcount#windows ) get error:

(base) PS C:\Users\dlevin> cmd < commands.txt
At line:1 char:5

cmd < commands.txt
```
~
```

The '<' operator is reserved for future use.
+ CategoryInfo : ParserError: (:) [], ParentContainsErrorRecordException
+ FullyQualifiedErrorId : RedirectionNotSupported

Unable to read .csv files

Hi,

I am trying to read an ActiGraph .csv file using stepcount version 3.2.3 and I am getting this error.

Traceback (most recent call last):
  File "/home/freemanjr/.conda/envs/stepcount/bin/stepcount", line 8, in <module>
    sys.exit(main())
  File "/home/freemanjr/.conda/envs/stepcount/lib/python3.9/site-packages/stepcount/stepcount.py", line 50, in main
    data, info = read(args.filepath, resample_hz, sample_rate=args.sample_rate, verbose=verbose)
  File "/home/freemanjr/.conda/envs/stepcount/lib/python3.9/site-packages/stepcount/stepcount.py", line 357, in read
    freq = infer_freq(data.index)
  File "/home/freemanjr/.conda/envs/stepcount/lib/python3.9/site-packages/stepcount/stepcount.py", line 402, in infer_freq
    freq, _ = stats.mode(np.diff(x), keepdims=False)
  File "<__array_function__ internals>", line 200, in diff
  File "/home/freemanjr/.conda/envs/stepcount/lib/python3.9/site-packages/numpy/lib/function_base.py", line 1448, in diff
    a = op(a[slice1], a[slice2])
TypeError: unsupported operand type(s) for -: 'str' and 'str'

Any idea what may be going wrong here?

For additional clarification, this file has been successfully run using a previous version of stepcount and the column names are specified as time, x, y, z.

Handle corrupted model files due to bad downloads

If model file download is interrupted for some reason, load_model() will fail forever

stepcount/stepcount/stepcount.py

Lines 176 to 190 in 5c96190

    
           def load_model(): 
        
               """ Load trained model. Download if not exists. """ 
        
               pth = pathlib.Path(MODEL_PATH) 
        
               if not pth.exists(): 
        
                   url = f"https://wearables-files.ndph.ox.ac.uk/files/models/stepcounter/{__model_version__}.joblib" 
        
                   print(f"Downloading {url}...") 
        
                   with urllib.request.urlopen(url) as f_src, open(pth, "wb") as f_dst: 
        
                       shutil.copyfileobj(f_src, f_dst) 
        
               return joblib.load(pth)

Retrain models after feature changes

A ticket to remember to retrain the RF model since one feature has been changed: #29

Error Running Stepcount in Pycharm

I am trying to run stepcount using PyCharm

Following your instructions on processing multiple files on windows https://github.com/OxWearables/stepcount#windows
I have a file commands.txt with 2 cwa files sample1 & sample2 :
stepcount sample1.cwa &
stepcount sample2.cwa
:END

On processing com < commands,txt get the following error:

C:\Users\dlevin>cmd < commands.txt
Microsoft Windows [Version 10.0.17763.4377]
(c) 2018 Microsoft Corporation. All rights reserved.

C:\Users\dlevin>stepcount sample1.cwa &
'stepcount' is not recognized as an internal or external command,
operable program or batch file.

C:\Users\dlevin>stepcount sample2.cwa
'stepcount' is not recognized as an internal or external command,
operable program or batch file.

Any ideas how to solve this?

Sample Rate Mis-Estimated

Issue

Sample Rate is mis-estimated from inferring from frequency.

Reproducible Example

Data Set

The sample rate on this data is 80Hz.
test_sample_rate_80.csv

Errant Behavior

The time stamps are only 3 significant digits, and the way the sample rate is estimated using infer_freq:

stepcount/src/stepcount/stepcount.py

Line 352 in 2f98d72

freq = infer_freq(data.index)

uses the mode of the sub-second differences as the correct sampling frequency:

stepcount/src/stepcount/stepcount.py

Line 394 in 2f98d72

freq, _ = stats.mode(np.diff(x), keepdims=False)

If the average is used, this will likely give a better estimate in some cases, but not all. In this case, we see the estimated sample rate to be 83.

head test_sample_rate_80.csv

The data:

x,y,z,time
-0.151428,0.981339,-0.448547,2023-10-23 10:00:00.000
-0.548889,0.745132,-0.269226,2023-10-23 10:00:00.012
-0.67157,0.528214,-0.273987,2023-10-23 10:00:00.025
-0.436584,0.598526,-0.242859,2023-10-23 10:00:00.037
-0.559753,0.776505,-0.264099,2023-10-23 10:00:00.049
-0.549255,0.919083,-0.132141,2023-10-23 10:00:00.062
-0.275208,1.039322,-0.162659,2023-10-23 10:00:00.075
-0.395447,1.01503,-0.236877,2023-10-23 10:00:00.087
-0.415222,0.972427,0.015564,2023-10-23 10:00:00.099

from stepcount.stepcount import read
data, info = read("test_sample_rate_80.csv", resample_hz = "uniform", verbose=True)
#> Gravity calibration...
#> Gravity calibration... Done! (0.01s)
#> Nonwear detection...
#> Nonwear detection... Done! (0.01s)
#> Resampling...
#> Resampling... Done! (0.03s)
#> /Users/johnmuschelli/miniconda3/envs/stepcount/lib/python3.9/site-packages/actipy/processing.py:294: UserWarning: Skipping calibration: Insufficient stationary samples: 0 < 50
#>   warnings.warn(f"Skipping calibration: Insufficient stationary samples: {len(xyz)} < {calib_min_samples}")
info["SampleRate"]
#> 83

Proposed Solutions

Allow for an explicit sample rate argument. Downsides - if this is not the same across all files, will not be correct.
Use the mean for the frequency estimate. Downsides - may fail in other cases than above.

TypeError: NumPy boolean array indexing assignment requires a 0 or 1-dimensional input, input has 2 dimensions

Issue

Model fails to generate step counts/step times and returns TypeError: NumPy boolean array indexing assignment requires a 0 or 1-dimensional input, input has 2 dimensions
The error happens for both the 'rf' and 'ssl' model versions, but not on the same datasets.

Reproducible example (SSL version)

Example dataset for SSL
The sample rate is 15Hz.

Command:

$ conda activate stepcount
stepcount example_data_rf.csv.gz --model-type=ssl --sample-rate=15

Output

/Users/lilykoffman/anaconda3/envs/stepcount/lib/python3.9/site-packages/actipy/processing.py:294: UserWarning: Skipping calibration: Insufficient stationary samples: 0 < 50
  warnings.warn(f"Skipping calibration: Insufficient stationary samples: {len(xyz)} < {calib_min_samples}")
Gravity calibration... Done! (0.02s)
Nonwear detection... Done! (0.01s)
Resampling... Done! (0.03s)
Loading model...
Running step counter...
Defining windows...
100%|██████████████████████████████████████████████████████████████████████████| 58/58 [00:00<00:00, 862.53it/s]
Using local /Users/lilykoffman/anaconda3/envs/stepcount/lib/python3.9/site-packages/stepcount/torch_hub_cache/OxWearables_ssl-wearables_v1.0.0
Classifying windows...
100%|█████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00,  2.76it/s]
Traceback (most recent call last):
  File "/Users/lilykoffman/anaconda3/envs/stepcount/bin/stepcount", line 8, in <module>
    sys.exit(main())
  File "/Users/lilykoffman/anaconda3/envs/stepcount/lib/python3.9/site-packages/stepcount/stepcount.py", line 74, in main
    Y, W, T_steps = model.predict_from_frame(data)
  File "/Users/lilykoffman/anaconda3/envs/stepcount/lib/python3.9/site-packages/stepcount/models.py", line 243, in predict_from_frame
    Y, W, Z = self.predict(X, return_walk=True, return_step_times=True)
  File "/Users/lilykoffman/anaconda3/envs/stepcount/lib/python3.9/site-packages/stepcount/models.py", line 199, in predict
    (Y_[W_], Z_[W_]) = batch_count_peaks(
TypeError: NumPy boolean array indexing assignment requires a 0 or 1-dimensional input, input has 2 dimensions

Reproducible example (RF version)

Example dataset for RF
The sample rate is 15Hz.

Command:

$ conda activate stepcount
stepcount example_data_rf.csv.gz --model-type=rf --sample-rate=15

Output

/Users/lilykoffman/anaconda3/envs/stepcount/lib/python3.9/site-packages/actipy/processing.py:294: UserWarning: Skipping calibration: Insufficient stationary samples: 0 < 50
  warnings.warn(f"Skipping calibration: Insufficient stationary samples: {len(xyz)} < {calib_min_samples}")
Gravity calibration... Done! (0.02s)
Nonwear detection... Done! (0.01s)
Loading model...
Running step counter...
Defining windows...
100%|███████████████████████████████████████████████████████████████████████| 147/147 [00:00<00:00, 2733.77it/s]
Extracting features...
100%|█████████████████████████████████████████████████████████████████████████| 146/146 [00:01<00:00, 86.02it/s]
Traceback (most recent call last):
  File "/Users/lilykoffman/anaconda3/envs/stepcount/bin/stepcount", line 8, in <module>
    sys.exit(main())
  File "/Users/lilykoffman/anaconda3/envs/stepcount/lib/python3.9/site-packages/stepcount/stepcount.py", line 74, in main
    Y, W, T_steps = model.predict_from_frame(data)
  File "/Users/lilykoffman/anaconda3/envs/stepcount/lib/python3.9/site-packages/stepcount/models.py", line 243, in predict_from_frame
    Y, W, Z = self.predict(X, return_walk=True, return_step_times=True)
  File "/Users/lilykoffman/anaconda3/envs/stepcount/lib/python3.9/site-packages/stepcount/models.py", line 199, in predict
    (Y_[W_], Z_[W_]) = batch_count_peaks(
TypeError: NumPy boolean array indexing assignment requires a 0 or 1-dimensional input, input has 2 dimensions

ValueError: cannot reindex a non-unique index with a method or limit

Im trying to get an output using the below csv but i get an error.
source.csv:
time,x,y,z
2022-01-27 12:21:52.123000,2.9240294,4.808078,8.586046
2022-01-27 12:21:52.144000,2.9706893,4.9196434,8.013563
2022-01-27 12:21:52.164000,3.728915,5.374579,7.7967134
2022-01-27 12:21:52.184000,3.4977086,4.6226344,7.84128
2022-01-27 12:21:52.204000,3.3768709,3.8751762,8.340483
2022-01-27 12:21:52.225000,3.789932,4.10399,8.355139

error:

Getting stationary points... Done! (0.03s)
/home/chris/influx-testing/.venv/lib/python3.8/site-packages/actipy/processing.py:195: UserWarning: Skipping calibration: Insufficient stationary samples
warnings.warn(f"Skipping calibration: Insufficient stationary samples")
Gravity calibration... Done! (0.01s)
Nonwear detection... Done! (0.03s)
Traceback (most recent call last):
File "/home/chris/influx-testing/.venv/bin/stepcount", line 8, in
sys.exit(main())
File "/home/chris/influx-testing/.venv/lib/python3.8/site-packages/stepcount/stepcount.py", line 45, in main
data, info = read(args.filepath, resample_hz)
File "/home/chris/influx-testing/.venv/lib/python3.8/site-packages/stepcount/stepcount.py", line 276, in read
data, info = actipy.process(
File "/home/chris/influx-testing/.venv/lib/python3.8/site-packages/actipy/reader.py", line 130, in process
data, info_resample = P.resample(data, resample_hz)
File "/home/chris/influx-testing/.venv/lib/python3.8/site-packages/actipy/processing.py", line 42, in resample
data = data.reindex(t, method='nearest', tolerance=pd.Timedelta('1s'), limit=1)
File "/home/chris/influx-testing/.venv/lib/python3.8/site-packages/pandas/util/_decorators.py", line 347, in wrapper
return func(*args, **kwargs)
File "/home/chris/influx-testing/.venv/lib/python3.8/site-packages/pandas/core/frame.py", line 5205, in reindex
return super().reindex(**kwargs)
File "/home/chris/influx-testing/.venv/lib/python3.8/site-packages/pandas/core/generic.py", line 5289, in reindex
return self._reindex_axes(
File "/home/chris/influx-testing/.venv/lib/python3.8/site-packages/pandas/core/frame.py", line 5004, in _reindex_axes
frame = frame._reindex_index(
File "/home/chris/influx-testing/.venv/lib/python3.8/site-packages/pandas/core/frame.py", line 5020, in _reindex_index
new_index, indexer = self.index.reindex(
File "/home/chris/influx-testing/.venv/lib/python3.8/site-packages/pandas/core/indexes/base.py", line 4419, in reindex
raise ValueError(
ValueError: cannot reindex a non-unique index with a method or limit

Failure with new `hmmlearn 0.3.0`

Get errors with new hmmlearn and rf model:
MultinomialHMM has undergone major changes. The previous version was implementing a CategoricalHMM (a special case of MultinomialHMM). This new implementation follows the standard definition for a Multinomial distribution (e.g. as in https://en.wikipedia.org/wiki/Multinomial_distribution). See these issues for details:
hmmlearn/hmmlearn#335
hmmlearn/hmmlearn#340

Traceback (most recent call last):
File "/users/jmuschel/.conda/envs/stepcount/lib/python3.9/site-packages/stepcount/models.py", line 244, in predict_from_frame
Y, W, Z = self.predict(X, return_walk=True, return_step_times=True)
File "/users/jmuschel/.conda/envs/stepcount/lib/python3.9/site-packages/stepcount/models.py", line 195, in predict
W_ = self.wd.predict(X_, groups).astype('bool')
File "/users/jmuschel/.conda/envs/stepcount/lib/python3.9/site-packages/stepcount/models.py", line 358, in predict
W = self.hmms.predict(W, groups=groups)
File "/users/jmuschel/.conda/envs/stepcount/lib/python3.9/site-packages/stepcount/hmm_utils.py", line 48, in predict
return self.hmmlearn_fit_predict(Y, groups=groups, method='predict')
File "/users/jmuschel/.conda/envs/stepcount/lib/python3.9/site-packages/stepcount/hmm_utils.py", line 98, in hmmlearn_fit_predict
_hmm, _score, _Y_pred = hmmlearn_fit_predict(Y[groups == g], groups=None, **hmm_params)
File "/users/jmuschel/.conda/envs/stepcount/lib/python3.9/site-packages/stepcount/hmm_utils.py", line 195, in hmmlearn_fit_predict
hmm.fit(Y_train, lengths_from_groups(groups_train))
File "/users/jmuschel/.conda/envs/stepcount/lib/python3.9/site-packages/hmmlearn/base.py", line 468, in fit
File "/users/jmuschel/.conda/envs/stepcount/lib/python3.9/site-packages/hmmlearn/hmm.py", line 895, in _init
res = np.zeros((n_samples, self.n_components))
File "/users/jmuschel/.conda/envs/stepcount/lib/python3.9/site-packages/hmmlearn/base.py", line 908, in _init
File "/users/jmuschel/.conda/envs/stepcount/lib/python3.9/site-packages/hmmlearn/_emissions.py", line 317, in _check_and_set_n_features
File "/users/jmuschel/.conda/envs/stepcount/lib/python3.9/site-packages/hmmlearn/base.py", line 527, in _check_and_set_n_features
"transition from the state was ever observed.")
ValueError: Unexpected number of dimensions, got 1 but expected 2

Reverting to 0.2.7 fixed this.

how to deal data with real time?

我想要在安卓设备中使用这个计算，应该怎么实时进行分析呢？

ENH Allow multiple files passing

I see the information for running multiple files at https://github.com/OxWearables/stepcount?tab=readme-ov-file#processing-multiple-files, but it would be great if you can pass in multiple files directly.

The benefits:

Downloading the model only once. Ideally people are using an explicit model path or the torch cache path (only available for SSL, not RF), but that may slow things down.
Loading the model only once - this is the big speedup. Even with a cached model, the model must load every time in stepcount, which is a slowdown. If the lines

stepcount/src/stepcount/stepcount.py

Line 67 in 58797b1

model.verbose = verbose

are moved then a loop can only add sample_rate, window_len, wd.device, and wd.sample_rate to the model as needed for the specific file. The process would go: load model, add specific attributes, loop over data to reading/getting info, add to model, then predict_from_frame and summarize.

`avgpow` and `pentropy` not included

This bug caught by @aidanacquah

stepcount/stepcount/features.py

Line 93 in 2e61ace

feats = {}

hourly stats in Info.json

AttributeError: 'Timedelta' object has no attribute 'astype'

Description

Failure in actipy.process command when estiamting non-wear periods on line

stepcount/src/stepcount/stepcount.py

Line 358 in 794b79e

data, info = actipy.process(

This is due to a failure of detect_nonwear

stepcount/src/stepcount/stepcount.py

Line 362 in 794b79e

detect_nonwear=True,

File to Reproduce

P30_wrist100.csv.gz
This is from https://ora.ox.ac.uk/objects/uuid:19d3cb34-e2b3-4177-91b6-1bad0e0163e7.

$ conda activate stepcount
(stepcount) $ stepcount P30_wrist100.csv.gz

And the resulting output is:

actipy/processing.py:294: UserWarning: Skipping calibration: Insufficient stationary samples: 1 < 50
  warnings.warn(f"Skipping calibration: Insufficient stationary samples: {len(xyz)} < {calib_min_samples}")
Gravity calibration... Done! (0.04s)
Traceback (most recent call last):
  File "/Users/johnmuschelli/miniconda3/envs/stepcount/bin/stepcount", line 8, in <module>
    sys.exit(main())
  File "/Users/johnmuschelli/miniconda3/envs/stepcount/lib/python3.9/site-packages/stepcount/stepcount.py", line 48, in main
    data, info = read(args.filepath, resample_hz, verbose=verbose)
  File "/Users/johnmuschelli/miniconda3/envs/stepcount/lib/python3.9/site-packages/stepcount/stepcount.py", line 356, in read
    data, info = actipy.process(
  File "/Users/johnmuschelli/miniconda3/envs/stepcount/lib/python3.9/site-packages/actipy/reader.py", line 144, in process
    data, info_nonwear = P.detect_nonwear(data)
  File "/Users/johnmuschelli/miniconda3/envs/stepcount/lib/python3.9/site-packages/actipy/processing.py", line 201, in detect_nonwear
    stationary_segment_ids
AttributeError: 'Timedelta' object has no attribute 'astype'

The process cannot access the file because it is being used by another process.

Getting the following error, but script still runs fine:

Error: C:\Users\myusername\AppData\Local\Temp\tmpgillei9i\tmpout.npy - The process cannot access the file because it is being used by another process.

This is an issue with actipy, not this pkg per se.

IndexError when processing (old ?) GENEActiv .bin files

Hello,
I work with Python 3.8.19 on Windows 10 - 64 bits.

An error appeared after running the following command line to process GENEActiv .bin files, only in some cases: stepcount "E:\file\directory\GENEActiv_file.bin" -o "E:\output\directory"

Here is the output message:
java.lang.ArrayIndexOutOfBoundsException: Index 1 out of bounds for length 1 at GENEActivReader.parseBinFileHeader(GENEActivReader.java:221) at GENEActivReader.main(GENEActivReader.java:75) Reading file... Done! (0.16s) Error: C:\Users\***\AppData\Local\Temp\tmphxr_w9yo\data.npy - Le processus ne peut pas accéder au fichier car ce fichier est utilisé par un autre processus. Traceback (most recent call last): File "C:\Users\***\Anaconda3\envs\stepcount\lib\runpy.py", line 194, in _run_module_as_main return _run_code(code, main_globals, None, File "C:\Users\***\Anaconda3\envs\stepcount\lib\runpy.py", line 87, in _run_code exec(code, run_globals) File "C:\Users\***\Anaconda3\envs\stepcount\Scripts\stepcount.exe\__main__.py", line 7, in <module> File "C:\Users\***\Anaconda3\envs\stepcount\lib\site-packages\stepcount\stepcount.py", line 58, in main data, info = read( File "C:\Users\***\Anaconda3\envs\stepcount\lib\site-packages\stepcount\stepcount.py", line 730, in read data, info = actipy.read_device( File "C:\Users\***\Anaconda3\envs\stepcount\lib\site-packages\actipy\reader.py", line 50, in read_device data, info = _read_device(input_file, verbose) File "C:\Users\***\Anaconda3\envs\stepcount\lib\site-packages\actipy\reader.py", line 220, in _read_device info['StartTime'] = t.iloc[0].strftime(strftime) File "C:\Users\***\Anaconda3\envs\stepcount\lib\site-packages\pandas\core\indexing.py", line 1103, in __getitem__ return self._getitem_axis(maybe_callable, axis=axis) File "C:\Users\***\Anaconda3\envs\stepcount\lib\site-packages\pandas\core\indexing.py", line 1656, in _getitem_axis self._validate_integer(key, axis) File "C:\Users\***\Anaconda3\envs\stepcount\lib\site-packages\pandas\core\indexing.py", line 1589, in _validate_integer raise IndexError("single positional indexer is out-of-bounds") IndexError: single positional indexer is out-of-bounds

The first error does not matter. It appeared everytime but the files can be processed. However, the IndexError stops the process.
I noted that the error did not appear for recent files (collected in 2023) but it appeared for old files (collected in 2018), even if the devices used to record the data were the same from one year to another.

SSL: CERTIFICATE_VERIFY_FAILED

I had a fresh install of Windows 11, and after downloading the most recent versions of Python and Java as mentioned in the prerequisites, I followed the instructions mentioned here however when attempting to run the program on the example, I get this error.

(base) C:\Users\grims>stepcount sample.cwa.gz
Decompressing... Done! (0.73s)
Reading file... Done! (8.28s)
Converting to dataframe... Done! (2.63s)
Getting stationary points... Done! (5.96s)
Gravity calibration... Done! (2.08s)
Nonwear detection... Done! (5.11s)
Resampling... Done! (1.92s)
Downloading https://wearables-files.ndph.ox.ac.uk/files/models/stepcount/ssl-20230208.joblib.lzma...
Traceback (most recent call last):
File "C:\Users\grims\miniconda3\lib\urllib\request.py", line 1348, in do_open
h.request(req.get_method(), req.selector, req.data, headers,
File "C:\Users\grims\miniconda3\lib\http\client.py", line 1282, in request
self._send_request(method, url, body, headers, encode_chunked)
File "C:\Users\grims\miniconda3\lib\http\client.py", line 1328, in _send_request
self.endheaders(body, encode_chunked=encode_chunked)
File "C:\Users\grims\miniconda3\lib\http\client.py", line 1277, in endheaders
self._send_output(message_body, encode_chunked=encode_chunked)
File "C:\Users\grims\miniconda3\lib\http\client.py", line 1037, in _send_output
self.send(msg)
File "C:\Users\grims\miniconda3\lib\http\client.py", line 975, in send
self.connect()
File "C:\Users\grims\miniconda3\lib\http\client.py", line 1454, in connect
self.sock = self._context.wrap_socket(self.sock,
File "C:\Users\grims\miniconda3\lib\ssl.py", line 513, in wrap_socket
return self.sslsocket_class._create(
File "C:\Users\grims\miniconda3\lib\ssl.py", line 1071, in _create
self.do_handshake()
File "C:\Users\grims\miniconda3\lib\ssl.py", line 1342, in do_handshake
self._sslobj.do_handshake()
ssl.SSLCertVerificationError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self signed certificate in certificate chain (_ssl.c:997)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\Users\grims\miniconda3\lib\runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "C:\Users\grims\miniconda3\lib\runpy.py", line 86, in run_code
exec(code, run_globals)
File "C:\Users\grims\miniconda3\Scripts\stepcount.exe_main.py", line 7, in
File "C:\Users\grims\miniconda3\lib\site-packages\stepcount\stepcount.py", line 55, in main
model = load_model(args.model_path or model_path, args.model_type, check_md5, args.force_download)
File "C:\Users\grims\miniconda3\lib\site-packages\stepcount\stepcount.py", line 335, in load_model
with urllib.request.urlopen(url) as f_src, open(pth, "wb") as f_dst:
File "C:\Users\grims\miniconda3\lib\urllib\request.py", line 216, in urlopen
return opener.open(url, data, timeout)
File "C:\Users\grims\miniconda3\lib\urllib\request.py", line 519, in open
response = self._open(req, data)
File "C:\Users\grims\miniconda3\lib\urllib\request.py", line 536, in _open
result = self._call_chain(self.handle_open, protocol, protocol +
File "C:\Users\grims\miniconda3\lib\urllib\request.py", line 496, in _call_chain
result = func(*args)
File "C:\Users\grims\miniconda3\lib\urllib\request.py", line 1391, in https_open
return self.do_open(http.client.HTTPSConnection, req,
File "C:\Users\grims\miniconda3\lib\urllib\request.py", line 1351, in do_open
raise URLError(err)
urllib.error.URLError: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self signed certificate in certificate chain (_ssl.c:997)>

Facing Issues with CSV Files

Facing lot many Issues with CSV Files. May be I missing something or running wrongly.

My CSV files are as below.

sample1.csv
time,x,y,z
2013-03-01 15:15:01.010,-1,65,5
2013-03-01 15:15:01.020,-1,65,5
2013-03-01 15:15:01.030,-1,65,4
2013-03-01 15:15:01.040,-1,65,5
2013-03-01 15:15:01.050,-1,64,5
2013-03-01 15:15:01.060,-1,64,5
2013-03-01 15:15:01.070,-1,65,5
2013-03-01 15:15:01.080,-1,65,4
2013-03-01 15:15:01.090,-1,65,5

sample2.csv
time,x,y,z
1683684921,34.0,-5.0,54.0
1683684922,34.0,-5.0,55.0
1683684923,33.0,-5.0,55.0
1683684924,34.0,-5.0,54.0
1683684925,34.0,-5.0,55.0
1683684926,33.0,-5.0,55.0
1683684927,34.0,-5.0,55.0
1683684928,33.0,-5.0,55.0
1683684929,33.0,-5.0,55.0
1683684910,34.0,-4.0,55.0
1683684911,34.0,-4.0,55.0
1683684912,33.0,-5.0,55.0
1683684913,34.0,-5.0,55.0
1683684914,34.0,-5.0,55.0
1683684915,33.0,-5.0,55.0
1683684916,33.0,-5.0,54.0
1683684917,34.0,-5.0,54.0

As I am exploring this as a beginner, it would be great if someone can give sample CSV data, notes and any example/article/video with CSV.

Thanks In Advance.

output csv column names

the column headers for the output csvs are titled "0" instead of "steps" or "minutes"

java depencency

Be explicit about java dependency in README

SSL torch model significantly slower in MacOS M1

Step count fails for Newcastle subject 14, 29 geneactive

Reading file... Done! (3.75s)
Converting to dataframe... Done! (0.24s)
Getting stationary points... Done! (0.71s)
Gravity calibration... Done! (1.09s)
Nonwear detection... Done! (0.95s)
Resampling... Done! (0.23s)
Traceback (most recent call last):
  File "/Users/hangy/opt/miniconda3/envs/step_count/bin/stepcount", line 8, in <module>
    sys.exit(main())
  File "/Users/hangy/opt/miniconda3/envs/step_count/lib/python3.9/site-packages/stepcount/stepcount.py", line 55, in main
    model = load_model(args.model_path or model_path, args.model_type, check_md5, args.force_download)
  File "/Users/hangy/opt/miniconda3/envs/step_count/lib/python3.9/site-packages/stepcount/stepcount.py", line 286, in load_model
    return joblib.load(pth)
  File "/Users/hangy/opt/miniconda3/envs/step_count/lib/python3.9/site-packages/joblib/numpy_pickle.py", line 587, in load
    obj = _unpickle(fobj, filename, mmap_mode)
  File "/Users/hangy/opt/miniconda3/envs/step_count/lib/python3.9/site-packages/joblib/numpy_pickle.py", line 506, in _unpickle
    obj = unpickler.load()
  File "/Users/hangy/opt/miniconda3/envs/step_count/lib/python3.9/pickle.py", line 1212, in load
    dispatch[key[0]](self)
KeyError: 255
(step_count) ha

The data can be download here https://zenodo.org/record/1160410#.Y__0Gi-l1qs

Attribute error not allowing creation of summary/adjusted files

In some participants (not all, seemingly those with significant amounts of non-wear), an attribute error is flagged when summarising data.

daily_avg = daily.mean().round()
AttributeError: 'float' object has no attribute 'round'.

`make_windows` maybe belongs to model logic

Maybe make_windows should be the responsability of the model logic?

stepcount/stepcount/stepcount.py

Lines 39 to 47 in 5c96190

    
           # Run model 
        
           model = load_model() 
        
           window_sec = model.window_sec 
        
           print("Splitting data into windows...") 
        
           X, T = make_windows(data, window_sec=window_sec) 
        
           print("Running step counter...") 
        
           Y = model.predict(X) 
        
           Y = pd.DataFrame({'steps': Y}, index=T) 
        
           Y.index.name = 'time'

add CONTRIBUTING.md

output non-wear time

In the terminal output, the Nonwear time and number of non-wear episodes are reported as zero even when non-wear and imputation occur. It seems that the processing is correct, just the reporting has an issue

AttributeError: 'float' object has no attribute 'round'

Getting the following error when trying to run using a csv file with data

Traceback (most recent call last):
File "anaconda3\lib\runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "anaconda3\lib\runpy.py", line 86, in run_code
exec(code, run_globals)
File "Scripts\stepcount.exe_main.py", line 7, in
File "anaconda3\lib\site-packages\stepcount\stepcount.py", line 70, in main
summary = summarize(Y, model.steptol)
File "anaconda3\lib\site-packages\stepcount\stepcount.py", line 147, in summarize
daily_med = daily.median().round()
AttributeError: 'float' object has no attribute 'round'

documentation

ActiGraph .gt3x error

I am able to run stepcount on .cwa files but when I have tried running stepcount on .gt3x files it does not work. This Actigraph data was collected back in 2018.

It spends 15-20minutes "reading file..."
Eported is dailySteps.csv, hourlySteps.csv, steps.csv, and time.csv but with no data in these csvs and no json and no adusted files

Have you seen this problem before? is there a workaround.

I also have csv files with of the raw data x,y,z but it does not have a time column but it does have the initial rows with the start time and start date (Would working with these files be easier)

Here is some meta data created from in the actigraph csv (not using for processing) ActiGraph GT3X+ ActiLife v6.13.3 Firmware v1.9.2.

Below is the error output

Reading file... Done! (2593.21s)
Converting to dataframe... Done! (8.32s)
Error: C:\Users\danas\AppData\Local\Temp\tmpthzt2197\data.npy - The process cannot access the file because it is being used by another process.
C:\Python310\lib\site-packages\actipy\processing.py:294: UserWarning: Skipping calibration: Insufficient stationary samples: 0 < 50
warnings.warn(f"Skipping calibration: Insufficient stationary samples: {len(xyz)} < {calib_min_samples}")
Gravity calibration... Done! (20.16s)
Nonwear detection... Done! (15.73s)
Resampling... Done! (3.48s)
Loading model...
Running step counter...
Defining windows...
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 498.85it/s]
Using local C:\Python310\lib\site-packages\stepcount\torch_hub_cache\OxWearables_ssl-wearables_v1.0.0
Classifying windows...
C:\Python310\lib\site-packages\numpy\core\fromnumeric.py:3464: RuntimeWarning: Mean of empty slice.
return _methods._mean(a, axis=axis, dtype=dtype,
C:\Python310\lib\site-packages\numpy\core_methods.py:192: RuntimeWarning: divide by zero encountered in divide
ret = ret.dtype.type(ret / rcount)
C:\Python310\lib\site-packages\numpy\lib\nanfunctions.py:1215: RuntimeWarning: Mean of empty slice
return np.nanmean(a, axis, out=out, keepdims=keepdims)
C:\Python310\lib\site-packages\numpy\core\fromnumeric.py:3464: RuntimeWarning: Mean of empty slice.
return _methods._mean(a, axis=axis, dtype=dtype,
C:\Python310\lib\site-packages\numpy\core_methods.py:192: RuntimeWarning: divide by zero encountered in divide
ret = ret.dtype.type(ret / rcount)
Traceback (most recent call last):
File "C:\Python310\lib\runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "C:\Python310\lib\runpy.py", line 86, in run_code
exec(code, run_globals)
File "C:\Python310\Scripts\stepcount.exe_main.py", line 7, in
File "C:\Python310\lib\site-packages\stepcount\stepcount.py", line 105, in main
summary_adj = summarize(Y, model.steptol, adjust_estimates=True)
File "C:\Python310\lib\site-packages\stepcount\stepcount.py", line 146, in summarize
Y = impute_missing(Y)
File "C:\Python310\lib\site-packages\stepcount\stepcount.py", line 292, in impute_missing
freq=to_offset(infer_freq(data.index)),
File "pandas_libs\tslibs\offsets.pyx", line 4102, in pandas._libs.tslibs.offsets.to_offset
File "pandas_libs\tslibs\offsets.pyx", line 4203, in pandas._libs.tslibs.offsets.to_offset
ValueError: Invalid frequency: NaT

def summarize(Y):

Period in filename causes failure

If there are periods in the filename, there is going to be a failure likely as

conda activate stepcount
wget https://github.com/OxWearables/stepcount/files/14117327/example_data_ssl.csv.gz -O example.data.csv.gz
stepcount example.data.csv.gz

Traceback (most recent call last):
  File "/Users/johnmuschelli/miniconda3/envs/stepcount/bin/stepcount", line 8, in <module>
    sys.exit(main())
  File "stepcount/stepcount.py", line 50, in main
    data, info = read(args.filepath, resample_hz, sample_rate=args.sample_rate, verbose=verbose)
  File "stepcount/stepcount.py", line 389, in read
    if 'ResampleRate' not in info:
UnboundLocalError: local variable 'info' referenced before assignment

This is due to parsing at

stepcount/src/stepcount/stepcount.py

Line 336 in 85ee34d

ftype = p.suffixes[0].lower()

that does not generally handle suffixes if the filename has periods.

import pathlib
filepath = "example.data.csv.gz"
p = pathlib.Path(filepath)
ftype = p.suffixes[0].lower()
ftype
p.suffixes

Resolution

This may be resolved with using p.suffixes, but at least an else at the if statement of

stepcount/src/stepcount/stepcount.py

Line 378 in 85ee34d

elif ftype in (".cwa", ".gt3x", ".bin"):

so that an error "Failed to parse extension, getting .data (OR WHATEVER extension)".

$ python
Python 3.9.18 (main, Sep 11 2023, 08:38:23) 
[Clang 14.0.6 ] :: Anaconda, Inc. on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import pathlib
>>> filepath = "example.data.csv.gz"
>>> p = pathlib.Path(filepath)
>>> ftype = p.suffixes[0].lower()
>>> ftype
'.data'
>>> p.suffixes
['.data', '.csv', '.gz']

PermissionError: [Errno 13] Permission denied: 'C:\\Users\\DIS~1.DES\\AppData\\Local\\Temp\\tmppoz7de8s'

I get a permission error when running:

stepcount .\data.csv

on windows 10.

The csv file is in the format of:

time,x,y,z
2020-05-14 17:57:53.359,-10.2578125,2.03769534,-1.85380855
2020-05-14 17:57:53.379,-10.305397727272728,1.8335404909090909,-1.5747514181818183
2020-05-14 17:57:53.399,-10.583806818181818,0.8554687327272728,-1.8620383636363635
2020-05-14 17:57:53.419,-10.034801136363637,-0.491788,-1.5311612454545451
2020-05-14 17:57:53.439,-9.323153409090908,-0.9483753545454544,-1.0940162999999998
2020-05-14 17:57:53.459,-8.506392045454545,-0.5928178490909091,-0.7536399181818182
2020-05-14 17:57:53.479,-8.5546875,-0.15795898,0.18908691
2020-05-14 17:57:53.499,-8.9140625,0.08618164,-0.05505371
2020-05-14 17:57:53.519,-9.03125,-0.11968994,-0.60791016
2020-05-14 17:57:53.539,-9.5,-1.2490233999999998,-0.15319824
2020-05-14 17:57:53.559,-9.921875,-0.61279297,1.2783203

PermissionError: [Errno 13] Permission denied: 'C:\Users\DIS~1.DES\AppData\Local\Temp\tmppoz7de8s'

	def load_model():
	""" Load trained model. Download if not exists. """

	pth = pathlib.Path(MODEL_PATH)

	if not pth.exists():

	url = f"https://wearables-files.ndph.ox.ac.uk/files/models/stepcounter/{__model_version__}.joblib"

	print(f"Downloading {url}...")

	with urllib.request.urlopen(url) as f_src, open(pth, "wb") as f_dst:
	shutil.copyfileobj(f_src, f_dst)

	return joblib.load(pth)

	# Run model
	model = load_model()
	window_sec = model.window_sec
	print("Splitting data into windows...")
	X, T = make_windows(data, window_sec=window_sec)
	print("Running step counter...")
	Y = model.predict(X)
	Y = pd.DataFrame({'steps': Y}, index=T)
	Y.index.name = 'time'

oxwearables / stepcount Goto Github PK

stepcount's People

Contributors

Stargazers

Watchers

Forkers

stepcount's Issues

Issue

Reproducible Example

Data Set

Errant Behavior

Proposed Solutions

Issue

Reproducible example (SSL version)

Command:

Output

Reproducible example (RF version)

Command:

Output

Description

File to Reproduce

Resolution

Recommend Projects

Recommend Topics

Recommend Org