Loading Digital Mobility Outcome (DMO) data#

Besides raw data, mobgap also has utilities to load pre-calculated DMO data in the format published by Mobilise-D. Specifically, this is the data of the TVS (=technical validation study) and the individual CVS (=clinical validation study) visits. For each CVS visit data is available as a single CSV file, which contains the DMO data for all walking bouts of all participants.

In addition, you might have access to the weartime reports, published by McRoberts. They can be loaded optionally together with the DMO data (see below).

To get started, the data should be organised as follows:

  1. The main dmo file with a name like cvs-T1-wb-dmo-27-11-2023.csv. The important part is that the second element (separated by -) indicates the visit-id (T1, T2, …).

  2. A file that contains the mapping from the p-id to the measurement site. This file should have at least the columns

    Local.Participant and Participant.Site.

  3. If you are planning to load the weartime reports, you need to have a folder with the individual weartime reports and a “compliance report” that contains the total weartime per day. The file should follow the naming schema CVS-wear-complicance-*.xlsx and should be placed in the same folder as the weartime reports.

If the data is organised as described above, you can load the data using the MobilisedCvsDmoDataset class. Below, we will use some example data that is included in the mobgap package containing the data from two participants.

Loading data using these classes handles a lot of common edgecases, in particular the correct handling of timezones and is hence, the recommended way to load the data.

We will only show loading the data without the weartime reports, as no example weartime reports are included in the package at the moment.

from mobgap.data import MobilisedCvsDmoDataset, get_example_cvs_dmo_data_path

example_data_base_path = get_example_cvs_dmo_data_path()
dmo_data_path = example_data_base_path / "cvs-T1-test_data.csv"
mapping_path = example_data_base_path / "cvs-T1-test_data_mapping.csv"

dataset = MobilisedCvsDmoDataset(
    dmo_path=dmo_data_path, site_pid_map_path=mapping_path
)
dataset

MobilisedCvsDmoDataset [14 groups/rows]

visit_type participant_id measurement_date
0 T1 10004 2021-04-13
1 T1 10004 2021-04-14
2 T1 10004 2021-04-15
3 T1 10004 2021-04-16
4 T1 10004 2021-04-17
5 T1 10004 2021-04-18
6 T1 10004 2021-04-19
7 T1 10005 2021-04-14
8 T1 10005 2021-04-15
9 T1 10005 2021-04-16
10 T1 10005 2021-04-17
11 T1 10005 2021-04-18
12 T1 10005 2021-04-19
13 T1 10005 2021-04-20


We can access all dmo data (i.e. all individual dmos per walking bout) of the entire dataset using the following line. This might take a second, as the data is loaded from the CSV file (in particular when using the full dataset instead of the example data).

wbday duration_s n_steps n_turns cadence_spm walking_speed_mps stride_length_m stride_duration_s visit_date_utc site timezone
visit_type participant_id measurement_date wb_id
T1 10004 2021-04-13 wb_2021_4_13_0001_1 1 4.823538 8 0 90.888440 0.530627 0.774706 1.383975 2021-04-12 22:00:00+00:00 CAU Europe/Berlin
wb_2021_4_13_0002_1 1 8.039229 11 0 80.188857 0.529198 0.876706 1.634663 2021-04-12 22:00:00+00:00 CAU Europe/Berlin
wb_2021_4_13_0003_1 1 13.817425 19 1 91.352944 0.708731 1.027429 1.481877 2021-04-12 22:00:00+00:00 CAU Europe/Berlin
wb_2021_4_13_0004_1 1 13.264728 21 0 99.749524 0.668984 0.894819 1.420747 2021-04-12 22:00:00+00:00 CAU Europe/Berlin
wb_2021_4_13_0005_1 1 32.207162 37 1 79.089075 0.609849 1.029953 1.495227 2021-04-12 22:00:00+00:00 CAU Europe/Berlin
... ... ... ... ... ... ... ... ... ... ... ... ... ...
10005 2021-04-20 wb_2021_4_20_0366_7 7 4.748170 6 0 87.497580 0.459190 0.694687 0.994593 2021-04-19 22:00:00+00:00 CAU Europe/Berlin
wb_2021_4_20_0367_7 7 7.360919 10 1 78.690165 0.546124 0.908011 1.445640 2021-04-19 22:00:00+00:00 CAU Europe/Berlin
wb_2021_4_20_0368_7 7 7.788003 12 0 85.342554 0.734981 1.124061 1.320190 2021-04-19 22:00:00+00:00 CAU Europe/Berlin
wb_2021_4_20_0369_7 7 9.194868 10 1 81.725056 0.752255 1.202214 1.651615 2021-04-19 22:00:00+00:00 CAU Europe/Berlin
wb_2021_4_20_0370_7 7 11.254921 10 7 76.918853 1.290913 2.199377 2.174182 2021-04-19 22:00:00+00:00 CAU Europe/Berlin

4753 rows × 11 columns



We can also access a data_mask that represents potential data quality issues in the data. If the value is False the specific value of the DMO is outside expert defined thresholds. Depending on the analysis, you might want to exclude these values from the analysis. Further methods like the MobilisedAggregator allow to pass this data mask to exclude these values correctly from further analysis.

duration_s n_steps n_turns cadence_spm walking_speed_mps stride_length_m stride_duration_s
visit_type participant_id measurement_date wb_id
T1 10004 2021-04-13 wb_2021_4_13_0001_1 True True True True True True True
wb_2021_4_13_0002_1 True True True True True True True
wb_2021_4_13_0003_1 True True True True True True True
wb_2021_4_13_0004_1 True True True True True True True
wb_2021_4_13_0005_1 True True True True True True True
... ... ... ... ... ... ... ... ... ...
10005 2021-04-20 wb_2021_4_20_0366_7 True True True True True True True
wb_2021_4_20_0367_7 True True True True True True True
wb_2021_4_20_0368_7 True True True True True True True
wb_2021_4_20_0369_7 True True True True True True True
wb_2021_4_20_0370_7 True True True True True True True

4753 rows × 7 columns



We can see in the index that each day of the recording is listed as a separate entry in the dataset index and hence can be easily accessed individually.

MobilisedCvsDmoDataset [14 groups/rows]

visit_type participant_id measurement_date
0 T1 10004 2021-04-13
1 T1 10004 2021-04-14
2 T1 10004 2021-04-15
3 T1 10004 2021-04-16
4 T1 10004 2021-04-17
5 T1 10004 2021-04-18
6 T1 10004 2021-04-19
7 T1 10005 2021-04-14
8 T1 10005 2021-04-15
9 T1 10005 2021-04-16
10 T1 10005 2021-04-17
11 T1 10005 2021-04-18
12 T1 10005 2021-04-19
13 T1 10005 2021-04-20


MobilisedCvsDmoDataset [7 groups/rows]

visit_type participant_id measurement_date
0 T1 10004 2021-04-13
1 T1 10004 2021-04-14
2 T1 10004 2021-04-15
3 T1 10004 2021-04-16
4 T1 10004 2021-04-17
5 T1 10004 2021-04-18
6 T1 10004 2021-04-19


This allows to access the measurement site and timezone of the participant. Note, that this is usually not that important, as the class handles timezone conversions internally and provides all time values (e.g. the start of a walking bout) in the local time of the measurement site.

'CAU'
'Europe/Berlin'

Total running time of the script: (0 minutes 1.309 seconds)

Estimated memory usage: 10 MB

Gallery generated by Sphinx-Gallery