Threshold Checker#

This example shows how to use the functions apply_thresholds to check if a DMO falls within the thresholds for a specific cohort and height.

import pandas as pd
from mobgap import PACKAGE_ROOT
from mobgap.aggregation import apply_thresholds, get_mobilised_dmo_thresholds

Selecting Example Data#

We use some example DMO data. Note, that the apply_thresholds function can only be used with the data of a single participant at the time, as meta-data such as the participant’s height is required. Luckily, the all the example data is from the same participant, just recorded at different times.

DATA_PATH = (
    PACKAGE_ROOT.parent / "example_data/original_results/mobilised_aggregator"
)

data = pd.read_csv(
    DATA_PATH / "aggregation_test_input.csv", index_col=0
).set_index(["visit_type", "participant_id", "measurement_date", "wb_id"])
data.head()
duration_s n_steps cadence_spm walking_speed_mps stride_length_m stride_duration_s n_turns
visit_type participant_id measurement_date wb_id
T1 12345 2023-01-01 0 4.74702 8 99.82188 1.16079 2.51885 1.58675 0
1 5.13150 7 101.16429 2.57881 1.57243 1.46537 0
2 8.52727 12 86.53527 1.60044 1.66305 2.56092 2
3 16.24554 27 91.49977 0.95558 0.88961 3.14549 1
4 6.09907 8 93.69895 2.33230 1.95969 2.35295 0


Load Thresholds#

condition free_living laboratory literature global
threshold_type min max min max min max min max
dmo cohort
cadence_spm CHF 40.594770 167.942930 53.617703 122.657158 36.20 150.900 40.000000 172.900
COPD 38.880484 151.558718 59.451004 164.969764 71.20 154.800 40.000000 172.900
HA 35.751898 156.946955 60.176644 146.309393 90.40 169.200 40.000000 172.900
MS 38.443211 155.394496 44.016149 140.939597 45.60 164.400 40.000000 172.900
PD 40.957969 142.121947 47.681576 147.054849 33.80 77.600 40.000000 172.900
PFF 42.191719 157.613665 46.839404 133.129276 42.90 172.900 40.000000 172.900
walking_speed_mps CHF 0.112478 1.757103 0.273401 1.528913 0.63 1.830 0.081515 2.220
COPD 0.090731 1.641773 0.159689 1.918146 0.60 1.840 0.081515 2.220
HA 0.097413 1.965728 0.207887 1.583779 0.78 2.220 0.081515 2.220
MS 0.085918 1.920262 0.203213 1.877691 0.52 1.740 0.081515 2.220
PD 0.081515 1.735348 0.139654 1.939697 0.56 2.100 0.081515 2.220
PFF 0.104547 1.553186 0.143724 1.627383 0.11 1.590 0.081515 2.220
stride_length_m CHF 0.185308 2.166646 0.444041 1.568583 0.53 1.860 0.150523 2.190
COPD 0.176403 1.711645 0.223537 1.623733 1.03 2.190 0.150523 2.190
HA 0.155126 2.024694 0.290283 1.669609 0.87 1.980 0.150523 2.190
MS 0.191099 1.940537 0.363004 1.721322 0.72 2.100 0.150523 2.190
PD 0.150523 1.982926 0.253069 1.772536 0.56 1.620 0.150523 2.190
PFF 0.206787 1.697251 0.233460 1.587144 0.71 1.930 0.150523 2.190
stride_duration_s CHF 0.702000 3.030857 0.980000 2.274545 0.46 1.320 0.460000 3.000
COPD 0.770000 3.000000 0.727500 2.160000 0.68 1.410 0.460000 3.000
HA 0.735435 3.254400 0.829091 2.142000 0.69 2.304 0.460000 3.000
MS 0.775597 3.057750 0.855000 2.731667 0.69 2.304 0.460000 3.000
PD 0.836253 2.928000 0.818571 2.522000 0.76 1.690 0.460000 3.000
PFF 0.744000 2.769000 0.902727 2.584545 0.14 0.980 0.460000 3.000
step_duration_s CHF 0.376000 1.504500 0.498000 1.144167 0.14 0.980 0.140000 2.124
COPD 0.390400 1.788000 0.366000 1.181429 0.16 0.920 0.140000 2.124
HA 0.367972 1.730400 0.410000 1.137500 0.25 1.040 0.140000 2.124
MS 0.388084 1.905000 0.432857 1.360000 0.25 1.090 0.140000 2.124
PD 0.417216 1.605600 0.415000 1.450000 0.19 1.100 0.140000 2.124
PFF 0.371429 2.124000 0.430000 1.320000 0.35 1.100 0.140000 2.124


Apply Thresholds#

data_mask = apply_thresholds(
    data,
    thresholds,
    cohort="CHF",
    height_m=1.75,
    measurement_condition="free_living",
)
data_mask.head()
duration_s n_steps cadence_spm walking_speed_mps stride_length_m stride_duration_s n_turns
visit_type participant_id measurement_date wb_id
T1 12345 2023-01-01 0 NaN NaN True True True True NaN
1 NaN NaN True False True True NaN
2 NaN NaN True True True True NaN
3 NaN NaN True True True False NaN
4 NaN NaN True False True True NaN


We can see that the output has exactly the same structure as the input data, but with boolean values indicating if the DMO falls within the thresholds or not. Columns for which no thresholds were provided are simply NaN, indicating that no filtering was applied. It is up to you how you want to use this information.

This output data can be used to filter the input data for DMOs that fall within the thresholds. This can be done in combination with the aggregation algorithm to only include DMOs that fall within the thresholds. See the mobilised_aggregator example for more information.

Total running time of the script: (0 minutes 0.734 seconds)

Estimated memory usage: 9 MB

Gallery generated by Sphinx-Gallery