Note

Go to the end to download the full example code.

Performance of the gait sequences algorithm on the TVS dataset#

The following provides an analysis and comparison of the GSD performance on the TVS dataset (lab and free-living). We look into the actual performance of the algorithms compared to the reference data and compare these results with the performance of the original matlab algorithm.

Note

If you are interested in how these results are calculated, head over to the processing page.

We focus on the single_results (aka the performance per trail) and will aggregate it over multiple levels.

Below are the list of algorithms that we will compare. Note, that we use the prefix “MobGap” to refer to the reimplemented python algorithms and “Original Implementation” to refer to the original matlab algorithms. In case of the GsdIluz algorithm, we also have two reimplemented versions. The version MobGap uses a slightly modified peak detection algorithm, while the version MobGap (original peak) tries to emulate the original peak detection algorithm as closely as possible.

algorithms = {
    "GsdIonescu": ("GsdIonescu", "MobGap"),
    "GsdAdaptiveIonescu": ("GsdAdaptiveIonescu", "MobGap"),
    "GsdIluz": ("GsdIluz", "MobGap"),
    "GsdIluz_orig_peak": ("GsdIluz", "MobGap (original peak)"),
}
# We only load the matlab algorithms that were also reimplemented
algorithms.update(
    {
        "matlab_EPFL_V1-improved_th": ("GsdIonescu", "Original Implementation"),
        "matlab_EPFL_V2-original": (
            "GsdAdaptiveIonescu",
            "Original Implementation",
        ),
        "matlab_TA_Iluz-original": ("GsdIluz", "Original Implementation"),
    }
)

The code below loads the data and prepares it for the analysis. By default, the data will be downloaded from an online repository (and cached locally). If you want to use a local copy of the data, you can set the MOBGAP_VALIDATION_DATA_PATH environment variable. and the MOBGAP_VALIDATION_USE_LOCA_DATA to 1.

The file download will print a couple log information, which can usually be ignored. You can also change the version parameter to load a different version of the data.

from pathlib import Path

import pandas as pd
from mobgap.data.validation_results import ValidationResultLoader
from mobgap.utils.misc import get_env_var

local_data_path = (
    Path(get_env_var("MOBGAP_VALIDATION_DATA_PATH")) / "results"
    if int(get_env_var("MOBGAP_VALIDATION_USE_LOCAL_DATA", 0))
    else None
)
__RESULT_VERSION = "v1.0.0"
loader = ValidationResultLoader(
    "gsd", result_path=local_data_path, version=__RESULT_VERSION
)


free_living_index_cols = [
    "cohort",
    "participant_id",
    "time_measure",
    "recording",
    "recording_name",
    "recording_name_pretty",
]

results = {
    v: loader.load_single_results(k, "free_living")
    for k, v in algorithms.items()
}
results = pd.concat(
    results, names=["algo", "version", *free_living_index_cols]
).assign(
    # We convert all relative errors to percentages
    gs_absolute_relative_duration_error=lambda df: df[
        "gs_absolute_relative_duration_error"
    ]
    * 100,
    gs_relative_duration_error=lambda df: df["gs_relative_duration_error"]
    * 100,
)
results_long = results.reset_index().assign(
    algo_with_version=lambda df: df["algo"] + " (" + df["version"] + ")",
    _combined="combined",
)

lab_index_cols = [
    "cohort",
    "participant_id",
    "time_measure",
    "test",
    "trial",
    "test_name",
    "test_name_pretty",
]

lab_results = {
    v: loader.load_single_results(k, "laboratory")
    for k, v in algorithms.items()
}
lab_results = pd.concat(
    lab_results, names=["algo", "version", *lab_index_cols]
).assign(
    # We convert all relative errors to percentages
    gs_absolute_relative_duration_error=lambda df: df[
        "gs_absolute_relative_duration_error"
    ]
    * 100,
    gs_relative_duration_error=lambda df: df["gs_relative_duration_error"]
    * 100,
)
lab_results_long = lab_results.reset_index().assign(
    algo_with_version=lambda df: df["algo"] + " (" + df["version"] + ")",
    _combined="combined",
)

cohort_order = ["HA", "CHF", "COPD", "MS", "PD", "PFF"]

  0%|                                              | 0.00/14.1k [00:00<?, ?B/s]
  0%|                                              | 0.00/14.1k [00:00<?, ?B/s]
100%|█████████████████████████████████████| 14.1k/14.1k [00:00<00:00, 63.3MB/s]

  0%|                                              | 0.00/14.0k [00:00<?, ?B/s]
  0%|                                              | 0.00/14.0k [00:00<?, ?B/s]
100%|█████████████████████████████████████| 14.0k/14.0k [00:00<00:00, 78.6MB/s]

  0%|                                              | 0.00/14.1k [00:00<?, ?B/s]
  0%|                                              | 0.00/14.1k [00:00<?, ?B/s]
100%|█████████████████████████████████████| 14.1k/14.1k [00:00<00:00, 73.1MB/s]

  0%|                                              | 0.00/14.1k [00:00<?, ?B/s]
  0%|                                              | 0.00/14.1k [00:00<?, ?B/s]
100%|█████████████████████████████████████| 14.1k/14.1k [00:00<00:00, 61.3MB/s]

  0%|                                              | 0.00/14.0k [00:00<?, ?B/s]
  0%|                                              | 0.00/14.0k [00:00<?, ?B/s]
100%|█████████████████████████████████████| 14.0k/14.0k [00:00<00:00, 97.7MB/s]

  0%|                                              | 0.00/14.0k [00:00<?, ?B/s]
  0%|                                              | 0.00/14.0k [00:00<?, ?B/s]
100%|██████████████████████████████████████| 14.0k/14.0k [00:00<00:00, 100MB/s]

  0%|                                              | 0.00/14.0k [00:00<?, ?B/s]
  0%|                                              | 0.00/14.0k [00:00<?, ?B/s]
100%|█████████████████████████████████████| 14.0k/14.0k [00:00<00:00, 86.2MB/s]

  0%|                                              | 0.00/95.0k [00:00<?, ?B/s]
  0%|                                              | 0.00/95.0k [00:00<?, ?B/s]
100%|██████████████████████████████████████| 95.0k/95.0k [00:00<00:00, 373MB/s]

  0%|                                              | 0.00/97.3k [00:00<?, ?B/s]
  0%|                                              | 0.00/97.3k [00:00<?, ?B/s]
100%|██████████████████████████████████████| 97.3k/97.3k [00:00<00:00, 406MB/s]

  0%|                                              | 0.00/90.7k [00:00<?, ?B/s]
  0%|                                              | 0.00/90.7k [00:00<?, ?B/s]
100%|██████████████████████████████████████| 90.7k/90.7k [00:00<00:00, 374MB/s]

  0%|                                              | 0.00/87.2k [00:00<?, ?B/s]
  0%|                                              | 0.00/87.2k [00:00<?, ?B/s]
100%|██████████████████████████████████████| 87.2k/87.2k [00:00<00:00, 318MB/s]

  0%|                                              | 0.00/96.3k [00:00<?, ?B/s]
  0%|                                              | 0.00/96.3k [00:00<?, ?B/s]
100%|██████████████████████████████████████| 96.3k/96.3k [00:00<00:00, 508MB/s]

  0%|                                              | 0.00/99.1k [00:00<?, ?B/s]
  0%|                                              | 0.00/99.1k [00:00<?, ?B/s]
100%|██████████████████████████████████████| 99.1k/99.1k [00:00<00:00, 474MB/s]

  0%|                                              | 0.00/89.3k [00:00<?, ?B/s]
  0%|                                              | 0.00/89.3k [00:00<?, ?B/s]
100%|██████████████████████████████████████| 89.3k/89.3k [00:00<00:00, 452MB/s]

Performance metrics#

For each participant, performance metrics were calculated by classifying each sample in the recording as either TP, FP, TN, or FN. Based on these values recall (sensitivity), precision (positive predictive value), F1 score, accuracy, specificity and many other metrics were calculated. On top of that the duration of overall detected gait per participant was calculated. From this we calculate the mean and confidence interval for both systems, the bias and limits of agreement (LoA) between the algorithm output and the reference data, the absolute error and the ICC.

Below the functions that calculate these metrics are defined.

from functools import partial

from mobgap.pipeline.evaluation import CustomErrorAggregations as A
from mobgap.utils.df_operations import (
    CustomOperation,
    apply_aggregations,
    apply_transformations,
    multilevel_groupby_apply_merge,
)
from mobgap.utils.tables import FormatTransformer as F
from mobgap.utils.tables import (
    RevalidationInfo,
    revalidation_table_styles,
)
from mobgap.utils.tables import StatsFunctions as S

custom_aggs = [
    CustomOperation(
        identifier=None,
        function=A.n_datapoints,
        column_name=[("n_datapoints", "all")],
    ),
    ("recall", ["mean", A.conf_intervals]),
    ("precision", ["mean", A.conf_intervals]),
    ("f1_score", ["mean", A.conf_intervals]),
    ("accuracy", ["mean", A.conf_intervals]),
    ("specificity", ["mean", A.conf_intervals]),
    ("reference_gs_duration_s", ["mean", A.conf_intervals]),
    ("detected_gs_duration_s", ["mean", A.conf_intervals]),
    ("gs_duration_error_s", ["mean", A.loa]),
    ("gs_absolute_duration_error_s", ["mean", A.conf_intervals]),
    ("gs_absolute_relative_duration_error", ["mean", A.conf_intervals]),
    CustomOperation(
        identifier=None,
        function=partial(
            A.icc,
            reference_col_name="reference_gs_duration_s",
            detected_col_name="detected_gs_duration_s",
            icc_type="icc2",
        ),
        column_name=[("icc", "gs_duration_s"), ("icc_ci", "gs_duration_s")],
    ),
]

format_transforms = [
    CustomOperation(
        identifier=None,
        function=lambda df_: df_[("n_datapoints", "all")].astype(int),
        column_name=("General", "n_datapoints"),
    ),
    *(
        CustomOperation(
            identifier=None,
            function=partial(
                F.value_with_metadata,
                value_col=("mean", c),
                other_columns={
                    "range": ("conf_intervals", c),
                    "stats_metadata": ("stats_metadata", c),
                },
            ),
            column_name=("GSD", c),
        )
        for c in [
            "recall",
            "precision",
            "f1_score",
            "accuracy",
            "specificity",
        ]
    ),
    *(
        CustomOperation(
            identifier=None,
            function=partial(
                F.value_with_metadata,
                value_col=("mean", c),
                other_columns={
                    "range": ("conf_intervals", c),
                    "stats_metadata": ("stats_metadata", c),
                },
            ),
            column_name=("GS duration", c),
        )
        for c in [
            "reference_gs_duration_s",
            "detected_gs_duration_s",
            "gs_absolute_duration_error_s",
            "gs_absolute_relative_duration_error",
        ]
    ),
    CustomOperation(
        identifier=None,
        function=partial(
            F.value_with_metadata,
            value_col=("mean", "gs_duration_error_s"),
            other_columns={"range": ("loa", "gs_duration_error_s")},
        ),
        column_name=("GS duration", "gs_duration_error_s"),
    ),
    CustomOperation(
        identifier=None,
        function=partial(
            F.value_with_metadata,
            value_col=("icc", "gs_duration_s"),
            other_columns={"range": ("icc_ci", "gs_duration_s")},
        ),
        column_name=("GS duration", "icc"),
    ),
]

final_names = {
    "n_datapoints": "# recordings",
    "recall": "Recall",
    "precision": "Precision",
    "f1_score": "F1 Score",
    "accuracy": "Accuracy",
    "specificity": "Specificity",
    "reference_gs_duration_s": "INDIP mean and CI [s]",
    "detected_gs_duration_s": "WD mean and CI [s]",
    "gs_duration_error_s": "Bias and LoA [s]",
    "gs_absolute_duration_error_s": "Abs. Error [s]",
    "gs_absolute_relative_duration_error": "Abs. Rel. Error [%]",
    "icc": "ICC",
}
stats_transform = [
    CustomOperation(
        identifier=None,
        function=partial(
            S.pairwise_tests,
            value_col=c,
            between="version",
            reference_group_key="Original Implementation",
        ),
        column_name=[("stats_metadata", c)],
    )
    for c in [
        "recall",
        "precision",
        "f1_score",
        "accuracy",
        "specificity",
        "gs_absolute_relative_duration_error",
    ]
]


validation_thresholds = {
    ("GSD", "Recall"): RevalidationInfo(threshold=0.7, higher_is_better=True),
    ("GSD", "Precision"): RevalidationInfo(
        threshold=0.7, higher_is_better=True
    ),
    ("GSD", "F1 Score"): RevalidationInfo(threshold=0.7, higher_is_better=True),
    ("GSD", "Accuracy"): RevalidationInfo(threshold=0.7, higher_is_better=True),
    ("GSD", "Specificity"): RevalidationInfo(
        threshold=0.7, higher_is_better=True
    ),
    ("GS duration", "Abs. Error [s]"): RevalidationInfo(
        threshold=None, higher_is_better=False
    ),
    ("GS duration", "Abs. Rel. Error [%]"): RevalidationInfo(
        threshold=20, higher_is_better=False
    ),
    ("GS duration", "ICC"): RevalidationInfo(
        threshold=0.7, higher_is_better=True
    ),
}


def format_results(df: pd.DataFrame) -> pd.DataFrame:
    return (
        df.pipe(apply_transformations, format_transforms)
        .rename(columns=final_names)
        .loc[:, pd.IndexSlice[:, list(final_names.values())]]
    )

Free-Living Comparison#

We focus the comparison on the free-living data, as this is the most relevant considering our final use-case. In the free-living data, there is one 2.5 hour recording per participant. This means, each datapoint in the plots below and in the summary statistics represents one participant.

All results across all cohorts#

Note, that the MobGap (original peak) version is a variant of the new GsdIluz algorithm for which we tried to emulate the original peak detection algorithm as closely as possible. The regular MobGap version uses a slightly modified peak detection algorithm.

import matplotlib.pyplot as plt
import seaborn as sns

hue_order = ["Original Implementation", "MobGap", "MobGap (original peak)"]

fig, ax = plt.subplots()
sns.boxplot(
    data=results_long,
    x="algo",
    y="f1_score",
    hue="version",
    hue_order=hue_order,
    ax=ax,
)
fig.show()

perf_metrics_all = results_long.pipe(
    multilevel_groupby_apply_merge,
    [
        (
            ["algo", "version"],
            partial(apply_aggregations, aggregations=custom_aggs),
        ),
        (
            ["algo"],
            partial(apply_transformations, transformations=stats_transform),
        ),
    ],
).pipe(format_results)
perf_metrics_all.style.pipe(
    revalidation_table_styles,
    validation_thresholds,
    ["algo"],
)

		General	GSD					GS duration
		# recordings	Recall	Precision	F1 Score	Accuracy	Specificity	INDIP mean and CI [s]	WD mean and CI [s]	Bias and LoA [s]	Abs. Error [s]	Abs. Rel. Error [%]	ICC
algo	version
GsdAdaptiveIonescu	MobGap	101	0.93 [0.91, 0.95]	0.67 [0.62, 0.72]	0.74 [0.70, 0.78]	0.93 [0.92, 0.94]	0.94 [0.92, 0.95]	1264.26 [1088.67, 1439.85]	1695.22 [1548.42, 1842.02]	430.95 [-986.86, 1848.76]	560.23 [437.84, 682.63]	138.91 [36.10, 241.72]	0.55 [0.27, 0.72]
GsdAdaptiveIonescu	Original Implementation	101	0.89 [0.86, 0.92]	0.70 [0.65, 0.75]	0.75 [0.70, 0.79]	0.94 [0.93, 0.95]	0.95 [0.94, 0.96]	1264.26 [1088.67, 1439.85]	1514.09 [1370.66, 1657.53]	249.83 [-1067.83, 1567.49]	461.74 [354.98, 568.50]	122.52 [26.84, 218.21]	0.64 [0.48, 0.75]
GsdIluz	MobGap	101	0.91 [0.88, 0.94]^**	0.74 [0.71, 0.77]	0.81 [0.78, 0.83]	0.95 [0.94, 0.96]	0.95 [0.95, 0.96]^**	1264.26 [1088.67, 1439.85]	1544.08 [1348.16, 1739.99]	279.81 [-321.86, 881.49]	315.00 [262.27, 367.72]	30.86 [24.93, 36.80]^**	0.91 [0.59, 0.97]
	MobGap (original peak)	101	0.71 [0.66, 0.75]^**	0.87 [0.82, 0.91]^**	0.77 [0.72, 0.81]	0.96 [0.95, 0.97]	0.99 [0.99, 0.99]^**	1264.26 [1088.67, 1439.85]	1043.89 [871.73, 1216.04]	-220.38 [-781.49, 340.73]	241.42 [189.04, 293.80]	23.80 [19.28, 28.32]	0.92 [0.72, 0.97]
	Original Implementation	101	0.82 [0.77, 0.86]	0.78 [0.75, 0.82]	0.79 [0.75, 0.83]	0.96 [0.95, 0.96]	0.97 [0.96, 0.97]	1264.26 [1088.67, 1439.85]	1315.86 [1127.03, 1504.69]	51.59 [-554.17, 657.36]	185.98 [136.93, 235.04]	18.72 [14.29, 23.15]	0.94 [0.92, 0.96]
GsdIonescu	MobGap	101	0.85 [0.81, 0.88]	0.82 [0.79, 0.85]	0.82 [0.79, 0.85]	0.96 [0.96, 0.97]	0.97 [0.97, 0.98]	1264.26 [1088.67, 1439.85]	1322.33 [1135.84, 1508.83]	58.07 [-501.45, 617.59]	172.65 [127.00, 218.30]	17.65 [13.71, 21.59]	0.95 [0.93, 0.97]
GsdIonescu	Original Implementation	101	0.82 [0.78, 0.86]	0.82 [0.79, 0.85]	0.81 [0.77, 0.84]	0.96 [0.95, 0.97]	0.98 [0.97, 0.98]	1264.26 [1088.67, 1439.85]	1262.37 [1078.65, 1446.09]	-1.90 [-620.40, 616.61]	168.66 [116.74, 220.57]	16.98 [12.83, 21.12]	0.94 [0.92, 0.96]

Per Cohort#

While this provides a good overview, it does not fully reflect how these algorithms perform on the different cohorts.

fig, ax = plt.subplots()
sns.boxplot(
    data=results_long, x="cohort", y="f1_score", hue="algo_with_version", ax=ax
)
fig.show()

perf_metrics_per_cohort = (
    results_long.pipe(
        multilevel_groupby_apply_merge,
        [
            (
                ["cohort", "algo", "version"],
                partial(apply_aggregations, aggregations=custom_aggs),
            ),
            (
                ["cohort", "algo"],
                partial(apply_transformations, transformations=stats_transform),
            ),
        ],
    )
    .pipe(format_results)
    .loc[cohort_order]
)
perf_metrics_per_cohort.style.pipe(
    revalidation_table_styles,
    validation_thresholds,
    ["cohort", "algo"],
)

			General	GSD					GS duration
			# recordings	Recall	Precision	F1 Score	Accuracy	Specificity	INDIP mean and CI [s]	WD mean and CI [s]	Bias and LoA [s]	Abs. Error [s]	Abs. Rel. Error [%]	ICC
cohort	algo	version
HA	GsdAdaptiveIonescu	MobGap	20	0.90 [0.87, 0.93]	0.80 [0.72, 0.87]	0.83 [0.79, 0.87]	0.94 [0.92, 0.96]	0.95 [0.93, 0.98]	1684.35 [1352.20, 2016.50]	1878.34 [1610.60, 2146.08]	193.99 [-894.88, 1282.86]	408.54 [226.49, 590.60]	30.09 [14.40, 45.78]	0.66 [0.33, 0.85]
	GsdAdaptiveIonescu	Original Implementation	20	0.82 [0.73, 0.91]	0.79 [0.68, 0.90]	0.79 [0.70, 0.88]	0.94 [0.92, 0.95]	0.97 [0.95, 0.99]	1684.35 [1352.20, 2016.50]	1638.47 [1329.01, 1947.92]	-45.88 [-1217.67, 1125.92]	423.40 [242.20, 604.60]	29.19 [15.19, 43.19]	0.68 [0.34, 0.86]
	GsdIluz	MobGap	20	0.94 [0.92, 0.96]^*	0.77 [0.73, 0.82]^*	0.85 [0.81, 0.88]	0.94 [0.93, 0.95]	0.94 [0.93, 0.95]^**	1684.35 [1352.20, 2016.50]	2021.26 [1679.70, 2362.82]	336.91 [-72.79, 746.62]	336.91 [245.30, 428.53]	22.78 [15.88, 29.67]^*	0.88 [0.03, 0.97]
		MobGap (original peak)	20	0.77 [0.71, 0.82]	0.93 [0.92, 0.95]^**	0.84 [0.80, 0.87]	0.95 [0.95, 0.96]	0.99 [0.99, 0.99]^**	1684.35 [1352.20, 2016.50]	1405.44 [1066.12, 1744.76]	-278.91 [-630.01, 72.19]	278.91 [200.40, 357.42]	18.17 [12.75, 23.59]	0.91 [0.10, 0.98]
		Original Implementation	20	0.84 [0.77, 0.92]	0.83 [0.80, 0.87]	0.83 [0.77, 0.88]	0.95 [0.94, 0.96]	0.97 [0.96, 0.97]	1684.35 [1352.20, 2016.50]	1709.24 [1341.94, 2076.54]	24.89 [-579.36, 629.15]	174.99 [64.57, 285.40]	12.13 [4.64, 19.63]	0.93 [0.83, 0.97]
	GsdIonescu	MobGap	20	0.88 [0.84, 0.91]	0.86 [0.83, 0.90]	0.87 [0.84, 0.89]	0.96 [0.95, 0.97]	0.97 [0.97, 0.98]	1684.35 [1352.20, 2016.50]	1703.38 [1363.55, 2043.21]	19.03 [-333.52, 371.58]	133.16 [81.17, 185.15]	9.06 [5.31, 12.81]	0.97 [0.93, 0.99]
	GsdIonescu	Original Implementation	20	0.82 [0.72, 0.91]	0.83 [0.74, 0.92]	0.82 [0.73, 0.91]	0.95 [0.94, 0.96]	0.98 [0.97, 0.98]	1684.35 [1352.20, 2016.50]	1575.46 [1206.80, 1944.13]	-108.89 [-850.55, 632.78]	205.32 [59.11, 351.53]	13.78 [4.24, 23.32]	0.89 [0.74, 0.95]
CHF	GsdAdaptiveIonescu	MobGap	10	0.94 [0.90, 0.99]	0.65 [0.50, 0.81]	0.74 [0.64, 0.84]	0.93 [0.90, 0.96]	0.94 [0.90, 0.98]	1552.02 [579.11, 2524.93]	1994.35 [1245.80, 2742.90]	442.33 [-1143.88, 2028.54]	684.85 [317.53, 1052.17]	79.60 [28.46, 130.75]	0.81 [0.40, 0.95]
	GsdAdaptiveIonescu	Original Implementation	10	0.92 [0.87, 0.97]	0.69 [0.55, 0.84]	0.76 [0.67, 0.85]	0.93 [0.91, 0.96]	0.95 [0.92, 0.98]	1552.02 [579.11, 2524.93]	1832.00 [1103.66, 2560.34]	279.98 [-1164.23, 1724.18]	571.66 [251.44, 891.88]	63.23 [20.64, 105.82]	0.85 [0.54, 0.96]
	GsdIluz	MobGap	10	0.95 [0.93, 0.98]	0.76 [0.67, 0.85]	0.84 [0.77, 0.90]	0.96 [0.94, 0.97]	0.95 [0.92, 0.98]	1552.02 [579.11, 2524.93]	1874.87 [849.58, 2900.16]	322.85 [-192.66, 838.36]	322.85 [159.83, 485.87]	31.12 [12.65, 49.59]	0.97 [0.55, 0.99]
		MobGap (original peak)	10	0.80 [0.74, 0.85]	0.93 [0.89, 0.96]^*	0.86 [0.81, 0.90]	0.97 [0.96, 0.98]	0.99 [0.98, 1.00]	1552.02 [579.11, 2524.93]	1413.81 [431.42, 2396.20]	-138.21 [-275.80, -0.62]	138.21 [94.70, 181.72]	14.04 [9.83, 18.26]	1.00 [0.65, 1.00]
		Original Implementation	10	0.84 [0.70, 0.98]	0.81 [0.74, 0.89]	0.81 [0.69, 0.93]	0.96 [0.94, 0.98]	0.97 [0.94, 0.99]	1552.02 [579.11, 2524.93]	1595.96 [555.96, 2635.96]	43.94 [-693.30, 781.18]	239.58 [64.44, 414.73]	20.25 [5.65, 34.86]	0.98 [0.91, 0.99]
	GsdIonescu	MobGap	10	0.91 [0.89, 0.94]	0.80 [0.71, 0.89]	0.85 [0.79, 0.91]	0.96 [0.94, 0.98]	0.96 [0.94, 0.98]	1552.02 [579.11, 2524.93]	1739.49 [749.21, 2729.77]	187.47 [-333.59, 708.53]	200.00 [41.64, 358.36]	19.03 [1.23, 36.82]	0.98 [0.90, 1.00]
	GsdIonescu	Original Implementation	10	0.90 [0.88, 0.93]	0.82 [0.73, 0.90]	0.85 [0.79, 0.91]	0.96 [0.94, 0.98]	0.97 [0.95, 0.99]	1552.02 [579.11, 2524.93]	1690.52 [709.69, 2671.35]	138.50 [-334.17, 611.17]	173.73 [40.90, 306.57]	16.84 [1.81, 31.87]	0.99 [0.94, 1.00]
COPD	GsdAdaptiveIonescu	MobGap	17	0.94 [0.88, 1.00]	0.63 [0.57, 0.70]	0.74 [0.69, 0.79]	0.92 [0.90, 0.95]	0.93 [0.91, 0.95]	1124.95 [889.96, 1359.93]	1614.72 [1392.44, 1836.99]	489.77 [-388.13, 1367.67]	569.49 [411.19, 727.80]	59.12 [39.86, 78.39]	0.38 [-0.10, 0.73]
	GsdAdaptiveIonescu	Original Implementation	17	0.90 [0.84, 0.96]	0.68 [0.62, 0.75]	0.76 [0.71, 0.82]	0.94 [0.91, 0.96]	0.95 [0.93, 0.96]	1124.95 [889.96, 1359.93]	1417.39 [1233.14, 1601.64]	292.44 [-531.89, 1116.78]	418.60 [282.83, 554.37]	42.60 [27.67, 57.52]	0.46 [0.01, 0.76]
	GsdIluz	MobGap	17	0.87 [0.79, 0.95]	0.69 [0.63, 0.75]	0.76 [0.69, 0.82]	0.93 [0.91, 0.96]	0.95 [0.93, 0.97]	1124.95 [889.96, 1359.93]	1353.32 [1147.54, 1559.10]	228.37 [-821.39, 1278.13]	413.10 [222.39, 603.82]	37.82 [21.48, 54.16]	0.31 [-0.13, 0.67]
		MobGap (original peak)	17	0.63 [0.56, 0.71]^**	0.88 [0.81, 0.95]^*	0.72 [0.64, 0.79]	0.94 [0.92, 0.97]	0.99 [0.97, 1.00]^*	1124.95 [889.96, 1359.93]	790.23 [631.57, 948.88]	-334.72 [-1397.73, 728.29]	419.70 [193.70, 645.71]	34.04 [23.32, 44.76]	0.14 [-0.22, 0.52]
		Original Implementation	17	0.80 [0.73, 0.88]	0.75 [0.69, 0.82]	0.76 [0.69, 0.83]	0.94 [0.92, 0.97]	0.96 [0.95, 0.98]	1124.95 [889.96, 1359.93]	1152.40 [971.15, 1333.64]	27.45 [-991.47, 1046.37]	264.08 [53.11, 475.05]	22.47 [6.94, 38.00]	0.32 [-0.20, 0.69]
	GsdIonescu	MobGap	17	0.83 [0.75, 0.91]	0.79 [0.76, 0.82]	0.79 [0.73, 0.86]	0.95 [0.93, 0.97]	0.97 [0.97, 0.98]	1124.95 [889.96, 1359.93]	1112.17 [921.13, 1303.21]	-12.77 [-980.07, 954.52]	219.68 [11.13, 428.22]	14.89 [6.70, 23.08]	0.41 [-0.09, 0.74]
	GsdIonescu	Original Implementation	17	0.80 [0.72, 0.88]	0.80 [0.78, 0.83]	0.79 [0.73, 0.85]	0.95 [0.93, 0.97]	0.98 [0.97, 0.98]	1124.95 [889.96, 1359.93]	1059.43 [880.85, 1238.01]	-65.52 [-1017.04, 886.00]	175.11 [-41.52, 391.73]	10.46 [1.97, 18.95]	0.40 [-0.10, 0.73]
MS	GsdAdaptiveIonescu	MobGap	18	0.95 [0.93, 0.98]	0.70 [0.58, 0.82]	0.77 [0.67, 0.87]	0.94 [0.92, 0.96]	0.95 [0.92, 0.97]	1251.49 [877.21, 1625.78]	1587.22 [1314.72, 1859.71]	335.72 [-473.80, 1145.25]	394.59 [231.21, 557.97]	93.47 [11.78, 175.15]	0.75 [0.26, 0.91]
	GsdAdaptiveIonescu	Original Implementation	18	0.93 [0.90, 0.95]	0.73 [0.61, 0.85]	0.78 [0.69, 0.88]	0.95 [0.93, 0.96]	0.96 [0.93, 0.98]	1251.49 [877.21, 1625.78]	1474.04 [1215.71, 1732.36]	222.54 [-583.23, 1028.32]	335.31 [187.48, 483.15]	81.38 [8.11, 154.65]	0.79 [0.49, 0.92]
	GsdIluz	MobGap	18	0.96 [0.92, 0.99]	0.71 [0.66, 0.77]^*	0.81 [0.77, 0.85]	0.95 [0.94, 0.96]^*	0.95 [0.93, 0.96]^*	1251.49 [877.21, 1625.78]	1631.56 [1174.19, 2088.93]	380.07 [-97.44, 857.57]	380.07 [267.52, 492.62]	37.78 [26.16, 49.41]^*	0.89 [0.05, 0.97]
		MobGap (original peak)	18	0.82 [0.74, 0.91]	0.91 [0.88, 0.93]^**	0.85 [0.79, 0.92]	0.97 [0.97, 0.98]^*	0.99 [0.98, 0.99]^**	1251.49 [877.21, 1625.78]	1185.03 [801.15, 1568.90]	-66.47 [-361.78, 228.84]	100.19 [40.39, 159.98]	11.65 [3.40, 19.89]	0.98 [0.95, 0.99]
		Original Implementation	18	0.92 [0.86, 0.97]	0.79 [0.74, 0.83]	0.84 [0.80, 0.89]	0.96 [0.96, 0.97]	0.96 [0.95, 0.97]	1251.49 [877.21, 1625.78]	1452.55 [1023.62, 1881.47]	201.05 [-185.38, 587.49]	228.60 [153.63, 303.58]	20.21 [14.84, 25.58]	0.95 [0.62, 0.99]
	GsdIonescu	MobGap	18	0.93 [0.89, 0.96]	0.79 [0.74, 0.84]	0.85 [0.81, 0.89]	0.96 [0.95, 0.97]	0.96 [0.95, 0.98]	1251.49 [877.21, 1625.78]	1463.24 [1011.32, 1915.16]	211.75 [-273.11, 696.61]	232.25 [127.35, 337.15]	20.34 [13.21, 27.46]	0.94 [0.68, 0.98]
	GsdIonescu	Original Implementation	18	0.91 [0.87, 0.95]	0.80 [0.75, 0.85]	0.85 [0.81, 0.89]	0.96 [0.96, 0.97]	0.97 [0.96, 0.98]	1251.49 [877.21, 1625.78]	1417.05 [981.70, 1852.40]	165.55 [-262.59, 593.69]	193.15 [104.00, 282.29]	17.22 [10.54, 23.91]	0.95 [0.79, 0.99]
PD	GsdAdaptiveIonescu	MobGap	19	0.94 [0.90, 0.97]	0.70 [0.57, 0.83]	0.75 [0.65, 0.86]	0.94 [0.92, 0.97]	0.95 [0.92, 0.98]	1157.40 [701.98, 1612.83]	1417.78 [1054.61, 1780.95]	260.38 [-1176.91, 1697.67]	466.07 [188.87, 743.26]	124.15 [7.36, 240.93]	0.66 [0.32, 0.85]
	GsdAdaptiveIonescu	Original Implementation	19	0.92 [0.88, 0.96]	0.73 [0.60, 0.86]	0.76 [0.66, 0.86]	0.95 [0.92, 0.97]	0.96 [0.94, 0.98]	1157.40 [701.98, 1612.83]	1327.51 [984.69, 1670.32]	170.10 [-1091.46, 1431.67]	413.48 [182.28, 644.68]	108.13 [9.44, 206.83]	0.74 [0.45, 0.89]
	GsdIluz	MobGap	19	0.95 [0.93, 0.98]^*	0.79 [0.72, 0.85]	0.86 [0.80, 0.91]	0.96 [0.95, 0.98]	0.96 [0.94, 0.98]	1157.40 [701.98, 1612.83]	1360.28 [850.90, 1869.65]	202.88 [-169.75, 575.51]	204.11 [119.25, 288.97]	27.15 [9.48, 44.83]	0.97 [0.70, 0.99]
		MobGap (original peak)	19	0.75 [0.64, 0.86]	0.90 [0.80, 1.00]	0.81 [0.71, 0.91]	0.97 [0.96, 0.98]	0.99 [0.99, 1.00]^**	1157.40 [701.98, 1612.83]	979.61 [540.22, 1418.99]	-177.80 [-477.08, 121.49]	181.69 [115.23, 248.14]	21.79 [11.93, 31.65]	0.97 [0.68, 0.99]
		Original Implementation	19	0.86 [0.78, 0.93]	0.84 [0.77, 0.91]	0.84 [0.77, 0.91]	0.97 [0.96, 0.98]	0.98 [0.97, 0.99]	1157.40 [701.98, 1612.83]	1177.75 [691.49, 1664.01]	20.35 [-271.37, 312.07]	103.50 [55.74, 151.26]	13.38 [8.08, 18.68]	0.99 [0.98, 1.00]
	GsdIonescu	MobGap	19	0.86 [0.79, 0.93]	0.85 [0.77, 0.92]	0.84 [0.78, 0.91]	0.97 [0.96, 0.98]	0.98 [0.97, 0.99]	1157.40 [701.98, 1612.83]	1181.56 [689.78, 1673.34]	24.16 [-341.15, 389.47]	139.33 [84.52, 194.15]	20.05 [10.65, 29.46]	0.98 [0.96, 0.99]
	GsdIonescu	Original Implementation	19	0.84 [0.76, 0.91]	0.86 [0.79, 0.94]	0.84 [0.77, 0.90]	0.97 [0.96, 0.98]	0.98 [0.97, 0.99]	1157.40 [701.98, 1612.83]	1146.39 [664.16, 1628.61]	-11.02 [-359.25, 337.21]	136.08 [86.51, 185.64]	19.84 [10.58, 29.11]	0.99 [0.96, 0.99]
PFF	GsdAdaptiveIonescu	MobGap	17	0.91 [0.80, 1.02]	0.51 [0.36, 0.67]	0.60 [0.44, 0.75]	0.89 [0.85, 0.93]	0.89 [0.85, 0.93]	873.05 [560.20, 1185.89]	1808.75 [1401.95, 2215.54]	935.70 [-1174.63, 3046.03]	936.77 [425.41, 1448.13]	465.43 [-129.92, 1060.79]	0.00 [-0.23, 0.34]
	GsdAdaptiveIonescu	Original Implementation	17	0.89 [0.78, 1.00]	0.54 [0.38, 0.70]	0.61 [0.46, 0.76]	0.91 [0.88, 0.95]	0.92 [0.88, 0.96]	873.05 [560.20, 1185.89]	1528.41 [1119.58, 1937.25]	655.37 [-1282.37, 2593.10]	673.15 [209.26, 1137.05]	424.54 [-133.18, 982.26]	0.13 [-0.21, 0.51]
	GsdIluz	MobGap	17	0.80 [0.66, 0.93]	0.69 [0.58, 0.81]	0.73 [0.61, 0.84]	0.96 [0.95, 0.98]	0.96 [0.95, 0.98]	873.05 [560.20, 1185.89]	1091.64 [689.35, 1493.93]	218.60 [-326.07, 763.26]	241.51 [119.37, 363.66]	30.05 [13.41, 46.68]	0.90 [0.58, 0.97]
		MobGap (original peak)	17	0.48 [0.32, 0.64]	0.65 [0.44, 0.86]	0.54 [0.37, 0.72]	0.96 [0.94, 0.97]	0.99 [0.99, 1.00]^*	873.05 [560.20, 1185.89]	576.99 [339.50, 814.48]	-296.06 [-813.04, 220.93]	296.06 [170.67, 421.44]	42.12 [26.54, 57.70]	0.80 [0.11, 0.94]
		Original Implementation	17	0.63 [0.45, 0.80]	0.66 [0.51, 0.82]	0.62 [0.45, 0.79]	0.96 [0.95, 0.97]	0.98 [0.97, 0.99]	873.05 [560.20, 1185.89]	861.37 [515.16, 1207.58]	-11.68 [-383.47, 360.12]	136.35 [75.52, 197.18]	26.66 [11.19, 42.13]	0.96 [0.91, 0.99]
	GsdIonescu	MobGap	17	0.70 [0.56, 0.84]	0.79 [0.68, 0.91]	0.72 [0.59, 0.85]	0.97 [0.96, 0.98]	0.98 [0.98, 0.99]	873.05 [560.20, 1185.89]	846.95 [539.60, 1154.30]	-26.09 [-367.42, 315.23]	130.15 [75.84, 184.46]	24.57 [10.94, 38.19]	0.97 [0.91, 0.99]
	GsdIonescu	Original Implementation	17	0.68 [0.54, 0.82]	0.80 [0.69, 0.92]	0.71 [0.58, 0.84]	0.97 [0.96, 0.98]	0.99 [0.98, 0.99]	873.05 [560.20, 1185.89]	810.96 [518.48, 1103.43]	-62.09 [-408.99, 284.81]	126.58 [62.10, 191.05]	24.31 [10.75, 37.86]	0.96 [0.89, 0.98]

Per relevant cohort#

Overview over all cohorts is good, but this is not how the GSD algorithms are used in our main pipeline. Here, the HA, CHF, and COPD cohort use the GsdIluz algorithm, while the GsdIonescu algorithm is used for the MS, PD, PFF cohorts. Let’s look at the performance of these algorithms on the respective cohorts.

from mobgap.pipeline import MobilisedPipelineHealthy, MobilisedPipelineImpaired

low_impairment_algo = "GsdIluz"
low_impairment_cohorts = list(MobilisedPipelineHealthy().recommended_cohorts)

low_impairment_results = results_long[
    results_long["cohort"].isin(low_impairment_cohorts)
].query("algo == @low_impairment_algo")

fig, ax = plt.subplots()
sns.boxplot(
    data=low_impairment_results,
    x="cohort",
    y="f1_score",
    hue="version",
    hue_order=hue_order,
    ax=ax,
)
sns.boxplot(
    data=low_impairment_results,
    x="_combined",
    y="f1_score",
    hue="version",
    hue_order=hue_order,
    legend=False,
    ax=ax,
)
fig.suptitle(f"Low Impairment Cohorts ({low_impairment_algo})")
fig.show()

perf_metrics_per_cohort.copy().loc[
    pd.IndexSlice[low_impairment_cohorts, low_impairment_algo], :
].reset_index("algo", drop=True).style.pipe(
    revalidation_table_styles,
    validation_thresholds,
    ["cohort"],
)

		General	GSD					GS duration
		# recordings	Recall	Precision	F1 Score	Accuracy	Specificity	INDIP mean and CI [s]	WD mean and CI [s]	Bias and LoA [s]	Abs. Error [s]	Abs. Rel. Error [%]	ICC
cohort	version
HA	MobGap	20	0.94 [0.92, 0.96]^*	0.77 [0.73, 0.82]^*	0.85 [0.81, 0.88]	0.94 [0.93, 0.95]	0.94 [0.93, 0.95]^**	1684.35 [1352.20, 2016.50]	2021.26 [1679.70, 2362.82]	336.91 [-72.79, 746.62]	336.91 [245.30, 428.53]	22.78 [15.88, 29.67]^*	0.88 [0.03, 0.97]
	MobGap (original peak)	20	0.77 [0.71, 0.82]	0.93 [0.92, 0.95]^**	0.84 [0.80, 0.87]	0.95 [0.95, 0.96]	0.99 [0.99, 0.99]^**	1684.35 [1352.20, 2016.50]	1405.44 [1066.12, 1744.76]	-278.91 [-630.01, 72.19]	278.91 [200.40, 357.42]	18.17 [12.75, 23.59]	0.91 [0.10, 0.98]
	Original Implementation	20	0.84 [0.77, 0.92]	0.83 [0.80, 0.87]	0.83 [0.77, 0.88]	0.95 [0.94, 0.96]	0.97 [0.96, 0.97]	1684.35 [1352.20, 2016.50]	1709.24 [1341.94, 2076.54]	24.89 [-579.36, 629.15]	174.99 [64.57, 285.40]	12.13 [4.64, 19.63]	0.93 [0.83, 0.97]
COPD	MobGap	17	0.87 [0.79, 0.95]	0.69 [0.63, 0.75]	0.76 [0.69, 0.82]	0.93 [0.91, 0.96]	0.95 [0.93, 0.97]	1124.95 [889.96, 1359.93]	1353.32 [1147.54, 1559.10]	228.37 [-821.39, 1278.13]	413.10 [222.39, 603.82]	37.82 [21.48, 54.16]	0.31 [-0.13, 0.67]
	MobGap (original peak)	17	0.63 [0.56, 0.71]^**	0.88 [0.81, 0.95]^*	0.72 [0.64, 0.79]	0.94 [0.92, 0.97]	0.99 [0.97, 1.00]^*	1124.95 [889.96, 1359.93]	790.23 [631.57, 948.88]	-334.72 [-1397.73, 728.29]	419.70 [193.70, 645.71]	34.04 [23.32, 44.76]	0.14 [-0.22, 0.52]
	Original Implementation	17	0.80 [0.73, 0.88]	0.75 [0.69, 0.82]	0.76 [0.69, 0.83]	0.94 [0.92, 0.97]	0.96 [0.95, 0.98]	1124.95 [889.96, 1359.93]	1152.40 [971.15, 1333.64]	27.45 [-991.47, 1046.37]	264.08 [53.11, 475.05]	22.47 [6.94, 38.00]	0.32 [-0.20, 0.69]
CHF	MobGap	10	0.95 [0.93, 0.98]	0.76 [0.67, 0.85]	0.84 [0.77, 0.90]	0.96 [0.94, 0.97]	0.95 [0.92, 0.98]	1552.02 [579.11, 2524.93]	1874.87 [849.58, 2900.16]	322.85 [-192.66, 838.36]	322.85 [159.83, 485.87]	31.12 [12.65, 49.59]	0.97 [0.55, 0.99]
	MobGap (original peak)	10	0.80 [0.74, 0.85]	0.93 [0.89, 0.96]^*	0.86 [0.81, 0.90]	0.97 [0.96, 0.98]	0.99 [0.98, 1.00]	1552.02 [579.11, 2524.93]	1413.81 [431.42, 2396.20]	-138.21 [-275.80, -0.62]	138.21 [94.70, 181.72]	14.04 [9.83, 18.26]	1.00 [0.65, 1.00]
	Original Implementation	10	0.84 [0.70, 0.98]	0.81 [0.74, 0.89]	0.81 [0.69, 0.93]	0.96 [0.94, 0.98]	0.97 [0.94, 0.99]	1552.02 [579.11, 2524.93]	1595.96 [555.96, 2635.96]	43.94 [-693.30, 781.18]	239.58 [64.44, 414.73]	20.25 [5.65, 34.86]	0.98 [0.91, 0.99]

high_impairment_algo = "GsdIonescu"
high_impairment_cohorts = list(MobilisedPipelineImpaired().recommended_cohorts)

high_impairment_results = results_long[
    results_long["cohort"].isin(high_impairment_cohorts)
].query("algo == @high_impairment_algo")

hue_order = ["Original Implementation", "MobGap"]

fig, ax = plt.subplots()
sns.boxplot(
    data=high_impairment_results,
    x="cohort",
    y="f1_score",
    hue="version",
    hue_order=hue_order,
    ax=ax,
)
sns.boxplot(
    data=high_impairment_results,
    x="_combined",
    y="f1_score",
    hue="version",
    hue_order=hue_order,
    legend=False,
    ax=ax,
)
fig.suptitle(f"High Impairment Cohorts ({high_impairment_algo})")
fig.show()

perf_metrics_per_cohort.copy().loc[
    pd.IndexSlice[high_impairment_cohorts, high_impairment_algo], :
].reset_index("algo", drop=True).style.pipe(
    revalidation_table_styles,
    validation_thresholds,
    ["cohort"],
)

		General	GSD					GS duration
		# recordings	Recall	Precision	F1 Score	Accuracy	Specificity	INDIP mean and CI [s]	WD mean and CI [s]	Bias and LoA [s]	Abs. Error [s]	Abs. Rel. Error [%]	ICC
cohort	version
PD	MobGap	19	0.86 [0.79, 0.93]	0.85 [0.77, 0.92]	0.84 [0.78, 0.91]	0.97 [0.96, 0.98]	0.98 [0.97, 0.99]	1157.40 [701.98, 1612.83]	1181.56 [689.78, 1673.34]	24.16 [-341.15, 389.47]	139.33 [84.52, 194.15]	20.05 [10.65, 29.46]	0.98 [0.96, 0.99]
PD	Original Implementation	19	0.84 [0.76, 0.91]	0.86 [0.79, 0.94]	0.84 [0.77, 0.90]	0.97 [0.96, 0.98]	0.98 [0.97, 0.99]	1157.40 [701.98, 1612.83]	1146.39 [664.16, 1628.61]	-11.02 [-359.25, 337.21]	136.08 [86.51, 185.64]	19.84 [10.58, 29.11]	0.99 [0.96, 0.99]
MS	MobGap	18	0.93 [0.89, 0.96]	0.79 [0.74, 0.84]	0.85 [0.81, 0.89]	0.96 [0.95, 0.97]	0.96 [0.95, 0.98]	1251.49 [877.21, 1625.78]	1463.24 [1011.32, 1915.16]	211.75 [-273.11, 696.61]	232.25 [127.35, 337.15]	20.34 [13.21, 27.46]	0.94 [0.68, 0.98]
MS	Original Implementation	18	0.91 [0.87, 0.95]	0.80 [0.75, 0.85]	0.85 [0.81, 0.89]	0.96 [0.96, 0.97]	0.97 [0.96, 0.98]	1251.49 [877.21, 1625.78]	1417.05 [981.70, 1852.40]	165.55 [-262.59, 593.69]	193.15 [104.00, 282.29]	17.22 [10.54, 23.91]	0.95 [0.79, 0.99]
PFF	MobGap	17	0.70 [0.56, 0.84]	0.79 [0.68, 0.91]	0.72 [0.59, 0.85]	0.97 [0.96, 0.98]	0.98 [0.98, 0.99]	873.05 [560.20, 1185.89]	846.95 [539.60, 1154.30]	-26.09 [-367.42, 315.23]	130.15 [75.84, 184.46]	24.57 [10.94, 38.19]	0.97 [0.91, 0.99]
PFF	Original Implementation	17	0.68 [0.54, 0.82]	0.80 [0.69, 0.92]	0.71 [0.58, 0.84]	0.97 [0.96, 0.98]	0.99 [0.98, 0.99]	873.05 [560.20, 1185.89]	810.96 [518.48, 1103.43]	-62.09 [-408.99, 284.81]	126.58 [62.10, 191.05]	24.31 [10.75, 37.86]	0.96 [0.89, 0.98]

Laboratory Comparison#

Every datapoint below is one trial of a test. Note, that each datapoint is weighted equally in the calculation of the performance metrics. This is a limitation of this simple approach, as the number of strides per trial and the complexity of the context can vary significantly. For a full picture, different groups of tests should be analyzed separately. The approach below should still provide a good overview to compare the algorithms.

All results across all cohorts#

hue_order = ["Original Implementation", "MobGap", "MobGap (original peak)"]

fig, ax = plt.subplots()
sns.boxplot(
    data=lab_results_long,
    x="algo",
    y="f1_score",
    hue="version",
    hue_order=hue_order,
    ax=ax,
)
fig.show()

perf_metrics_all = lab_results_long.pipe(
    multilevel_groupby_apply_merge,
    [
        (
            ["cohort", "algo", "version"],
            partial(apply_aggregations, aggregations=custom_aggs),
        ),
        (
            ["cohort", "algo"],
            partial(apply_transformations, transformations=stats_transform),
        ),
    ],
).pipe(format_results)
perf_metrics_all.style.pipe(
    revalidation_table_styles,
    validation_thresholds,
    ["algo"],
)

			General	GSD					GS duration
			# recordings	Recall	Precision	F1 Score	Accuracy	Specificity	INDIP mean and CI [s]	WD mean and CI [s]	Bias and LoA [s]	Abs. Error [s]	Abs. Rel. Error [%]	ICC
cohort	algo	version
CHF	GsdAdaptiveIonescu	MobGap	106	0.82 [0.76, 0.88]	0.73 [0.67, 0.79]	0.76 [0.71, 0.82]	0.84 [0.82, 0.87]	0.83 [0.80, 0.86]	11.68 [9.20, 14.16]	12.88 [10.10, 15.66]	1.20 [-8.87, 11.27]	2.88 [2.04, 3.72]	26.68 [21.64, 31.73]	0.93 [0.89, 0.95]
	GsdAdaptiveIonescu	Original Implementation	106	0.78 [0.71, 0.84]	0.71 [0.65, 0.77]	0.73 [0.67, 0.79]	0.85 [0.83, 0.88]	0.85 [0.82, 0.88]	11.68 [9.20, 14.16]	11.74 [9.20, 14.28]	0.06 [-11.59, 11.70]	3.21 [2.26, 4.16]	30.60 [25.05, 36.14]	0.90 [0.86, 0.93]
	GsdIluz	MobGap	106	0.90 [0.85, 0.95]^**	0.61 [0.57, 0.66]	0.72 [0.67, 0.76]	0.72 [0.68, 0.76]^**	0.53 [0.48, 0.57]^**	11.68 [9.20, 14.16]	16.02 [13.29, 18.75]	4.33 [-4.51, 13.18]	4.61 [3.81, 5.42]	58.82 [49.28, 68.37]	0.90 [0.52, 0.96]
		MobGap (original peak)	106	0.78 [0.71, 0.86]	0.62 [0.56, 0.68]	0.67 [0.61, 0.73]	0.77 [0.73, 0.80]	0.70 [0.65, 0.74]	11.68 [9.20, 14.16]	11.24 [9.59, 12.89]	-0.44 [-18.32, 17.45]	4.93 [3.46, 6.39]	49.91 [41.96, 57.86]	0.66 [0.54, 0.76]
		Original Implementation	106	0.76 [0.69, 0.84]	0.60 [0.53, 0.66]	0.66 [0.59, 0.72]	0.79 [0.76, 0.82]	0.71 [0.67, 0.75]	11.68 [9.20, 14.16]	12.97 [10.41, 15.53]	1.29 [-7.91, 10.49]	3.40 [2.74, 4.06]	46.68 [39.14, 54.21]	0.93 [0.90, 0.96]
	GsdIonescu	MobGap	106	0.90 [0.84, 0.95]	0.69 [0.64, 0.73]	0.77 [0.72, 0.82]	0.82 [0.80, 0.84]	0.70 [0.67, 0.73]	11.68 [9.20, 14.16]	14.34 [11.74, 16.94]	2.65 [-3.01, 8.32]	3.00 [2.52, 3.48]	35.76 [31.02, 40.51]	0.96 [0.77, 0.98]
	GsdIonescu	Original Implementation	106	0.89 [0.84, 0.94]	0.69 [0.64, 0.73]	0.77 [0.72, 0.82]	0.82 [0.80, 0.84]	0.70 [0.67, 0.73]	11.68 [9.20, 14.16]	13.93 [11.57, 16.29]	2.25 [-4.95, 9.44]	3.11 [2.54, 3.67]	35.79 [31.09, 40.50]	0.94 [0.86, 0.97]
COPD	GsdAdaptiveIonescu	MobGap	214	0.75 [0.70, 0.81]	0.65 [0.60, 0.70]	0.69 [0.64, 0.74]	0.90 [0.88, 0.91]	0.89 [0.88, 0.91]	10.85 [9.36, 12.34]	13.44 [11.34, 15.55]	2.59 [-13.40, 18.57]	3.39 [2.34, 4.44]	33.24 [24.81, 41.66]	0.81 [0.73, 0.86]
	GsdAdaptiveIonescu	Original Implementation	214	0.78 [0.73, 0.83]	0.68 [0.63, 0.73]	0.71 [0.67, 0.76]	0.90 [0.88, 0.91]	0.89 [0.88, 0.91]	10.85 [9.36, 12.34]	13.04 [11.12, 14.97]	2.19 [-11.35, 15.73]	3.02 [2.14, 3.90]	28.96 [21.31, 36.62]	0.84 [0.78, 0.89]
	GsdIluz	MobGap	214	0.79 [0.73, 0.84]^**	0.54 [0.50, 0.58]	0.63 [0.58, 0.67]	0.78 [0.76, 0.80]^**	0.69 [0.67, 0.71]^**	10.85 [9.36, 12.34]	14.89 [13.32, 16.45]	4.03 [-4.30, 12.37]	4.52 [4.02, 5.02]	59.31 [52.47, 66.14]^**	0.88 [0.47, 0.95]
		MobGap (original peak)	214	0.66 [0.59, 0.72]	0.55 [0.50, 0.60]	0.58 [0.53, 0.63]	0.86 [0.85, 0.88]	0.85 [0.83, 0.87]	10.85 [9.36, 12.34]	10.47 [9.30, 11.63]	-0.39 [-14.05, 13.28]	3.78 [2.99, 4.56]	45.54 [40.19, 50.89]	0.76 [0.69, 0.81]
		Original Implementation	214	0.66 [0.60, 0.72]	0.53 [0.48, 0.58]	0.58 [0.53, 0.64]	0.87 [0.86, 0.88]	0.85 [0.83, 0.87]	10.85 [9.36, 12.34]	12.18 [10.58, 13.78]	1.33 [-6.72, 9.38]	2.78 [2.34, 3.22]	42.60 [37.14, 48.05]	0.93 [0.90, 0.95]
	GsdIonescu	MobGap	214	0.83 [0.78, 0.87]	0.62 [0.58, 0.66]	0.70 [0.66, 0.75]	0.85 [0.84, 0.87]	0.79 [0.77, 0.80]	10.85 [9.36, 12.34]	14.20 [12.52, 15.88]	3.34 [-4.85, 11.54]	3.52 [2.98, 4.06]	37.80 [33.73, 41.86]	0.90 [0.66, 0.96]
	GsdIonescu	Original Implementation	214	0.82 [0.77, 0.87]	0.63 [0.58, 0.67]	0.70 [0.66, 0.75]	0.85 [0.84, 0.87]	0.79 [0.77, 0.80]	10.85 [9.36, 12.34]	14.01 [12.41, 15.61]	3.16 [-4.84, 11.15]	3.44 [2.92, 3.95]	38.17 [33.66, 42.68]	0.90 [0.68, 0.96]
HA	GsdAdaptiveIonescu	MobGap	227	0.78 [0.73, 0.83]	0.65 [0.61, 0.70]	0.70 [0.65, 0.74]	0.86 [0.84, 0.87]^*	0.84 [0.82, 0.85]	9.34 [8.00, 10.67]	12.26 [10.27, 14.25]	2.92 [-11.31, 17.15]	3.51 [2.60, 4.42]	31.56 [26.73, 36.38]	0.82 [0.74, 0.88]
	GsdAdaptiveIonescu	Original Implementation	227	0.77 [0.72, 0.82]	0.64 [0.59, 0.68]	0.69 [0.64, 0.73]	0.88 [0.87, 0.89]	0.85 [0.84, 0.87]	9.34 [8.00, 10.67]	11.20 [9.53, 12.87]	1.86 [-8.25, 11.97]	2.71 [2.09, 3.33]	32.85 [27.62, 38.08]	0.89 [0.84, 0.92]
	GsdIluz	MobGap	227	0.80 [0.75, 0.85]^**	0.53 [0.49, 0.56]	0.63 [0.59, 0.67]	0.75 [0.73, 0.77]^**	0.61 [0.59, 0.64]^**	9.34 [8.00, 10.67]	13.53 [11.94, 15.11]	4.19 [-5.35, 13.73]	4.50 [3.90, 5.10]	60.68 [55.08, 66.28]^**	0.85 [0.48, 0.93]
		MobGap (original peak)	227	0.71 [0.65, 0.76]	0.55 [0.51, 0.60]	0.61 [0.56, 0.65]	0.83 [0.81, 0.84]	0.77 [0.74, 0.79]	9.34 [8.00, 10.67]	9.54 [8.60, 10.48]	0.20 [-12.55, 12.96]	3.82 [3.14, 4.51]	48.22 [43.50, 52.95]	0.73 [0.66, 0.79]
		Original Implementation	227	0.69 [0.64, 0.75]	0.53 [0.49, 0.57]	0.59 [0.55, 0.64]	0.84 [0.82, 0.85]	0.78 [0.76, 0.81]	9.34 [8.00, 10.67]	10.93 [9.45, 12.41]	1.59 [-6.00, 9.19]	2.71 [2.30, 3.13]	45.40 [40.48, 50.32]	0.93 [0.88, 0.95]
	GsdIonescu	MobGap	227	0.82 [0.77, 0.86]	0.61 [0.57, 0.65]	0.69 [0.65, 0.74]	0.83 [0.82, 0.84]	0.75 [0.73, 0.77]	9.34 [8.00, 10.67]	12.64 [11.02, 14.27]	3.31 [-4.64, 11.26]	3.43 [2.91, 3.94]	38.94 [35.03, 42.85]	0.90 [0.64, 0.96]
	GsdIonescu	Original Implementation	227	0.81 [0.77, 0.86]	0.62 [0.58, 0.65]	0.70 [0.65, 0.74]	0.83 [0.82, 0.85]	0.75 [0.73, 0.77]	9.34 [8.00, 10.67]	12.20 [10.75, 13.65]	2.86 [-2.95, 8.67]	3.02 [2.65, 3.38]	36.93 [33.25, 40.61]	0.93 [0.62, 0.97]
MS	GsdAdaptiveIonescu	MobGap	228	0.87 [0.83, 0.90]	0.79 [0.76, 0.82]	0.81 [0.78, 0.84]	0.86 [0.84, 0.88]	0.85 [0.83, 0.87]	12.85 [11.34, 14.37]	14.76 [12.53, 16.99]	1.91 [-17.00, 20.82]	3.89 [2.72, 5.06]	27.31 [22.62, 32.01]	0.78 [0.72, 0.83]
	GsdAdaptiveIonescu	Original Implementation	228	0.86 [0.83, 0.89]	0.77 [0.74, 0.81]	0.80 [0.77, 0.83]	0.86 [0.84, 0.87]	0.84 [0.82, 0.86]	12.85 [11.34, 14.37]	14.30 [12.31, 16.29]	1.45 [-14.66, 17.55]	3.71 [2.74, 4.68]	28.57 [23.99, 33.16]	0.81 [0.76, 0.85]
	GsdIluz	MobGap	228	0.95 [0.92, 0.97]^**	0.64 [0.62, 0.67]	0.75 [0.73, 0.78]	0.76 [0.74, 0.77]^**	0.54 [0.52, 0.57]^**	12.85 [11.34, 14.37]	18.26 [16.11, 20.41]	5.40 [-10.45, 21.25]	5.59 [4.56, 6.63]	57.94 [52.34, 63.54]^**	0.79 [0.53, 0.88]
		MobGap (original peak)	228	0.86 [0.82, 0.90]	0.70 [0.66, 0.73]	0.75 [0.72, 0.79]	0.81 [0.79, 0.83]	0.73 [0.70, 0.75]	12.85 [11.34, 14.37]	13.39 [12.07, 14.72]	0.54 [-13.52, 14.60]	4.10 [3.33, 4.87]	42.59 [37.75, 47.43]	0.79 [0.73, 0.83]
		Original Implementation	228	0.86 [0.82, 0.90]	0.66 [0.62, 0.70]	0.74 [0.70, 0.77]	0.81 [0.80, 0.83]	0.72 [0.69, 0.74]	12.85 [11.34, 14.37]	15.28 [13.40, 17.15]	2.42 [-11.07, 15.91]	4.04 [3.25, 4.83]	43.82 [38.94, 48.69]	0.85 [0.78, 0.89]
	GsdIonescu	MobGap	228	0.94 [0.91, 0.96]	0.72 [0.70, 0.75]	0.81 [0.78, 0.83]	0.83 [0.82, 0.85]	0.72 [0.70, 0.74]	12.85 [11.34, 14.37]	15.92 [14.06, 17.78]	3.07 [-9.41, 15.55]	3.87 [3.10, 4.64]	37.62 [33.46, 41.77]	0.86 [0.75, 0.91]
	GsdIonescu	Original Implementation	228	0.93 [0.90, 0.96]	0.72 [0.70, 0.75]	0.80 [0.78, 0.83]	0.83 [0.82, 0.85]	0.72 [0.70, 0.74]	12.85 [11.34, 14.37]	15.42 [13.72, 17.12]	2.56 [-8.23, 13.36]	3.60 [2.97, 4.24]	37.13 [33.12, 41.15]	0.88 [0.80, 0.93]
PD	GsdAdaptiveIonescu	MobGap	224	0.73 [0.68, 0.78]	0.67 [0.63, 0.72]	0.68 [0.64, 0.73]	0.84 [0.82, 0.86]	0.87 [0.85, 0.89]	11.83 [10.03, 13.62]	13.35 [11.13, 15.56]	1.52 [-14.61, 17.66]	4.04 [3.08, 5.00]	33.45 [27.88, 39.01]	0.85 [0.81, 0.89]
	GsdAdaptiveIonescu	Original Implementation	224	0.75 [0.70, 0.79]	0.69 [0.65, 0.74]	0.70 [0.66, 0.75]	0.86 [0.85, 0.88]	0.87 [0.85, 0.89]	11.83 [10.03, 13.62]	12.81 [10.78, 14.85]	0.99 [-12.59, 14.57]	3.43 [2.63, 4.23]	29.43 [24.73, 34.13]	0.89 [0.85, 0.91]
	GsdIluz	MobGap	224	0.85 [0.80, 0.89]^**	0.58 [0.55, 0.62]	0.68 [0.64, 0.72]	0.75 [0.73, 0.78]^**	0.60 [0.57, 0.63]^**	11.83 [10.03, 13.62]	16.53 [14.43, 18.64]	4.71 [-5.16, 14.58]	4.98 [4.36, 5.60]	54.21 [49.04, 59.38]^**	0.90 [0.54, 0.96]
		MobGap (original peak)	224	0.71 [0.65, 0.76]	0.60 [0.55, 0.65]	0.63 [0.58, 0.68]	0.82 [0.79, 0.84]	0.79 [0.77, 0.82]	11.83 [10.03, 13.62]	10.92 [9.76, 12.07]	-0.91 [-19.57, 17.75]	4.52 [3.42, 5.62]	42.79 [38.23, 47.36]	0.66 [0.58, 0.73]
		Original Implementation	224	0.72 [0.66, 0.77]	0.58 [0.53, 0.62]	0.63 [0.58, 0.68]	0.83 [0.82, 0.85]	0.79 [0.77, 0.82]	11.83 [10.03, 13.62]	13.25 [11.41, 15.10]	1.43 [-8.13, 10.99]	3.15 [2.62, 3.67]	40.07 [35.68, 44.45]	0.93 [0.91, 0.95]
	GsdIonescu	MobGap	224	0.84 [0.79, 0.88]	0.65 [0.61, 0.69]	0.73 [0.69, 0.77]	0.83 [0.82, 0.84]	0.74 [0.72, 0.76]	11.83 [10.03, 13.62]	14.68 [12.84, 16.53]	2.86 [-6.05, 11.77]	3.58 [3.06, 4.10]	35.07 [31.50, 38.64]	0.93 [0.82, 0.96]
	GsdIonescu	Original Implementation	224	0.83 [0.79, 0.87]	0.66 [0.62, 0.69]	0.72 [0.69, 0.76]	0.83 [0.81, 0.84]	0.74 [0.72, 0.76]	11.83 [10.03, 13.62]	14.23 [12.54, 15.92]	2.40 [-5.97, 10.78]	3.32 [2.85, 3.79]	33.51 [30.24, 36.78]	0.93 [0.86, 0.96]
PFF	GsdAdaptiveIonescu	MobGap	169	0.89 [0.86, 0.92]	0.81 [0.78, 0.84]	0.83 [0.80, 0.85]	0.84 [0.82, 0.86]	0.81 [0.78, 0.83]	17.62 [14.73, 20.51]	20.51 [16.49, 24.52]	2.89 [-25.49, 31.27]	6.45 [4.45, 8.45]	31.05 [25.52, 36.59]	0.80 [0.74, 0.85]
	GsdAdaptiveIonescu	Original Implementation	169	0.87 [0.84, 0.91]	0.79 [0.75, 0.82]	0.81 [0.78, 0.84]	0.82 [0.80, 0.84]	0.80 [0.77, 0.83]	17.62 [14.73, 20.51]	19.69 [16.04, 23.34]	2.07 [-22.44, 26.57]	6.08 [4.41, 7.76]	32.81 [27.38, 38.23]	0.83 [0.78, 0.87]
	GsdIluz	MobGap	169	0.93 [0.90, 0.95]^**	0.70 [0.67, 0.73]^*	0.77 [0.75, 0.80]^**	0.77 [0.75, 0.78]	0.59 [0.56, 0.62]^**	17.62 [14.73, 20.51]	20.54 [17.41, 23.68]	2.92 [-18.67, 24.51]	6.64 [5.25, 8.03]	50.80 [44.61, 57.00]	0.84 [0.78, 0.88]
		MobGap (original peak)	169	0.67 [0.60, 0.73]	0.58 [0.53, 0.64]	0.59 [0.54, 0.65]	0.72 [0.69, 0.75]	0.80 [0.77, 0.83]	17.62 [14.73, 20.51]	10.24 [8.73, 11.76]	-7.38 [-45.37, 30.62]	10.23 [7.51, 12.95]	56.51 [50.51, 62.51]	0.18 [0.03, 0.31]
		Original Implementation	169	0.74 [0.68, 0.79]	0.64 [0.59, 0.69]	0.66 [0.61, 0.71]	0.76 [0.73, 0.78]	0.78 [0.75, 0.81]	17.62 [14.73, 20.51]	12.86 [10.95, 14.78]	-4.76 [-39.26, 29.74]	8.42 [5.98, 10.86]	48.22 [42.13, 54.32]	0.40 [0.26, 0.52]
	GsdIonescu	MobGap	169	0.89 [0.86, 0.92]	0.80 [0.77, 0.82]	0.82 [0.80, 0.85]	0.84 [0.83, 0.86]	0.79 [0.77, 0.81]	17.62 [14.73, 20.51]	16.49 [14.20, 18.79]	-1.13 [-18.98, 16.73]	4.77 [3.59, 5.95]	30.17 [26.53, 33.81]	0.86 [0.82, 0.89]
	GsdIonescu	Original Implementation	169	0.87 [0.84, 0.91]	0.80 [0.78, 0.83]	0.81 [0.79, 0.84]	0.84 [0.82, 0.85]	0.80 [0.78, 0.82]	17.62 [14.73, 20.51]	15.48 [13.50, 17.46]	-2.14 [-22.17, 17.90]	5.17 [3.80, 6.53]	31.18 [27.31, 35.06]	0.80 [0.74, 0.85]

Per Cohort#

While this provides a good overview, it does not fully reflect how these algorithms perform on the different cohorts.

fig, ax = plt.subplots()
sns.boxplot(
    data=lab_results_long,
    x="cohort",
    y="f1_score",
    hue="algo_with_version",
    ax=ax,
)
fig.show()

perf_metrics_per_cohort = (
    lab_results_long.pipe(
        multilevel_groupby_apply_merge,
        [
            (
                ["cohort", "algo", "version"],
                partial(apply_aggregations, aggregations=custom_aggs),
            ),
            (
                ["cohort", "algo"],
                partial(apply_transformations, transformations=stats_transform),
            ),
        ],
    )
    .pipe(format_results)
    .loc[cohort_order]
)
perf_metrics_per_cohort.style.pipe(
    revalidation_table_styles,
    validation_thresholds,
    ["cohort", "algo"],
)

			General	GSD					GS duration
			# recordings	Recall	Precision	F1 Score	Accuracy	Specificity	INDIP mean and CI [s]	WD mean and CI [s]	Bias and LoA [s]	Abs. Error [s]	Abs. Rel. Error [%]	ICC
cohort	algo	version
HA	GsdAdaptiveIonescu	MobGap	227	0.78 [0.73, 0.83]	0.65 [0.61, 0.70]	0.70 [0.65, 0.74]	0.86 [0.84, 0.87]^*	0.84 [0.82, 0.85]	9.34 [8.00, 10.67]	12.26 [10.27, 14.25]	2.92 [-11.31, 17.15]	3.51 [2.60, 4.42]	31.56 [26.73, 36.38]	0.82 [0.74, 0.88]
	GsdAdaptiveIonescu	Original Implementation	227	0.77 [0.72, 0.82]	0.64 [0.59, 0.68]	0.69 [0.64, 0.73]	0.88 [0.87, 0.89]	0.85 [0.84, 0.87]	9.34 [8.00, 10.67]	11.20 [9.53, 12.87]	1.86 [-8.25, 11.97]	2.71 [2.09, 3.33]	32.85 [27.62, 38.08]	0.89 [0.84, 0.92]
	GsdIluz	MobGap	227	0.80 [0.75, 0.85]^**	0.53 [0.49, 0.56]	0.63 [0.59, 0.67]	0.75 [0.73, 0.77]^**	0.61 [0.59, 0.64]^**	9.34 [8.00, 10.67]	13.53 [11.94, 15.11]	4.19 [-5.35, 13.73]	4.50 [3.90, 5.10]	60.68 [55.08, 66.28]^**	0.85 [0.48, 0.93]
		MobGap (original peak)	227	0.71 [0.65, 0.76]	0.55 [0.51, 0.60]	0.61 [0.56, 0.65]	0.83 [0.81, 0.84]	0.77 [0.74, 0.79]	9.34 [8.00, 10.67]	9.54 [8.60, 10.48]	0.20 [-12.55, 12.96]	3.82 [3.14, 4.51]	48.22 [43.50, 52.95]	0.73 [0.66, 0.79]
		Original Implementation	227	0.69 [0.64, 0.75]	0.53 [0.49, 0.57]	0.59 [0.55, 0.64]	0.84 [0.82, 0.85]	0.78 [0.76, 0.81]	9.34 [8.00, 10.67]	10.93 [9.45, 12.41]	1.59 [-6.00, 9.19]	2.71 [2.30, 3.13]	45.40 [40.48, 50.32]	0.93 [0.88, 0.95]
	GsdIonescu	MobGap	227	0.82 [0.77, 0.86]	0.61 [0.57, 0.65]	0.69 [0.65, 0.74]	0.83 [0.82, 0.84]	0.75 [0.73, 0.77]	9.34 [8.00, 10.67]	12.64 [11.02, 14.27]	3.31 [-4.64, 11.26]	3.43 [2.91, 3.94]	38.94 [35.03, 42.85]	0.90 [0.64, 0.96]
	GsdIonescu	Original Implementation	227	0.81 [0.77, 0.86]	0.62 [0.58, 0.65]	0.70 [0.65, 0.74]	0.83 [0.82, 0.85]	0.75 [0.73, 0.77]	9.34 [8.00, 10.67]	12.20 [10.75, 13.65]	2.86 [-2.95, 8.67]	3.02 [2.65, 3.38]	36.93 [33.25, 40.61]	0.93 [0.62, 0.97]
CHF	GsdAdaptiveIonescu	MobGap	106	0.82 [0.76, 0.88]	0.73 [0.67, 0.79]	0.76 [0.71, 0.82]	0.84 [0.82, 0.87]	0.83 [0.80, 0.86]	11.68 [9.20, 14.16]	12.88 [10.10, 15.66]	1.20 [-8.87, 11.27]	2.88 [2.04, 3.72]	26.68 [21.64, 31.73]	0.93 [0.89, 0.95]
	GsdAdaptiveIonescu	Original Implementation	106	0.78 [0.71, 0.84]	0.71 [0.65, 0.77]	0.73 [0.67, 0.79]	0.85 [0.83, 0.88]	0.85 [0.82, 0.88]	11.68 [9.20, 14.16]	11.74 [9.20, 14.28]	0.06 [-11.59, 11.70]	3.21 [2.26, 4.16]	30.60 [25.05, 36.14]	0.90 [0.86, 0.93]
	GsdIluz	MobGap	106	0.90 [0.85, 0.95]^**	0.61 [0.57, 0.66]	0.72 [0.67, 0.76]	0.72 [0.68, 0.76]^**	0.53 [0.48, 0.57]^**	11.68 [9.20, 14.16]	16.02 [13.29, 18.75]	4.33 [-4.51, 13.18]	4.61 [3.81, 5.42]	58.82 [49.28, 68.37]	0.90 [0.52, 0.96]
		MobGap (original peak)	106	0.78 [0.71, 0.86]	0.62 [0.56, 0.68]	0.67 [0.61, 0.73]	0.77 [0.73, 0.80]	0.70 [0.65, 0.74]	11.68 [9.20, 14.16]	11.24 [9.59, 12.89]	-0.44 [-18.32, 17.45]	4.93 [3.46, 6.39]	49.91 [41.96, 57.86]	0.66 [0.54, 0.76]
		Original Implementation	106	0.76 [0.69, 0.84]	0.60 [0.53, 0.66]	0.66 [0.59, 0.72]	0.79 [0.76, 0.82]	0.71 [0.67, 0.75]	11.68 [9.20, 14.16]	12.97 [10.41, 15.53]	1.29 [-7.91, 10.49]	3.40 [2.74, 4.06]	46.68 [39.14, 54.21]	0.93 [0.90, 0.96]
	GsdIonescu	MobGap	106	0.90 [0.84, 0.95]	0.69 [0.64, 0.73]	0.77 [0.72, 0.82]	0.82 [0.80, 0.84]	0.70 [0.67, 0.73]	11.68 [9.20, 14.16]	14.34 [11.74, 16.94]	2.65 [-3.01, 8.32]	3.00 [2.52, 3.48]	35.76 [31.02, 40.51]	0.96 [0.77, 0.98]
	GsdIonescu	Original Implementation	106	0.89 [0.84, 0.94]	0.69 [0.64, 0.73]	0.77 [0.72, 0.82]	0.82 [0.80, 0.84]	0.70 [0.67, 0.73]	11.68 [9.20, 14.16]	13.93 [11.57, 16.29]	2.25 [-4.95, 9.44]	3.11 [2.54, 3.67]	35.79 [31.09, 40.50]	0.94 [0.86, 0.97]
COPD	GsdAdaptiveIonescu	MobGap	214	0.75 [0.70, 0.81]	0.65 [0.60, 0.70]	0.69 [0.64, 0.74]	0.90 [0.88, 0.91]	0.89 [0.88, 0.91]	10.85 [9.36, 12.34]	13.44 [11.34, 15.55]	2.59 [-13.40, 18.57]	3.39 [2.34, 4.44]	33.24 [24.81, 41.66]	0.81 [0.73, 0.86]
	GsdAdaptiveIonescu	Original Implementation	214	0.78 [0.73, 0.83]	0.68 [0.63, 0.73]	0.71 [0.67, 0.76]	0.90 [0.88, 0.91]	0.89 [0.88, 0.91]	10.85 [9.36, 12.34]	13.04 [11.12, 14.97]	2.19 [-11.35, 15.73]	3.02 [2.14, 3.90]	28.96 [21.31, 36.62]	0.84 [0.78, 0.89]
	GsdIluz	MobGap	214	0.79 [0.73, 0.84]^**	0.54 [0.50, 0.58]	0.63 [0.58, 0.67]	0.78 [0.76, 0.80]^**	0.69 [0.67, 0.71]^**	10.85 [9.36, 12.34]	14.89 [13.32, 16.45]	4.03 [-4.30, 12.37]	4.52 [4.02, 5.02]	59.31 [52.47, 66.14]^**	0.88 [0.47, 0.95]
		MobGap (original peak)	214	0.66 [0.59, 0.72]	0.55 [0.50, 0.60]	0.58 [0.53, 0.63]	0.86 [0.85, 0.88]	0.85 [0.83, 0.87]	10.85 [9.36, 12.34]	10.47 [9.30, 11.63]	-0.39 [-14.05, 13.28]	3.78 [2.99, 4.56]	45.54 [40.19, 50.89]	0.76 [0.69, 0.81]
		Original Implementation	214	0.66 [0.60, 0.72]	0.53 [0.48, 0.58]	0.58 [0.53, 0.64]	0.87 [0.86, 0.88]	0.85 [0.83, 0.87]	10.85 [9.36, 12.34]	12.18 [10.58, 13.78]	1.33 [-6.72, 9.38]	2.78 [2.34, 3.22]	42.60 [37.14, 48.05]	0.93 [0.90, 0.95]
	GsdIonescu	MobGap	214	0.83 [0.78, 0.87]	0.62 [0.58, 0.66]	0.70 [0.66, 0.75]	0.85 [0.84, 0.87]	0.79 [0.77, 0.80]	10.85 [9.36, 12.34]	14.20 [12.52, 15.88]	3.34 [-4.85, 11.54]	3.52 [2.98, 4.06]	37.80 [33.73, 41.86]	0.90 [0.66, 0.96]
	GsdIonescu	Original Implementation	214	0.82 [0.77, 0.87]	0.63 [0.58, 0.67]	0.70 [0.66, 0.75]	0.85 [0.84, 0.87]	0.79 [0.77, 0.80]	10.85 [9.36, 12.34]	14.01 [12.41, 15.61]	3.16 [-4.84, 11.15]	3.44 [2.92, 3.95]	38.17 [33.66, 42.68]	0.90 [0.68, 0.96]
MS	GsdAdaptiveIonescu	MobGap	228	0.87 [0.83, 0.90]	0.79 [0.76, 0.82]	0.81 [0.78, 0.84]	0.86 [0.84, 0.88]	0.85 [0.83, 0.87]	12.85 [11.34, 14.37]	14.76 [12.53, 16.99]	1.91 [-17.00, 20.82]	3.89 [2.72, 5.06]	27.31 [22.62, 32.01]	0.78 [0.72, 0.83]
	GsdAdaptiveIonescu	Original Implementation	228	0.86 [0.83, 0.89]	0.77 [0.74, 0.81]	0.80 [0.77, 0.83]	0.86 [0.84, 0.87]	0.84 [0.82, 0.86]	12.85 [11.34, 14.37]	14.30 [12.31, 16.29]	1.45 [-14.66, 17.55]	3.71 [2.74, 4.68]	28.57 [23.99, 33.16]	0.81 [0.76, 0.85]
	GsdIluz	MobGap	228	0.95 [0.92, 0.97]^**	0.64 [0.62, 0.67]	0.75 [0.73, 0.78]	0.76 [0.74, 0.77]^**	0.54 [0.52, 0.57]^**	12.85 [11.34, 14.37]	18.26 [16.11, 20.41]	5.40 [-10.45, 21.25]	5.59 [4.56, 6.63]	57.94 [52.34, 63.54]^**	0.79 [0.53, 0.88]
		MobGap (original peak)	228	0.86 [0.82, 0.90]	0.70 [0.66, 0.73]	0.75 [0.72, 0.79]	0.81 [0.79, 0.83]	0.73 [0.70, 0.75]	12.85 [11.34, 14.37]	13.39 [12.07, 14.72]	0.54 [-13.52, 14.60]	4.10 [3.33, 4.87]	42.59 [37.75, 47.43]	0.79 [0.73, 0.83]
		Original Implementation	228	0.86 [0.82, 0.90]	0.66 [0.62, 0.70]	0.74 [0.70, 0.77]	0.81 [0.80, 0.83]	0.72 [0.69, 0.74]	12.85 [11.34, 14.37]	15.28 [13.40, 17.15]	2.42 [-11.07, 15.91]	4.04 [3.25, 4.83]	43.82 [38.94, 48.69]	0.85 [0.78, 0.89]
	GsdIonescu	MobGap	228	0.94 [0.91, 0.96]	0.72 [0.70, 0.75]	0.81 [0.78, 0.83]	0.83 [0.82, 0.85]	0.72 [0.70, 0.74]	12.85 [11.34, 14.37]	15.92 [14.06, 17.78]	3.07 [-9.41, 15.55]	3.87 [3.10, 4.64]	37.62 [33.46, 41.77]	0.86 [0.75, 0.91]
	GsdIonescu	Original Implementation	228	0.93 [0.90, 0.96]	0.72 [0.70, 0.75]	0.80 [0.78, 0.83]	0.83 [0.82, 0.85]	0.72 [0.70, 0.74]	12.85 [11.34, 14.37]	15.42 [13.72, 17.12]	2.56 [-8.23, 13.36]	3.60 [2.97, 4.24]	37.13 [33.12, 41.15]	0.88 [0.80, 0.93]
PD	GsdAdaptiveIonescu	MobGap	224	0.73 [0.68, 0.78]	0.67 [0.63, 0.72]	0.68 [0.64, 0.73]	0.84 [0.82, 0.86]	0.87 [0.85, 0.89]	11.83 [10.03, 13.62]	13.35 [11.13, 15.56]	1.52 [-14.61, 17.66]	4.04 [3.08, 5.00]	33.45 [27.88, 39.01]	0.85 [0.81, 0.89]
	GsdAdaptiveIonescu	Original Implementation	224	0.75 [0.70, 0.79]	0.69 [0.65, 0.74]	0.70 [0.66, 0.75]	0.86 [0.85, 0.88]	0.87 [0.85, 0.89]	11.83 [10.03, 13.62]	12.81 [10.78, 14.85]	0.99 [-12.59, 14.57]	3.43 [2.63, 4.23]	29.43 [24.73, 34.13]	0.89 [0.85, 0.91]
	GsdIluz	MobGap	224	0.85 [0.80, 0.89]^**	0.58 [0.55, 0.62]	0.68 [0.64, 0.72]	0.75 [0.73, 0.78]^**	0.60 [0.57, 0.63]^**	11.83 [10.03, 13.62]	16.53 [14.43, 18.64]	4.71 [-5.16, 14.58]	4.98 [4.36, 5.60]	54.21 [49.04, 59.38]^**	0.90 [0.54, 0.96]
		MobGap (original peak)	224	0.71 [0.65, 0.76]	0.60 [0.55, 0.65]	0.63 [0.58, 0.68]	0.82 [0.79, 0.84]	0.79 [0.77, 0.82]	11.83 [10.03, 13.62]	10.92 [9.76, 12.07]	-0.91 [-19.57, 17.75]	4.52 [3.42, 5.62]	42.79 [38.23, 47.36]	0.66 [0.58, 0.73]
		Original Implementation	224	0.72 [0.66, 0.77]	0.58 [0.53, 0.62]	0.63 [0.58, 0.68]	0.83 [0.82, 0.85]	0.79 [0.77, 0.82]	11.83 [10.03, 13.62]	13.25 [11.41, 15.10]	1.43 [-8.13, 10.99]	3.15 [2.62, 3.67]	40.07 [35.68, 44.45]	0.93 [0.91, 0.95]
	GsdIonescu	MobGap	224	0.84 [0.79, 0.88]	0.65 [0.61, 0.69]	0.73 [0.69, 0.77]	0.83 [0.82, 0.84]	0.74 [0.72, 0.76]	11.83 [10.03, 13.62]	14.68 [12.84, 16.53]	2.86 [-6.05, 11.77]	3.58 [3.06, 4.10]	35.07 [31.50, 38.64]	0.93 [0.82, 0.96]
	GsdIonescu	Original Implementation	224	0.83 [0.79, 0.87]	0.66 [0.62, 0.69]	0.72 [0.69, 0.76]	0.83 [0.81, 0.84]	0.74 [0.72, 0.76]	11.83 [10.03, 13.62]	14.23 [12.54, 15.92]	2.40 [-5.97, 10.78]	3.32 [2.85, 3.79]	33.51 [30.24, 36.78]	0.93 [0.86, 0.96]
PFF	GsdAdaptiveIonescu	MobGap	169	0.89 [0.86, 0.92]	0.81 [0.78, 0.84]	0.83 [0.80, 0.85]	0.84 [0.82, 0.86]	0.81 [0.78, 0.83]	17.62 [14.73, 20.51]	20.51 [16.49, 24.52]	2.89 [-25.49, 31.27]	6.45 [4.45, 8.45]	31.05 [25.52, 36.59]	0.80 [0.74, 0.85]
	GsdAdaptiveIonescu	Original Implementation	169	0.87 [0.84, 0.91]	0.79 [0.75, 0.82]	0.81 [0.78, 0.84]	0.82 [0.80, 0.84]	0.80 [0.77, 0.83]	17.62 [14.73, 20.51]	19.69 [16.04, 23.34]	2.07 [-22.44, 26.57]	6.08 [4.41, 7.76]	32.81 [27.38, 38.23]	0.83 [0.78, 0.87]
	GsdIluz	MobGap	169	0.93 [0.90, 0.95]^**	0.70 [0.67, 0.73]^*	0.77 [0.75, 0.80]^**	0.77 [0.75, 0.78]	0.59 [0.56, 0.62]^**	17.62 [14.73, 20.51]	20.54 [17.41, 23.68]	2.92 [-18.67, 24.51]	6.64 [5.25, 8.03]	50.80 [44.61, 57.00]	0.84 [0.78, 0.88]
		MobGap (original peak)	169	0.67 [0.60, 0.73]	0.58 [0.53, 0.64]	0.59 [0.54, 0.65]	0.72 [0.69, 0.75]	0.80 [0.77, 0.83]	17.62 [14.73, 20.51]	10.24 [8.73, 11.76]	-7.38 [-45.37, 30.62]	10.23 [7.51, 12.95]	56.51 [50.51, 62.51]	0.18 [0.03, 0.31]
		Original Implementation	169	0.74 [0.68, 0.79]	0.64 [0.59, 0.69]	0.66 [0.61, 0.71]	0.76 [0.73, 0.78]	0.78 [0.75, 0.81]	17.62 [14.73, 20.51]	12.86 [10.95, 14.78]	-4.76 [-39.26, 29.74]	8.42 [5.98, 10.86]	48.22 [42.13, 54.32]	0.40 [0.26, 0.52]
	GsdIonescu	MobGap	169	0.89 [0.86, 0.92]	0.80 [0.77, 0.82]	0.82 [0.80, 0.85]	0.84 [0.83, 0.86]	0.79 [0.77, 0.81]	17.62 [14.73, 20.51]	16.49 [14.20, 18.79]	-1.13 [-18.98, 16.73]	4.77 [3.59, 5.95]	30.17 [26.53, 33.81]	0.86 [0.82, 0.89]
	GsdIonescu	Original Implementation	169	0.87 [0.84, 0.91]	0.80 [0.78, 0.83]	0.81 [0.79, 0.84]	0.84 [0.82, 0.85]	0.80 [0.78, 0.82]	17.62 [14.73, 20.51]	15.48 [13.50, 17.46]	-2.14 [-22.17, 17.90]	5.17 [3.80, 6.53]	31.18 [27.31, 35.06]	0.80 [0.74, 0.85]

Per relevant cohort#

low_impairment_results = lab_results_long[
    lab_results_long["cohort"].isin(low_impairment_cohorts)
].query("algo == @low_impairment_algo")

fig, ax = plt.subplots()
sns.boxplot(
    data=low_impairment_results,
    x="cohort",
    y="f1_score",
    hue="version",
    hue_order=hue_order,
    ax=ax,
)
sns.boxplot(
    data=low_impairment_results,
    x="_combined",
    y="f1_score",
    hue="version",
    hue_order=hue_order,
    legend=False,
    ax=ax,
)
fig.suptitle(f"Low Impairment Cohorts ({low_impairment_algo})")
fig.show()

perf_metrics_per_cohort.copy().loc[
    pd.IndexSlice[low_impairment_cohorts, low_impairment_algo], :
].reset_index("algo", drop=True).style.pipe(
    revalidation_table_styles,
    validation_thresholds,
    ["cohort"],
)

		General	GSD					GS duration
		# recordings	Recall	Precision	F1 Score	Accuracy	Specificity	INDIP mean and CI [s]	WD mean and CI [s]	Bias and LoA [s]	Abs. Error [s]	Abs. Rel. Error [%]	ICC
cohort	version
HA	MobGap	227	0.80 [0.75, 0.85]^**	0.53 [0.49, 0.56]	0.63 [0.59, 0.67]	0.75 [0.73, 0.77]^**	0.61 [0.59, 0.64]^**	9.34 [8.00, 10.67]	13.53 [11.94, 15.11]	4.19 [-5.35, 13.73]	4.50 [3.90, 5.10]	60.68 [55.08, 66.28]^**	0.85 [0.48, 0.93]
	MobGap (original peak)	227	0.71 [0.65, 0.76]	0.55 [0.51, 0.60]	0.61 [0.56, 0.65]	0.83 [0.81, 0.84]	0.77 [0.74, 0.79]	9.34 [8.00, 10.67]	9.54 [8.60, 10.48]	0.20 [-12.55, 12.96]	3.82 [3.14, 4.51]	48.22 [43.50, 52.95]	0.73 [0.66, 0.79]
	Original Implementation	227	0.69 [0.64, 0.75]	0.53 [0.49, 0.57]	0.59 [0.55, 0.64]	0.84 [0.82, 0.85]	0.78 [0.76, 0.81]	9.34 [8.00, 10.67]	10.93 [9.45, 12.41]	1.59 [-6.00, 9.19]	2.71 [2.30, 3.13]	45.40 [40.48, 50.32]	0.93 [0.88, 0.95]
COPD	MobGap	214	0.79 [0.73, 0.84]^**	0.54 [0.50, 0.58]	0.63 [0.58, 0.67]	0.78 [0.76, 0.80]^**	0.69 [0.67, 0.71]^**	10.85 [9.36, 12.34]	14.89 [13.32, 16.45]	4.03 [-4.30, 12.37]	4.52 [4.02, 5.02]	59.31 [52.47, 66.14]^**	0.88 [0.47, 0.95]
	MobGap (original peak)	214	0.66 [0.59, 0.72]	0.55 [0.50, 0.60]	0.58 [0.53, 0.63]	0.86 [0.85, 0.88]	0.85 [0.83, 0.87]	10.85 [9.36, 12.34]	10.47 [9.30, 11.63]	-0.39 [-14.05, 13.28]	3.78 [2.99, 4.56]	45.54 [40.19, 50.89]	0.76 [0.69, 0.81]
	Original Implementation	214	0.66 [0.60, 0.72]	0.53 [0.48, 0.58]	0.58 [0.53, 0.64]	0.87 [0.86, 0.88]	0.85 [0.83, 0.87]	10.85 [9.36, 12.34]	12.18 [10.58, 13.78]	1.33 [-6.72, 9.38]	2.78 [2.34, 3.22]	42.60 [37.14, 48.05]	0.93 [0.90, 0.95]
CHF	MobGap	106	0.90 [0.85, 0.95]^**	0.61 [0.57, 0.66]	0.72 [0.67, 0.76]	0.72 [0.68, 0.76]^**	0.53 [0.48, 0.57]^**	11.68 [9.20, 14.16]	16.02 [13.29, 18.75]	4.33 [-4.51, 13.18]	4.61 [3.81, 5.42]	58.82 [49.28, 68.37]	0.90 [0.52, 0.96]
	MobGap (original peak)	106	0.78 [0.71, 0.86]	0.62 [0.56, 0.68]	0.67 [0.61, 0.73]	0.77 [0.73, 0.80]	0.70 [0.65, 0.74]	11.68 [9.20, 14.16]	11.24 [9.59, 12.89]	-0.44 [-18.32, 17.45]	4.93 [3.46, 6.39]	49.91 [41.96, 57.86]	0.66 [0.54, 0.76]
	Original Implementation	106	0.76 [0.69, 0.84]	0.60 [0.53, 0.66]	0.66 [0.59, 0.72]	0.79 [0.76, 0.82]	0.71 [0.67, 0.75]	11.68 [9.20, 14.16]	12.97 [10.41, 15.53]	1.29 [-7.91, 10.49]	3.40 [2.74, 4.06]	46.68 [39.14, 54.21]	0.93 [0.90, 0.96]

high_impairment_results = lab_results_long[
    lab_results_long["cohort"].isin(high_impairment_cohorts)
].query("algo == @high_impairment_algo")

hue_order = ["Original Implementation", "MobGap"]

fig, ax = plt.subplots()
sns.boxplot(
    data=high_impairment_results,
    x="cohort",
    y="f1_score",
    hue="version",
    hue_order=hue_order,
    ax=ax,
)
sns.boxplot(
    data=high_impairment_results,
    x="_combined",
    y="f1_score",
    hue="version",
    hue_order=hue_order,
    legend=False,
    ax=ax,
)
fig.suptitle(f"High Impairment Cohorts ({high_impairment_algo})")
fig.show()

perf_metrics_per_cohort.copy().loc[
    pd.IndexSlice[high_impairment_cohorts, high_impairment_algo], :
].reset_index("algo", drop=True).style.pipe(
    revalidation_table_styles,
    validation_thresholds,
    ["cohort"],
)

		General	GSD					GS duration
		# recordings	Recall	Precision	F1 Score	Accuracy	Specificity	INDIP mean and CI [s]	WD mean and CI [s]	Bias and LoA [s]	Abs. Error [s]	Abs. Rel. Error [%]	ICC
cohort	version
PD	MobGap	224	0.84 [0.79, 0.88]	0.65 [0.61, 0.69]	0.73 [0.69, 0.77]	0.83 [0.82, 0.84]	0.74 [0.72, 0.76]	11.83 [10.03, 13.62]	14.68 [12.84, 16.53]	2.86 [-6.05, 11.77]	3.58 [3.06, 4.10]	35.07 [31.50, 38.64]	0.93 [0.82, 0.96]
PD	Original Implementation	224	0.83 [0.79, 0.87]	0.66 [0.62, 0.69]	0.72 [0.69, 0.76]	0.83 [0.81, 0.84]	0.74 [0.72, 0.76]	11.83 [10.03, 13.62]	14.23 [12.54, 15.92]	2.40 [-5.97, 10.78]	3.32 [2.85, 3.79]	33.51 [30.24, 36.78]	0.93 [0.86, 0.96]
MS	MobGap	228	0.94 [0.91, 0.96]	0.72 [0.70, 0.75]	0.81 [0.78, 0.83]	0.83 [0.82, 0.85]	0.72 [0.70, 0.74]	12.85 [11.34, 14.37]	15.92 [14.06, 17.78]	3.07 [-9.41, 15.55]	3.87 [3.10, 4.64]	37.62 [33.46, 41.77]	0.86 [0.75, 0.91]
MS	Original Implementation	228	0.93 [0.90, 0.96]	0.72 [0.70, 0.75]	0.80 [0.78, 0.83]	0.83 [0.82, 0.85]	0.72 [0.70, 0.74]	12.85 [11.34, 14.37]	15.42 [13.72, 17.12]	2.56 [-8.23, 13.36]	3.60 [2.97, 4.24]	37.13 [33.12, 41.15]	0.88 [0.80, 0.93]
PFF	MobGap	169	0.89 [0.86, 0.92]	0.80 [0.77, 0.82]	0.82 [0.80, 0.85]	0.84 [0.83, 0.86]	0.79 [0.77, 0.81]	17.62 [14.73, 20.51]	16.49 [14.20, 18.79]	-1.13 [-18.98, 16.73]	4.77 [3.59, 5.95]	30.17 [26.53, 33.81]	0.86 [0.82, 0.89]
PFF	Original Implementation	169	0.87 [0.84, 0.91]	0.80 [0.78, 0.83]	0.81 [0.79, 0.84]	0.84 [0.82, 0.85]	0.80 [0.78, 0.82]	17.62 [14.73, 20.51]	15.48 [13.50, 17.46]	-2.14 [-22.17, 17.90]	5.17 [3.80, 6.53]	31.18 [27.31, 35.06]	0.80 [0.74, 0.85]

Total running time of the script: (0 minutes 24.870 seconds)

Estimated memory usage: 80 MB

Gallery generated by Sphinx-Gallery

Performance of the gait sequences algorithm on the TVS dataset#

Performance metrics#

Free-Living Comparison#

All results across all cohorts#

Per Cohort#

Per relevant cohort#

Laboratory Comparison#

All results across all cohorts#

Per Cohort#

Per relevant cohort#

This Page