.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "auto_revalidation/full_pipeline/_02_pipeline_cad.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note :ref:`Go to the end ` to download the full example code. .. rst-class:: sphx-glr-example-title .. _sphx_glr_auto_revalidation_full_pipeline__02_pipeline_cad.py: .. _pipeline_val_results: Cadence estimation ================== .. warning:: On this page you will find preliminary results for a standardized revalidation of the pipeline and all of its algorithm. The current state, **TECHNICAL EXPERIMENTATION**. Don't use these results or make any assumptions based on them. We will update this page incrementally and provide further information, as soon as the state of any of the validation steps changes. The following provides an analysis and comparison of the Mobilise-D algorithm pipeline on the `Mobilise-D Technical Validation Study (TVS) dataset `_ for the estimation of cadence (free-living). In this example, we look into the performance of the Python implementation of the pipeline compared to the reference data. We also compare the actual performance to that obtained by the original Matlab-based implementation [1]_. .. [1] Kirk, C., Küderle, A., Micó-Amigo, M.E. et al. Mobilise-D insights to estimate real-world walking speed in multiple conditions with a wearable device. Sci Rep 14, 1754 (2024). https://doi.org/10.1038/s41598-024-51766-5 .. note:: If you are interested in how these results are calculated, head over to the :ref:`processing page `. .. GENERATED FROM PYTHON SOURCE LINES 28-31 .. code-block:: Python from typing import Optional .. GENERATED FROM PYTHON SOURCE LINES 32-35 Below the list of pipelines that are compared is shown. Note, that we use "MobGap" to refer to the reimplemented python algorithms, and the "Original Implementation" to refer to the original Matlab-based implementation. .. GENERATED FROM PYTHON SOURCE LINES 35-43 .. code-block:: Python algorithms = { "Official_MobiliseD_Pipeline": ("Mobilise-D Pipeline", "MobGap"), "EScience_MobiliseD_Pipeline": ( "Mobilise-D Pipeline", "Original Implementation", ), } .. GENERATED FROM PYTHON SOURCE LINES 44-51 The code below loads the data and prepares it for the analysis. By default, the data will be downloaded from an online repository (and cached locally). If you want to use a local copy of the data, you can set the `MOBGAP_VALIDATION_DATA_PATH` environment variable. and the `MOBGAP_VALIDATION_USE_LOCA_DATA` to `1`. The file download will print a couple log information, which can usually be ignored. You can also change the `version` parameter to load a different version of the data. .. GENERATED FROM PYTHON SOURCE LINES 51-184 .. code-block:: Python from pathlib import Path import pandas as pd from mobgap.data.validation_results import ValidationResultLoader from mobgap.utils.misc import get_env_var def format_loaded_results( values: dict[tuple[str, str], pd.DataFrame], index_cols: list[str], col_prefix_filter: Optional[str], convert_rel_error: bool = False, ) -> pd.DataFrame: formatted = ( pd.concat(values, names=["algo", "version", *index_cols]) .pipe( lambda df: df.filter(like=col_prefix_filter) if col_prefix_filter else df ) .reset_index() .assign( algo_with_version=lambda df: df["algo"] + " (" + df["version"] + ")", _combined="combined", ) ) if col_prefix_filter: formatted.columns = formatted.columns.str.removeprefix( col_prefix_filter ) if convert_rel_error: rel_cols = [c for c in formatted.columns if "rel_error" in c] formatted[rel_cols] = formatted[rel_cols] * 100 return formatted local_data_path = ( Path(get_env_var("MOBGAP_VALIDATION_DATA_PATH")) / "results" if int(get_env_var("MOBGAP_VALIDATION_USE_LOCAL_DATA", 0)) else None ) __RESULT_VERSION = "v0.11.0" loader = ValidationResultLoader( "full_pipeline", result_path=local_data_path, version=__RESULT_VERSION ) # Loading free-living data free_living_index_cols = [ "cohort", "participant_id", "time_measure", "recording", "recording_name", "recording_name_pretty", ] _free_living_results = { # Matched and aggregate/combined per-recording results for the 2.5 h free-living recordings v: loader.load_single_results(k, "free_living") for k, v in algorithms.items() } _free_living_results_raw = { # Matched per-WB results for the 2.5 h free-living recordings v: loader.load_single_csv_file(k, "free_living", "raw_matched_errors.csv") for k, v in algorithms.items() } free_living_results_combined = format_loaded_results( _free_living_results, free_living_index_cols, "combined__", convert_rel_error=True, ) free_living_results_matched = format_loaded_results( _free_living_results, free_living_index_cols, "matched__", convert_rel_error=True, ) free_living_results_matched_raw = format_loaded_results( values=_free_living_results_raw, index_cols=free_living_index_cols, col_prefix_filter=None, convert_rel_error=True, ) del _free_living_results, _free_living_results_raw # Loading laboratory data laboratory_index_cols = [ "cohort", "participant_id", "time_measure", "test", "trial", "test_name", "test_name_pretty", ] _laboratory_results = { # Matched and aggregate/combined per-recording results for the laboratory recordings v: loader.load_single_results(k, "laboratory") for k, v in algorithms.items() } _laboratory_results_raw = { # Matched per-WB results for the laboratory recordings v: loader.load_single_csv_file(k, "laboratory", "raw_matched_errors.csv") for k, v in algorithms.items() } laboratory_results_combined = format_loaded_results( _laboratory_results, laboratory_index_cols, "combined__", convert_rel_error=True, ) laboratory_results_matched = format_loaded_results( _laboratory_results, laboratory_index_cols, "matched__", convert_rel_error=True, ) laboratory_results_matched_raw = format_loaded_results( values=_laboratory_results_raw, index_cols=laboratory_index_cols, col_prefix_filter=None, convert_rel_error=True, ) del _laboratory_results, _laboratory_results_raw cohort_order = ["HA", "CHF", "COPD", "MS", "PD", "PFF"] .. GENERATED FROM PYTHON SOURCE LINES 185-193 Performance metrics ------------------- Below you can find the setup for all performance metrics that we will calculate. We only use the `single__` results for the comparison. .. note:: For the evaluation of the full pipeline performance, two types of aggregation are performed, which will be described later on in the example. .. GENERATED FROM PYTHON SOURCE LINES 193-358 .. code-block:: Python from functools import partial from mobgap.pipeline.evaluation import CustomErrorAggregations as A from mobgap.utils.df_operations import ( CustomOperation, apply_aggregations, apply_transformations, multilevel_groupby_apply_merge, ) from mobgap.utils.tables import FormatTransformer as F from mobgap.utils.tables import RevalidationInfo, revalidation_table_styles from mobgap.utils.tables import StatsFunctions as S custom_aggs_combined = [ CustomOperation( identifier=None, function=A.n_datapoints, column_name=[("n_datapoints", "all")], ), ("cadence_spm__detected", ["mean", A.conf_intervals]), ("cadence_spm__reference", ["mean", A.conf_intervals]), ("cadence_spm__error", ["mean", A.loa]), ("cadence_spm__abs_error", ["mean", A.conf_intervals]), ("cadence_spm__rel_error", ["mean", A.conf_intervals]), ("cadence_spm__abs_rel_error", ["mean", A.conf_intervals]), CustomOperation( identifier=None, function=partial( A.icc, reference_col_name="cadence_spm__reference", detected_col_name="cadence_spm__detected", icc_type="icc2", # For the lab data, some trials have no results for the old algorithms. nan_policy="omit", ), column_name=[("icc", "all"), ("icc_ci", "all")], ), ] custom_aggs_matched = [ CustomOperation( identifier=None, function=lambda df_: df_["n_matched_wbs"].sum(), column_name=[("n_wbs_matched", "all")], ), *custom_aggs_combined, ] stats_transform = [ CustomOperation( identifier=None, function=partial( S.pairwise_tests, value_col=c, between="version", reference_group_key="Original Implementation", ), column_name=[("stats_metadata", c)], ) for c in [ "cadence_spm__abs_error", "cadence_spm__abs_rel_error", ] ] format_transforms_combined = [ CustomOperation( identifier=None, function=lambda df_: df_[("n_datapoints", "all")].astype(int), column_name="n_datapoints", ), *( CustomOperation( identifier=None, function=partial( F.value_with_metadata, value_col=("mean", c), other_columns={ "range": ("conf_intervals", c), "stats_metadata": ("stats_metadata", c), }, ), column_name=c, ) for c in [ "cadence_spm__reference", "cadence_spm__detected", "cadence_spm__abs_error", "cadence_spm__rel_error", "cadence_spm__abs_rel_error", ] ), CustomOperation( identifier=None, function=partial( F.value_with_metadata, value_col=("mean", "cadence_spm__error"), other_columns={"range": ("loa", "cadence_spm__error")}, ), column_name="cadence_spm__error", ), CustomOperation( identifier=None, function=partial( F.value_with_metadata, value_col=("icc", "all"), other_columns={"range": ("icc_ci", "all")}, ), column_name="icc", ), ] format_transforms_matched = [ CustomOperation( identifier=None, function=lambda df_: df_[("n_wbs_matched", "all")].astype(int), column_name="n_wbs_matched", ), *format_transforms_combined, ] final_names_combined = { "n_datapoints": "# participants", "cadence_spm__detected": "WD mean and CI [steps/min]", "cadence_spm__reference": "INDIP mean and CI [steps/min]", "cadence_spm__error": "Bias and LoA [steps/min]", "cadence_spm__abs_error": "Abs. Error [steps/min]", "cadence_spm__rel_error": "Rel. Error [%]", "cadence_spm__abs_rel_error": "Abs. Rel. Error [%]", "icc": "ICC", } final_names_matched = { **final_names_combined, "n_wbs_matched": "# Matched WBs", } validation_thresholds = { "Abs. Error [steps/min]": RevalidationInfo( threshold=None, higher_is_better=False ), "Abs. Rel. Error [%]": RevalidationInfo( threshold=20, higher_is_better=False ), "ICC": RevalidationInfo(threshold=0.7, higher_is_better=True), } def format_tables_combined(df: pd.DataFrame) -> pd.DataFrame: return ( df.pipe(apply_transformations, format_transforms_combined) .rename(columns=final_names_combined) .loc[:, list(final_names_combined.values())] ) def format_tables_matched(df: pd.DataFrame) -> pd.DataFrame: return ( df.pipe(apply_transformations, format_transforms_matched) .rename(columns=final_names_matched) .loc[:, list(final_names_matched.values())] ) .. GENERATED FROM PYTHON SOURCE LINES 359-376 Free-living dataset ------------------- Combined/Aggregated Evaluation ****************************** To mimic actual use of wearable device where actual decisions are made on aggregated measures over a longer measurement period and not WB per WB, our primary comparison is based on the median gait metrics over the entire recording. We call this combined or aggregated evaluation. For this we combined all WBs for a datapoint by taking the median of the calculated cadence. These combined values were then compared between the systems. .. note:: In the free-living dataset, each datapoint represents one 2.5h recording. All results across all cohorts ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The results below represent the average performance across all participants independent of the cohort in terms of error, relative error, absolute error, and absolute relative error. .. GENERATED FROM PYTHON SOURCE LINES 376-415 .. code-block:: Python import matplotlib.pyplot as plt import seaborn as sns sns.set_context("talk") metrics = { "abs_rel_error": "Abs. Rel. Error (%)", "error": "Error (steps/min)", "rel_error": "Rel. Error (%)", "abs_error": "Abs. Error (steps/min)", } def multi_metric_plot(data, metrics, nrows, ncols): fig, axs = plt.subplots( nrows, ncols, sharex=True, figsize=(ncols * 6, nrows * 4 + 2) ) for ax, (metric, metric_label) in zip(axs.flatten(), metrics.items()): overall_df = data[["version", f"cadence_spm__{metric}"]].rename( columns={f"cadence_spm__{metric}": metric_label} ) sns.boxplot( data=overall_df, x="version", hue="version", y=metric_label, ax=ax ) ax.set_title(metric_label) ax.set_ylabel(metric_label) ax.tick_params(axis="both", which="major") ax.tick_params(axis="both", which="minor") ax.grid(True) plt.tight_layout() plt.show() free_living_results_combined.pipe(multi_metric_plot, metrics, 2, 2) .. image-sg:: /auto_revalidation/full_pipeline/images/sphx_glr__02_pipeline_cad_001.png :alt: Abs. Rel. Error (%), Error (steps/min), Rel. Error (%), Abs. Error (steps/min) :srcset: /auto_revalidation/full_pipeline/images/sphx_glr__02_pipeline_cad_001.png :class: sphx-glr-single-img .. GENERATED FROM PYTHON SOURCE LINES 416-435 .. code-block:: Python free_living_combined_perf_metrics_all = free_living_results_combined.pipe( multilevel_groupby_apply_merge, [ ( ["algo", "version"], partial(apply_aggregations, aggregations=custom_aggs_combined), ), ( ["algo"], partial(apply_transformations, transformations=stats_transform), ), ], ).pipe(format_tables_combined) free_living_combined_perf_metrics_all.style.pipe( revalidation_table_styles, validation_thresholds, ["algo"], ) .. rst-class:: sphx-glr-script-out .. code-block:: none /home/docs/checkouts/readthedocs.org/user_builds/mobgap/checkouts/v0.11.0/src/mobgap/utils/df_operations.py:703: FutureWarning: DataFrameGroupBy.apply operated on the grouping columns. This behavior is deprecated, and in a future version of pandas the grouping columns will be excluded from the operation. Either pass `include_groups=False` to exclude the groupings or explicitly select the grouping columns after groupby to silence this warning. results = [df.groupby(key).apply(func, **apply_kwargs) for key, func in groupbys] /home/docs/checkouts/readthedocs.org/user_builds/mobgap/checkouts/v0.11.0/src/mobgap/utils/df_operations.py:703: FutureWarning: DataFrameGroupBy.apply operated on the grouping columns. This behavior is deprecated, and in a future version of pandas the grouping columns will be excluded from the operation. Either pass `include_groups=False` to exclude the groupings or explicitly select the grouping columns after groupby to silence this warning. results = [df.groupby(key).apply(func, **apply_kwargs) for key, func in groupbys] .. raw:: html
    # participants WD mean and CI [steps/min] INDIP mean and CI [steps/min] Bias and LoA [steps/min] Abs. Error [steps/min] Rel. Error [%] Abs. Rel. Error [%] ICC
algo version                
Mobilise-D Pipeline MobGap 101 86.49 [85.26, 87.72] 85.78 [83.76, 87.79] 0.87 [-12.50, 14.23] 4.76 [3.80, 5.72] 1.91 [0.03, 3.80] 5.92 [4.38, 7.45] 0.68 [0.56, 0.77]
Original Implementation 101 87.03 [85.58, 88.48] 85.78 [83.76, 87.79] 1.14 [-12.13, 14.41] 5.02 [4.11, 5.93] 2.08 [0.42, 3.75] 6.10 [4.87, 7.33] 0.71 [0.60, 0.80]


.. GENERATED FROM PYTHON SOURCE LINES 436-437 Residual plots .. GENERATED FROM PYTHON SOURCE LINES 437-469 .. code-block:: Python from mobgap.plotting import move_legend_outside, residual_plot def combo_residual_plot(data, name=None): name = name or data.name fig, axs = plt.subplots( ncols=2, sharey=True, sharex=True, figsize=(12, 9), constrained_layout=True, ) fig.suptitle(name) for (version, subdata), ax in zip(data.groupby("version"), axs): residual_plot( subdata, "cadence_spm__reference", "cadence_spm__detected", "cohort", "steps/min", ax=ax, legend=ax == axs[-1], ) ax.set_title(version) move_legend_outside(fig, axs[-1]) plt.show() free_living_results_combined.query('algo == "Mobilise-D Pipeline"').pipe( combo_residual_plot, name="Aggregated Analysis - Cadence" ) .. image-sg:: /auto_revalidation/full_pipeline/images/sphx_glr__02_pipeline_cad_002.png :alt: Aggregated Analysis - Cadence, MobGap, Original Implementation :srcset: /auto_revalidation/full_pipeline/images/sphx_glr__02_pipeline_cad_002.png :class: sphx-glr-single-img .. GENERATED FROM PYTHON SOURCE LINES 470-475 Per-cohort analysis ~~~~~~~~~~~~~~~~~~~ The results below represent the average absolute error on cadence estimation across all participants within a cohort. .. GENERATED FROM PYTHON SOURCE LINES 475-488 .. code-block:: Python fig, ax = plt.subplots(figsize=(12, 6)) sns.boxplot( data=free_living_results_combined, x="cohort", y="cadence_spm__abs_error", hue="version", order=cohort_order, showmeans=True, ax=ax, ).legend().set_title(None) ax.set_ylabel("Absolute Error [steps/min]") ax.set_title("Absolute Error - Combined Analysis") fig.show() .. image-sg:: /auto_revalidation/full_pipeline/images/sphx_glr__02_pipeline_cad_003.png :alt: Absolute Error - Combined Analysis :srcset: /auto_revalidation/full_pipeline/images/sphx_glr__02_pipeline_cad_003.png :class: sphx-glr-single-img .. GENERATED FROM PYTHON SOURCE LINES 489-511 .. code-block:: Python free_living_combined_perf_metrics_cohort = ( free_living_results_combined.pipe( multilevel_groupby_apply_merge, [ ( ["cohort", "algo", "version"], partial(apply_aggregations, aggregations=custom_aggs_combined), ), ( ["cohort", "algo"], partial(apply_transformations, transformations=stats_transform), ), ], ) .pipe(format_tables_combined) .loc[cohort_order] ) free_living_combined_perf_metrics_cohort.style.pipe( revalidation_table_styles, validation_thresholds, ["cohort", "algo"], ) .. rst-class:: sphx-glr-script-out .. code-block:: none /home/docs/checkouts/readthedocs.org/user_builds/mobgap/checkouts/v0.11.0/src/mobgap/utils/df_operations.py:703: FutureWarning: DataFrameGroupBy.apply operated on the grouping columns. This behavior is deprecated, and in a future version of pandas the grouping columns will be excluded from the operation. Either pass `include_groups=False` to exclude the groupings or explicitly select the grouping columns after groupby to silence this warning. results = [df.groupby(key).apply(func, **apply_kwargs) for key, func in groupbys] /home/docs/checkouts/readthedocs.org/user_builds/mobgap/checkouts/v0.11.0/src/mobgap/utils/df_operations.py:703: FutureWarning: DataFrameGroupBy.apply operated on the grouping columns. This behavior is deprecated, and in a future version of pandas the grouping columns will be excluded from the operation. Either pass `include_groups=False` to exclude the groupings or explicitly select the grouping columns after groupby to silence this warning. results = [df.groupby(key).apply(func, **apply_kwargs) for key, func in groupbys] .. raw:: html
      # participants WD mean and CI [steps/min] INDIP mean and CI [steps/min] Bias and LoA [steps/min] Abs. Error [steps/min] Rel. Error [%] Abs. Rel. Error [%] ICC
cohort algo version                
HA Mobilise-D Pipeline MobGap 20 85.21 [83.34, 87.08] 87.12 [83.67, 90.56] -1.90 [-10.78, 6.97] 3.94 [2.71, 5.18] -1.79 [-4.06, 0.48] 4.45 [3.11, 5.79] 0.72 [0.42, 0.88]
Original Implementation 20 83.96 [81.93, 85.99] 87.12 [83.67, 90.56] -3.15 [-12.74, 6.43] 4.85 [3.50, 6.21] -3.23 [-5.65, -0.81] 5.43 [4.03, 6.84] 0.64 [0.23, 0.85]
CHF Mobilise-D Pipeline MobGap 10 87.01 [83.29, 90.72] 90.03 [85.17, 94.89] -3.02 [-11.23, 5.18] 4.35 [2.74, 5.95] -3.13 [-6.06, -0.20] 4.80 [3.10, 6.51] 0.76 [0.24, 0.94]
Original Implementation 10 93.28 [88.55, 98.01] 90.03 [85.17, 94.89] 3.29 [-3.49, 10.07] 4.05 [2.58, 5.52] 3.81 [1.33, 6.28] 4.62 [2.85, 6.40] 0.84 [0.23, 0.97]
COPD Mobilise-D Pipeline MobGap 17 83.41 [81.21, 85.60] 82.55 [79.23, 85.86] 0.86 [-5.26, 6.98] 2.57 [1.68, 3.46] 1.31 [-0.56, 3.17] 3.20 [2.01, 4.40] 0.86 [0.66, 0.95]
Original Implementation 17 81.84 [79.56, 84.12] 82.55 [79.23, 85.86] -0.70 [-7.01, 5.60] 2.61 [1.71, 3.52] -0.61 [-2.49, 1.28] 3.16 [2.05, 4.27] 0.86 [0.65, 0.95]
MS Mobilise-D Pipeline MobGap 18 89.27 [86.69, 91.85] 87.51 [83.40, 91.62] 1.77 [-6.73, 10.26] 3.70 [2.42, 4.97] 2.50 [-0.06, 5.06] 4.51 [2.68, 6.34] 0.81 [0.57, 0.93]
Original Implementation 18 90.00 [87.63, 92.36] 87.51 [83.40, 91.62] 2.49 [-6.45, 11.43] 3.78 [2.16, 5.39] 3.39 [0.61, 6.16] 4.71 [2.42, 7.01] 0.77 [0.44, 0.91]
PD Mobilise-D Pipeline MobGap 19 89.80 [86.42, 93.18] 88.46 [82.92, 94.01] 1.34 [-16.62, 19.29] 6.58 [3.73, 9.43] 2.56 [-2.00, 7.13] 7.54 [4.35, 10.72] 0.61 [0.22, 0.83]
Original Implementation 19 91.29 [88.12, 94.47] 88.46 [82.92, 94.01] 2.83 [-15.94, 21.60] 7.71 [4.95, 10.46] 4.37 [-0.43, 9.17] 8.91 [5.72, 12.10] 0.54 [0.14, 0.79]
PFF Mobilise-D Pipeline MobGap 17 84.14 [80.77, 87.51] 79.74 [73.26, 86.22] 5.21 [-14.01, 24.43] 7.41 [3.53, 11.29] 8.91 [0.57, 17.24] 10.99 [3.28, 18.70] 0.53 [0.09, 0.80]
Original Implementation 17 84.63 [80.34, 88.91] 79.74 [73.26, 86.22] 3.73 [-13.11, 20.56] 6.53 [3.41, 9.65] 6.43 [0.43, 12.42] 9.09 [3.99, 14.19] 0.68 [0.30, 0.87]


.. GENERATED FROM PYTHON SOURCE LINES 512-516 Scatter plot The results below represent the detected and reference values of cadence scattered across all participants within a cohort. Correlation factor, p-value and confidence intervals of the regression line are shown in the plot. Each datapoint represents one participant. .. GENERATED FROM PYTHON SOURCE LINES 516-575 .. code-block:: Python from mobgap.plotting import calc_min_max_with_margin, make_square, plot_regline def combo_scatter_plot(data, name=None): name = name or data.name fig, axs = plt.subplots( ncols=2, sharey=True, sharex=True, figsize=(12, 8), constrained_layout=True, ) fig.suptitle(name) min_max = calc_min_max_with_margin( data["cadence_spm__reference"], data["cadence_spm__detected"], ) for (version, subdata), ax in zip(data.groupby("version"), axs): subdata = subdata[ [ "cadence_spm__reference", "cadence_spm__detected", "cohort", ] ].dropna(how="any") sns.scatterplot( subdata, x="cadence_spm__reference", y="cadence_spm__detected", hue="cohort", ax=ax, legend=ax == axs[-1], ) plot_regline( subdata["cadence_spm__reference"], subdata["cadence_spm__detected"], ax=ax, ) make_square(ax, min_max, draw_diagonal=True) ax.set_title(version) ax.set_xlabel("Reference [steps/min]") ax.set_ylabel("Detected [steps/min]") ax.tick_params(axis="both", labelsize=20) move_legend_outside(fig, axs[-1]) plt.show() free_living_results_combined.query('algo == "Mobilise-D Pipeline"').pipe( combo_scatter_plot, name="Mobilise-D Pipeline - Cadence" ) .. image-sg:: /auto_revalidation/full_pipeline/images/sphx_glr__02_pipeline_cad_004.png :alt: Mobilise-D Pipeline - Cadence, MobGap, Original Implementation :srcset: /auto_revalidation/full_pipeline/images/sphx_glr__02_pipeline_cad_004.png :class: sphx-glr-single-img .. GENERATED FROM PYTHON SOURCE LINES 576-598 Matched/True Positive Evaluation ******************************** The "Matched" Evaluation directly compares the performance of cadence estimation on only the WBs that were detected in both systems (true positives). WBs were included in the true positive analysis, if there was an overlap of more than 80% between WBs detected by the two systems (details about the selection of this threshold can be found in [1]_). The threshold of 80% was selected as a trade-off to allow us: (i) to consider as much as possible a like-for-like comparison between selected WBs (INDIP vs. wearable device), and at the same time (ii) to include the minimum number of WBs to ensure sufficient statistical power for the analyses (i.e., at least 101 walking bouts for each cohort). This target was based upon the number of WBs rather than a percentage of total walking bouts that would allow us to meet criteria established by statistical experts for robust statistical analysis after sample-size re-evaluation (total WB number > 101 corresponding to ICC > 0.7 and a CI = 0.2). .. note:: compared to the results published in [1]_, the primary analysis on the matched results is performed on the average performance metrics across all matched WBs **per recording/per participant**. The original publication considered the average performance metrics across all matched WBs without additional aggregation. Results across all cohorts ~~~~~~~~~~~~~~~~~~~~~~~~~~ The results below represent the average performance across all participants independent of the cohort in terms of error, relative error, absolute error, and absolute relative error. .. GENERATED FROM PYTHON SOURCE LINES 598-600 .. code-block:: Python free_living_results_matched.pipe(multi_metric_plot, metrics, 2, 2) .. image-sg:: /auto_revalidation/full_pipeline/images/sphx_glr__02_pipeline_cad_005.png :alt: Abs. Rel. Error (%), Error (steps/min), Rel. Error (%), Abs. Error (steps/min) :srcset: /auto_revalidation/full_pipeline/images/sphx_glr__02_pipeline_cad_005.png :class: sphx-glr-single-img .. GENERATED FROM PYTHON SOURCE LINES 601-603 As each pipeline version produces different WB's, it is important to compare the number of matched WBs to put all other metrics into perspective. .. GENERATED FROM PYTHON SOURCE LINES 603-614 .. code-block:: Python fig, ax = plt.subplots(figsize=(12, 6)) sns.barplot( data=free_living_results_matched.groupby(["version"])["n_matched_wbs"] .sum() .reset_index(), x="version", y="n_matched_wbs", ax=ax, ) fig.show() .. image-sg:: /auto_revalidation/full_pipeline/images/sphx_glr__02_pipeline_cad_006.png :alt: 02 pipeline cad :srcset: /auto_revalidation/full_pipeline/images/sphx_glr__02_pipeline_cad_006.png :class: sphx-glr-single-img .. GENERATED FROM PYTHON SOURCE LINES 615-634 .. code-block:: Python free_living_matched_perf_metrics_all = free_living_results_matched.pipe( multilevel_groupby_apply_merge, [ ( ["algo", "version"], partial(apply_aggregations, aggregations=custom_aggs_matched), ), ( ["algo"], partial(apply_transformations, transformations=stats_transform), ), ], ).pipe(format_tables_matched) free_living_matched_perf_metrics_all.style.pipe( revalidation_table_styles, validation_thresholds, ["algo"], ) .. rst-class:: sphx-glr-script-out .. code-block:: none /home/docs/checkouts/readthedocs.org/user_builds/mobgap/checkouts/v0.11.0/src/mobgap/utils/df_operations.py:703: FutureWarning: DataFrameGroupBy.apply operated on the grouping columns. This behavior is deprecated, and in a future version of pandas the grouping columns will be excluded from the operation. Either pass `include_groups=False` to exclude the groupings or explicitly select the grouping columns after groupby to silence this warning. results = [df.groupby(key).apply(func, **apply_kwargs) for key, func in groupbys] /home/docs/checkouts/readthedocs.org/user_builds/mobgap/checkouts/v0.11.0/src/mobgap/utils/df_operations.py:703: FutureWarning: DataFrameGroupBy.apply operated on the grouping columns. This behavior is deprecated, and in a future version of pandas the grouping columns will be excluded from the operation. Either pass `include_groups=False` to exclude the groupings or explicitly select the grouping columns after groupby to silence this warning. results = [df.groupby(key).apply(func, **apply_kwargs) for key, func in groupbys] .. raw:: html
    # participants WD mean and CI [steps/min] INDIP mean and CI [steps/min] Bias and LoA [steps/min] Abs. Error [steps/min] Rel. Error [%] Abs. Rel. Error [%] ICC # Matched WBs
algo version                  
Mobilise-D Pipeline MobGap 101 89.24 [87.85, 90.62] 89.04 [87.16, 90.93] 0.16 [-9.84, 10.16] 4.96 [4.10, 5.83] 1.22 [-0.06, 2.49] 5.89 [4.77, 7.01] 0.82 [0.74, 0.88] 1984
Original Implementation 101 89.44 [87.94, 90.94] 90.60 [88.76, 92.44] -1.16 [-9.55, 7.23] 4.72 [3.94, 5.49] -0.54 [-1.40, 0.33] 5.28 [4.47, 6.09] 0.87 [0.81, 0.91] 1697


.. GENERATED FROM PYTHON SOURCE LINES 635-636 Residual plot .. GENERATED FROM PYTHON SOURCE LINES 636-639 .. code-block:: Python free_living_results_matched.query('algo == "Mobilise-D Pipeline"').pipe( combo_residual_plot, name="Matched WBs - Cadence" ) .. image-sg:: /auto_revalidation/full_pipeline/images/sphx_glr__02_pipeline_cad_007.png :alt: Matched WBs - Cadence, MobGap, Original Implementation :srcset: /auto_revalidation/full_pipeline/images/sphx_glr__02_pipeline_cad_007.png :class: sphx-glr-single-img .. GENERATED FROM PYTHON SOURCE LINES 640-645 Per-cohort analysis ~~~~~~~~~~~~~~~~~~~ Boxplot The results below represent the average absolute error on cadence estimation across all participants within a cohort. .. GENERATED FROM PYTHON SOURCE LINES 645-660 .. code-block:: Python fig, ax = plt.subplots(figsize=(12, 6)) sns.barplot( data=free_living_results_matched.groupby(["version", "cohort"])[ "n_matched_wbs" ] .sum() .reset_index(), hue="version", y="n_matched_wbs", x="cohort", order=cohort_order, ax=ax, ) fig.show() .. image-sg:: /auto_revalidation/full_pipeline/images/sphx_glr__02_pipeline_cad_008.png :alt: 02 pipeline cad :srcset: /auto_revalidation/full_pipeline/images/sphx_glr__02_pipeline_cad_008.png :class: sphx-glr-single-img .. GENERATED FROM PYTHON SOURCE LINES 661-673 .. code-block:: Python fig, ax = plt.subplots(figsize=(12, 6)) sns.boxplot( data=free_living_results_matched, x="cohort", y="cadence_spm__abs_error", hue="algo_with_version", order=cohort_order, ax=ax, ).legend().set_title(None) ax.set_ylabel("Absolute Error [steps/min]") ax.set_title("Absolute Error - Matched Analysis") fig.show() .. image-sg:: /auto_revalidation/full_pipeline/images/sphx_glr__02_pipeline_cad_009.png :alt: Absolute Error - Matched Analysis :srcset: /auto_revalidation/full_pipeline/images/sphx_glr__02_pipeline_cad_009.png :class: sphx-glr-single-img .. GENERATED FROM PYTHON SOURCE LINES 674-675 Processing the per-cohort performance table .. GENERATED FROM PYTHON SOURCE LINES 675-698 .. code-block:: Python free_living_matched_perf_metrics_cohort = ( free_living_results_matched.pipe( multilevel_groupby_apply_merge, [ ( ["cohort", "algo", "version"], partial(apply_aggregations, aggregations=custom_aggs_matched), ), ( ["cohort", "algo"], partial(apply_transformations, transformations=stats_transform), ), ], ) .pipe(format_tables_matched) .loc[cohort_order] ) free_living_matched_perf_metrics_cohort.style.pipe( revalidation_table_styles, validation_thresholds, ["cohort", "algo"], ) .. rst-class:: sphx-glr-script-out .. code-block:: none /home/docs/checkouts/readthedocs.org/user_builds/mobgap/checkouts/v0.11.0/src/mobgap/utils/df_operations.py:703: FutureWarning: DataFrameGroupBy.apply operated on the grouping columns. This behavior is deprecated, and in a future version of pandas the grouping columns will be excluded from the operation. Either pass `include_groups=False` to exclude the groupings or explicitly select the grouping columns after groupby to silence this warning. results = [df.groupby(key).apply(func, **apply_kwargs) for key, func in groupbys] /home/docs/checkouts/readthedocs.org/user_builds/mobgap/checkouts/v0.11.0/src/mobgap/utils/df_operations.py:703: FutureWarning: DataFrameGroupBy.apply operated on the grouping columns. This behavior is deprecated, and in a future version of pandas the grouping columns will be excluded from the operation. Either pass `include_groups=False` to exclude the groupings or explicitly select the grouping columns after groupby to silence this warning. results = [df.groupby(key).apply(func, **apply_kwargs) for key, func in groupbys] .. raw:: html
      # participants WD mean and CI [steps/min] INDIP mean and CI [steps/min] Bias and LoA [steps/min] Abs. Error [steps/min] Rel. Error [%] Abs. Rel. Error [%] ICC # Matched WBs
cohort algo version                  
HA Mobilise-D Pipeline MobGap 20 89.91 [87.15, 92.67] 90.47 [86.76, 94.17] -0.69 [-7.81, 6.43] 5.02 [3.82, 6.23] 0.30 [-1.77, 2.37] 5.93 [4.32, 7.54] 0.88 [0.73, 0.95] 524
Original Implementation 20 90.02 [87.23, 92.81] 92.72 [89.25, 96.20] -2.71 [-7.65, 2.24] 4.64 [3.77, 5.51] -2.48 [-3.55, -1.40] 4.94 [4.17, 5.71] 0.88 [0.30, 0.97] 410
CHF Mobilise-D Pipeline MobGap 10 91.98 [88.22, 95.75] 93.24 [88.49, 97.99] -1.06 [-5.77, 3.66] 4.50 [2.92, 6.08] -0.38 [-1.81, 1.05] 4.84 [3.34, 6.34] 0.93 [0.74, 0.98] 220
Original Implementation 10 93.67 [88.03, 99.30] 94.64 [88.71, 100.57] -0.98 [-3.56, 1.61] 3.44 [1.69, 5.20] -0.78 [-1.60, 0.05] 3.55 [1.83, 5.27] 0.99 [0.91, 1.00] 176
COPD Mobilise-D Pipeline MobGap 17 85.45 [82.97, 87.94] 84.42 [81.47, 87.36] 0.88 [-2.99, 4.74] 3.79 [3.05, 4.53] 1.55 [0.27, 2.83] 4.75 [3.72, 5.77] 0.92 [0.78, 0.97] 410
Original Implementation 17 83.78 [81.42, 86.14] 86.14 [83.24, 89.05] -2.36 [-6.25, 1.52] 4.23 [3.45, 5.00] -2.43 [-3.46, -1.40] 4.82 [4.04, 5.60] 0.86 [0.17, 0.96] 323
MS Mobilise-D Pipeline MobGap 18 90.71 [87.18, 94.24] 90.39 [86.01, 94.76] 0.32 [-8.22, 8.87] 6.10 [3.96, 8.24] 1.64 [-0.81, 4.09] 7.23 [4.78, 9.68] 0.88 [0.70, 0.95] 327
Original Implementation 18 91.23 [87.35, 95.10] 89.68 [84.75, 94.60] 1.55 [-5.52, 8.61] 5.33 [3.87, 6.79] 2.98 [0.64, 5.33] 6.72 [4.45, 8.99] 0.92 [0.79, 0.97] 355
PD Mobilise-D Pipeline MobGap 19 92.01 [88.88, 95.14] 91.72 [86.38, 97.06] 0.29 [-16.84, 17.42] 5.32 [1.99, 8.66] 1.67 [-3.60, 6.94] 6.45 [1.78, 11.11] 0.61 [0.22, 0.83] 267
Original Implementation 19 92.47 [89.67, 95.27] 93.66 [89.63, 97.70] -1.20 [-13.06, 10.67] 5.42 [2.48, 8.36] -0.14 [-2.57, 2.29] 5.90 [2.99, 8.81] 0.70 [0.37, 0.87] 256
PFF Mobilise-D Pipeline MobGap 17 85.52 [81.84, 89.20] 84.59 [79.54, 89.65] 0.92 [-10.07, 11.92] 4.69 [2.56, 6.81] 2.05 [-0.96, 5.06] 5.55 [3.13, 7.96] 0.82 [0.56, 0.94] 236
Original Implementation 17 86.12 [82.23, 90.01] 87.14 [81.64, 92.64] -1.02 [-13.12, 11.08] 4.47 [1.94, 7.01] -0.35 [-2.96, 2.25] 4.70 [2.65, 6.76] 0.82 [0.51, 0.94] 177


.. GENERATED FROM PYTHON SOURCE LINES 699-707 Deep dive investigation: Do errors depend on WB duration or walking speed? ************************************************************************** Effect of WB duration ~~~~~~~~~~~~~~~~~~~~~ We investigate the dependency of the absolute cadence error of all true-positive WBs from the real-world recording on the WB duration reported by the reference system. In the top, WB errors are grouped by various duration bouts. In the bottom the number of bouts within each duration group is visualized. .. GENERATED FROM PYTHON SOURCE LINES 707-764 .. code-block:: Python import numpy as np from mobgap.utils.df_operations import cut_into_overlapping_bins def plot_wb_duration_analysis(df): """Generates a single figure with: - First row: Two side-by-side boxplot for "new" and "old" cases. - Second row: A grouped bar chart comparing WB counts for "new" and "old" cases. df: DataFrame containing 'version' column with values 'new' or 'old' to distinguish data """ fig, axs = plt.subplot_mosaic( [["v"], ["v"], ["v"], ["n"]], sharex=True, figsize=(12, 9) ) # Compute WB durations in seconds df_with_durations = df.assign( duration_s=lambda df_: (df_["end__reference"] - df_["start__reference"]) / 100 ) bins = { "All": (-np.inf, np.inf), "> 10 s": (10, np.inf), "<= 10 s": (0, 10), "10 - 30 s": (10, 30), "30 - 60 s": (30, 60), "60 - 120 s": (60, 120), "> 120 s": (120, np.inf), } binned_df = cut_into_overlapping_bins( df_with_durations, "duration_s", bins ).reset_index() n = sns.countplot( data=binned_df, x="bin", hue="version", ax=axs["n"], legend=False ) for container in n.containers: n.bar_label(container, size=10) sns.boxplot( data=binned_df, x="bin", y="cadence_spm__abs_error", hue="version", ax=axs["v"], ) sns.despine(fig) axs["v"].set_ylabel("Absolute Cadence Error (steps/min)") axs["n"].set_ylabel("WB Count") axs["n"].set_xlabel("Ref. WB Duration") fig.show() free_living_results_matched_raw.query("algo == 'Mobilise-D Pipeline'").pipe( plot_wb_duration_analysis ) .. image-sg:: /auto_revalidation/full_pipeline/images/sphx_glr__02_pipeline_cad_010.png :alt: 02 pipeline cad :srcset: /auto_revalidation/full_pipeline/images/sphx_glr__02_pipeline_cad_010.png :class: sphx-glr-single-img .. GENERATED FROM PYTHON SOURCE LINES 765-772 Effect of walking speed on error ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ One important aspect of the algorithm performance is the dependency on the cadence. Aka, how well do the algorithms perform at different walking speeds. For this we plot the absolute error against the cadence of the reference data. For better granularity, we use the values per WB, instead of the aggregates per participant. The overlayed dots represent the trend-line calculated by taking the median of the absolute error within bins of 0.05 m/s. .. GENERATED FROM PYTHON SOURCE LINES 772-864 .. code-block:: Python # For plotting all participants at the end free_living_combined = free_living_results_matched_raw.copy() free_living_combined["cohort"] = "Combined" ws_level_results = pd.concat( [free_living_results_matched_raw, free_living_combined] ).reset_index(drop=True) algo_names = ws_level_results["algo_with_version"].unique() cohort_names = ws_level_results["cohort"].unique() ws_level_results["cohort"] = pd.Categorical( ws_level_results["cohort"], categories=cohort_names, ordered=True ) ws_level_results["algo_with_version"] = pd.Categorical( ws_level_results["algo_with_version"], categories=algo_names, ordered=True ) # Create the figure with subplots fig = plt.figure(constrained_layout=True, figsize=(24, 5 * len(algo_names))) subfigs = fig.subfigures(len(algo_names), 1, wspace=0.1, hspace=0.1) # Define the min and max limits for x and y axes min_max_x = calc_min_max_with_margin( ws_level_results["walking_speed_mps__reference"] ) min_max_y = calc_min_max_with_margin(ws_level_results["cadence_spm__abs_error"]) # Plotting each algorithm version for subfig, (algo, data) in zip( subfigs, ws_level_results.groupby("algo_with_version", observed=True) ): subfig.suptitle(algo) subfig.supxlabel("Walking Speed (m/s)") subfig.supylabel("Absolute Error (steps/min)") # Create subplots for each cohort axs = subfig.subplots(1, len(cohort_names), sharex=True, sharey=True) for ax, (cohort, cohort_data) in zip( axs, data.groupby("cohort", observed=True) ): # Scatter plot for the cohort data sns.scatterplot( data=cohort_data, x="walking_speed_mps__reference", # Reference walking speed y="cadence_spm__abs_error", # Absolute error ax=ax, alpha=0.3, ) # Define bins for walking speed bins = np.arange( 0, cohort_data["walking_speed_mps__reference"].max() + 0.05, 0.05 ) cohort_data["speed_bin"] = pd.cut( cohort_data["walking_speed_mps__reference"], bins=bins ) # Calculate bin centers cohort_data["bin_center"] = cohort_data["speed_bin"].apply( lambda x: x.mid ) # Calculate median error per bin and cohort binned_data = ( cohort_data.groupby("bin_center", observed=True)[ "cadence_spm__abs_error" ] .median() .reset_index() ) # Plot the median lines for each bin sns.scatterplot( data=binned_data, x="bin_center", y="cadence_spm__abs_error", # Median error ax=ax, ) ax.set_title(cohort) ax.set_xlabel(None) ax.set_ylabel(None) # Set axis limits ax.set_xlim(*min_max_x) ax.set_ylim(*min_max_y) fig.show() .. image-sg:: /auto_revalidation/full_pipeline/images/sphx_glr__02_pipeline_cad_011.png :alt: CHF, COPD, HA, MS, PD, PFF, Combined, CHF, COPD, HA, MS, PD, PFF, Combined :srcset: /auto_revalidation/full_pipeline/images/sphx_glr__02_pipeline_cad_011.png :class: sphx-glr-single-img .. GENERATED FROM PYTHON SOURCE LINES 865-882 Laboratory dataset ------------------ Combined/Aggregated Evaluation ****************************** To mimic actual use of wearable device where actual decisions are made on aggregated measures over a longer measurement period and not WB per WB, our primary comparison is based on the median gait metrics over the entire recording. We call this combined or aggregated evaluation. For this we combined all WBs for a datapoint by taking the median of the calculated cadence. These combined values were then compared between the systems. .. note:: In the laboratory dataset, each datapoint represents one trial. All results across all cohorts ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The results below represent the average performance across all participants independent of the cohort in terms of error, relative error, absolute error, and absolute relative error. .. GENERATED FROM PYTHON SOURCE LINES 882-921 .. code-block:: Python import matplotlib.pyplot as plt import seaborn as sns sns.set_context("talk") metrics = { "abs_rel_error": "Abs. Rel. Error (%)", "error": "Error (steps/min)", "rel_error": "Rel. Error (%)", "abs_error": "Abs. Error (steps/min)", } def multi_metric_plot(data, metrics, nrows, ncols): fig, axs = plt.subplots( nrows, ncols, sharex=True, figsize=(ncols * 6, nrows * 4 + 2) ) for ax, (metric, metric_label) in zip(axs.flatten(), metrics.items()): overall_df = data[["version", f"cadence_spm__{metric}"]].rename( columns={f"cadence_spm__{metric}": metric_label} ) sns.boxplot( data=overall_df, x="version", hue="version", y=metric_label, ax=ax ) ax.set_title(metric_label) ax.set_ylabel(metric_label) ax.tick_params(axis="both", which="major") ax.tick_params(axis="both", which="minor") ax.grid(True) plt.tight_layout() plt.show() laboratory_results_combined.pipe(multi_metric_plot, metrics, 2, 2) .. image-sg:: /auto_revalidation/full_pipeline/images/sphx_glr__02_pipeline_cad_012.png :alt: Abs. Rel. Error (%), Error (steps/min), Rel. Error (%), Abs. Error (steps/min) :srcset: /auto_revalidation/full_pipeline/images/sphx_glr__02_pipeline_cad_012.png :class: sphx-glr-single-img .. GENERATED FROM PYTHON SOURCE LINES 922-942 .. code-block:: Python laboratory_combined_perf_metrics_all = laboratory_results_combined.pipe( multilevel_groupby_apply_merge, [ ( ["algo", "version"], partial(apply_aggregations, aggregations=custom_aggs_combined), ), ( ["algo"], partial(apply_transformations, transformations=stats_transform), ), ], ).pipe(format_tables_combined) laboratory_combined_perf_metrics_all.style.pipe( revalidation_table_styles, validation_thresholds, ["algo"], ) .. rst-class:: sphx-glr-script-out .. code-block:: none /home/docs/checkouts/readthedocs.org/user_builds/mobgap/checkouts/v0.11.0/src/mobgap/utils/df_operations.py:703: FutureWarning: DataFrameGroupBy.apply operated on the grouping columns. This behavior is deprecated, and in a future version of pandas the grouping columns will be excluded from the operation. Either pass `include_groups=False` to exclude the groupings or explicitly select the grouping columns after groupby to silence this warning. results = [df.groupby(key).apply(func, **apply_kwargs) for key, func in groupbys] /home/docs/checkouts/readthedocs.org/user_builds/mobgap/checkouts/v0.11.0/src/mobgap/utils/df_operations.py:703: FutureWarning: DataFrameGroupBy.apply operated on the grouping columns. This behavior is deprecated, and in a future version of pandas the grouping columns will be excluded from the operation. Either pass `include_groups=False` to exclude the groupings or explicitly select the grouping columns after groupby to silence this warning. results = [df.groupby(key).apply(func, **apply_kwargs) for key, func in groupbys] .. raw:: html
    # participants WD mean and CI [steps/min] INDIP mean and CI [steps/min] Bias and LoA [steps/min] Abs. Error [steps/min] Rel. Error [%] Abs. Rel. Error [%] ICC
algo version                
Mobilise-D Pipeline MobGap 1169 94.94 [94.18, 95.70] 96.39 [95.39, 97.39] -2.17 [-21.52, 17.18] 4.87 [4.36, 5.38]* -1.17 [-1.76, -0.58] 5.01 [4.49, 5.53]* 0.78 [0.75, 0.81]
Original Implementation 1169 94.22 [93.46, 94.99] 96.39 [95.39, 97.39] -2.40 [-22.81, 18.01] 5.75 [5.23, 6.26] -1.32 [-1.98, -0.65] 6.01 [5.44, 6.59] 0.75 [0.71, 0.78]


.. GENERATED FROM PYTHON SOURCE LINES 943-944 Residual plots .. GENERATED FROM PYTHON SOURCE LINES 944-975 .. code-block:: Python def combo_residual_plot(data, name=None): name = name or data.name fig, axs = plt.subplots( ncols=2, sharey=True, sharex=True, figsize=(12, 9), constrained_layout=True, ) fig.suptitle(name) for (version, subdata), ax in zip(data.groupby("version"), axs): residual_plot( subdata, "cadence_spm__reference", "cadence_spm__detected", "cohort", "steps/min", ax=ax, legend=ax == axs[-1], ) ax.set_title(version) move_legend_outside(fig, axs[-1]) plt.show() laboratory_results_combined.query('algo == "Mobilise-D Pipeline"').pipe( combo_residual_plot, name="Aggregated Analysis - Cadence" ) .. image-sg:: /auto_revalidation/full_pipeline/images/sphx_glr__02_pipeline_cad_013.png :alt: Aggregated Analysis - Cadence, MobGap, Original Implementation :srcset: /auto_revalidation/full_pipeline/images/sphx_glr__02_pipeline_cad_013.png :class: sphx-glr-single-img .. GENERATED FROM PYTHON SOURCE LINES 976-981 Per-cohort analysis ~~~~~~~~~~~~~~~~~~~ The results below represent the average absolute error on cadence estimation across all participants within a cohort. .. GENERATED FROM PYTHON SOURCE LINES 981-994 .. code-block:: Python fig, ax = plt.subplots(figsize=(12, 6)) sns.boxplot( data=laboratory_results_combined, x="cohort", y="cadence_spm__abs_error", hue="version", order=cohort_order, showmeans=True, ax=ax, ).legend().set_title(None) ax.set_ylabel("Absolute Error [steps/min]") ax.set_title("Absolute Error - Combined Analysis") fig.show() .. image-sg:: /auto_revalidation/full_pipeline/images/sphx_glr__02_pipeline_cad_014.png :alt: Absolute Error - Combined Analysis :srcset: /auto_revalidation/full_pipeline/images/sphx_glr__02_pipeline_cad_014.png :class: sphx-glr-single-img .. GENERATED FROM PYTHON SOURCE LINES 995-1017 .. code-block:: Python laboratory_combined_perf_metrics_cohort = ( laboratory_results_combined.pipe( multilevel_groupby_apply_merge, [ ( ["cohort", "algo", "version"], partial(apply_aggregations, aggregations=custom_aggs_combined), ), ( ["cohort", "algo"], partial(apply_transformations, transformations=stats_transform), ), ], ) .pipe(format_tables_combined) .loc[cohort_order] ) laboratory_combined_perf_metrics_cohort.style.pipe( revalidation_table_styles, validation_thresholds, ["cohort", "algo"], ) .. rst-class:: sphx-glr-script-out .. code-block:: none /home/docs/checkouts/readthedocs.org/user_builds/mobgap/checkouts/v0.11.0/src/mobgap/utils/df_operations.py:703: FutureWarning: DataFrameGroupBy.apply operated on the grouping columns. This behavior is deprecated, and in a future version of pandas the grouping columns will be excluded from the operation. Either pass `include_groups=False` to exclude the groupings or explicitly select the grouping columns after groupby to silence this warning. results = [df.groupby(key).apply(func, **apply_kwargs) for key, func in groupbys] /home/docs/checkouts/readthedocs.org/user_builds/mobgap/checkouts/v0.11.0/src/mobgap/utils/df_operations.py:703: FutureWarning: DataFrameGroupBy.apply operated on the grouping columns. This behavior is deprecated, and in a future version of pandas the grouping columns will be excluded from the operation. Either pass `include_groups=False` to exclude the groupings or explicitly select the grouping columns after groupby to silence this warning. results = [df.groupby(key).apply(func, **apply_kwargs) for key, func in groupbys] .. raw:: html
      # participants WD mean and CI [steps/min] INDIP mean and CI [steps/min] Bias and LoA [steps/min] Abs. Error [steps/min] Rel. Error [%] Abs. Rel. Error [%] ICC
cohort algo version                
HA Mobilise-D Pipeline MobGap 227 96.73 [95.16, 98.30] 101.68 [99.54, 103.81] -5.10 [-23.82, 13.62] 6.01 [4.83, 7.18]* -4.28 [-5.28, -3.29] 5.38 [4.49, 6.28]* 0.73 [0.54, 0.83]
Original Implementation 227 93.41 [91.75, 95.06] 101.68 [99.54, 103.81] -7.16 [-29.50, 15.19] 8.11 [6.71, 9.51] -6.36 [-7.55, -5.17] 7.41 [6.32, 8.49] 0.61 [0.32, 0.76]
CHF Mobilise-D Pipeline MobGap 106 95.23 [92.75, 97.71] 95.69 [92.38, 98.99] -2.33 [-18.63, 13.98] 4.80 [3.44, 6.17] -1.26 [-3.11, 0.59] 5.07 [3.48, 6.66] 0.84 [0.76, 0.89]
Original Implementation 106 97.06 [94.27, 99.85] 95.69 [92.38, 98.99] -3.35 [-21.59, 14.89] 5.67 [4.13, 7.21] -2.21 [-4.30, -0.13] 5.91 [4.11, 7.71] 0.80 [0.68, 0.88]
COPD Mobilise-D Pipeline MobGap 214 94.76 [93.18, 96.34] 98.25 [96.04, 100.45] -4.18 [-28.88, 20.51] 5.67 [4.06, 7.28] -3.33 [-4.50, -2.17] 5.13 [4.08, 6.17] 0.56 [0.43, 0.67]
Original Implementation 214 92.54 [90.95, 94.14] 98.25 [96.04, 100.45] -4.56 [-24.73, 15.60] 5.56 [4.25, 6.87] -4.05 [-5.09, -3.00] 5.19 [4.25, 6.14] 0.67 [0.51, 0.77]
MS Mobilise-D Pipeline MobGap 228 94.18 [92.22, 96.15] 94.82 [92.50, 97.15] -1.40 [-18.31, 15.50] 3.97 [2.96, 4.98] -0.66 [-1.75, 0.43] 4.11 [3.16, 5.06] 0.85 [0.81, 0.89]
Original Implementation 228 94.64 [92.75, 96.54] 94.82 [92.50, 97.15] -0.97 [-19.20, 17.26] 4.78 [3.74, 5.82] -0.08 [-1.27, 1.12] 5.02 [4.02, 6.02] 0.83 [0.78, 0.86]
PD Mobilise-D Pipeline MobGap 225 94.09 [92.38, 95.80] 93.64 [91.54, 95.74] -0.61 [-13.72, 12.50] 3.73 [3.00, 4.46] 0.05 [-0.90, 0.99] 4.10 [3.33, 4.88] 0.89 [0.86, 0.92]
Original Implementation 225 93.76 [92.17, 95.36] 93.64 [91.54, 95.74] -0.63 [-15.38, 14.13] 4.65 [3.88, 5.43] 0.16 [-0.89, 1.20] 5.02 [4.20, 5.83] 0.85 [0.81, 0.89]
PFF Mobilise-D Pipeline MobGap 169 94.80 [92.65, 96.95] 94.08 [91.18, 96.97] 0.45 [-21.71, 22.61] 5.33 [3.82, 6.83] 2.52 [0.00, 5.04] 6.68 [4.34, 9.02] 0.77 [0.71, 0.83]
Original Implementation 169 95.30 [93.20, 97.41] 94.08 [91.18, 96.97] 0.90 [-22.80, 24.60] 6.12 [4.54, 7.70] 3.36 [0.52, 6.19] 7.86 [5.24, 10.49] 0.74 [0.66, 0.80]


.. GENERATED FROM PYTHON SOURCE LINES 1018-1022 Scatter plot The results below represent the detected and reference values of cadence scattered across all participants within a cohort. Correlation factor, p-value and confidence intervals of the regression line are shown in the plot. Each datapoint represents one participant. .. GENERATED FROM PYTHON SOURCE LINES 1022-1081 .. code-block:: Python from mobgap.plotting import calc_min_max_with_margin def combo_scatter_plot(data, name=None): name = name or data.name fig, axs = plt.subplots( ncols=2, sharey=True, sharex=True, figsize=(12, 8), constrained_layout=True, ) fig.suptitle(name) min_max = calc_min_max_with_margin( data["cadence_spm__reference"], data["cadence_spm__detected"], ) for (version, subdata), ax in zip(data.groupby("version"), axs): subdata = subdata[ [ "cadence_spm__reference", "cadence_spm__detected", "cohort", ] ].dropna(how="any") sns.scatterplot( subdata, x="cadence_spm__reference", y="cadence_spm__detected", hue="cohort", ax=ax, legend=ax == axs[-1], ) plot_regline( subdata["cadence_spm__reference"], subdata["cadence_spm__detected"], ax=ax, ) make_square(ax, min_max, draw_diagonal=True) ax.set_title(version) ax.set_xlabel("Reference [steps/min]") ax.set_ylabel("Detected [steps/min]") ax.tick_params(axis="both", labelsize=20) move_legend_outside(fig, axs[-1]) plt.show() laboratory_results_combined.query('algo == "Mobilise-D Pipeline"').pipe( combo_scatter_plot, name="Mobilise-D Pipeline - Cadence" ) .. image-sg:: /auto_revalidation/full_pipeline/images/sphx_glr__02_pipeline_cad_015.png :alt: Mobilise-D Pipeline - Cadence, MobGap, Original Implementation :srcset: /auto_revalidation/full_pipeline/images/sphx_glr__02_pipeline_cad_015.png :class: sphx-glr-single-img .. GENERATED FROM PYTHON SOURCE LINES 1082-1104 Matched/True Positive Evaluation ******************************** The "Matched" Evaluation directly compares the performance of cadence estimation on only the WBs that were detected in both systems (true positives). WBs were included in the true positive analysis, if there was an overlap of more than 80% between WBs detected by the two systems (details about the selection of this threshold can be found in [1]_). The threshold of 80% was selected as a trade-off to allow us: (i) to consider as much as possible a like-for-like comparison between selected WBs (INDIP vs. wearable device), and at the same time (ii) to include the minimum number of WBs to ensure sufficient statistical power for the analyses (i.e., at least 101 walking bouts for each cohort). This target was based upon the number of WBs rather than a percentage of total walking bouts that would allow us to meet criteria established by statistical experts for robust statistical analysis after sample-size re-evaluation (total WB number > 101 corresponding to ICC > 0.7 and a CI = 0.2). .. note:: compared to the results published in [1]_, the primary analysis on the matched results is performed on the average performance metrics across all matched WBs **per trial**. The original publication considered the average performance metrics across all matched WBs without additional aggregation. Results across all cohorts ~~~~~~~~~~~~~~~~~~~~~~~~~~ The results below represent the average performance across all participants independent of the cohort in terms of error, relative error, absolute error, and absolute relative error. .. GENERATED FROM PYTHON SOURCE LINES 1104-1106 .. code-block:: Python laboratory_results_matched.pipe(multi_metric_plot, metrics, 2, 2) .. image-sg:: /auto_revalidation/full_pipeline/images/sphx_glr__02_pipeline_cad_016.png :alt: Abs. Rel. Error (%), Error (steps/min), Rel. Error (%), Abs. Error (steps/min) :srcset: /auto_revalidation/full_pipeline/images/sphx_glr__02_pipeline_cad_016.png :class: sphx-glr-single-img .. GENERATED FROM PYTHON SOURCE LINES 1107-1109 As each pipeline version produces different WB's, it is important to compare the number of matched WBs to put all other metrics into perspective. .. GENERATED FROM PYTHON SOURCE LINES 1109-1120 .. code-block:: Python fig, ax = plt.subplots(figsize=(12, 6)) sns.barplot( data=laboratory_results_matched.groupby(["version"])["n_matched_wbs"] .sum() .reset_index(), x="version", y="n_matched_wbs", ax=ax, ) fig.show() .. image-sg:: /auto_revalidation/full_pipeline/images/sphx_glr__02_pipeline_cad_017.png :alt: 02 pipeline cad :srcset: /auto_revalidation/full_pipeline/images/sphx_glr__02_pipeline_cad_017.png :class: sphx-glr-single-img .. GENERATED FROM PYTHON SOURCE LINES 1121-1140 .. code-block:: Python laboratory_matched_perf_metrics_all = laboratory_results_matched.pipe( multilevel_groupby_apply_merge, [ ( ["algo", "version"], partial(apply_aggregations, aggregations=custom_aggs_matched), ), ( ["algo"], partial(apply_transformations, transformations=stats_transform), ), ], ).pipe(format_tables_matched) laboratory_matched_perf_metrics_all.style.pipe( revalidation_table_styles, validation_thresholds, ["algo"], ) .. rst-class:: sphx-glr-script-out .. code-block:: none /home/docs/checkouts/readthedocs.org/user_builds/mobgap/checkouts/v0.11.0/src/mobgap/utils/df_operations.py:703: FutureWarning: DataFrameGroupBy.apply operated on the grouping columns. This behavior is deprecated, and in a future version of pandas the grouping columns will be excluded from the operation. Either pass `include_groups=False` to exclude the groupings or explicitly select the grouping columns after groupby to silence this warning. results = [df.groupby(key).apply(func, **apply_kwargs) for key, func in groupbys] /home/docs/checkouts/readthedocs.org/user_builds/mobgap/checkouts/v0.11.0/src/mobgap/utils/df_operations.py:703: FutureWarning: DataFrameGroupBy.apply operated on the grouping columns. This behavior is deprecated, and in a future version of pandas the grouping columns will be excluded from the operation. Either pass `include_groups=False` to exclude the groupings or explicitly select the grouping columns after groupby to silence this warning. results = [df.groupby(key).apply(func, **apply_kwargs) for key, func in groupbys] .. raw:: html
    # participants WD mean and CI [steps/min] INDIP mean and CI [steps/min] Bias and LoA [steps/min] Abs. Error [steps/min] Rel. Error [%] Abs. Rel. Error [%] ICC # Matched WBs
algo version                  
Mobilise-D Pipeline MobGap 1169 94.25 [93.55, 94.95] 95.28 [94.47, 96.08] -1.16 [-12.98, 10.66] 3.16 [2.86, 3.47]* -0.79 [-1.14, -0.44] 3.30 [3.00, 3.60]* 0.89 [0.87, 0.91] 675
Original Implementation 1169 94.80 [94.06, 95.53] 96.60 [95.76, 97.44] -1.80 [-14.56, 10.95] 3.86 [3.54, 4.19] -1.38 [-1.76, -1.00] 3.95 [3.64, 4.27] 0.88 [0.85, 0.90] 715


.. GENERATED FROM PYTHON SOURCE LINES 1141-1142 Residual plot .. GENERATED FROM PYTHON SOURCE LINES 1142-1145 .. code-block:: Python laboratory_results_matched.query('algo == "Mobilise-D Pipeline"').pipe( combo_residual_plot, name="Matched WBs - Cadence" ) .. image-sg:: /auto_revalidation/full_pipeline/images/sphx_glr__02_pipeline_cad_018.png :alt: Matched WBs - Cadence, MobGap, Original Implementation :srcset: /auto_revalidation/full_pipeline/images/sphx_glr__02_pipeline_cad_018.png :class: sphx-glr-single-img .. GENERATED FROM PYTHON SOURCE LINES 1146-1151 Per-cohort analysis ~~~~~~~~~~~~~~~~~~~ Boxplot The results below represent the average absolute error on cadence estimation across all participants within a cohort. .. GENERATED FROM PYTHON SOURCE LINES 1151-1166 .. code-block:: Python fig, ax = plt.subplots(figsize=(12, 6)) sns.barplot( data=laboratory_results_matched.groupby(["version", "cohort"])[ "n_matched_wbs" ] .sum() .reset_index(), hue="version", y="n_matched_wbs", x="cohort", order=cohort_order, ax=ax, ) fig.show() .. image-sg:: /auto_revalidation/full_pipeline/images/sphx_glr__02_pipeline_cad_019.png :alt: 02 pipeline cad :srcset: /auto_revalidation/full_pipeline/images/sphx_glr__02_pipeline_cad_019.png :class: sphx-glr-single-img .. GENERATED FROM PYTHON SOURCE LINES 1167-1179 .. code-block:: Python fig, ax = plt.subplots(figsize=(12, 6)) sns.boxplot( data=laboratory_results_matched, x="cohort", y="cadence_spm__abs_error", hue="algo_with_version", order=cohort_order, ax=ax, ).legend().set_title(None) ax.set_ylabel("Absolute Error [steps/min]") ax.set_title("Absolute Error - Matched Analysis") fig.show() .. image-sg:: /auto_revalidation/full_pipeline/images/sphx_glr__02_pipeline_cad_020.png :alt: Absolute Error - Matched Analysis :srcset: /auto_revalidation/full_pipeline/images/sphx_glr__02_pipeline_cad_020.png :class: sphx-glr-single-img .. GENERATED FROM PYTHON SOURCE LINES 1180-1181 Processing the per-cohort performance table .. GENERATED FROM PYTHON SOURCE LINES 1181-1204 .. code-block:: Python laboratory_matched_perf_metrics_cohort = ( laboratory_results_matched.pipe( multilevel_groupby_apply_merge, [ ( ["cohort", "algo", "version"], partial(apply_aggregations, aggregations=custom_aggs_matched), ), ( ["cohort", "algo"], partial(apply_transformations, transformations=stats_transform), ), ], ) .pipe(format_tables_matched) .loc[cohort_order] ) laboratory_matched_perf_metrics_cohort.style.pipe( revalidation_table_styles, validation_thresholds, ["cohort", "algo"], ) .. rst-class:: sphx-glr-script-out .. code-block:: none /home/docs/checkouts/readthedocs.org/user_builds/mobgap/checkouts/v0.11.0/src/mobgap/utils/df_operations.py:703: FutureWarning: DataFrameGroupBy.apply operated on the grouping columns. This behavior is deprecated, and in a future version of pandas the grouping columns will be excluded from the operation. Either pass `include_groups=False` to exclude the groupings or explicitly select the grouping columns after groupby to silence this warning. results = [df.groupby(key).apply(func, **apply_kwargs) for key, func in groupbys] /home/docs/checkouts/readthedocs.org/user_builds/mobgap/checkouts/v0.11.0/src/mobgap/utils/df_operations.py:703: FutureWarning: DataFrameGroupBy.apply operated on the grouping columns. This behavior is deprecated, and in a future version of pandas the grouping columns will be excluded from the operation. Either pass `include_groups=False` to exclude the groupings or explicitly select the grouping columns after groupby to silence this warning. results = [df.groupby(key).apply(func, **apply_kwargs) for key, func in groupbys] .. raw:: html
      # participants WD mean and CI [steps/min] INDIP mean and CI [steps/min] Bias and LoA [steps/min] Abs. Error [steps/min] Rel. Error [%] Abs. Rel. Error [%] ICC # Matched WBs
cohort algo version                  
HA Mobilise-D Pipeline MobGap 227 97.51 [96.11, 98.91] 99.00 [97.38, 100.62] -1.88 [-11.93, 8.16] 3.38 [2.82, 3.93]* -1.61 [-2.37, -0.85] 3.50 [2.86, 4.14] 0.89 [0.81, 0.93] 80
Original Implementation 227 95.33 [93.95, 96.72] 99.82 [98.22, 101.43] -4.49 [-15.48, 6.49] 4.94 [4.27, 5.62] -4.20 [-4.86, -3.54] 4.72 [4.12, 5.32] 0.82 [0.47, 0.92] 102
CHF Mobilise-D Pipeline MobGap 106 92.12 [89.98, 94.26] 92.10 [89.17, 95.03] -1.14 [-12.62, 10.35] 3.61 [2.71, 4.51] -0.49 [-1.89, 0.90] 3.95 [2.77, 5.13] 0.90 [0.82, 0.94] 53
Original Implementation 106 96.24 [93.59, 98.88] 99.48 [96.46, 102.49] -3.24 [-11.66, 5.18] 3.91 [3.13, 4.70] -2.95 [-3.66, -2.24] 3.77 [3.07, 4.47] 0.94 [0.77, 0.97] 60
COPD Mobilise-D Pipeline MobGap 214 96.25 [94.84, 97.66] 97.98 [96.42, 99.54] -1.80 [-10.51, 6.90] 3.09 [2.61, 3.58] -1.60 [-2.34, -0.85] 3.28 [2.65, 3.92] 0.91 [0.84, 0.95] 93
Original Implementation 214 94.97 [93.46, 96.48] 97.89 [96.24, 99.53] -2.92 [-9.37, 3.54] 3.35 [2.97, 3.73] -2.86 [-3.28, -2.43] 3.35 [3.00, 3.70] 0.93 [0.69, 0.97] 106
MS Mobilise-D Pipeline MobGap 228 93.49 [91.72, 95.27] 94.64 [92.70, 96.58] -1.15 [-16.47, 14.17] 3.47 [2.52, 4.41] -0.67 [-1.62, 0.29] 3.56 [2.69, 4.43] 0.85 [0.80, 0.89] 176
Original Implementation 228 94.68 [92.81, 96.54] 95.65 [93.61, 97.68] -0.97 [-17.41, 15.47] 3.96 [2.99, 4.93] -0.37 [-1.45, 0.70] 4.17 [3.23, 5.10] 0.84 [0.79, 0.88] 182
PD Mobilise-D Pipeline MobGap 225 92.94 [91.45, 94.42] 93.17 [91.52, 94.82] -0.23 [-7.38, 6.91] 2.47 [2.12, 2.83]* 0.00 [-0.55, 0.56] 2.77 [2.34, 3.19] 0.95 [0.94, 0.97] 151
Original Implementation 225 93.35 [91.81, 94.89] 93.48 [91.76, 95.19] -0.12 [-8.34, 8.09] 3.18 [2.80, 3.56] 0.18 [-0.45, 0.81] 3.52 [3.07, 3.97] 0.94 [0.92, 0.96] 142
PFF Mobilise-D Pipeline MobGap 169 94.14 [92.13, 96.15] 95.52 [93.16, 97.87] -1.38 [-15.23, 12.48] 3.31 [2.34, 4.29] -0.92 [-1.81, -0.03] 3.19 [2.41, 3.97] 0.88 [0.83, 0.92] 122
Original Implementation 169 95.28 [93.22, 97.34] 96.21 [93.69, 98.72] -0.93 [-16.91, 15.06] 4.01 [2.93, 5.10] -0.13 [-1.44, 1.18] 4.11 [2.96, 5.27] 0.86 [0.80, 0.90] 123


.. GENERATED FROM PYTHON SOURCE LINES 1205-1213 Deep dive investigation: Do errors depend on WB duration or walking speed? ************************************************************************** Effect of WB duration ~~~~~~~~~~~~~~~~~~~~~ We investigate the dependency of the absolute cadence error of all true-positive WBs from the real-world recording on the WB duration reported by the reference system. In the top, WB errors are grouped by various duration bouts. In the bottom the number of bouts within each duration group is visualized. .. GENERATED FROM PYTHON SOURCE LINES 1213-1269 .. code-block:: Python import numpy as np def plot_wb_duration_analysis(df): """Generates a single figure with: - First row: Two side-by-side boxplot for "new" and "old" cases. - Second row: A grouped bar chart comparing WB counts for "new" and "old" cases. df: DataFrame containing 'version' column with values 'new' or 'old' to distinguish data """ fig, axs = plt.subplot_mosaic( [["v"], ["v"], ["v"], ["n"]], sharex=True, figsize=(12, 9) ) # Compute WB durations in seconds df_with_durations = df.assign( duration_s=lambda df_: (df_["end__reference"] - df_["start__reference"]) / 100 ) bins = { "All": (-np.inf, np.inf), "> 10 s": (10, np.inf), "<= 10 s": (0, 10), "10 - 30 s": (10, 30), "30 - 60 s": (30, 60), "60 - 120 s": (60, 120), "> 120 s": (120, np.inf), } binned_df = cut_into_overlapping_bins( df_with_durations, "duration_s", bins ).reset_index() n = sns.countplot( data=binned_df, x="bin", hue="version", ax=axs["n"], legend=False ) for container in n.containers: n.bar_label(container, size=10) sns.boxplot( data=binned_df, x="bin", y="cadence_spm__abs_error", hue="version", ax=axs["v"], ) sns.despine(fig) axs["v"].set_ylabel("Absolute Cadence Error (steps/min)") axs["n"].set_ylabel("WB Count") axs["n"].set_xlabel("Ref. WB Duration") fig.show() laboratory_results_matched_raw.query("algo == 'Mobilise-D Pipeline'").pipe( plot_wb_duration_analysis ) .. image-sg:: /auto_revalidation/full_pipeline/images/sphx_glr__02_pipeline_cad_021.png :alt: 02 pipeline cad :srcset: /auto_revalidation/full_pipeline/images/sphx_glr__02_pipeline_cad_021.png :class: sphx-glr-single-img .. GENERATED FROM PYTHON SOURCE LINES 1270-1277 Effect of walking speed on error ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ One important aspect of the algorithm performance is the dependency on the cadence. Aka, how well do the algorithms perform at different walking speeds. For this we plot the absolute error against the cadence of the reference data. For better granularity, we use the values per WB, instead of the aggregates per participant. The overlayed dots represent the trend-line calculated by taking the median of the absolute error within bins of 0.05 m/s. .. GENERATED FROM PYTHON SOURCE LINES 1277-1367 .. code-block:: Python # For plotting all participants at the end laboratory_combined = laboratory_results_matched_raw.copy() laboratory_combined["cohort"] = "Combined" ws_level_results = pd.concat( [laboratory_results_matched_raw, laboratory_combined] ).reset_index(drop=True) algo_names = ws_level_results["algo_with_version"].unique() cohort_names = ws_level_results["cohort"].unique() ws_level_results["cohort"] = pd.Categorical( ws_level_results["cohort"], categories=cohort_names, ordered=True ) ws_level_results["algo_with_version"] = pd.Categorical( ws_level_results["algo_with_version"], categories=algo_names, ordered=True ) # Create the figure with subplots fig = plt.figure(constrained_layout=True, figsize=(24, 5 * len(algo_names))) subfigs = fig.subfigures(len(algo_names), 1, wspace=0.1, hspace=0.1) # Define the min and max limits for x and y axes min_max_x = calc_min_max_with_margin( ws_level_results["walking_speed_mps__reference"] ) min_max_y = calc_min_max_with_margin(ws_level_results["cadence_spm__abs_error"]) # Plotting each algorithm version for subfig, (algo, data) in zip( subfigs, ws_level_results.groupby("algo_with_version", observed=True) ): subfig.suptitle(algo) subfig.supxlabel("Walking Speed (m/s)") subfig.supylabel("Absolute Error (steps/min)") # Create subplots for each cohort axs = subfig.subplots(1, len(cohort_names), sharex=True, sharey=True) for ax, (cohort, cohort_data) in zip( axs, data.groupby("cohort", observed=True) ): # Scatter plot for the cohort data sns.scatterplot( data=cohort_data, x="walking_speed_mps__reference", # Reference walking speed y="cadence_spm__abs_error", # Absolute error ax=ax, alpha=0.3, ) # Define bins for walking speed bins = np.arange( 0, cohort_data["walking_speed_mps__reference"].max() + 0.05, 0.05 ) cohort_data["speed_bin"] = pd.cut( cohort_data["walking_speed_mps__reference"], bins=bins ) # Calculate bin centers cohort_data["bin_center"] = cohort_data["speed_bin"].apply( lambda x: x.mid ) # Calculate median error per bin and cohort binned_data = ( cohort_data.groupby("bin_center", observed=True)[ "cadence_spm__abs_error" ] .median() .reset_index() ) # Plot the median lines for each bin sns.scatterplot( data=binned_data, x="bin_center", y="cadence_spm__abs_error", # Median error ax=ax, ) ax.set_title(cohort) ax.set_xlabel(None) ax.set_ylabel(None) # Set axis limits ax.set_xlim(*min_max_x) ax.set_ylim(*min_max_y) fig.show() .. image-sg:: /auto_revalidation/full_pipeline/images/sphx_glr__02_pipeline_cad_022.png :alt: CHF, COPD, HA, MS, PD, PFF, Combined, CHF, COPD, HA, MS, PD, PFF, Combined :srcset: /auto_revalidation/full_pipeline/images/sphx_glr__02_pipeline_cad_022.png :class: sphx-glr-single-img .. rst-class:: sphx-glr-timing **Total running time of the script:** (0 minutes 20.286 seconds) **Estimated memory usage:** 91 MB .. _sphx_glr_download_auto_revalidation_full_pipeline__02_pipeline_cad.py: .. only:: html .. container:: sphx-glr-footer sphx-glr-footer-example .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: _02_pipeline_cad.ipynb <_02_pipeline_cad.ipynb>` .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: _02_pipeline_cad.py <_02_pipeline_cad.py>` .. container:: sphx-glr-download sphx-glr-download-zip :download:`Download zipped: _02_pipeline_cad.zip <_02_pipeline_cad.zip>` .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_