.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "auto_revalidation/laterality/_01_lrc_analysis.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note :ref:`Go to the end ` to download the full example code. .. rst-class:: sphx-glr-example-title .. _sphx_glr_auto_revalidation_laterality__01_lrc_analysis.py: .. _lrc_val_results: Performance of the laterality classification algorithms on the TVS dataset ========================================================================== .. warning:: On this page you will find preliminary results for a standardized revalidation of the pipeline and all of its algorithm. The current state, **TECHNICAL EXPERIMENTATION**. Don't use these results or make any assumptions based on them. We will update this page incrementally and provide further information, as soon as the state of any of the validation steps changes. The following provides an analysis and comparison of the stride length algorithms on the TVS dataset (lab and free-living). We look into the actual performance of the algorithms compared to the reference data. Compared to the other revalidation scripts, this one does not load the old "matlab" results, as there are no old results. The laterality algorithm by Ulrich et al. was validated independently and was already written in Python. The implemented version follows the old version very closely. The goal of this revalidation, is to validate the re-trained model (with the updated training code) on the TVS dataset. We compare it against the old model and the McCamley algorithm. .. note:: If you are interested in how these results are calculated, head over to the :ref:`processing page `. .. GENERATED FROM PYTHON SOURCE LINES 30-36 Below are the list of algorithms that we will compare. Note, that we use the prefix "MobGap" to refer to the newly trained model and "Original Implementation" refers to the models trained as part of previous work. We compare all the available models. For context, the "MS_ALL" models are used by default in the pipelines. For the McCamley algorithm, only a single version exists. .. GENERATED FROM PYTHON SOURCE LINES 36-44 .. code-block:: Python algorithms = { "McCamley": ("McCamley", "-"), "UllrichOld__ms_all": ("Ullrich - MS-ALL", "Original Implementation"), "UllrichOld__ms_ms": ("Ullrich - MS-MS", "Original Implementation"), "UllrichNew__ms_all": ("Ullrich - MS-ALL", "MobGap"), } .. GENERATED FROM PYTHON SOURCE LINES 45-52 The code below loads the data and prepares it for the analysis. By default, the data will be downloaded from an online repository (and cached locally). If you want to use a local copy of the data, you can set the `MOBGAP_VALIDATION_DATA_PATH` environment variable. and the MOBGAP_VALIDATION_USE_LOCA_DATA to `1`. The file download will print a couple log information, which can usually be ignored. You can also change the `version` parameter to load a different version of the data. .. GENERATED FROM PYTHON SOURCE LINES 52-124 .. code-block:: Python from pathlib import Path import pandas as pd from mobgap.data.validation_results import ValidationResultLoader from mobgap.utils.misc import get_env_var def format_loaded_results( values: dict[tuple[str, str], pd.DataFrame], index_cols: list[str], ) -> pd.DataFrame: formatted = ( pd.concat(values, names=["algo", "version", *index_cols]) .reset_index() .assign( algo_with_version=lambda df: df["algo"] + " (" + df["version"] + ")", _combined="combined", ) ) return formatted local_data_path = ( Path(get_env_var("MOBGAP_VALIDATION_DATA_PATH")) / "results" if int(get_env_var("MOBGAP_VALIDATION_USE_LOCAL_DATA", 0)) else None ) __RESULT_VERSION = "v0.11.0" loader = ValidationResultLoader( "lrc", result_path=local_data_path, version=__RESULT_VERSION ) free_living_index_cols = [ "cohort", "participant_id", "time_measure", "recording", "recording_name", "recording_name_pretty", ] free_living_results = format_loaded_results( { v: loader.load_single_results(k, "free_living") for k, v in algorithms.items() }, free_living_index_cols, ) lab_index_cols = [ "cohort", "participant_id", "time_measure", "test", "trial", "test_name", "test_name_pretty", ] lab_results = format_loaded_results( { v: loader.load_single_results(k, "laboratory") for k, v in algorithms.items() }, lab_index_cols, ) cohort_order = ["HA", "CHF", "COPD", "MS", "PD", "PFF"] .. rst-class:: sphx-glr-script-out .. code-block:: none 0%| | 0.00/2.44k [00:00 pd.DataFrame: return ( df.pipe(apply_transformations, format_transforms) .rename(columns=final_names) .loc[:, list(final_names.values())] ) .. GENERATED FROM PYTHON SOURCE LINES 202-210 Free-Living Comparison ---------------------- We focus on the free-living data for the comparison as this is the expected use case for the algorithms. All results across all cohorts ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The results below represent the average performance across all participants independent of the cohort. .. GENERATED FROM PYTHON SOURCE LINES 210-236 .. code-block:: Python import matplotlib.pyplot as plt import seaborn as sns fig, ax = plt.subplots() sns.boxplot( data=free_living_results, x="algo_with_version", y="accuracy", ax=ax ) fig.show() fig, ax = plt.subplots() sns.boxplot( data=free_living_results, x="algo_with_version", y="accuracy_pairwise", ax=ax, ) fig.show() perf_metrics_all = ( free_living_results.groupby(["algo", "version"]) .apply(apply_aggregations, custom_aggs, include_groups=False) .pipe(format_tables) ) perf_metrics_all.style.pipe( revalidation_table_styles, validation_thresholds, ["algo"] ) .. rst-class:: sphx-glr-horizontal * .. image-sg:: /auto_revalidation/laterality/images/sphx_glr__01_lrc_analysis_001.png :alt: 01 lrc analysis :srcset: /auto_revalidation/laterality/images/sphx_glr__01_lrc_analysis_001.png :class: sphx-glr-single-img * .. image-sg:: /auto_revalidation/laterality/images/sphx_glr__01_lrc_analysis_002.png :alt: 01 lrc analysis :srcset: /auto_revalidation/laterality/images/sphx_glr__01_lrc_analysis_002.png :class: sphx-glr-single-img .. raw:: html
    # participants Accuracy Accuracy IC-pairs
algo version      
McCamley - 101 0.78 [0.75, 0.82] 0.77 [0.75, 0.79]
Ullrich - MS-ALL MobGap 101 0.80 [0.77, 0.84] 0.81 [0.79, 0.83]
Original Implementation 101 0.77 [0.74, 0.81] 0.74 [0.71, 0.77]
Ullrich - MS-MS Original Implementation 101 0.76 [0.72, 0.80] 0.76 [0.73, 0.78]


.. GENERATED FROM PYTHON SOURCE LINES 237-240 Per Cohort ~~~~~~~~~~ The results below represent the average performance across all participants within a cohort. .. GENERATED FROM PYTHON SOURCE LINES 240-272 .. code-block:: Python fig, ax = plt.subplots() sns.boxplot( data=free_living_results, x="cohort", y="accuracy", hue="algo_with_version", order=cohort_order, ax=ax, ) ax.set_title("Accuracy") fig.show() fig, ax = plt.subplots() sns.boxplot( data=free_living_results, x="cohort", y="accuracy_pairwise", hue="algo_with_version", order=cohort_order, ax=ax, ) ax.set_title("Accuracy IC-pairs") fig.show() perf_metrics_cohort = ( free_living_results.groupby(["cohort", "algo", "version"]) .apply(apply_aggregations, custom_aggs, include_groups=False) .pipe(format_tables) .loc[cohort_order] ) perf_metrics_cohort.style.pipe( revalidation_table_styles, validation_thresholds, ["cohort", "algo"] ) .. rst-class:: sphx-glr-horizontal * .. image-sg:: /auto_revalidation/laterality/images/sphx_glr__01_lrc_analysis_003.png :alt: Accuracy :srcset: /auto_revalidation/laterality/images/sphx_glr__01_lrc_analysis_003.png :class: sphx-glr-single-img * .. image-sg:: /auto_revalidation/laterality/images/sphx_glr__01_lrc_analysis_004.png :alt: Accuracy IC-pairs :srcset: /auto_revalidation/laterality/images/sphx_glr__01_lrc_analysis_004.png :class: sphx-glr-single-img .. raw:: html
      # participants Accuracy Accuracy IC-pairs
cohort algo version      
HA McCamley - 20 0.85 [0.80, 0.90] 0.81 [0.76, 0.86]
Ullrich - MS-ALL MobGap 20 0.86 [0.81, 0.90] 0.84 [0.80, 0.88]
Original Implementation 20 0.84 [0.79, 0.88] 0.78 [0.73, 0.84]
Ullrich - MS-MS Original Implementation 20 0.82 [0.76, 0.87] 0.78 [0.73, 0.83]
CHF McCamley - 10 0.82 [0.73, 0.91] 0.80 [0.72, 0.88]
Ullrich - MS-ALL MobGap 10 0.83 [0.73, 0.93] 0.85 [0.80, 0.90]
Original Implementation 10 0.80 [0.71, 0.90] 0.79 [0.72, 0.85]
Ullrich - MS-MS Original Implementation 10 0.72 [0.61, 0.84] 0.72 [0.63, 0.82]
COPD McCamley - 17 0.70 [0.62, 0.78] 0.71 [0.66, 0.76]
Ullrich - MS-ALL MobGap 17 0.76 [0.68, 0.84] 0.78 [0.74, 0.82]
Original Implementation 17 0.70 [0.62, 0.78] 0.69 [0.64, 0.73]
Ullrich - MS-MS Original Implementation 17 0.76 [0.70, 0.83] 0.74 [0.70, 0.78]
MS McCamley - 18 0.78 [0.68, 0.87] 0.79 [0.74, 0.84]
Ullrich - MS-ALL MobGap 18 0.79 [0.69, 0.88] 0.82 [0.77, 0.86]
Original Implementation 18 0.75 [0.66, 0.84] 0.73 [0.67, 0.79]
Ullrich - MS-MS Original Implementation 18 0.75 [0.65, 0.86] 0.80 [0.75, 0.85]
PD McCamley - 19 0.79 [0.71, 0.87] 0.78 [0.72, 0.84]
Ullrich - MS-ALL MobGap 19 0.81 [0.71, 0.91] 0.83 [0.76, 0.90]
Original Implementation 19 0.79 [0.70, 0.88] 0.78 [0.70, 0.85]
Ullrich - MS-MS Original Implementation 19 0.78 [0.68, 0.87] 0.78 [0.71, 0.84]
PFF McCamley - 17 0.75 [0.67, 0.84] 0.73 [0.68, 0.79]
Ullrich - MS-ALL MobGap 17 0.76 [0.65, 0.86] 0.76 [0.69, 0.83]
Original Implementation 17 0.75 [0.65, 0.84] 0.70 [0.61, 0.79]
Ullrich - MS-MS Original Implementation 17 0.70 [0.60, 0.81] 0.70 [0.62, 0.77]


.. GENERATED FROM PYTHON SOURCE LINES 273-277 Deep Dive Analysis of Main Algorithms ------------------------------------- Below, we show the direct correlation between the results from the old and the new implementation. Each datapoint represents one participant. .. GENERATED FROM PYTHON SOURCE LINES 277-325 .. code-block:: Python from mobgap.plotting import ( calc_min_max_with_margin, make_square, move_legend_outside, plot_regline, ) def compare_scatter_plot(data, name): fig, ax = plt.subplots(figsize=(8, 8), constrained_layout=True) reformated_data = ( data.pivot_table( values="accuracy", index=("cohort", "participant_id"), columns="version", ) .reset_index() .dropna(how="any") ) min_max = calc_min_max_with_margin( reformated_data["Original Implementation"], reformated_data["MobGap"] ) sns.scatterplot( reformated_data, x="Original Implementation", y="MobGap", hue="cohort", ax=ax, ) plot_regline( reformated_data["Original Implementation"], reformated_data["MobGap"], ax=ax, ) make_square(ax, min_max, draw_diagonal=True) move_legend_outside(fig, ax) ax.set_title(name) ax.set_xlabel("Original Implementation") ax.set_ylabel("MobGap") plt.tight_layout() plt.show() free_living_results.query("algo == 'Ullrich - MS-ALL'").pipe( compare_scatter_plot, "Ullrich - MS-ALL" ) .. image-sg:: /auto_revalidation/laterality/images/sphx_glr__01_lrc_analysis_005.png :alt: Ullrich - MS-ALL :srcset: /auto_revalidation/laterality/images/sphx_glr__01_lrc_analysis_005.png :class: sphx-glr-single-img .. rst-class:: sphx-glr-script-out .. code-block:: none /home/docs/checkouts/readthedocs.org/user_builds/mobgap/checkouts/v0.11.0/revalidation/laterality/_01_lrc_analysis.py:317: UserWarning: The figure layout has changed to tight plt.tight_layout() .. GENERATED FROM PYTHON SOURCE LINES 326-333 Conclusion Free-Living ~~~~~~~~~~~~~~~~~~~~~~ It is good to see that the new version of the algorithm performs slightly better than the old version. However, it is unclear, why the new model is different, as we used almost the same pipeline and the same data. The non-ML algo (McCamly) performs suprisingly well, and much better than in the tests we did as part of Mobilise-D. Overall, the performance is not as good as we would like it to be. In particular for a couple of participants, where the performance is as low as 0.1. .. GENERATED FROM PYTHON SOURCE LINES 335-343 Laboratory Comparison --------------------- Every datapoint below is one trial of a test. Note, that each datapoint is weighted equally in the calculation of the performance metrics. This is a limitation of this simple approach, as the number of strides per trial and the complexity of the context can vary significantly. For a full picture, different groups of tests should be analyzed separately. The approach below should still provide a good overview to compare the algorithms. .. GENERATED FROM PYTHON SOURCE LINES 343-361 .. code-block:: Python fig, ax = plt.subplots() sns.boxplot(data=lab_results, x="algo_with_version", y="accuracy", ax=ax) fig.show() fig, ax = plt.subplots() sns.boxplot( data=lab_results, x="algo_with_version", y="accuracy_pairwise", ax=ax ) fig.show() perf_metrics_all = ( lab_results.groupby(["algo", "version"]) .apply(apply_aggregations, custom_aggs, include_groups=False) .pipe(format_tables) ) perf_metrics_all.style.pipe( revalidation_table_styles, validation_thresholds, ["algo"] ) .. rst-class:: sphx-glr-horizontal * .. image-sg:: /auto_revalidation/laterality/images/sphx_glr__01_lrc_analysis_006.png :alt: 01 lrc analysis :srcset: /auto_revalidation/laterality/images/sphx_glr__01_lrc_analysis_006.png :class: sphx-glr-single-img * .. image-sg:: /auto_revalidation/laterality/images/sphx_glr__01_lrc_analysis_007.png :alt: 01 lrc analysis :srcset: /auto_revalidation/laterality/images/sphx_glr__01_lrc_analysis_007.png :class: sphx-glr-single-img .. raw:: html
    # participants Accuracy Accuracy IC-pairs
algo version      
McCamley - 1169 0.80 [0.79, 0.81] 0.77 [0.76, 0.78]
Ullrich - MS-ALL MobGap 1169 0.85 [0.84, 0.86] 0.83 [0.82, 0.84]
Original Implementation 1169 0.79 [0.78, 0.80] 0.74 [0.73, 0.75]
Ullrich - MS-MS Original Implementation 1169 0.78 [0.77, 0.79] 0.75 [0.74, 0.76]


.. GENERATED FROM PYTHON SOURCE LINES 362-365 Per Cohort ~~~~~~~~~~ The results below represent the average performance across all trails of all participants within a cohort. .. GENERATED FROM PYTHON SOURCE LINES 365-396 .. code-block:: Python fig, ax = plt.subplots() sns.boxplot( data=lab_results, x="cohort", y="accuracy", hue="algo_with_version", order=cohort_order, ax=ax, ) fig.show() fig, ax = plt.subplots() sns.boxplot( data=lab_results, x="cohort", y="accuracy_pairwise", hue="algo_with_version", order=cohort_order, ax=ax, ) fig.show() perf_metrics_cohort = ( lab_results.groupby(["cohort", "algo", "version"]) .apply(apply_aggregations, custom_aggs, include_groups=False) .pipe(format_tables) .loc[cohort_order] ) perf_metrics_cohort.style.pipe( revalidation_table_styles, validation_thresholds, ["cohort", "algo"] ) .. rst-class:: sphx-glr-horizontal * .. image-sg:: /auto_revalidation/laterality/images/sphx_glr__01_lrc_analysis_008.png :alt: 01 lrc analysis :srcset: /auto_revalidation/laterality/images/sphx_glr__01_lrc_analysis_008.png :class: sphx-glr-single-img * .. image-sg:: /auto_revalidation/laterality/images/sphx_glr__01_lrc_analysis_009.png :alt: 01 lrc analysis :srcset: /auto_revalidation/laterality/images/sphx_glr__01_lrc_analysis_009.png :class: sphx-glr-single-img .. raw:: html
      # participants Accuracy Accuracy IC-pairs
cohort algo version      
HA McCamley - 227 0.80 [0.77, 0.83] 0.78 [0.75, 0.80]
Ullrich - MS-ALL MobGap 227 0.82 [0.79, 0.84] 0.82 [0.80, 0.84]
Original Implementation 227 0.76 [0.73, 0.79] 0.74 [0.72, 0.77]
Ullrich - MS-MS Original Implementation 227 0.81 [0.79, 0.83] 0.78 [0.75, 0.80]
CHF McCamley - 106 0.85 [0.82, 0.87] 0.80 [0.77, 0.83]
Ullrich - MS-ALL MobGap 106 0.88 [0.86, 0.90] 0.86 [0.83, 0.88]
Original Implementation 106 0.81 [0.79, 0.84] 0.75 [0.71, 0.79]
Ullrich - MS-MS Original Implementation 106 0.73 [0.69, 0.77] 0.69 [0.65, 0.73]
COPD McCamley - 214 0.80 [0.77, 0.83] 0.77 [0.74, 0.80]
Ullrich - MS-ALL MobGap 214 0.87 [0.85, 0.90] 0.85 [0.83, 0.88]
Original Implementation 214 0.79 [0.76, 0.82] 0.76 [0.73, 0.78]
Ullrich - MS-MS Original Implementation 214 0.83 [0.81, 0.85] 0.78 [0.75, 0.80]
MS McCamley - 228 0.79 [0.76, 0.82] 0.77 [0.74, 0.79]
Ullrich - MS-ALL MobGap 228 0.86 [0.84, 0.88] 0.83 [0.80, 0.85]
Original Implementation 228 0.81 [0.79, 0.83] 0.74 [0.71, 0.77]
Ullrich - MS-MS Original Implementation 228 0.76 [0.73, 0.79] 0.74 [0.71, 0.77]
PD McCamley - 225 0.77 [0.74, 0.79] 0.75 [0.73, 0.78]
Ullrich - MS-ALL MobGap 225 0.81 [0.79, 0.84] 0.82 [0.80, 0.84]
Original Implementation 225 0.77 [0.74, 0.79] 0.72 [0.69, 0.75]
Ullrich - MS-MS Original Implementation 225 0.75 [0.73, 0.78] 0.73 [0.70, 0.75]
PFF McCamley - 169 0.83 [0.81, 0.85] 0.77 [0.74, 0.80]
Ullrich - MS-ALL MobGap 169 0.86 [0.84, 0.88] 0.81 [0.79, 0.84]
Original Implementation 169 0.81 [0.79, 0.83] 0.74 [0.72, 0.76]
Ullrich - MS-MS Original Implementation 169 0.81 [0.79, 0.84] 0.76 [0.73, 0.79]


.. rst-class:: sphx-glr-timing **Total running time of the script:** (0 minutes 3.741 seconds) **Estimated memory usage:** 80 MB .. _sphx_glr_download_auto_revalidation_laterality__01_lrc_analysis.py: .. only:: html .. container:: sphx-glr-footer sphx-glr-footer-example .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: _01_lrc_analysis.ipynb <_01_lrc_analysis.ipynb>` .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: _01_lrc_analysis.py <_01_lrc_analysis.py>` .. container:: sphx-glr-download sphx-glr-download-zip :download:`Download zipped: _01_lrc_analysis.zip <_01_lrc_analysis.zip>` .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_