.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "auto_examples/gsd/_01_gsd_iluz.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note :ref:`Go to the end ` to download the full example code .. rst-class:: sphx-glr-example-title .. _sphx_glr_auto_examples_gsd__01_gsd_iluz.py: GSD Iluz ======== This example shows how to use the GSD Iluz algorithm and some examples on how the results compare to the original matlab implementation. We start by defining some helpers for plotting and loading the data. You can skip them for now and jump directly to "Performance on a single lab trial", if you just want to see how to apply the algorithm. .. GENERATED FROM PYTHON SOURCE LINES 14-18 Plotting Helper --------------- We define a helper function to plot the results of the algorithm. Just ignore this function for now. .. GENERATED FROM PYTHON SOURCE LINES 18-42 .. code-block:: default import matplotlib.pyplot as plt def plot_gsd_outputs(data, **kwargs): fig, ax = plt.subplots() ax.plot(data["acc_x"].to_numpy(), label="acc_x") color_cycle = iter(plt.rcParams["axes.prop_cycle"]) y_max = 1.1 plot_props = [ {"data": v, "label": k, "alpha": 0.2, "ymax": (y_max := y_max - 0.1), "color": next(color_cycle)["color"]} for k, v in kwargs.items() ] for props in plot_props: for gsd in props.pop("data").itertuples(index=False): ax.axvspan(gsd.start, gsd.end, label=props.pop("label", None), **props) ax.legend() return fig, ax .. GENERATED FROM PYTHON SOURCE LINES 43-53 Loading some example data ------------------------- .. note :: More infos about data loading can be found in the :ref:`data loading example `. We load example data from the lab dataset together with the INDIP reference system. We will use the INDIP "WB" output as ground truth. Note, that the "WB" (Walking Bout) output is further processed than a normal "Gait Sequence". This means we expect Gait Sequences to contain some false positives compared to the "WB" output. However, a good gait sequence detection algorithm should have high sensitivity (i.e. contain all the "WBs" of the reference system). .. GENERATED FROM PYTHON SOURCE LINES 53-81 .. code-block:: default import json import pandas as pd from mobgap import PACKAGE_ROOT from mobgap.data import LabExampleDataset lab_example_data = LabExampleDataset(reference_system="INDIP") def load_matlab_output(datapoint): p = datapoint.group_label with ( PACKAGE_ROOT.parent / f"example_data/original_results/gsd_iluz/lab/{p.cohort}/{p.participant_id}/GSDA_Output.json" ).open() as f: original_results = json.load(f)["GSDA_Output"][p.time_measure][p.test][p.trial]["SU"]["LowerBack"]["GSD"] if not isinstance(original_results, list): original_results = [original_results] return ( pd.DataFrame.from_records(original_results).rename( {"GaitSequence_Start": "start", "GaitSequence_End": "end"}, axis=1 )[["start", "end"]] * datapoint.sampling_rate_hz ) .. GENERATED FROM PYTHON SOURCE LINES 82-85 Performance on a single lab trial --------------------------------- Below we apply the algorithm to a lab trail, where we only expect a single gait sequence. .. GENERATED FROM PYTHON SOURCE LINES 85-96 .. code-block:: default from mobgap.gsd import GsdIluz short_trial = lab_example_data.get_subset(cohort="HA", participant_id="001", test="Test5", trial="Trial2") short_trial_matlab_output = load_matlab_output(short_trial) short_trial_reference_parameters = short_trial.reference_parameters_.wb_list short_trial_output = GsdIluz().detect(short_trial.data["LowerBack"], sampling_rate_hz=short_trial.sampling_rate_hz) print("Reference Parameters:\n\n", short_trial_reference_parameters) print("\nMatlab Output:\n\n", short_trial_matlab_output) print("\nPython Output:\n\n", short_trial_output.gs_list_) .. rst-class:: sphx-glr-script-out .. code-block:: none Reference Parameters: start end ... avg_stride_length_m termination_reason wb_id ... 1 392 862 ... 1.211187 Pause [1 rows x 9 columns] Matlab Output: start end 0 299.0 1049.0 Python Output: start end 0 150 1052 .. GENERATED FROM PYTHON SOURCE LINES 97-100 When we plot the output, we can see that the python version is a little more sensitive than the matlab version. It includes a section of the signal before the region classified as WB by the reference system. Both algorithm implementations produce a gait sequence that extends beyond the end of the reference system. .. GENERATED FROM PYTHON SOURCE LINES 100-109 .. code-block:: default fig, ax = plot_gsd_outputs( short_trial.data["LowerBack"], reference=short_trial_reference_parameters, matlab=short_trial_matlab_output, python=short_trial_output.gs_list_, ) fig.show() .. image-sg:: /auto_examples/gsd/images/sphx_glr__01_gsd_iluz_001.png :alt: 01 gsd iluz :srcset: /auto_examples/gsd/images/sphx_glr__01_gsd_iluz_001.png :class: sphx-glr-single-img .. GENERATED FROM PYTHON SOURCE LINES 110-114 Performance on a longer lab trial --------------------------------- Below we apply the algorithm to a lab trail that contains activities of daily living. This is a more challenging scenario, as we expect multiple gait sequences. .. GENERATED FROM PYTHON SOURCE LINES 114-124 .. code-block:: default long_trial = lab_example_data.get_subset(cohort="MS", participant_id="001", test="Test11", trial="Trial1") long_trial_matlab_output = load_matlab_output(long_trial) long_trial_reference_parameters = long_trial.reference_parameters_.wb_list long_trial_output = GsdIluz().detect(long_trial.data["LowerBack"], sampling_rate_hz=long_trial.sampling_rate_hz) print("Reference Parameters:\n\n", long_trial_reference_parameters) print("\nMatlab Output:\n\n", long_trial_matlab_output) print("\nPython Output:\n\n", long_trial_output.gs_list_) .. rst-class:: sphx-glr-script-out .. code-block:: none Reference Parameters: start end ... avg_stride_length_m termination_reason wb_id ... 1 1019 1768 ... 0.942678 Pause 2 4534 5549 ... 0.483923 Pause 3 9665 10569 ... 0.506458 Pause 4 12337 14633 ... 0.803933 Pause 5 20151 20982 ... 0.507484 Pause 6 21378 22129 ... 0.599360 Pause [6 rows x 9 columns] Matlab Output: start end 0 899.0 1799.0 1 13049.0 13949.0 2 14099.0 14849.0 3 20249.0 21149.0 4 21299.0 22349.0 Python Output: start end 0 750 1652 2 4650 6152 5 12900 14852 6 20100 21152 7 21300 22502 .. GENERATED FROM PYTHON SOURCE LINES 125-127 When we plot the output, we can see again that the python version is more sensitive. It detects longer gait sequences and even one entire gait sequence that is not detected by the matlab version. .. GENERATED FROM PYTHON SOURCE LINES 127-136 .. code-block:: default fig, _ = plot_gsd_outputs( long_trial.data["LowerBack"], reference=long_trial_reference_parameters, matlab=long_trial_matlab_output, python=long_trial_output.gs_list_, ) fig.show() .. image-sg:: /auto_examples/gsd/images/sphx_glr__01_gsd_iluz_002.png :alt: 01 gsd iluz :srcset: /auto_examples/gsd/images/sphx_glr__01_gsd_iluz_002.png :class: sphx-glr-single-img .. GENERATED FROM PYTHON SOURCE LINES 137-148 Changing the parameters ----------------------- The Python version aims to expose all relevant parameters of the algorithm. The `GsdlIluz` algorithm has a lot of parameters that can be modified. Finding a combination of parameters that works well for all scenarios is difficult. Below we show, just how to modify them in general. We modify one of the basic parameters, the window length. This can effect all parts of the output. In this case, we can see that all GSDs are slightly longer and that we now detect a gait sequence that was not detected before. .. GENERATED FROM PYTHON SOURCE LINES 148-165 .. code-block:: default long_trial_output_modified = GsdIluz(window_length_s=5, window_overlap=0.8).detect( long_trial.data["LowerBack"], sampling_rate_hz=long_trial.sampling_rate_hz ) print("Reference Parameters:\n\n", long_trial_reference_parameters) print("\nPython Output:\n\n", long_trial_output.gs_list_) print("\nPython Output Modified:\n\n", long_trial_output_modified.gs_list_) fig, _ = plot_gsd_outputs( long_trial.data["LowerBack"], reference=long_trial_reference_parameters, python=long_trial_output.gs_list_, python_modified=long_trial_output_modified.gs_list_, ) fig.show() .. image-sg:: /auto_examples/gsd/images/sphx_glr__01_gsd_iluz_003.png :alt: 01 gsd iluz :srcset: /auto_examples/gsd/images/sphx_glr__01_gsd_iluz_003.png :class: sphx-glr-single-img .. rst-class:: sphx-glr-script-out .. code-block:: none Reference Parameters: start end ... avg_stride_length_m termination_reason wb_id ... 1 1019 1768 ... 0.942678 Pause 2 4534 5549 ... 0.483923 Pause 3 9665 10569 ... 0.506458 Pause 4 12337 14633 ... 0.803933 Pause 5 20151 20982 ... 0.507484 Pause 6 21378 22129 ... 0.599360 Pause [6 rows x 9 columns] Python Output: start end 0 750 1652 2 4650 6152 5 12900 14852 6 20100 21152 7 21300 22502 Python Output Modified: start end 0 600 1902 1 4500 6202 2 9700 10302 3 12700 14902 4 20100 22702 .. GENERATED FROM PYTHON SOURCE LINES 166-171 Validation of algorithm output against a reference -------------------------------------------------- Let's quantify how the Python output compares to the reference labels. To do this, we use the `categorize_intervals` function to identify overlappting regions between the detected gait sequences and the reference gait sequences. .. GENERATED FROM PYTHON SOURCE LINES 171-180 .. code-block:: default from mobgap.gsd.evaluation import categorize_intervals categorized_intervals = categorize_intervals( gsd_list_detected=long_trial_output.gs_list_, gsd_list_reference=long_trial_reference_parameters, n_overall_samples=len(long_trial.data["LowerBack"]), ) .. GENERATED FROM PYTHON SOURCE LINES 181-189 The function returns a DataFrame containing `start` and `end` index of the resulting matched intervals together with a `match_type` column that contains the type of match for each interval, i.e. `tp` for true positive, `fp` for false positive, and `fn` for false negative. These intervals can not be interpreted as gait sequences, but are rather subsequences of the detected gait sequences categorizing correctly detected samples (`tp`), falsely detected samples (`fp`), and samples from the reference gsd list that were not detected (`fn`). Note that the true negative intervals are not explicitly calculated, but are inferred from the other intervals and the total length of the data, as everything between them is considered as true negative. .. GENERATED FROM PYTHON SOURCE LINES 189-192 .. code-block:: default print("Matched Intervals:\n\n", categorized_intervals) .. rst-class:: sphx-glr-script-out .. code-block:: none Matched Intervals: start end match_type 0 0 750 tn 1 750 1019 fp 2 1019 1652 tp 3 1652 1768 fn 4 1768 4534 tn 5 4534 4650 fn 6 4650 5549 tp 7 5549 6152 fp 8 6152 9665 tn 9 9665 10569 fn 10 10569 12337 tn 11 12337 12900 fn 12 12900 14633 tp 13 14633 14852 fp 14 14852 20100 tn 15 20100 20151 fp 16 20151 20982 tp 17 20982 21152 fp 18 21152 21300 tn 19 21300 21378 fp 20 21378 22129 tp 21 22129 22502 fp 22 22502 22727 tn .. GENERATED FROM PYTHON SOURCE LINES 193-197 Based on the tp, fp, and fn intervals, common performance metrics such as F1 score, precision, and recall can be calculated. For this purpose, the :func:`~mobgap.gsd.evaluation.calculate_matched_gsd_performance_metrics` function can be used. It returns a dictionary containing the metrics for the specified categorized intervals DataFrame. .. GENERATED FROM PYTHON SOURCE LINES 197-203 .. code-block:: default from mobgap.gsd.evaluation import calculate_matched_gsd_performance_metrics matched_metrics_dict = calculate_matched_gsd_performance_metrics(categorized_intervals) print("Matched Performance Metrics:\n\n", matched_metrics_dict) .. rst-class:: sphx-glr-script-out .. code-block:: none Matched Performance Metrics: {'tp_samples': 4852, 'fp_samples': 1770, 'fn_samples': 1703, 'precision': 0.7327091513138024, 'recall': 0.7401983218916858, 'f1_score': 0.7364346968202169, 'tn_samples': 14425, 'specificity': 0.8907070083359061, 'accuracy': 0.8473406593406594, 'npv': 0.8944072420634921} .. GENERATED FROM PYTHON SOURCE LINES 204-209 Furthermore, there is a range of performance metrics that only require the overall amount of gait detected. These metrics can be calculated using the :func:`~mobgap.gsd.evaluation.calculate_unmatched_gsd_performance_metrics` function. It requires specifying the sampling frequency of the recorded data (to calculate the duration errors in seconds) and returns a dictionary containing all metrics for the specified detected and reference gait sequences. .. GENERATED FROM PYTHON SOURCE LINES 209-220 .. code-block:: default from mobgap.gsd.evaluation import calculate_unmatched_gsd_performance_metrics unmatched_metrics_dict = calculate_unmatched_gsd_performance_metrics( gsd_list_detected=long_trial_output.gs_list_, gsd_list_reference=long_trial_reference_parameters, sampling_rate_hz=long_trial.sampling_rate_hz, ) print("Unmatched Performance Metrics:\n\n", unmatched_metrics_dict) .. rst-class:: sphx-glr-script-out .. code-block:: none Unmatched Performance Metrics: {'reference_gs_duration_s': 65.52, 'detected_gs_duration_s': 66.15, 'gs_duration_error_s': 0.6300000000000097, 'gs_relative_duration_error': 0.009615384615384763, 'gs_absolute_duration_error_s': 0.6300000000000097, 'gs_absolute_relative_duration_error': 0.009615384615384763, 'gs_absolute_relative_duration_error_log': 0.009569451016150893, 'detected_num_gs': 5, 'reference_num_gs': 6, 'num_gs_error': -1, 'num_gs_relative_error': -0.16666666666666666, 'num_gs_absolute_error': 1, 'num_gs_absolute_relative_error': 0.16666666666666666, 'num_gs_absolute_relative_error_log': 0.15415067982725836} .. GENERATED FROM PYTHON SOURCE LINES 221-229 Another useful function for evaluation is :func:`~mobgap.gsd.evaluation.find_matches_with_min_overlap`. It returns all intervals of the detected gait sequences that overlap with the reference gait sequences by at least a given amount. We can see that with an overlap threshold of 0.7 (70%), three of the five detected gait sequences are considered as matches with the reference gait sequences. The remaining ones either contain too many false positive and/or false negative samples. This information can then be used further to compare aggregated parameters from within the respective gait sequences. .. GENERATED FROM PYTHON SOURCE LINES 229-239 .. code-block:: default from mobgap.gsd.evaluation import find_matches_with_min_overlap matches = find_matches_with_min_overlap( gsd_list_detected=long_trial_output.gs_list_, gsd_list_reference=long_trial_reference_parameters, overlap_threshold=0.7, ) print("Matches:\n\n", matches) .. rst-class:: sphx-glr-script-out .. code-block:: none Matches: start end id 0 750 1652 5 12900 14852 6 20100 21152 .. rst-class:: sphx-glr-timing **Total running time of the script:** (0 minutes 5.229 seconds) **Estimated memory usage:** 66 MB .. _sphx_glr_download_auto_examples_gsd__01_gsd_iluz.py: .. only:: html .. container:: sphx-glr-footer sphx-glr-footer-example .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: _01_gsd_iluz.py <_01_gsd_iluz.py>` .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: _01_gsd_iluz.ipynb <_01_gsd_iluz.ipynb>` .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_