.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "auto_examples/initial_contacts/_04_icd_evaluation.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note :ref:`Go to the end ` to download the full example code .. rst-class:: sphx-glr-example-title .. _sphx_glr_auto_examples_initial_contacts__04_icd_evaluation.py: .. _icd_evaluation: ICD Evaluation ============== This example shows how to apply evaluation algorithms to ICD and thus how to rate the performance of an ICD algorithm. .. GENERATED FROM PYTHON SOURCE LINES 9-12 .. code-block:: default import pandas as pd .. GENERATED FROM PYTHON SOURCE LINES 13-14 Import useful modules and packages .. GENERATED FROM PYTHON SOURCE LINES 14-19 .. code-block:: default from mobgap.data import LabExampleDataset from mobgap.initial_contacts import IcdIonescu from mobgap.pipeline import GsIterator from mobgap.utils.conversions import to_body_frame .. GENERATED FROM PYTHON SOURCE LINES 20-26 Loading some example data ------------------------- First, we load example data and apply the ICD Ionescu algorithm to it. However, you can use any other ICD algorithm as well. To have a reference to compare the results to, we also load the corresponding ground truth data. These steps are explained in more detail in the :ref:`ICD Ionescu example `. .. GENERATED FROM PYTHON SOURCE LINES 26-67 .. code-block:: default def load_data(): """Load example data and extract a single trial for demonstration purposes.""" example_data = LabExampleDataset( reference_system="INDIP", reference_para_level="wb" ) single_test = example_data.get_subset( cohort="HA", participant_id="001", test="Test11", trial="Trial1" ) return single_test def calculate_icd_ionescu_output(single_test_data): """Calculate the ICD Ionescu output for one sensor from the test data.""" imu_data = to_body_frame(single_test_data.data_ss) sampling_rate_hz = single_test_data.sampling_rate_hz reference_wbs = single_test_data.reference_parameters_.wb_list iterator = GsIterator() for (gs, data), result in iterator.iterate(imu_data, reference_wbs): result.ic_list = ( IcdIonescu() .detect(data, sampling_rate_hz=sampling_rate_hz) .ic_list_ ) det_ics = iterator.results_.ic_list return det_ics def load_reference(single_test_data): """Load the reference initial contacts from the test data.""" ref_ics = single_test_data.reference_parameters_.ic_list return ref_ics wb_data = load_data() detected_ics = calculate_icd_ionescu_output(wb_data) reference_ics = load_reference(wb_data) .. GENERATED FROM PYTHON SOURCE LINES 68-71 As you can see our detected initial contacts and reference initial contacts are multiindexed dataframes. The first level of the multiindex is the walking bout id and the second level is the index of the initial contact within the walking bout. .. GENERATED FROM PYTHON SOURCE LINES 71-73 .. code-block:: default detected_ics .. raw:: html
ic
wb_id step_id
0 0 697
1 760
2 814
3 872
4 924
1 0 2929
1 2989
2 3056
3 3109
4 3176
5 3264
2 0 3913
1 3975
2 4085
3 4141
4 4201
5 4273
6 4355
7 4488
8 4553
9 4638
10 4758
11 4815
12 4863
13 4968
14 5038
3 0 7739
1 7846
2 7983
3 8096
4 8166
5 8229
6 8281
7 8336
8 8399
9 8461
10 8541
11 8606
4 0 9531
1 9593
2 9659
3 9723
4 9791
5 9853
5 0 12044
1 12099
2 12157
3 12211
4 12271
5 12414


.. GENERATED FROM PYTHON SOURCE LINES 74-76 .. code-block:: default reference_ics .. raw:: html
ic lr_label
wb_id step_id
0 0 632 left
1 709 right
2 763 left
3 824 right
4 876 left
... ... ... ...
5 3 12162 left
4 12220 right
5 12277 left
6 12335 right
7 12516 left

63 rows × 2 columns



.. GENERATED FROM PYTHON SOURCE LINES 77-94 Matching ICs between detected and reference lists ------------------------------------------------- Let's quantify how the algorithm output compares to the reference labels. To gain a detailed insight into the performance of the algorithm, we can look into the individual matches between the detected and reference initial contacts. To do this, we use the :func:`~mobgap.initial_contacts.evaluation.categorize_ic_list` function to classify each detected initial contact as a true positive, false positive, or false negative. We can then use these results to calculate a range of higher-level performance metrics. Note, that we want to only match initial contacts within the same walking bout. If we would simply pass the detected and reference initial contacts to the matching function, it would match all ICs independent of the walking bout, as it ignores the multiindex. We will have a look at how this looks like below, and when we might want to use it, but for now, let's perform the matching within the walking bouts. For this, we need to group the detected and reference initial contacts by the walking bout id. This can be done using the :func:`~mobgap.utils.array_handling.create_multi_groupby` helper function. .. GENERATED FROM PYTHON SOURCE LINES 94-100 .. code-block:: default from mobgap.utils.df_operations import create_multi_groupby per_wb_grouper = create_multi_groupby( detected_ics, reference_ics, groupby="wb_id" ) .. GENERATED FROM PYTHON SOURCE LINES 101-112 The provides us with a groupby object that is similar to the normal pandas groupby object that can be created from a single dataframe. The ``MultiGroupBy`` object allows us to apply a function to each group across all dataframes. I.e. the function will get the detected and reference initial contacts for each walking bout and then can perform some operation on them. In our case we want to apply the :func:`~mobgap.initial_contacts.evaluation.categorize_ic_list` function to each walking bout. This function will then return a dataframe with the matches given a certain tolerance. We don't assume that initial contacts are detected at perfectly the exact same time in both systems. Hence, we allow for a certain deviation in the matching process. .. GENERATED FROM PYTHON SOURCE LINES 112-118 .. code-block:: default from mobgap.utils.conversions import as_samples tolerance_s = 0.2 tolerance_samples = as_samples(tolerance_s, wb_data.sampling_rate_hz) tolerance_samples .. rst-class:: sphx-glr-script-out .. code-block:: none 20 .. GENERATED FROM PYTHON SOURCE LINES 119-125 Now we can apply the matching function to each walking bout. Note, that our matches retain the multiindex and provide matches for each walking bout separately. The dataframe has 3 columns, containing the index value of the detected ic, the index value of matched reference ic, and the match type. The two index columns contain tuples in our case, as they stem from the original multiindex that we provided. So each of the tuples has the form ``(wb_id, ic_id)``. .. GENERATED FROM PYTHON SOURCE LINES 125-139 .. code-block:: default from mobgap.initial_contacts.evaluation import categorize_ic_list matches_per_wb = create_multi_groupby( detected_ics, reference_ics, groupby="wb_id" ).apply( lambda df1, df2: categorize_ic_list( ic_list_detected=df1, ic_list_reference=df2, tolerance_samples=tolerance_samples, multiindex_warning=False, ) ) matches_per_wb .. raw:: html
ic_id_detected ic_id_reference match_type
wb_id
0 0 (0, 0) (0, 1) tp
1 (0, 1) (0, 2) tp
2 (0, 2) (0, 3) tp
3 (0, 3) (0, 4) tp
4 (0, 4) (0, 5) tp
... ... ... ... ...
5 4 (5, 4) (5, 5) tp
5 (5, 5) NaN fp
6 NaN (5, 0) fn
7 NaN (5, 6) fn
8 NaN (5, 7) fn

69 rows × 3 columns



.. GENERATED FROM PYTHON SOURCE LINES 140-145 Instead of matching the initial contacts within the same walking bout, we could also match all initial contacts independent of the walking bout. This can be done by simply passing the detected and reference initial contacts directly to the matching function. This can be useful if the walking bouts between the two compared systems are not identical or the multiindex has other columns that should not be taken into account for the matching. .. GENERATED FROM PYTHON SOURCE LINES 145-152 .. code-block:: default matched_all = categorize_ic_list( ic_list_detected=detected_ics, ic_list_reference=reference_ics, tolerance_samples=tolerance_samples, ) matched_all .. rst-class:: sphx-glr-script-out .. code-block:: none /home/docs/checkouts/readthedocs.org/user_builds/mobgap/checkouts/v0.9.0/mobgap/initial_contacts/evaluation.py:172: UserWarning: The index of `ic_list_detected` or `ic_list_reference` is a MultiIndex. Please be aware that the index levels will not be regarded separately for the matching process, and initial contacts might be matched across different index groups, such as walking bouts or participants. If this is not the intended use case for you, consider grouping your input data before calling the evaluation function. This can be done using the `create_multi_groupby` function from the `mobgap.utils.array_handling`. Checkout the example of IC-evaluation for more information. warnings.warn( .. raw:: html
ic_id_detected ic_id_reference match_type
0 (0, 0) (0, 1) tp
1 (0, 1) (0, 2) tp
2 (0, 2) (0, 3) tp
3 (0, 3) (0, 4) tp
4 (0, 4) (0, 5) tp
... ... ... ...
64 NaN (4, 0) fn
65 NaN (4, 7) fn
66 NaN (5, 0) fn
67 NaN (5, 6) fn
68 NaN (5, 7) fn

69 rows × 3 columns



.. GENERATED FROM PYTHON SOURCE LINES 153-175 Note, that this did not really make a difference in our case, as the individual WBs are identical between the two systems and far enough apart so that matches between different WBs are not possible. But in general, this can be a typical "foot-gun" for users, as they might not be aware of the fact that the multiindex is ignored in the matching process. Hence, as you can see above, a warning is raised if you pass a multiindex to the matching function. This can be silenced by setting the ``multiindex_warning`` parameter to ``False``. As in our case we would recommend to match the ICs per walking bout, we will continue with the matches per walking bout and ignore ``matches_all`` for the rest of this example. Calculating performance metrics ------------------------------- From these ``matches_per_wb``, a range of higher-level performance metrics (including the total number of true positives, false positives, and false negatives, as well as precision, recall, and F1-score) can be calculated. For this purpose, we can use the :func:`~mobgap.initial_contacts.evaluation.calculate_matched_icd_performance_metrics` function. It returns a dictionary containing all metrics for the specified detected and reference initial contact lists. We can again decide, if we want to calculate these metrics across all walking bouts or for each walking bout separately. We will quickly show both approaches below. Across all walking bouts: .. GENERATED FROM PYTHON SOURCE LINES 175-183 .. code-block:: default from mobgap.initial_contacts.evaluation import ( calculate_matched_icd_performance_metrics, ) metrics_all = calculate_matched_icd_performance_metrics(matches_per_wb) pd.Series(metrics_all) .. rst-class:: sphx-glr-script-out .. code-block:: none tp_samples 44.000000 fp_samples 6.000000 fn_samples 19.000000 precision 0.880000 recall 0.698413 f1_score 0.778761 dtype: float64 .. GENERATED FROM PYTHON SOURCE LINES 184-187 Per Wb: For this we can use the normal pandas groupby to calculate the metrics for each walking bout separately. .. GENERATED FROM PYTHON SOURCE LINES 187-193 .. code-block:: default metrics_per_wb = matches_per_wb.groupby(level="wb_id").apply( lambda df_: pd.Series(calculate_matched_icd_performance_metrics(df_)) ) metrics_per_wb .. raw:: html
tp_samples fp_samples fn_samples precision recall f1_score
wb_id
0 5.0 0.0 2.0 1.000000 0.714286 0.833333
1 4.0 2.0 2.0 0.666667 0.666667 0.666667
2 13.0 2.0 5.0 0.866667 0.722222 0.787879
3 11.0 1.0 5.0 0.916667 0.687500 0.785714
4 6.0 0.0 2.0 1.000000 0.750000 0.857143
5 5.0 1.0 3.0 0.833333 0.625000 0.714286


.. GENERATED FROM PYTHON SOURCE LINES 194-195 Which of the two approaches makes more sense depends on the use case and what your multiindex represents. .. rst-class:: sphx-glr-timing **Total running time of the script:** (0 minutes 2.028 seconds) **Estimated memory usage:** 9 MB .. _sphx_glr_download_auto_examples_initial_contacts__04_icd_evaluation.py: .. only:: html .. container:: sphx-glr-footer sphx-glr-footer-example .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: _04_icd_evaluation.py <_04_icd_evaluation.py>` .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: _04_icd_evaluation.ipynb <_04_icd_evaluation.ipynb>` .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_