Note

Go to the end to download the full example code

Stride Length Evaluation#

This example shows how to apply evaluation algorithms to stride length results and thus how to rate the performance of a stride length calculation algorithm.

from pprint import pprint

import numpy as np
import pandas as pd
from mobgap.data import LabExampleDataset

Example Data#

We load example data from the lab dataset. We will use a longer trial from an “MS” participant for this example. Additionally, we will load the reference stride length values to compare the results and evaluate the performance of the stride length calculation algorithm. To calculate the stride length, we will then use the initial contacts measured from the reference system.

def load_data():
    """Load example data from the lab dataset."""
    lab_example_data = LabExampleDataset(reference_system="INDIP")
    long_trial = lab_example_data.get_subset(
        cohort="MS", participant_id="001", test="Test11", trial="Trial1"
    )
    sampling_rate_hz = long_trial.sampling_rate_hz
    sensor_height_m = long_trial.participant_metadata["sensor_height_m"]
    return long_trial, sampling_rate_hz, sensor_height_m


def load_reference(data):
    """Load reference data from the INDIP reference system."""
    reference_gs = data.reference_parameters_.wb_list
    reference_ic = data.reference_parameters_relative_to_wb_.ic_list
    return reference_gs, reference_ic


test_data, sampling_rate_hz, sensor_height_m = load_data()
reference_gs, reference_ic = load_reference(test_data)

/home/docs/checkouts/readthedocs.org/user_builds/mobgap/checkouts/v0.9.0/mobgap/data/_mobilised_matlab_loader.py:1082: UserWarning: There were multiple ICs with the same index value, but different LR labels. This is likely an issue with the reference system you should further investigate. For now, we set the `lr_label` of the stride corresponding to this IC to Nan. However, both values still remain in the IC list.
  return parse_reference_parameters(
/home/docs/checkouts/readthedocs.org/user_builds/mobgap/checkouts/v0.9.0/mobgap/data/_mobilised_matlab_loader.py:1091: UserWarning: There were multiple ICs with the same index value, but different LR labels. This is likely an issue with the reference system you should further investigate. For now, we set the `lr_label` of the stride corresponding to this IC to Nan. However, both values still remain in the IC list.
  return parse_reference_parameters(

From the reference data, we can see that our example data contains several gait sequences. For each gait sequence, the reference system provides an average stride length value.

reference_gs[["avg_stride_length_m"]]

	avg_stride_length_m
wb_id
0	0.942678
1	0.483923
2	0.506458
3	0.803933
4	0.507484
5	0.599360

Furthermore, the reference data includes a list of initial contacts for each gait sequence. We will use these reference initial contacts as input for the stride length calculation algorithm, to enable evaluating the performance of the stride length calculation in isolation.

reference_ic

		ic	lr_label
wb_id	step_id
0	0	0	right
	1	46	left
	2	110	right
	3	153	left
	4	217	right
...	...	...	...
5	7	518	right
	8	573	left
	9	664	right
	10	711	left
	11	750	right

93 rows × 2 columns

Applying the Stride Length Calculation Algorithm#

In this example, we will use the SlZijlstra algorithm to calculate the stride length from the reference initial contacts.

from mobgap.stride_length import SlZijlstra

sl_zijlstra_reoriented = SlZijlstra(
    **SlZijlstra.PredefinedParameters.step_length_scaling_factor_ms_ms,
)

As this algorithm is designed to work on a single gait sequence at a time, we will iterate over the gait sequences present in the example data and calculate the stride length for each of them. This is done using the GsIterator class. How the GsIterator class works in detail together with different application examples is explained in its dedicated example.

from mobgap.initial_contacts import refine_gs
from mobgap.pipeline import GsIterator
from mobgap.utils.conversions import to_body_frame

iterator = GsIterator()

for (gs, data), r in iterator.iterate(test_data.data_ss, reference_gs):
    r.ic_list = reference_ic.loc[gs.id]
    refined_gs, refined_ic_list = refine_gs(r.ic_list)
    with iterator.subregion(refined_gs) as ((_, refined_gs_data), rr):
        sl = sl_zijlstra_reoriented.clone().calculate(
            data=to_body_frame(refined_gs_data),
            initial_contacts=refined_ic_list,
            sensor_height_m=sensor_height_m,
            sampling_rate_hz=sampling_rate_hz,
        )
        rr.stride_length_per_sec = sl.stride_length_per_sec_

The detected stride lengths per second for all gait sequences can then be accessed from the results_ attribute of the GsIterator object.

stride_length_result = iterator.results_.stride_length_per_sec
stride_length_result

			stride_length_m
wb_id	r_gs_id	sec_center_samples
0	0	1069	1.105959
		1169	1.160042
		1269	1.088413
		1369	1.018586
		1469	0.835517
...	...	...	...
5	0	21728	0.861495
		21828	0.693402
		21928	0.572525
		22028	0.940278
		22128	0.813917

69 rows × 1 columns

Comparison with Reference#

To evaluate the performance of the stride length calculation algorithm, we can compare the calculated stride lengths with the reference stride lengths. For this purpose, as there are is only an average reference stride length per gait sequence, we will first average the calculated stride length values per gait sequence.

avg_stride_length_per_gs = stride_length_result.groupby("wb_id").mean()
avg_stride_length_per_gs

	stride_length_m
wb_id
0	0.883338
1	0.706788
2	0.653937
3	0.778489
4	0.887897
5	0.871213

Next, the detected and reference stride lengths are concatenated into a single DataFrame. As columns, a multilevel index is used that contains the type of metric (stride_length_m) in the first and the source of the data (detected or reference) in the second level.

reference_stride_length = reference_gs[["avg_stride_length_m"]].rename(
    columns={"avg_stride_length_m": "stride_length_m"}
)
combined_sl = {
    "detected": avg_stride_length_per_gs,
    "reference": reference_stride_length,
}
combined_sl = pd.concat(
    combined_sl, axis=1, keys=combined_sl.keys()
).reorder_levels((1, 0), axis=1)
combined_sl

	stride_length_m
	detected	reference
wb_id
0	0.883338	0.942678
1	0.706788	0.483923
2	0.653937	0.506458
3	0.778489	0.803933
4	0.887897	0.507484
5	0.871213	0.599360

The concatenated DataFrame can then be used to evaluate the performance of the stride length calculation algorithm. For this purpose, first, the errors between the detected and reference stride lengths are calculated.

Estimate Errors in stride length data#

We can calculate a variety of error metrics to evaluate the performance of the stride length calculation algorithm, ranging from the simple difference between the estimated and ground truth values (simply referred to as error) to its absolute value (absolute_error). Both can also be set in relation to the reference value (relative_error and absolute_relative_error). To apply these errors, we first need to build a list specifying the error functions to be applied. All the above-mentioned error functions are provided by the ErrorTransformFuncs class. Note that you can also apply custom functions instead of the predefined ones. For more information on how to define custom error functions, see the general example on DMO evaluation.

from mobgap.pipeline.evaluation import ErrorTransformFuncs as E

errors = [
    ("stride_length_m", [E.error, E.abs_error, E.rel_error, E.abs_rel_error])
]
pprint(errors)

[('stride_length_m',
  [<function error at 0x7fdc83ac5480>,
   <function abs_error at 0x7fdc82f9b640>,
   <function rel_error at 0x7fdc82f9b5b0>,
   <function abs_rel_error at 0x7fdc82f9b6d0>])]

The error functions can be applied to the combined stride length data using the apply_transformations function. The resulting DataFrame contains the error values for each gait sequence.

from mobgap.utils.df_operations import apply_transformations

sl_errors = apply_transformations(combined_sl, errors)
sl_errors.T

	wb_id	0	1	2	3	4	5
stride_length_m	error	-0.059340	0.222865	0.147479	-0.025443	0.380412	0.271853
	abs_error	0.059340	0.222865	0.147479	0.025443	0.380412	0.271853
	rel_error	-0.062948	0.460538	0.291196	-0.031649	0.749604	0.453572
	abs_rel_error	0.062948	0.460538	0.291196	0.031649	0.749604	0.453572

Before we now aggregate the results, we can also combine the error metrics with the reference and detected values to have all the information in one dataframe.

combined_sl_with_errors = pd.concat([combined_sl, sl_errors], axis=1)
combined_sl_with_errors

	stride_length_m
	detected	reference	error	abs_error	rel_error	abs_rel_error
wb_id
0	0.883338	0.942678	-0.059340	0.059340	-0.062948	0.062948
1	0.706788	0.483923	0.222865	0.222865	0.460538	0.460538
2	0.653937	0.506458	0.147479	0.147479	0.291196	0.291196
3	0.778489	0.803933	-0.025443	0.025443	-0.031649	0.031649
4	0.887897	0.507484	0.380412	0.380412	0.749604	0.749604
5	0.871213	0.599360	0.271853	0.271853	0.453572	0.453572

Aggregate Errors#

Finally, the estimated errors can be aggregated to provide a summary of the performance of the stride length calculation. For this purpose, different aggregation functions can be applied to the error metrics, ranging from simple, built-in aggregations like the mean or standard deviation to more complex functions like the limits of agreement or 5th and 95th percentiles. This can be done using the apply_aggregations function. Possible aggregations are provided by the CustomErrorAggregations class. There are two ways to define such aggregations:

As a list of tuples in the format (<identifier>, <aggregation>) with <identifier> being the key for accessing the column to evaluate, and <aggregation> being the aggregation function(s) to apply. A valid list of aggregations could look like this:

from mobgap.pipeline.evaluation import CustomErrorAggregations as A

aggregations_simple = [
    *(
        (("stride_length_m", origin), [np.mean, A.quantiles])
        for origin in ["detected", "reference", "abs_error", "abs_rel_error"]
    ),
    *(
        (("stride_length_m", origin), [np.mean, A.loa])
        for origin in ["error", "rel_error"]
    ),
]
pprint(aggregations_simple)

[(('stride_length_m', 'detected'),
  [<function mean at 0x7fdca23e7a30>, <function quantiles at 0x7fdc82f9b880>]),
 (('stride_length_m', 'reference'),
  [<function mean at 0x7fdca23e7a30>, <function quantiles at 0x7fdc82f9b880>]),
 (('stride_length_m', 'abs_error'),
  [<function mean at 0x7fdca23e7a30>, <function quantiles at 0x7fdc82f9b880>]),
 (('stride_length_m', 'abs_rel_error'),
  [<function mean at 0x7fdca23e7a30>, <function quantiles at 0x7fdc82f9b880>]),
 (('stride_length_m', 'error'),
  [<function mean at 0x7fdca23e7a30>, <function loa at 0x7fdc82f9b910>]),
 (('stride_length_m', 'rel_error'),
  [<function mean at 0x7fdca23e7a30>, <function loa at 0x7fdc82f9b910>])]

As a named tuple of Type CustomOperation taking three values: identifier, function, and column_name. identifier is a valid loc identifier selecting one or more columns from the dataframe, function is the aggregation function or list of functions to apply, and column_name is the identifier of the resulting column in the output dataframe. This allows for more complex aggregations that require multiple columns as input, for example, the intra-class correlation coefficient (ICC). A valid aggregation list for calculating the ICC of all DMOs would look like this:

from mobgap.utils.df_operations import CustomOperation

aggregations_custom = [
    CustomOperation(
        identifier="stride_length_m",
        function=A.icc,
        column_name=("stride_length_m", "all"),
    )
]
pprint(aggregations_custom)

[CustomOperation(identifier='stride_length_m', function=<function icc at 0x7fdc82f9b7f0>, column_name=('stride_length_m', 'all'))]

For more detailed information on the aggregation types and their usage, check out the detailed example on it in the general example on DMO evaluation.

Both types of aggregations can be combined and applied in a single call to the apply_aggregations function. This returns a pandas Series with the aggregated values for each aggregation function and origin for the metric stride length. For better readability, we sort and format the resulting dataframe.

from mobgap.utils.df_operations import apply_aggregations

aggregations = aggregations_simple + aggregations_custom
agg_results = (
    apply_aggregations(combined_sl_with_errors, aggregations)
    .rename_axis(index=["aggregation", "metric", "origin"])
    .reorder_levels(["metric", "origin", "aggregation"])
    .sort_index(level=0)
    .to_frame("values")
)
agg_results

			values
metric	origin	aggregation
stride_length_m	abs_error	mean	0.184565
	abs_error	quantiles	(0.03391753005107545, 0.35327247945312534)
	abs_rel_error	mean	0.341585
	abs_rel_error	quantiles	(0.039473562184754674, 0.6773377272759125)
	all	icc	(0.12074454354629512, [-0.65, 0.8])
	detected	mean	0.796944
	detected	quantiles	(0.6671496372533257, 0.8867570948216703)
	error	loa	(-0.1804722809131206, 0.4930808350816448)
	error	mean	0.156304
	reference	mean	0.640639
	reference	quantiles	(0.4895567924265623, 0.9079919076564963)
	rel_error	loa	(-0.3052093545468356, 0.925313898778093)
	rel_error	mean	0.310052

Total running time of the script: (0 minutes 2.913 seconds)

Estimated memory usage: 9 MB

Gallery generated by Sphinx-Gallery