mobgap.stride_length.evaluation.sl_per_datapoint_score#

mobgap.stride_length.evaluation.sl_per_datapoint_score(
pipeline: SlEmulationPipeline,
datapoint: BaseGaitDatasetWithReference,
) dict[source]#

Evaluate the performance of the stride length pipeline on a single datapoint.

Warning

This function is not meant to be called directly, but as a scoring function in a tpcp.validate.Scorer. If you are writing custom scoring functions, you can use this function as a template or wrap it in a new function.

This function calculates the stride length error on a per stride, per WB, and per datapoint level.

The following metrics are calculated:

  • The error, absolute error, relative error, and absolute relative error for each stride (stride_level_values_with_errors). These are returned as a dataframe wrapped in no_agg.

  • The average stride-level error metrics on a per-data-point level. These are returned as stride__<metric> and will be averaged over all datapoints in the Scorer.

  • The error, absolute error, relative error, and absolute relative error for each WB. The WB-level metrics are calculated as the average stride-level stride-length for each WB in the algorithm output. For the reference system, the average stride length are taken directly from the wb-level reference data. (wb_level_values_with_errors). These are returned as a dataframe wrapped in no_agg. The dataframe also contains the average walking speed for each WB extracted from the reference system to provide context for further analysis.

  • The average WB-level error metrics on a per-data-point level. These are returned as wb__<metric> and will be averaged over all datapoints in the Scorer.

Parameters:
pipeline

An instance of SlEmulationPipeline that wraps the algorithm that should be evaluated.

datapoint

The datapoint to be evaluated.

Returns:
dict

A dictionary containing the performance metrics. Note, that some results are wrapped in a no_agg object or other aggregators. The results of this function are not expected to be parsed manually, but rather the function is expected to be used in the context of the validate/cross_validate functions or similar as scorer. This functions will aggregate the results and provide a summary of the performance metrics.