mobgap.gait_sequences.evaluation.categorize_intervals_per_sample#

mobgap.gait_sequences.evaluation.categorize_intervals_per_sample( *, gsd_list_detected: DataFrame, gsd_list_reference: DataFrame, n_overall_samples: int | None = None, ) → DataFrame[source]#

Evaluate detected gait sequence intervals against a reference on a sample-wise level.

The detected and reference dataframes are expected to have columns namend “start” and “end” containing the start and end indices of the respective gait sequences. Each sample from the detected interval list is categorized as true positive (tp), false positive (fp), false negative (fn), or - if the total length of the recording (n_overall_samples) is provided - true negative (tn). The results are concatenated into intervals of tp, fp, fn, and tn matches and returned as a DataFrame.

The output of this method can be used to calculate performance metrics using the calculate_matched_gsd_performance_metrics method.

Parameters:

gsd_list_detected: Each row contains a detected gait sequence interval as output from the GSD algorithms. The respective start index is stored in a column named start and the stop index in a column named end.
gsd_list_reference: Gold standard to validate the detected gait sequences against. Should have the same format as gsd_list_detected.
n_overall_samples: Number of samples in the analyzed recording. If provided, true negative intervals will be added to the result.

Returns:

pd.DataFrame: A DataFrame containing the categorized intervals with their start and end index and the respective match_type. Keep in mind that the intervals are not identical to the intervals in gsd_list_detected, but are rather split into subsequences according to their match type with the reference.

See also

calculate_matched_gsd_performance_metrics: For calculating performance metrics based on the matches returned by this function.
calculate_unmatched_gsd_performance_metrics: For calculating performance metrics without matching the detected and reference gait sequences.

Examples

>>> from mobgap.gait_sequences.evaluation import categorize_intervals_per_sample
>>> detected = pd.DataFrame([[0, 10], [20, 30]], columns=["start", "end"])
>>> reference = pd.DataFrame([[0, 10], [15, 25]], columns=["start", "end"])
>>> result = categorize_intervals_per_sample(detected, reference)
>>> result.tp_intervals
       start  end match_type
0      0   10         tp
1     15   20         fn
2     20   25         tp
3     25   30         fp