mobgap.gait_sequences.evaluation.categorize_intervals_per_sample#
- mobgap.gait_sequences.evaluation.categorize_intervals_per_sample(
- *,
- gsd_list_detected: DataFrame,
- gsd_list_reference: DataFrame,
- n_overall_samples: int | None = None,
Evaluate detected gait sequence intervals against a reference on a sample-wise level.
The detected and reference dataframes are expected to have columns namend “start” and “end” containing the start and end indices of the respective gait sequences. Each sample from the detected interval list is categorized as true positive (tp), false positive (fp), false negative (fn), or - if the total length of the recording (
n_overall_samples) is provided - true negative (tn). The results are concatenated into intervals of tp, fp, fn, and tn matches and returned as a DataFrame.The output of this method can be used to calculate performance metrics using the
calculate_matched_gsd_performance_metricsmethod.- Parameters:
- gsd_list_detected
Each row contains a detected gait sequence interval as output from the GSD algorithms. The respective start index is stored in a column named
startand the stop index in a column namedend.- gsd_list_reference
Gold standard to validate the detected gait sequences against. Should have the same format as
gsd_list_detected.- n_overall_samples
Number of samples in the analyzed recording. If provided, true negative intervals will be added to the result.
- Returns:
- pd.DataFrame
A DataFrame containing the categorized intervals with their
startandendindex and the respectivematch_type. Keep in mind that the intervals are not identical to the intervals ingsd_list_detected, but are rather split into subsequences according to their match type with the reference.
See also
calculate_matched_gsd_performance_metricsFor calculating performance metrics based on the matches returned by this function.
calculate_unmatched_gsd_performance_metricsFor calculating performance metrics without matching the detected and reference gait sequences.
Examples
>>> from mobgap.gait_sequences.evaluation import categorize_intervals_per_sample >>> detected = pd.DataFrame([[0, 10], [20, 30]], columns=["start", "end"]) >>> reference = pd.DataFrame([[0, 10], [15, 25]], columns=["start", "end"]) >>> result = categorize_intervals_per_sample(detected, reference) >>> result.tp_intervals start end match_type 0 0 10 tp 1 15 20 fn 2 20 25 tp 3 25 30 fp