Evaluation#
- class mobgap.utils.evaluation.Evaluation(
- dataset: BaseGaitDatasetWithReference,
- scoring: Callable[[T, BaseGaitDatasetWithReference], float | Aggregator[Any] | dict[str, float | Aggregator[Any]]] | Scorer[T, BaseGaitDatasetWithReference],
- *,
- validate_paras: dict | None = None,
Gneric Evaluation challenge for all algorithms.
This challenge wraps any valid gait pipeline together with a scoring function and runs and scores it on a dataset.
This is a suitable approach, when you want to evaluate and compare algorithms that are not “trainable” in any way. For example, traditional algorithms or pre-trained models. Note, that if you are planning to compare algorithms that are trainable with non-trainable algorithms, you should use the
EvaluationCVfor all of them.- Parameters:
- dataset
A gait dataset with reference information. Evaluation is performed across all datapoints within the dataset.
- scoring
A scoring function that evaluates the performance of the algorithm on a single datapoint. It should take a pipeline and a datapoint as input, run the pipeline on the datapoint and return a dictionary of performance metrics. These performance metrics are then aggregated across all datapoints.
- validate_paras
Dictionary with further parameters that are directly passed to
validate. This can overwrite all parameters exceptpipeline,dataset,scoring. Typical usecase is to setn_jobsto activate multiprocessing.
- Other Parameters:
- pipeline
The pipeline passed to the run method.
- Attributes:
- results_
Dictionary with all results of the validation. The results are returned by
validate. You can control what information is provided viavalidate_paras- perf_
A dictionary with the performance results of the action method. This includes:
start_datetime_utc_timestamp: The start time of the action in UTC as a timestamp.
start_datetime: The start time of the action as a string.
end_datetime_utc_timestamp: The end time of the action in UTC as a timestamp.
end_datetime: The end time of the action as a string.
runtime_s: The runtime of the action in seconds.
Methods
clone()Create a new instance of the class with all parameters copied over.
Get the aggregated results as a pandas DataFrame.
get_params([deep])Get parameters for this algorithm.
Get the raw results of the cross-validation.
get_single_results_as_df([columns])Get the results as a pandas DataFrame.
run(pipeline)Run the evaluation challenge.
set_params(**params)Set the parameters of this Algorithm.
- __init__(
- dataset: BaseGaitDatasetWithReference,
- scoring: Callable[[T, BaseGaitDatasetWithReference], float | Aggregator[Any] | dict[str, float | Aggregator[Any]]] | Scorer[T, BaseGaitDatasetWithReference],
- *,
- validate_paras: dict | None = None,
- clone() Self[source]#
Create a new instance of the class with all parameters copied over.
This will create a new instance of the class itself and all nested objects
- get_aggregated_results_as_df() DataFrame[source]#
Get the aggregated results as a pandas DataFrame.
This will return all
agg__columns that the scorer returned (seeresults_attribute) as a pandas dataframe.The returned Df just has a single row with the index
0and each column represents one aggregated values. This shape is used, to provide equivalent output to the results of the cross-validation.- Returns:
- pd.DataFrame
The results as a pandas DataFrame.
- get_params(deep: bool = True) dict[str, Any][source]#
Get parameters for this algorithm.
- Parameters:
- deep
Only relevant if object contains nested algorithm objects. If this is the case and deep is True, the params of these nested objects are included in the output using a prefix like
nested_object_name__(Note the two “_” at the end)
- Returns:
- params
Parameter names mapped to their values.
- get_raw_results() dict[source]#
Get the raw results of the cross-validation.
Get the direct output of the algorithms. These are usually handed down through the
single__raw__parameters of the scoring output.The exact structure of the results depends on the scorer and the optimizer used. Usually, outputs are provided as pandas dataframes.
If the individual outputs are dataframes, they are concatenated along the
cv_foldaxis. Otherwise, they are simply returned as a list, where each element represents the output of one cv-fold.- Returns:
- dict
Raw algorithm results from teh cross-validation.
- get_single_results_as_df( ) DataFrame[source]#
Get the results as a pandas DataFrame.
This will return the results as a pandas DataFrame with the columns specified in the
columnsparameter. If no columns are specified, all columns are returned. We excludesingle__raw__columns, as they are by convention reserved for the direct output of the pipeline and usually don’t make sense to view together with the single results.This will provide as one row per data label.
- Parameters:
- columns
List of columns that should be included in the DataFrame. These need to be specified WITHOUT the “single__” prefix. (e.g.
f1_scoreinstead ofsingle__f1_score). If not specified, all columns are included.
- Returns:
- pd.DataFrame
The results as a pandas DataFrame.
- run(
- pipeline: T,
Run the evaluation challenge.
This will call the pipeline for each datapoint in the dataset and evaluate the performance using the provided scoring function.
- Parameters:
- pipeline
A valid pipeline that is compatible with the provided dataset and scorer.
- Returns:
- self
The instance of the class with the
results_attribute set to the results of the validation.
Examples using mobgap.utils.evaluation.Evaluation#
Revalidation of the gait sequence detection algorithms
Revalidation of the initial contact detection algorithms
Revalidation of the laterality classification algorithms