GsIterator#

class mobgap.pipeline.GsIterator( data_type: type[FullPipelinePerGsResult] = ..., aggregations: Sequence[tuple[str, _aggregator_type[Any]]] = ..., )[source]#

class mobgap.pipeline.GsIterator( data_type: type[DataclassT] = ..., aggregations: Sequence[tuple[str, _aggregator_type[Any]]] = ..., )

Iterator to split data into gait-sequences and iterate over them individually.

This can be used to easily iterate over gait-sequences and apply algorithms to them, and then collect the results in a convenient way.

Note that you need to specify the expected results by creating a custom dataclass (learn more in the example linked at the bottom of this page). Each result can further be aggregated by providing an aggregation function.

Parameters:

data_type

A dataclass that defines the result type you expect from each iteration. By default, this is GsIterator.DEFAULT_DATA_TYPE, which should handle all typical results of a gait analysis pipeline.

aggregations

An optional list of aggregations to apply to the results. This has the form [(result_name, aggregation_function), ...]. If a result-name is in the list, the aggregation will be applied to it, when accessing the respective result attribute (i.e. {result_name}_). If no aggregation is defined for a result, a simple list of all results will be returned. By default, this is GsIterator.DEFAULT_AGGREGATIONS.

NULL_VALUE

(Class attribute) The value that is used to initialize the result dataclass and will remain in the results, if no result was for a specific attribute in one or more iterations.

PredefinedParameters

(Class attribute) Predefined parameters that can be used depending on which aggregation you want to use. In all provided cases the data_type is set to FullPipelinePerGsResult. This datatype provides the following attributes:

ic_list (pd.DataFrame with a column called ic): The initial contacts for each gait-sequence.
cad_per_sec (pd.DataFrame): The cadence values within each gait-sequence.
stride_length (pd.DataFrame): The stride length values within each gait-sequence.
gait_speed (pd.DataFrame): The gait speed values within each gait-sequence.

DefaultAggregators

(Class attribute) Class that holds some aggregator functions that can be used to create custom aggregations.

Attributes:

inputs_: List of all input elements that were iterated over.
raw_results_: List of all results as dataclass instances. The attribute of the dataclass instance will have the value of _NOT_SET if no result was set. To check for this, you can use isinstance(val, TypedIterator.NULL_VALUE).
{result_name}_: The aggregated results for the respective result name.
done_: True, if the iterator is done. If the iterator is not done, but you try to access the results, a warning will be raised.

See also

tpcp.misc.TypedIterator: Generic version of this iterator
iter_gs: Functional interface to iterate over gs.

Methods

`DefaultAggregators`()	Available aggregators for the gait-sequence iterator.
`PredefinedParameters`()	Predefined parameters for the gait-sequence iterator.
`clone`()	Create a new instance of the class with all parameters copied over.
`get_params`([deep])	Get parameters for this algorithm.
`iterate`(data, gs_list)	Iterate over the gait sequences one by one.
`set_params`(**params)	Set the parameters of this Algorithm.

__init__( data_type: type[FullPipelinePerGsResult] = <class 'mobgap.pipeline._gs_iterator.FullPipelinePerGsResult'>, aggregations: Sequence[tuple[str, _aggregator_type[Any]]] = cf([('ic_list', <function create_aggregate_df.<locals>.aggregate_df>), ('cad_per_sec', <function create_aggregate_df.<locals>.aggregate_df>), ('stride_length', <function create_aggregate_df.<locals>.aggregate_df>), ('gait_speed', <function create_aggregate_df.<locals>.aggregate_df>)]), ) → None[source]#
__init__( data_type: type[DataclassT] = <class 'mobgap.pipeline._gs_iterator.FullPipelinePerGsResult'>, aggregations: Sequence[tuple[str, _aggregator_type[Any]]] = cf([('ic_list', <function create_aggregate_df.<locals>.aggregate_df>), ('cad_per_sec', <function create_aggregate_df.<locals>.aggregate_df>), ('stride_length', <function create_aggregate_df.<locals>.aggregate_df>), ('gait_speed', <function create_aggregate_df.<locals>.aggregate_df>)]), ) → None

class DefaultAggregators[source]#

Available aggregators for the gait-sequence iterator.

Note, that all of them are constructors for aggregators, as they have some configuration options. To use them as aggregators, you need to call them with the desired configuration.

Examples

>>> from mobgap.pipeline import GsIterator
>>> my_aggregation = [("my_result", GsIterator.DefaultAggregators.create_aggregate_df(["my_col"]))]
>>> iterator = GsIterator(aggregations=my_aggregation)

Methods

create_aggregate_df(*[, ...])

Create an aggregator for the GS iterator that aggregates dataframe results into a single dataframe.

create_aggregate_df( *, fix_gs_offset_index: bool = False, _potential_index_names: Sequence[str] = ('wb_id', 'gs_id'), ) → Callable[[list[tuple[tuple, DataFrame]], list[DataFrame]], DataFrame][source]#

Create an aggregator for the GS iterator that aggregates dataframe results into a single dataframe.

The aggregator will also fix the offset of the given columns by adding the start value of the gait-sequence. This way the final dataframe will have all sample based time-values relative to the start of the recording.

Parameters:

fix_gs_offset_cols: The columns that should be adapted to be relative to the start of the recording. By default, this is ("start", "end"). If you don’t want to fix any columns, you can set this to an empty list.
fix_gs_offset_index: If True, the index of the dataframes will be adapted to be relative to the start of the recording. This only makes sense, if the index represents sample values relative to the start of the gs.
_potential_index_names: The potential names of the index columns. This usually does not need to be changed.

class PredefinedParameters[source]#

Predefined parameters for the gait-sequence iterator.

Attributes:

default_aggregation: The default of the TypedIterator using the FullPipelinePerGsResult as data_type and trying to aggregate all results so that the time values in the final outputs are relative to the start of the recording.
default_aggregation_rel_to_gs: Same as default_aggregation, but the time values in the final outputs are relative to the start of the respective gait-sequence (i.e. no modification of the time values is done).

clone() → Self[source]#

Create a new instance of the class with all parameters copied over.

This will create a new instance of the class itself and all nested objects

get_params(deep: bool = True) → dict[str, Any][source]#

Get parameters for this algorithm.

Parameters:

deep: Only relevant if object contains nested algorithm objects. If this is the case and deep is True, the params of these nested objects are included in the output using a prefix like nested_object_name__ (Note the two “_” at the end)

Returns:

params: Parameter names mapped to their values.

iterate( data: DataFrame, gs_list: DataFrame, ) → Iterator[tuple[tuple[WalkingBout | GaitSequence, DataFrame], DataclassT]][source]#

Iterate over the gait sequences one by one.

Parameters:

data: The data to iterate over.
gs_list: The list of gait-sequences. The “start” and “end” columns are expected to match the units of the data index.

Yields:

gs_datatuple[str, pd.DataFrame]: The data per gait-sequence. This is a tuple where the first element is the gait-sequence-id (i.e. the index from the gs-dataframe) and the second element is the data cut from the data dataframe.
result_object: The empty result object (instance of the provided Dataclass) that should be filled with the results during iteration.

property results_: DataclassT#

The aggregated results.

Note, that this returns an instance of the result object, even-though the datatypes of the attributes might be different depending on the aggregation function. We still decided it makes sense to return an instance of the result object, as it will allow to autocomplete the attributes, even-though the associated times might not be correct.

set_params(**params: Any) → Self[source]#