MobilisedAggregator#

class mobgap.aggregation.MobilisedAggregator( groupby: Sequence[str] | None = cf(None), *, unique_wb_id_column: str = cf('wb_id'), use_original_names: bool = cf(False), )[source]#

Implementation of the aggregation algorithm utilized in Mobilise-D.

This algorithm aggregates DMO parameters across single walking bouts. The calculated parameters are divided into 4 groups based on the duration of the walking bout. For every group, a different set of possible parameters is available. The groups are defined as follows ([orginial_mobilised_name/new_name]):

All walking bout [parameters with “all”], available parameters:
- Duration of all walking bouts in h [“walkdur_all_sum”/”total_walking_duration_h]
- Number of steps in all walking bouts [“steps_all_sum”/”wb_all__n_steps__sum”]
- Number of turns in all walking bouts [“turns_all_sum”/”wb_all__n_turns__sum”]
- Number of walking bouts [“wb_all_sum”/”wb_all__count”]
- Median duration of walking bouts in s [“wbdur_all_avg”/”wb_all__duration_s__avg”]
- 90th percentile of duration of walking bouts in s [“wbdur_all_max”/”wb_all__duration_s__max”
- Coefficient of variation in walking bout duration in s [“wbdur_all_var”/”wb_all__duration_s__var”]
- Average cadence of walking bouts in steps/min [“cadence_all_avg”/”wb_all__cadence_spm__avg”]
- Average stride duration of walking bouts in s [“strdur_all_avg”/”wb_all__stride_duration_s__avg”]
- Coefficient of variation in cadence of walking bouts in steps/min [“cadence_all_var”/”wb_all__cadence_spm__var”]
- Coefficient of variation in stride duration of walking bouts in s [“strdur_all_var”/”wb_all__stride_duration_s__var”]
Walking bouts with duration between 10 and 30 seconds [parameters with “1030”], available parameters:
- Average stride speed of walking bouts in m/s [“ws_1030_avg”/”wb_10_30__walking_speed_mps__avg”]
- Average stride length of walking bouts in cm [“strlen_1030_avg”/”wb_10_30__stride_length_m__avg”]
Walking bouts with duration longer than 10 seconds [parameters with “10”], available parameters:
- Number of walking bouts [“wb_10_sum”/”wb_10__count”]
- 90th percentile of stride speed of walking bouts in m/s [“ws_10_max”/”wb_10__walking_speed_mps__max”]
Walking bouts with duration longer than 30 seconds [parameters with “30”], available parameters:
- Number of walking bouts [“wb_30_sum”/”wb_30__count”]
- Average stride speed of walking bouts in m/s [“ws_30_avg”/”wb_30__walking_speed_mps__avg”]
- Average stride length of walking bouts in cm [“strlen_30_avg”/”wb_30__stride_length_m__avg”]
- Average cadence of walking bouts in steps/min [“cadence_30_avg”/”wb_30__cadence_spm__avg”]
- Average stride duration of walking bouts in s [“strdur_30_avg”/”wb_30__stride_duration_s__avg”]
- 90th percentile of stride speed of walking bouts in m/s [“ws_30_max”/”wb_30__walking_speed_mps__max”]
- 90th percentile of cadence of walking bouts in steeps/min [“cadence_30_max”/”wb_30__cadence_spm__max”]
- Coefficient of variation in stride speed of walking bouts in m/s [“ws_30_var”/”wb_30__walking_speed_mps__var”]
- Coefficient of variation in stride length of walking bouts in cm [“strlen_30_var”/”wb_30__stride_length_m__var”]
Walking bouts with duration longer than 60 seconds [parameters with “60”], available parameters:
- Number of walking bouts [“wb_60_sum”/”wb_60__count”]

Every of the above-mentioned parameters will be added to a distinct column in the aggregated data. Which of the parameters are calculated depends on the columns available in the input data. All parameters are calculated when all the following columns are available:

duration_s
n_steps
n_turns
walking_speed_mps
stride_length_m
cadence_spm
stride_duration_s

Otherwise, only parameters for which the corresponding DMO data is provided are added to the aggregation results. For example, if the input data does not contain a “stride_length” column, the “strlen_1030_avg”, “strlen_30_avg”, “strlen_30_var” parameters are not calculated. Furthermore, if no “duration” is provided, only the “all”-parameters without duration filter will be calculated.

The aggregation parameters are calculated for every unique group of the groupby. Per default, one set of aggregation results is calculated per participant and recording date. This can however be adapted by passing a different list of groupby.

Parameters:

groupby: A list of columns to group the data by. Based on the resulting groups, the aggregations are calculated. Possible groupings are e.g. by participant, recording date, or trial. To generate daily aggregations (the default), the groupby should contain the columns “subject_code” and “visit_date”. If groupby is set to None, the data is aggregated without grouping (i.e. all WBs are aggregated together). This is useful, when aggregation is run on a single recording.
unique_wb_id_column: The name of the column (or index level) containing a unique identifier for every walking bout. The id does not have to be unique globally, but only within the groups defined by groupby. Aka wb_dmos.reset_index().set_index([*groupby, unique_wb_id_column]).index.is_unique must be True.
use_original_names: If True, the original names used in Mobilise-D are used for the aggregated data. They use shorthands for many parameters and are hence, less easy to understand. The “new” names are more descriptive and easier to understand and are hence, the default.

Other Parameters:

wb_dmos

The DMO data per walking bout passed to the aggregate method.

wb_dmos_mask

A boolean DataFrame with the same shape the wb_dmos indicating the validity of every measure. Like the data, the wb_dmos_mask must have the groupby and the unique_wb_id_column as either as index or column available. After setting all of them as index, the index must be identical to the data. Every column of the data mask corresponds to a column of wb_dmos and has the same name. If an entry is False, the corresponding measure is implausible and should be ignored for the aggregations.

To exclude implausible data points from the input data, a wb_dmos_mask can be passed to the aggregate method. The columns in wb_dmos_mask are regarded if there exists a column in the input data with the same name. Note that depending on which DMO measure is flagged as implausible, different elimination steps are applied:

“duration_s”: The whole walking bout is not considered for the aggregation.
“n_steps”: The corresponding “n_steps” is not regarded.
“n_turns”: The corresponding “n_turns” is not regarded.
“walking_speed_mps”: The corresponding “walking_speed_mps” is not regarded.
“stride_length_m”: The corresponding “stride_length_m” AND the corresponding “walking_speed_mps” are not regarded.
“cadence_spm”: The corresponding “cadence_spm” AND the corresponding “walking_speed_mps” are not regarded.
“stride_duration_s”: The corresponding “stride_duration_s” is not regarded.

Attributes:

aggregated_data_: A dataframe containing the aggregated results. The index of the dataframe contains the groupby_columns. Consequently, there is one row which aggregation results for each group.
filtered_wb_dmos_: An updated version of wb_dmos with the implausible entries removed based on wb_dmos_mask. filtered_wb_dmos_ will have the groupby columns and the unique_wb_id_column set as index.

Notes

The outputs of this aggregation algorithm are analogous to the outputs of the original Mobilise-D R-Script for aggregation (when using use_original_names=Tue), with 3 exceptions. Values are not rounded to 3 decimal places, stride length values are not converted to cm, and variance values are expressed as ratios instead of percentages. This is done for consistency within mobgap, but if you want to directly reproduce the original Mobilise-D results, you can round the values and convert stride length to cm manually.

However, there can be small differences in the second/third decimal place range in the results. This is due to different outputs of the quantile function in Python and R.

Methods

`aggregate`(wb_dmos, *[, wb_dmos_mask])	Aggregate parameters across walking bouts..
`clone`()	Create a new instance of the class with all parameters copied over.
`get_params`([deep])	Get parameters for this algorithm.
`set_params`(**params)	Set the parameters of this Algorithm.

PredefinedParameters

__init__( groupby: Sequence[str] | None = cf(None), *, unique_wb_id_column: str = cf('wb_id'), use_original_names: bool = cf(False), ) → None[source]#

aggregate(

wb_dmos: DataFrame,

*,

wb_dmos_mask: DataFrame | None = None,

**_: Unpack[dict[str, Any]],

) → Self[source]#

Aggregate parameters across walking bouts..

Parameters:

wb_dmos: The DMO data per walking bout. This is a dataframe with one row for every walking bout and one column for every DMO parameter. This should further have relevant metadata (i.e. participant_id, visit_date, wb_id) as columns or indices. The specific requirements depend on the aggregation algorithm.
wb_dmos_mask: A boolean DataFrame with the same shape the wb_dmos indicating the validity of every measure. If the DataFrame contains a NaN value, this is interpreted as True, assuming no checks were applied to this value and the corresponding measure is regarded as plausible.

Returns:

self: The instance of the class with the aggregated_data_ attribute set to the aggregation results.

clone() → Self[source]#

Create a new instance of the class with all parameters copied over.

This will create a new instance of the class itself and all nested objects

get_params(deep: bool = True) → dict[str, Any][source]#

Get parameters for this algorithm.

Parameters:

deep: Only relevant if object contains nested algorithm objects. If this is the case and deep is True, the params of these nested objects are included in the output using a prefix like nested_object_name__ (Note the two “_” at the end)

Returns: