MobilisedAggregator#
- class mobgap.aggregation.MobilisedAggregator(
- groupby: Sequence[str] | None = cf(None),
- *,
- unique_wb_id_column: str = cf('wb_id'),
- use_original_names: bool = cf(False),
Implementation of the aggregation algorithm utilized in Mobilise-D.
This algorithm aggregates DMO parameters across single walking bouts. The calculated parameters are divided into 4 groups based on the duration of the walking bout. For every group, a different set of possible parameters is available. The groups are defined as follows ([orginial_mobilised_name/new_name]):
All walking bout [parameters with “all”], available parameters:
Duration of all walking bouts in h [“walkdur_all_sum”/”total_walking_duration_h]
Number of steps in all walking bouts [“steps_all_sum”/”wb_all__n_steps__sum”]
Number of turns in all walking bouts [“turns_all_sum”/”wb_all__n_turns__sum”]
Number of walking bouts [“wb_all_sum”/”wb_all__count”]
Median duration of walking bouts in s [“wbdur_all_avg”/”wb_all__duration_s__avg”]
90th percentile of duration of walking bouts in s [“wbdur_all_max”/”wb_all__duration_s__max”
Coefficient of variation in walking bout duration in s [“wbdur_all_var”/”wb_all__duration_s__var”]
Average cadence of walking bouts in steps/min [“cadence_all_avg”/”wb_all__cadence_spm__avg”]
Average stride duration of walking bouts in s [“strdur_all_avg”/”wb_all__stride_duration_s__avg”]
Coefficient of variation in cadence of walking bouts in steps/min [“cadence_all_var”/”wb_all__cadence_spm__var”]
Coefficient of variation in stride duration of walking bouts in s [“strdur_all_var”/”wb_all__stride_duration_s__var”]
Walking bouts with duration between 10 and 30 seconds [parameters with “1030”], available parameters:
Average stride speed of walking bouts in m/s [“ws_1030_avg”/”wb_10_30__walking_speed_mps__avg”]
Average stride length of walking bouts in cm [“strlen_1030_avg”/”wb_10_30__stride_length_m__avg”]
Walking bouts with duration longer than 10 seconds [parameters with “10”], available parameters:
Number of walking bouts [“wb_10_sum”/”wb_10__count”]
90th percentile of stride speed of walking bouts in m/s [“ws_10_max”/”wb_10__walking_speed_mps__max”]
Walking bouts with duration longer than 30 seconds [parameters with “30”], available parameters:
Number of walking bouts [“wb_30_sum”/”wb_30__count”]
Average stride speed of walking bouts in m/s [“ws_30_avg”/”wb_30__walking_speed_mps__avg”]
Average stride length of walking bouts in cm [“strlen_30_avg”/”wb_30__stride_length_m__avg”]
Average cadence of walking bouts in steps/min [“cadence_30_avg”/”wb_30__cadence_spm__avg”]
Average stride duration of walking bouts in s [“strdur_30_avg”/”wb_30__stride_duration_s__avg”]
90th percentile of stride speed of walking bouts in m/s [“ws_30_max”/”wb_30__walking_speed_mps__max”]
90th percentile of cadence of walking bouts in steeps/min [“cadence_30_max”/”wb_30__cadence_spm__max”]
Coefficient of variation in stride speed of walking bouts in m/s [“ws_30_var”/”wb_30__walking_speed_mps__var”]
Coefficient of variation in stride length of walking bouts in cm [“strlen_30_var”/”wb_30__stride_length_m__var”]
Walking bouts with duration longer than 60 seconds [parameters with “60”], available parameters:
Number of walking bouts [“wb_60_sum”/”wb_60__count”]
Every of the above-mentioned parameters will be added to a distinct column in the aggregated data. Which of the parameters are calculated depends on the columns available in the input data. All parameters are calculated when all the following columns are available:
duration_sn_stepsn_turnswalking_speed_mpsstride_length_mcadence_spmstride_duration_s
Otherwise, only parameters for which the corresponding DMO data is provided are added to the aggregation results. For example, if the input data does not contain a “stride_length” column, the “strlen_1030_avg”, “strlen_30_avg”, “strlen_30_var” parameters are not calculated. Furthermore, if no “duration” is provided, only the “all”-parameters without duration filter will be calculated.
The aggregation parameters are calculated for every unique group of the
groupby. Per default, one set of aggregation results is calculated per participant and recording date. This can however be adapted by passing a different list ofgroupby.- Parameters:
- groupby
A list of columns to group the data by. Based on the resulting groups, the aggregations are calculated. Possible groupings are e.g. by participant, recording date, or trial. To generate daily aggregations (the default), the
groupbyshould contain the columns “subject_code” and “visit_date”. If groupby is set toNone, the data is aggregated without grouping (i.e. all WBs are aggregated together). This is useful, when aggregation is run on a single recording.- unique_wb_id_column
The name of the column (or index level) containing a unique identifier for every walking bout. The id does not have to be unique globally, but only within the groups defined by
groupby. Akawb_dmos.reset_index().set_index([*groupby, unique_wb_id_column]).index.is_uniquemust beTrue.- use_original_names
If
True, the original names used in Mobilise-D are used for the aggregated data. They use shorthands for many parameters and are hence, less easy to understand. The “new” names are more descriptive and easier to understand and are hence, the default.
- Other Parameters:
- wb_dmos
The DMO data per walking bout passed to the
aggregatemethod.- wb_dmos_mask
A boolean DataFrame with the same shape the
wb_dmosindicating the validity of every measure. Like the data, thewb_dmos_maskmust have thegroupbyand theunique_wb_id_columnas either as index or column available. After setting all of them as index, the index must be identical to the data. Every column of the data mask corresponds to a column ofwb_dmosand has the same name. If an entry isFalse, the corresponding measure is implausible and should be ignored for the aggregations.To exclude implausible data points from the input data, a
wb_dmos_maskcan be passed to theaggregatemethod. The columns inwb_dmos_maskare regarded if there exists a column in the input data with the same name. Note that depending on which DMO measure is flagged as implausible, different elimination steps are applied:“duration_s”: The whole walking bout is not considered for the aggregation.
“n_steps”: The corresponding “n_steps” is not regarded.
“n_turns”: The corresponding “n_turns” is not regarded.
“walking_speed_mps”: The corresponding “walking_speed_mps” is not regarded.
“stride_length_m”: The corresponding “stride_length_m” AND the corresponding “walking_speed_mps” are not regarded.
“cadence_spm”: The corresponding “cadence_spm” AND the corresponding “walking_speed_mps” are not regarded.
“stride_duration_s”: The corresponding “stride_duration_s” is not regarded.
- Attributes:
- aggregated_data_
A dataframe containing the aggregated results. The index of the dataframe contains the
groupby_columns. Consequently, there is one row which aggregation results for each group.- filtered_wb_dmos_
An updated version of
wb_dmoswith the implausible entries removed based onwb_dmos_mask.filtered_wb_dmos_will have the groupby columns and theunique_wb_id_columnset as index.
Notes
The outputs of this aggregation algorithm are analogous to the outputs of the original Mobilise-D R-Script for aggregation (when using
use_original_names=Tue), with 3 exceptions. Values are not rounded to 3 decimal places, stride length values are not converted to cm, and variance values are expressed as ratios instead of percentages. This is done for consistency within mobgap, but if you want to directly reproduce the original Mobilise-D results, you can round the values and convert stride length to cm manually.However, there can be small differences in the second/third decimal place range in the results. This is due to different outputs of the quantile function in Python and R.
Methods
aggregate(wb_dmos, *[, wb_dmos_mask])Aggregate parameters across walking bouts..
clone()Create a new instance of the class with all parameters copied over.
get_params([deep])Get parameters for this algorithm.
set_params(**params)Set the parameters of this Algorithm.
PredefinedParameters
- __init__(
- groupby: Sequence[str] | None = cf(None),
- *,
- unique_wb_id_column: str = cf('wb_id'),
- use_original_names: bool = cf(False),
- aggregate( ) Self[source]#
Aggregate parameters across walking bouts..
- Parameters:
- wb_dmos
The DMO data per walking bout. This is a dataframe with one row for every walking bout and one column for every DMO parameter. This should further have relevant metadata (i.e.
participant_id,visit_date,wb_id) as columns or indices. The specific requirements depend on the aggregation algorithm.- wb_dmos_mask
A boolean DataFrame with the same shape the
wb_dmosindicating the validity of every measure. If the DataFrame contains aNaNvalue, this is interpreted asTrue, assuming no checks were applied to this value and the corresponding measure is regarded as plausible.
- Returns:
- self
The instance of the class with the
aggregated_data_attribute set to the aggregation results.
- clone() Self[source]#
Create a new instance of the class with all parameters copied over.
This will create a new instance of the class itself and all nested objects
- get_params(deep: bool = True) dict[str, Any][source]#
Get parameters for this algorithm.
- Parameters:
- deep
Only relevant if object contains nested algorithm objects. If this is the case and deep is True, the params of these nested objects are included in the output using a prefix like
nested_object_name__(Note the two “_” at the end)
- Returns:
- params
Parameter names mapped to their values.