.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "auto_examples/aggregation/_01_mobilised_aggregator.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note :ref:`Go to the end ` to download the full example code .. rst-class:: sphx-glr-example-title .. _sphx_glr_auto_examples_aggregation__01_mobilised_aggregator.py: .. _mobilised_aggregator_example: Mobilised Aggregator ==================== This example shows how to use the :class:`.MobilisedAggregator` class to aggregate DMOs over multiple walking bouts. .. GENERATED FROM PYTHON SOURCE LINES 12-19 Loading some example data ------------------------- .. note :: This data is randomly generated and not physiologically meaningful. However, it has the same structure as any other typical input dataset for the :class:`.MobilisedAggregator`. The input data for the aggregator is a :class:`pandas.DataFrame` with one row for every walking bout. The columns contain the DMO parameters estimated for each walking bout, such as duration, stride length, etc. .. GENERATED FROM PYTHON SOURCE LINES 19-32 .. code-block:: default import pandas as pd from mobgap import PACKAGE_ROOT from mobgap.aggregation import MobilisedAggregator DATA_PATH = PACKAGE_ROOT.parent / "example_data/original_results/mobilised_aggregator" data = pd.read_csv(DATA_PATH / "aggregation_test_input.csv", index_col=0).set_index( ["visit_type", "participant_id", "measurement_date", "wb_id"] ) data.head() .. raw:: html

				duration_s	n_steps	cadence_spm	walking_speed_mps	stride_length_m	stride_duration_s	n_turns
visit_type	participant_id	measurement_date	wb_id
T1	12345	2023-01-01	0	4.74702	8	99.82188	1.16079	2.51885	1.58675	0
			1	5.13150	7	101.16429	2.57881	1.57243	1.46537	0
			2	8.52727	12	86.53527	1.60044	1.66305	2.56092	2
			3	16.24554	27	91.49977	0.95558	0.88961	3.14549	1
			4	6.09907	8	93.69895	2.33230	1.95969	2.35295	0

.. GENERATED FROM PYTHON SOURCE LINES 33-45 Furthermore, the aggregator allows to provide a data mask, which is a boolean :class:`pandas.DataFrame` with the same dimensions as the input data. The data mask indicates which DMOs of the input data should be used for the aggregation (marked as True) and which should be ignored (marked as False). For this example, we create this mask by applying the "standard" thresholds from Mobilise-D to the data. To learn more about this see the example :ref:`threshold_check example `. .. note :: It is only possible to use the ``apply_thresholds`` function here, as all the example data is from the same participant. As some thresholds are cohort or height specific, you would have to apply the thresholds for each participant data separately. .. GENERATED FROM PYTHON SOURCE LINES 45-51 .. code-block:: default from mobgap.aggregation import apply_thresholds, get_mobilised_dmo_thresholds thresholds = get_mobilised_dmo_thresholds() # Note: The height is "artificially" set to 1.75m, as the example data does not contain this information. data_mask = apply_thresholds(data, thresholds, cohort="HA", height_m=1.75, measurement_condition="free_living") .. GENERATED FROM PYTHON SOURCE LINES 52-58 Performing the aggregation -------------------------- The :class:`.MobilisedAggregator` is now used to aggregate the input data over several walking bouts, e.g., over all walking bouts from one participant, or over all walking bouts per participant and day, week, or other criteria. The data is grouped using additional columns in the input data, which are not used for the aggregation itself. In this example, the data is grouped by participant (`subject_code`) and day (`visit_date`). .. GENERATED FROM PYTHON SOURCE LINES 58-61 .. code-block:: default agg = MobilisedAggregator(groupby=("visit_type", "participant_id", "measurement_date")) agg.aggregate(data, wb_dmos_mask=data_mask) .. rst-class:: sphx-glr-script-out .. code-block:: none /home/docs/checkouts/readthedocs.org/user_builds/mobgap/checkouts/v0.3.0/mobgap/aggregation/_mobilised_aggregator.py:279: FutureWarning: Downcasting object dtype arrays on .fillna, .ffill, .bfill is deprecated and will change in a future version. Call result.infer_objects(copy=False) instead. To opt-in to the future behavior, set `pd.set_option('future.no_silent_downcasting', True)` wb_dmos_mask.fillna(True) MobilisedAggregator(groupby=('visit_type', 'participant_id', 'measurement_date'), unique_wb_id_column='wb_id') .. GENERATED FROM PYTHON SOURCE LINES 62-64 The resulting :class:`pandas.DataFrame` containing the aggregated data contains one row for every group. In this case, there is only one participant and day, so the resulting dataframe contains only one row. .. GENERATED FROM PYTHON SOURCE LINES 64-66 .. code-block:: default agg.aggregated_data_ .. raw:: html

			wb_all_sum	walkdur_all_sum	steps_all_sum	turns_all_sum	wbdur_all_avg	wbdur_all_max	wbdur_all_var	cadence_all_avg	strdur_all_avg	cadence_all_var	strdur_all_var	wb_1030_sum	ws_1030_avg	strlen_1030_avg	wb_10_sum	ws_10_max	wb_30_sum	ws_30_avg	strlen_30_avg	cadence_30_avg	strdur_30_avg	ws_30_max	cadence_30_max	ws_30_var	strlen_30_var	wb_60_sum
visit_type	participant_id	measurement_date
T1	12345	2023-01-01	2378	10.534	59320	3012	8.859	26.927	2.275	94.673	2.213	0.127	0.261	844	1.497	186.458	1029	2.096	185	1.619	197.523	102.807	2.101	2.128	115.175	0.241	0.252	62

.. GENERATED FROM PYTHON SOURCE LINES 67-80 Comparison with R aggregation script ------------------------------------ The outputs of this aggregation algorithm are analogous to the outputs of the original Mobilise-D R-Script, using the same duration filters and aggregation metrics. However, there can be small differences in the second/third decimal place range in the results. This is due to different outputs of the quantile function in Python and R. Furthermore, the parameter "strlen_30_var" is converted to cm for consistency, while it is in m in the original R-Script. By grouping the data by participant and day, the results the Daily Aggregations of the original R-Script are retrieved. To get the Weekly Aggregations, the Daily results are averaged over all recording days per participant and rounded depending on the aggregation metric. Obviously, in this example, the results are identical to the Daily Aggregations, as there is only data from one day contained. .. GENERATED FROM PYTHON SOURCE LINES 80-86 .. code-block:: default weekly_agg = agg.aggregated_data_.groupby("participant_id").mean(numeric_only=True).reset_index() round_to_int = ["steps_all_sum", "turns_all_sum", "wb_all_sum", "wb_10_sum", "wb_30_sum", "wb_60_sum"] round_to_three_decimals = weekly_agg.columns[~weekly_agg.columns.isin(round_to_int)] weekly_agg[round_to_int] = weekly_agg[round_to_int].round() weekly_agg[round_to_three_decimals] = weekly_agg[round_to_three_decimals].round(3) weekly_agg .. raw:: html

	participant_id	wb_all_sum	walkdur_all_sum	steps_all_sum	turns_all_sum	wbdur_all_avg	wbdur_all_max	wbdur_all_var	cadence_all_avg	strdur_all_avg	cadence_all_var	strdur_all_var	wb_1030_sum	ws_1030_avg	strlen_1030_avg	wb_10_sum	ws_10_max	wb_30_sum	ws_30_avg	strlen_30_avg	cadence_30_avg	strdur_30_avg	ws_30_max	cadence_30_max	ws_30_var	strlen_30_var	wb_60_sum
0	12345	2378.0	10.534	59320.0	3012.0	8.859	26.927	2.275	94.673	2.213	0.127	0.261	844.0	1.497	186.458	1029.0	2.096	185.0	1.619	197.523	102.807	2.101	2.128	115.175	0.241	0.252	62.0

.. rst-class:: sphx-glr-timing **Total running time of the script:** (0 minutes 1.030 seconds) **Estimated memory usage:** 9 MB .. _sphx_glr_download_auto_examples_aggregation__01_mobilised_aggregator.py: .. only:: html .. container:: sphx-glr-footer sphx-glr-footer-example .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: _01_mobilised_aggregator.py <_01_mobilised_aggregator.py>` .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: _01_mobilised_aggregator.ipynb <_01_mobilised_aggregator.ipynb>` .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_