Note
Go to the end to download the full example code
Preconfigured Mobilised Pipelines#
As part of the Mobilise-D project two separate pipelines have been developed depending on the patient characteristics.
The first pipeline MobilisedPipelineHealthy (P1 in [1]) is designed for people that likely
still have a somewhat normal gait pattern.
In Mobilise-D, this pipeline is used for healthy controls and patients with “COPD” and “CHF”.
The second pipeline MobilisedPipelineImpaired (P2 in [1]) is designed for patients with
likely significantly impaired gait patterns.
In Mobilise-D, this pipeline is used for patients with “PD”, “PFF” and “MS”.
In this example we will show how to use these preconfigured pipelines. If you want to understand the details of the pipelines, please refer to the step-by-step example.
Data#
For this example, we will use the provided example data. It contains data from Lab tests from MS patients and healthy controls.
from mobgap.data import LabExampleDataset
data_ha = LabExampleDataset().get_subset(cohort="HA")
data_ha
data_ms = LabExampleDataset().get_subset(cohort="MS")
data_ms
Mobilised Pipeline Healthy#
from mobgap.pipeline import MobilisedPipelineHealthy, MobilisedPipelineUniversal
pipeline_ha = MobilisedPipelineHealthy()
We just apply the pipeline to the first long test in the data.
long_test_ha = data_ha.get_subset(test="Test11")[0]
pipeline_ha = pipeline_ha.safe_run(long_test_ha)
Now we can access the results. Note, that the pipelines contain a large number of results. Not all of them are relevant for every use case. We only show the main outputs here:
The main output are the aggregated parameters. By default, this is just a single output for each recording. It describes the overall statistics and aggregated parameters over all WBs.
On level below, we have the WB parameters. This is a DataFrame with the parameters per WB.
Even more granular are the stride level parameters. They contain only the strides that are also part of a valid WB.
For other results, see the documentation of the pipeline class itself.
Mobilised Pipeline Impaired#
from mobgap.pipeline import MobilisedPipelineImpaired
pipeline_ms = MobilisedPipelineImpaired()
We just apply the pipeline to the first long test in the data.
long_test_ms = data_ms.get_subset(test="Test11")[0]
pipeline_ms = pipeline_ms.safe_run(long_test_ms)
Like before we can access the results.
That’s it. As you can see, it is super simple to run the preconfigured pipelines on your data, if they are structured as a valid gait dataset.
However, when running a larger study, you might want to process all data at once. Then it become “inconvenient” to run the pipeline for each recording separately manually.
Luckily, it is relatively easy to implement a loop that runs the pipeline for each recording.
We can even use MobilisedPipelineUniversal to automatically process all MS participants
with the impaired pipeline and all HA participants with the healthy pipeline.
meta_pipeline = MobilisedPipelineUniversal(
pipelines=[
("healthy", MobilisedPipelineHealthy()),
("impaired", MobilisedPipelineImpaired()),
]
)
The meta-pipeline uses the recommended_cohorts parameter of the respective pipeline to determine which pipeline to
use.
MobilisedPipelineHealthy().recommended_cohorts
('HA', 'COPD', 'CHF')
MobilisedPipelineImpaired().recommended_cohorts
('PD', 'MS', 'PFF')
So we can simply loop over all the data and run the meta-pipeline. We add a little bit of logic to deal with trials that for which we might not detect a valid WB. Then we aggregate the results.
For the aggreate_parameters we modify the index, so that we have rows with NaNs for the trials that did not have
any valid WBs.
import pandas as pd
from tqdm.auto import tqdm
per_wb_paras = {}
aggregated_paras = {}
for trial in tqdm(LabExampleDataset()):
pipe = meta_pipeline.clone().safe_run(trial)
if not (per_wb := pipe.per_wb_parameters_).empty:
per_wb_paras[trial.group_label] = per_wb
if not (agg := pipe.aggregated_parameters_).empty:
aggregated_paras[trial.group_label] = agg
per_wb_paras = pd.concat(per_wb_paras)
aggregated_paras = (
pd.concat(aggregated_paras)
.reset_index(-1, drop=True)
.rename_axis(LabExampleDataset().index.columns)
.reindex(pd.MultiIndex.from_tuples(LabExampleDataset().group_labels))
)
0%| | 0/9 [00:00<?, ?it/s]
11%|█ | 1/9 [00:00<00:03, 2.01it/s]
22%|██▏ | 2/9 [00:00<00:03, 2.16it/s]
33%|███▎ | 3/9 [00:01<00:04, 1.42it/s]
44%|████▍ | 4/9 [00:02<00:03, 1.64it/s]
56%|█████▌ | 5/9 [00:02<00:02, 1.83it/s]
67%|██████▋ | 6/9 [00:03<00:01, 1.51it/s]
78%|███████▊ | 7/9 [00:04<00:01, 1.66it/s]
89%|████████▉ | 8/9 [00:04<00:00, 1.83it/s]
100%|██████████| 9/9 [00:05<00:00, 1.56it/s]
100%|██████████| 9/9 [00:05<00:00, 1.65it/s]
And now we can simply access the results.
First the per WB parameters. Each row represents a WB and the multi-index tells us from which participant and which test it is.
And the aggregated parameters.
Note, that many values are NaN, because only a single WB was detected per trial.
So we can not calculate the standard deviation or other statistics.
To learn more about what the different aggregated values mean, check MobilisedAggregator.
aggregated_paras
# Modifying Parameters
# -------------------
# Both pipelines are basically the same, but the algorithms used for certain steps are different.
# Both just reimplement :class:`~mobgap.pipeline.BaseMobilisedPipeline` with the respective algorithms as default
# parameters.
# So we can easily modify the parameters of the pipeline either using the ``set_params`` method or by passing different
# parameters/algorithms to the constructor.
#
# .. warning:: As part of Mobilise-D we only validated the pipelines with their default values in exactly the cohorts we
# recommend them for.
# If you change the parameters, or use them in a different cohort, we ask you to not call this approach
# "the Mobilised Pipeline" anymore, when communicating your results.
#
# Starting simple, let's say we simply don't want to filter and aggregate the final DMOs.
# We just set the respective parameters to None.
from mobgap.pipeline import MobilisedPipelineHealthy
pipe_no_agg = MobilisedPipelineHealthy(
dmo_thresholds=None, dmo_aggregation=None
)
pipe_no_agg.safe_run(long_test_ha)
MobilisedPipelineHealthy(cadence_calculation=CadFromIcDetector(ic_detector=IcdShinImproved(axis='norm'), max_interpolation_gap_s=3, silence_ic_warning=True, step_time_smoothing=HampelFilter(half_window_size=2, n_sigmas=3.0)), dmo_aggregation=None, dmo_thresholds=None, gait_sequence_detection=GsdIluz(acc_v_standing_threshold=4.903325, allowed_acc_v_change_per_window=0.15, allowed_steps_per_s=(0.5, 3), mean_activity_threshold=-0.980665, min_gsd_duration_s=5, pre_filter=FirFilter(cutoff_freq_hz=(0.5, 3), filter_type='bandpass', order=200, window='hamming', zero_phase=True), sin_template_freq_hz=2, std_activity_threshold=0.0980665, step_detection_thresholds=(3.92266, 14.709975), window_length_s=3, window_overlap=0.5), initial_contact_detection=IcdIonescu(cwt_width=9.0, pre_filter=EpflDedriftedGaitFilter(zero_phase=True)), laterality_classification=LrcUllrich(clf_pipe=Pipeline(steps=[('scaler_old', MinMaxScaler()),
('clf_old', SVC(C=0.1, kernel='linear'))]), smoothing_filter=ButterworthFilter(cutoff_freq_hz=(0.5, 2), filter_type='bandpass', order=4, zero_phase=True)), recommended_cohorts=('HA', 'COPD', 'CHF'), stride_length_calculation=SlZijlstra(acc_smoothing=ButterworthFilter(cutoff_freq_hz=0.1, filter_type='highpass', order=4, zero_phase=True), max_interpolation_gap_s=3, orientation_method=None, speed_smoothing=ButterworthFilter(cutoff_freq_hz=1, filter_type='highpass', order=4, zero_phase=True), step_length_scaling_factor=1.14675, step_length_smoothing=HampelFilter(half_window_size=2, n_sigmas=3.0)), stride_selection=StrideSelection(incompatible_rules='warn', rules=[('stride_duration_thres', IntervalDurationCriteria(inclusive=(False, True), max_duration_s=3.0, min_duration_s=0.2)), ('stride_length_thres', IntervalParameterCriteria(inclusive=(False, True), lower_threshold=0.15, parameter='stride_length_m', upper_threshold=None))]), turn_detection=TdElGohary(allowed_turn_angle_deg=(45, inf), allowed_turn_duration_s=(0.5, 10), lower_threshold_velocity_dps=5, min_gap_between_turns_s=0.05, min_peak_angle_velocity_dps=15, orientation_estimation=None, smoothing_filter=ButterworthFilter(cutoff_freq_hz=0.5, filter_type='lowpass', order=4, zero_phase=True)), walking_speed_calculation=WsNaive(), wba=WbAssembly(rules=[('min_strides', NStridesCriteria(min_strides=4, min_strides_left=3, min_strides_right=3)), ('max_break', MaxBreakCriteria(consider_end_as_break=True, max_break_s=3, remove_last_ic=False))]))
Now, the aggregated parameters are empty.
pipe_no_agg.aggregated_parameters_
And the per WB parameters are still there.
If you want to change the algorithm used for a certain step, you can simply pass a different algorithm to the constructor. For example, let’s say you want to use the Adaptive Ionescu GSD algorithm instead of the GSDIluz (which is the default for the healthy pipeline).
For the sake of this example, we will also modify the default parameters of the algorithm.
from mobgap.gait_sequences import GsdAdaptiveIonescu
pipe_adaptive_gsd = MobilisedPipelineHealthy(
gait_sequence_detection=GsdAdaptiveIonescu(min_n_steps=3)
)
pipe_adaptive_gsd.safe_run(long_test_ha)
MobilisedPipelineHealthy(cadence_calculation=CadFromIcDetector(ic_detector=IcdShinImproved(axis='norm'), max_interpolation_gap_s=3, silence_ic_warning=True, step_time_smoothing=HampelFilter(half_window_size=2, n_sigmas=3.0)), dmo_aggregation=MobilisedAggregator(groupby=None, unique_wb_id_column='wb_id', use_original_names=False), dmo_thresholds=condition free_living ... global
threshold_type min max ... min max
dmo cohort ...
cadence_spm CHF 40.594770 167.942930 ... 40.000000 172.900
COPD 38.880484 151.558718 ... 40.000000 172.900
HA 35.751898 156.946955 ... 40.000000 172.900
MS 38.443211 155.394496 ... 40.000000 172.900
PD 40.957969 142.121947 ... 40.000000 172.900
PFF 42.191719 157.613665 ... 40.000000 172.900
walking_speed_mps CHF 0.112478 1.757103 ... 0.081515 2.220
COPD 0.090731 1.641773 ... 0.081515 2.220
HA 0.097413 1.965728 ... 0.081515 2.220
MS 0.085918 1.920262 ... 0.081515 2.220
PD 0.081515 1.735348 ... 0.081515 2.220
PFF 0.104547 1.553186 ... 0.081515 2.220
stride_length_m CHF 0.185308 2.166646 ... 0.150523 2.190
COPD 0.176403 1.711645 ... 0.150523 2.190
HA 0.155126 2.024694 ... 0.150523 2.190
MS 0.191099 1.940537 ... 0.150523 2.190
PD 0.150523 1.982926 ... 0.150523 2.190
PFF 0.206787 1.697251 ... 0.150523 2.190
stride_duration_s CHF 0.702000 3.030857 ... 0.460000 3.000
COPD 0.770000 3.000000 ... 0.460000 3.000
HA 0.735435 3.254400 ... 0.460000 3.000
MS 0.775597 3.057750 ... 0.460000 3.000
PD 0.836253 2.928000 ... 0.460000 3.000
PFF 0.744000 2.769000 ... 0.460000 3.000
step_duration_s CHF 0.376000 1.504500 ... 0.140000 2.124
COPD 0.390400 1.788000 ... 0.140000 2.124
HA 0.367972 1.730400 ... 0.140000 2.124
MS 0.388084 1.905000 ... 0.140000 2.124
PD 0.417216 1.605600 ... 0.140000 2.124
PFF 0.371429 2.124000 ... 0.140000 2.124
[30 rows x 8 columns], gait_sequence_detection=GsdAdaptiveIonescu(active_signal_fallback_threshold=1.4709975, max_gap_s=3.5, min_n_steps=3, min_step_margin_s=1.5, padding=0.75), initial_contact_detection=IcdIonescu(cwt_width=9.0, pre_filter=EpflDedriftedGaitFilter(zero_phase=True)), laterality_classification=LrcUllrich(clf_pipe=Pipeline(steps=[('scaler_old', MinMaxScaler()),
('clf_old', SVC(C=0.1, kernel='linear'))]), smoothing_filter=ButterworthFilter(cutoff_freq_hz=(0.5, 2), filter_type='bandpass', order=4, zero_phase=True)), recommended_cohorts=('HA', 'COPD', 'CHF'), stride_length_calculation=SlZijlstra(acc_smoothing=ButterworthFilter(cutoff_freq_hz=0.1, filter_type='highpass', order=4, zero_phase=True), max_interpolation_gap_s=3, orientation_method=None, speed_smoothing=ButterworthFilter(cutoff_freq_hz=1, filter_type='highpass', order=4, zero_phase=True), step_length_scaling_factor=1.14675, step_length_smoothing=HampelFilter(half_window_size=2, n_sigmas=3.0)), stride_selection=StrideSelection(incompatible_rules='warn', rules=[('stride_duration_thres', IntervalDurationCriteria(inclusive=(False, True), max_duration_s=3.0, min_duration_s=0.2)), ('stride_length_thres', IntervalParameterCriteria(inclusive=(False, True), lower_threshold=0.15, parameter='stride_length_m', upper_threshold=None))]), turn_detection=TdElGohary(allowed_turn_angle_deg=(45, inf), allowed_turn_duration_s=(0.5, 10), lower_threshold_velocity_dps=5, min_gap_between_turns_s=0.05, min_peak_angle_velocity_dps=15, orientation_estimation=None, smoothing_filter=ButterworthFilter(cutoff_freq_hz=0.5, filter_type='lowpass', order=4, zero_phase=True)), walking_speed_calculation=WsNaive(), wba=WbAssembly(rules=[('min_strides', NStridesCriteria(min_strides=4, min_strides_left=3, min_strides_right=3)), ('max_break', MaxBreakCriteria(consider_end_as_break=True, max_break_s=3, remove_last_ic=False))]))
This works as before and all parameters of the pipeline are still available.
When you are planning to modify many algorithms, we would recommend to not use the specific pipeline classes anymore,
to avoid the association (is it really still the Healthy pipeline if you change all algorithms?).
In this case, we recommend the un-configured GenericMobilisedPipeline.
This class is also used as the base class for the preconfigured pipelines.
It has no algorithms set by default, so you have to set all algorithms yourself.
See the end of the step-by-step example for a demonstration.
If you want to reuse some of the defaults of the preconfigured pipelines, you can still use the
PreconfiguredParameters.
For example, we could get the same pipeline as before like this:
from mobgap.pipeline import GenericMobilisedPipeline
pipe_custom = GenericMobilisedPipeline(
**dict(
GenericMobilisedPipeline.PredefinedParameters.regular_walking,
gait_sequence_detection=GsdAdaptiveIonescu(min_n_steps=3),
)
)
pipe_adaptive_gsd.safe_run(long_test_ha)
MobilisedPipelineHealthy(cadence_calculation=CadFromIcDetector(ic_detector=IcdShinImproved(axis='norm'), max_interpolation_gap_s=3, silence_ic_warning=True, step_time_smoothing=HampelFilter(half_window_size=2, n_sigmas=3.0)), dmo_aggregation=MobilisedAggregator(groupby=None, unique_wb_id_column='wb_id', use_original_names=False), dmo_thresholds=condition free_living ... global
threshold_type min max ... min max
dmo cohort ...
cadence_spm CHF 40.594770 167.942930 ... 40.000000 172.900
COPD 38.880484 151.558718 ... 40.000000 172.900
HA 35.751898 156.946955 ... 40.000000 172.900
MS 38.443211 155.394496 ... 40.000000 172.900
PD 40.957969 142.121947 ... 40.000000 172.900
PFF 42.191719 157.613665 ... 40.000000 172.900
walking_speed_mps CHF 0.112478 1.757103 ... 0.081515 2.220
COPD 0.090731 1.641773 ... 0.081515 2.220
HA 0.097413 1.965728 ... 0.081515 2.220
MS 0.085918 1.920262 ... 0.081515 2.220
PD 0.081515 1.735348 ... 0.081515 2.220
PFF 0.104547 1.553186 ... 0.081515 2.220
stride_length_m CHF 0.185308 2.166646 ... 0.150523 2.190
COPD 0.176403 1.711645 ... 0.150523 2.190
HA 0.155126 2.024694 ... 0.150523 2.190
MS 0.191099 1.940537 ... 0.150523 2.190
PD 0.150523 1.982926 ... 0.150523 2.190
PFF 0.206787 1.697251 ... 0.150523 2.190
stride_duration_s CHF 0.702000 3.030857 ... 0.460000 3.000
COPD 0.770000 3.000000 ... 0.460000 3.000
HA 0.735435 3.254400 ... 0.460000 3.000
MS 0.775597 3.057750 ... 0.460000 3.000
PD 0.836253 2.928000 ... 0.460000 3.000
PFF 0.744000 2.769000 ... 0.460000 3.000
step_duration_s CHF 0.376000 1.504500 ... 0.140000 2.124
COPD 0.390400 1.788000 ... 0.140000 2.124
HA 0.367972 1.730400 ... 0.140000 2.124
MS 0.388084 1.905000 ... 0.140000 2.124
PD 0.417216 1.605600 ... 0.140000 2.124
PFF 0.371429 2.124000 ... 0.140000 2.124
[30 rows x 8 columns], gait_sequence_detection=GsdAdaptiveIonescu(active_signal_fallback_threshold=1.4709975, max_gap_s=3.5, min_n_steps=3, min_step_margin_s=1.5, padding=0.75), initial_contact_detection=IcdIonescu(cwt_width=9.0, pre_filter=EpflDedriftedGaitFilter(zero_phase=True)), laterality_classification=LrcUllrich(clf_pipe=Pipeline(steps=[('scaler_old', MinMaxScaler()),
('clf_old', SVC(C=0.1, kernel='linear'))]), smoothing_filter=ButterworthFilter(cutoff_freq_hz=(0.5, 2), filter_type='bandpass', order=4, zero_phase=True)), recommended_cohorts=('HA', 'COPD', 'CHF'), stride_length_calculation=SlZijlstra(acc_smoothing=ButterworthFilter(cutoff_freq_hz=0.1, filter_type='highpass', order=4, zero_phase=True), max_interpolation_gap_s=3, orientation_method=None, speed_smoothing=ButterworthFilter(cutoff_freq_hz=1, filter_type='highpass', order=4, zero_phase=True), step_length_scaling_factor=1.14675, step_length_smoothing=HampelFilter(half_window_size=2, n_sigmas=3.0)), stride_selection=StrideSelection(incompatible_rules='warn', rules=[('stride_duration_thres', IntervalDurationCriteria(inclusive=(False, True), max_duration_s=3.0, min_duration_s=0.2)), ('stride_length_thres', IntervalParameterCriteria(inclusive=(False, True), lower_threshold=0.15, parameter='stride_length_m', upper_threshold=None))]), turn_detection=TdElGohary(allowed_turn_angle_deg=(45, inf), allowed_turn_duration_s=(0.5, 10), lower_threshold_velocity_dps=5, min_gap_between_turns_s=0.05, min_peak_angle_velocity_dps=15, orientation_estimation=None, smoothing_filter=ButterworthFilter(cutoff_freq_hz=0.5, filter_type='lowpass', order=4, zero_phase=True)), walking_speed_calculation=WsNaive(), wba=WbAssembly(rules=[('min_strides', NStridesCriteria(min_strides=4, min_strides_left=3, min_strides_right=3)), ('max_break', MaxBreakCriteria(consider_end_as_break=True, max_break_s=3, remove_last_ic=False))]))
On the other end, if you are only planning to change a single sub-parameter of a pipeline, it might be easier to use
the set_params method, instead of passing all parameters to the constructor.
We show the extreme example of this here, by using the Universal-Pipeline as starting point and changing the filter order of the pre-processing filter of the GSD algorithm of the healthy pipeline used internally in the MetaPipeline.
Note
The MobilisedPipelineUniversal is a special case, as it makes use of a tpcp feature called
composite_params.
This allows us to target the pipelines__healthy parameters, even tough pipelines is not an object,
but a list of tuples.
Learn more about this feature in the tpcp documentation.
from mobgap.pipeline import MobilisedPipelineUniversal
meta_pipeline_modified = MobilisedPipelineUniversal().set_params(
pipelines__healthy__gait_sequence_detection__pre_filter__order=50
)
This parameter name is a bit long, but it demonstrates that it is possible to change even deeply nested parameters. This might be in particular useful, when you want to run approaches like GridSearch.
The algorithm works as before (note we don’t expect any change in output for this parameter change).
meta_pipeline_modified.safe_run(long_test_ha)
meta_pipeline_modified.aggregated_parameters_
Total running time of the script: (0 minutes 14.841 seconds)
Estimated memory usage: 16 MB