LrcUllrich#
- class mobgap.laterality.LrcUllrich(
- smoothing_filter: BaseFilter = cf(ButterworthFilter(cutoff_freq_hz=(0.5, 2), filter_type='bandpass', order=4, zero_phase=True)),
- clf_pipe: Pipeline = cf(Pipeline(steps=[('scaler_old', MinMaxScaler()), ('clf_old', SVC(C=0.1, kernel='linear'))])),
Machine-Learning based algorithm for laterality classification of initial contacts.
This algorithm uses the band-pass filtered vertical (“gyr_is”) and anterior-posterior (“gyr_pa”) angular velocity. For both axis a set of features consisting of the value, the first and second derivative are extracted at the time points of the ICs ([1]). This results in a 6-dimensional feature vector for each IC. This feature set is normalized using the provided scaler and then classified using the provided model.
We provide a set of pre-trained models that are based on the MS-Project ([2]) dataset. They all use a Min-Max Scaler in combination with a linear SVC classifier. The parameters of the SVC depend on the cohort and were tuned as explained in the paper ([1]).
- Parameters:
- smoothing_filter
The bandpass filter used to smooth the data before feature extraction.
- clf_pipe
A sklearn pipeline used to perform the laterality classification based on the extracted features. All pretrained pipelines consist of a
MinMaxScalerand a linearSVCclassifier.
- Other Parameters:
- data
The raw IMU data passed to the
detectmethod.- ic_list
The list of initial contacts passed to the
detectmethod.- sampling_rate_hz
The sampling rate of the IMU data in Hz passed to the
detectmethod.
- Attributes:
- ic_lr_list_
The predicted left and right foot initial contacts. The dataframe is identical to the input
ic_list, but with thelrcolumn added. Thelrcolumn specifies if the respective IC belongs to the left or the right foot.- feature_matrix_
The 6-dimensional feature vector, containing the vertical and anterior-posterior filtered angular velocities and their first (gradient) and second (curvature) derivatives. This might be helpful for debugging or further analysis. This feature matrix will be passed into the provided pipeline.
Notes
Differences to original implementation:
Instead of using
numpy.diff, this implementation usesgradientto calculate the first and second derivative of the filtered signals. Compared to diff, gradient can estimate reliable derivatives for values at the edges of the data, which is important when the ICs are close to the beginning or end of the data.
[1] (1,2)Ullrich M, Küderle A, Reggi L, Cereatti A, Eskofier BM, Kluge F. Machine learning-based distinction of left and right foot contacts in lower back inertial sensor gait data. Annu Int Conf IEEE Eng Med Biol Soc. 2021, available at: https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=9630653
[2]Angelini L, Hodgkinson W, Smith C, Dodd JM, Sharrack B, Mazzà C, Paling D. Wearable sensors can reliably quantify gait alterations associated with disability in people with progressive multiple sclerosis in a clinical setting. J Neurol. 2020 Oct;267(10):2897-2909. doi: 10.1007/s00415-020-09928-8. Epub 2020 May 28. PMID: 32468119; PMCID: PMC7501113.
Methods
Predefined parameters for the LrdUllrich class.
clone()Create a new instance of the class with all parameters copied over.
extract_features(data, ics, sampling_rate_hz)Extract features from the provided gait data and initial contact list.
get_params([deep])Get parameters for this algorithm.
predict(data, ic_list, *, sampling_rate_hz, ...)Assign a left/right label to each initial contact in the passed data..
self_optimize(data_sequences, ...)Retrain the classifier pipeline using the provided data.
set_params(**params)Set the parameters of this Algorithm.
- __init__(
- smoothing_filter: BaseFilter = cf(ButterworthFilter(cutoff_freq_hz=(0.5, 2), filter_type='bandpass', order=4, zero_phase=True)),
- clf_pipe: Pipeline = cf(Pipeline(steps=[('scaler_old', MinMaxScaler()), ('clf_old', SVC(C=0.1, kernel='linear'))])),
- clone() Self[source]#
Create a new instance of the class with all parameters copied over.
This will create a new instance of the class itself and all nested objects
- extract_features( ) DataFrame[source]#
Extract features from the provided gait data and initial contact list.
Here, the feature set ic composed of the first (gradient) and second derivatives (curvature) of the filtered signals at the time points of the ICs. Consequently, for a dataset containing a total of ICs, this results in a feature matrix. To ensure uniformity, the feature set is normalized.
Note
Usually you don’t want to call this method directly. Instead, use the
detectmethod, which calls this method internally.- Parameters:
- data
The gait data.
- ics
The initial contact list.
- sampling_rate_hz
The sampling rate in Hz.
- Returns:
- feature_df
The DataFrame containing the extracted features.
- get_params(deep: bool = True) dict[str, Any][source]#
Get parameters for this algorithm.
- Parameters:
- deep
Only relevant if object contains nested algorithm objects. If this is the case and deep is True, the params of these nested objects are included in the output using a prefix like
nested_object_name__(Note the two “_” at the end)
- Returns:
- params
Parameter names mapped to their values.
- predict( ) Self[source]#
Assign a left/right label to each initial contact in the passed data..
- Parameters:
- data
The raw IMU of a single sensor. This should usually represent a single gait sequence or walking bout.
- ic_list
The list of initial contacts within the data. The
ic_listis expected to have a columnicwith the indices of the detected initial contacts relative to the start of the passed data.- sampling_rate_hz
The sampling rate of the IMU data in Hz.
- kwargs
Additional kwargs that are passed to the
self.clf_pipe.predictmethod.
- Returns:
- self
The instance of the class with the
ic_lr_list_attribute set to the passed ICs with a new left/right column.
- self_optimize(
- data_sequences: Iterable[DataFrame],
- ic_list_per_sequence: Iterable[DataFrame],
- ref_ic_lr_list_per_sequence: Iterable[DataFrame],
- *,
- sampling_rate_hz: float | Iterable[float],
- **kwargs: Unpack[dict[str, Any]],
Retrain the classifier pipeline using the provided data.
Note
We only support a full re-fit of the model and the scaler. Therefore, you have to pass an untrained instance to the algorithm instance before calling
self_optimize.- Parameters:
- data_sequences
A sequence/iterable/list of dataframes, each containing the raw IMU data of a single sensor. Each sequence should usually contain the data of a single gait sequence/walking bout. The optimization will be performed over all sequences combined.
- ic_list_per_sequence
A sequence/iterable/list of gsd-list, each containing the list of detected ics for the respective data sequence. The
ic_listis expected to have a columnicwith the indices of the detected initial contacts relative to the start of the each passed data sequence.- ref_ic_lr_list_per_sequence
A sequence/iterable/list of reference ic_lr_list, each containing the reference left/right initial contacts. They are expected to have the exact same structure as the ic_lists passed as
ic_list_per_sequence, but should contain the ground-truth left/right labels in a additional column calledlr_label. They are used as ground-truth to validate the output of the algorithm during optimization.- sampling_rate_hz
The sampling rate of the IMU data in Hz. This can either be a single float, in case all sequences have the same sampling rate, or a sequence of floats, in case the sampling rate differs between the sequences.
- kwargs
Additional keyword arguments that are passed to the
self.clf_pipe.fit
- Returns:
- self
The instance of the class with the internal parameters optimized.