LabExampleDataset#
- class mobgap.data.LabExampleDataset(
- *,
- raw_data_sensor: Literal['SU', 'INDIP', 'INDIP2'] = 'SU',
- reference_system: Literal['INDIP', 'Stereophoto'] | None = None,
- reference_para_level: Literal['wb', 'lwb'] = 'wb',
- sensor_positions: Sequence[str] = ('LowerBack',),
- sensor_types: Sequence[Literal['acc', 'gyr', 'mag', 'bar']] = ('acc', 'gyr'),
- missing_sensor_error_type: Literal['raise', 'warn', 'ignore'] = 'raise',
- memory: Memory = Memory(location=None),
- groupby_cols: list[str] | str | None = None,
- subset_index: DataFrame | None = None,
A dataset containing all lab example data provided with mobgap.
- Parameters:
- raw_data_sensor
Which sensor to load the raw data for. One of “SU”, “INDIP”, “INDIP2”. SU is usually the “normal” lower back sensor. INDIP and INDIP2 are only available under special circumstances for the Mobilise-D TVS data.
- reference_system
When specified, reference gait parameters are loaded using the specified reference system.
- sensor_positions
Which sensor positions to load the raw data for. For “SU”, only “LowerBack” is available, but for other sensors, more positions might be available. If a sensor position is not available, an error is raised.
- sensor_types
Which sensor types to load the raw data for. This can be used to reduce the amount of data loaded, if only e.g. acc and gyr data is required. Some sensors might only have a subset of the available sensor types. If a sensor type is not available, it is ignored.
- missing_sensor_error_type
Whether to throw an error (“raise”), a warning (“warn”) or ignore (“ignore”) when a sensor is missing.
- memory
A joblib memory object to cache the results of the data loading. This is highly recommended, if you have many large data files. Otherwise, the initial index creation can take a long time.
- reference_para_level
Whether to provide “wb” (walking bout) or “lwb” (level-walking bout) reference when loading
reference_parameters_.raw_reference_parameters_will always contain both in an unformatted way.- groupby_cols
Columns to group the data by. See
Datasetfor details.- subset_index
The selected subset of the index. See
Datasetfor details.
- Attributes:
- data
The raw IMU data.
- sampling_rate_hz
The sampling rate of the IMU data in Hz.
- participant_metadata
The participant metadata loaded from the
infoForAlgo.matfile.- reference_parameters_
Parsed reference parameters. This contains the reference parameters in a format that can be used as input and output to many of the mobgap algorithms. See
ReferenceDatafor details. Note that these reference parameters are expected to be relative to the start of the recording and all timing parameters (like the start and end of a walking bout) are expected to be in samples.- reference_parameters_relative_to_wb_
Same as
reference_parameters_, but all timing parameters are relative to the start of the walking bout. This is useful for algorithms that only act in the context of a walking bout.- reference_sampling_rate_hz_
The sampling rate of the reference data in Hz.
- raw_reference_parameters_
The raw reference parameters (if available). Check other attributes with a trailing underscore for the reference parameters converted into a more standardized format.
- metadata
The metadata of the selected test.
UNITSRepresentation of units IMU units in gait datasets.
See also
DatasetFor details about the
groupby_colsandsubset_indexparameters.load_mobilised_matlab_format
Methods
UNITS()Representation of units IMU units in gait datasets.
as_attrs()Return a version of the Dataset class that can be subclassed using
attrsdefined classes.Return a version of the Dataset class that can be subclassed using dataclasses.
assert_is_single(groupby_cols, property_name)Raise error if index does contain more than one group/row with the given groupby settings.
assert_is_single_group(property_name)Raise error if index does contain more than one group/row.
clone()Create a new instance of the class with all parameters copied over.
Create the dataset index.
create_string_group_labels(label_cols)Generate a list of string labels for each group/row in the dataset.
get_params([deep])Get parameters for this algorithm.
get_subset(*[, group_labels, index, bool_map])Get a subset of the dataset.
groupby(groupby_cols)Return a copy of the dataset grouped by the specified columns.
Get all datapoint labels of the dataset (i.e. a list of the rows of the index as named tuples).
is_single(groupby_cols)Return True if index contains only one row/group with the given groupby settings.
Return True if index contains only one group.
iter_level(level)Return generator object containing a subset for every category from the selected level.
set_params(**params)Set the parameters of this Algorithm.
create_group_labels
- __init__(
- *,
- raw_data_sensor: Literal['SU', 'INDIP', 'INDIP2'] = 'SU',
- reference_system: Literal['INDIP', 'Stereophoto'] | None = None,
- reference_para_level: Literal['wb', 'lwb'] = 'wb',
- sensor_positions: Sequence[str] = ('LowerBack',),
- sensor_types: Sequence[Literal['acc', 'gyr', 'mag', 'bar']] = ('acc', 'gyr'),
- missing_sensor_error_type: Literal['raise', 'warn', 'ignore'] = 'raise',
- memory: Memory = Memory(location=None),
- groupby_cols: list[str] | str | None = None,
- subset_index: DataFrame | None = None,
- class UNITS[source]#
Representation of units IMU units in gait datasets.
- Parameters:
- acc
acceleration unit, default = ms^-2
- gyr
gyroscope unit, default = deg/s
- mag
magnetometer unit, default = uT
- classmethod as_attrs()[source]#
Return a version of the Dataset class that can be subclassed using
attrsdefined classes.Note, this requires
attrsto be installed!
- classmethod as_dataclass()[source]#
Return a version of the Dataset class that can be subclassed using dataclasses.
- assert_is_single( ) None[source]#
Raise error if index does contain more than one group/row with the given groupby settings.
This should be used when implementing access to data values, which can only be accessed when only a single trail/participant/etc. exist in the dataset.
- Parameters:
- groupby_cols
None (no grouping) or a valid subset of the columns available in the dataset index.
- property_name
Name of the property this check is used in. Used to format the error message.
- assert_is_single_group(property_name) None[source]#
Raise error if index does contain more than one group/row.
Note that this is different from
assert_is_singleas it is aware of the current grouping. Instead of checking that a certain combination of columns is left in the dataset, it checks that only a single group exists with the already selected grouping as defined byself.groupby_cols.- Parameters:
- property_name
Name of the property this check is used in. Used to format the error message.
- clone() Self[source]#
Create a new instance of the class with all parameters copied over.
This will create a new instance of the class itself and all nested objects
- create_index() DataFrame[source]#
Create the dataset index.
The index columns will consist of the metadata extracted from the columns and the test names.
- create_string_group_labels(label_cols: str | list[str]) list[str][source]#
Generate a list of string labels for each group/row in the dataset.
Note
This has a different use case than the dataset-wide groupby. Using
groupbyreduces the effective size of the dataset to the number of groups. This method produces a group label for each group/row that is already in the dataset, without changing the dataset.The output of this method can be used in combination with
GroupKFoldas the group label.- Parameters:
- label_cols
The columns that should be included in the label. If the dataset is already grouped, this must be a subset of
self.groupby_cols.
- get_params(deep: bool = True) dict[str, Any][source]#
Get parameters for this algorithm.
- Parameters:
- deep
Only relevant if object contains nested algorithm objects. If this is the case and deep is True, the params of these nested objects are included in the output using a prefix like
nested_object_name__(Note the two “_” at the end)
- Returns:
- params
Parameter names mapped to their values.
- get_subset(
- *,
- group_labels: list[tuple[str, ...]] | None = None,
- index: DataFrame | None = None,
- bool_map: Sequence[bool] | None = None,
- **kwargs: list[str] | str,
Get a subset of the dataset.
Note
All arguments are mutable exclusive!
- Parameters:
- group_labels
A valid row locator or slice that can be passed to
self.grouped_index.loc[locator, :]. This basically needs to be a subset ofself.group_labels. Note that this is the only indexer that works on the grouped index. All other indexers work on the pure index.- index
pd.DataFramethat is a valid subset of the current dataset index.- bool_map
bool-map that is used to index the current index-dataframe. The list must be of same length as the number of rows in the index.
- **kwargs
The key must be the name of an index column. The value is a list containing strings that correspond to the categories that should be kept. For examples see above.
- Returns:
- subset
New dataset object filtered by specified parameters.
- property group_label: tuple[str, ...]#
Get the current group label.
The group is defined by the current groupby settings. If the dataset is not grouped, this is equivalent to
datapoint_label.Note, this attribute can only be used, if there is just a single group. This will return a named tuple. The tuple will contain only one entry if there is only a single groupby column or column in the index. The elements of the named tuple will have the same names as the groupby columns and will be in the same order.
- property group_labels: list[tuple[str, ...]]#
Get all group labels of the dataset based on the set groupby level.
This will return a list of named tuples. The tuples will contain only one entry if there is only one groupby level or index column.
The elements of the named tuples will have the same names as the groupby columns and will be in the same order.
Note, that if one of the groupby levels/index columns is not a valid Python attribute name (e.g. in contains spaces or starts with a number), the named tuple will not contain the correct column name! For more information see the documentation of the
renameparameter ofcollections.namedtuple.For some examples and additional explanation see this example.
- groupby( ) Self[source]#
Return a copy of the dataset grouped by the specified columns.
This does not change the order of the rows of the dataset index.
Each unique group represents a single data point in the resulting dataset.
- Parameters:
- groupby_cols
None (no grouping) or a valid subset of the columns available in the dataset index.
- property groups: list[tuple[str, ...]]#
Get the current group labels. Deprecated, use
group_labelsinstead.
- index_as_tuples() list[tuple[str, ...]][source]#
Get all datapoint labels of the dataset (i.e. a list of the rows of the index as named tuples).
- is_single(groupby_cols: list[str] | str | None) bool[source]#
Return True if index contains only one row/group with the given groupby settings.
If
groupby_cols=Nonethis checks if there is only a single row left. If you want to check if there is only a single group within the current grouping, useis_single_groupinstead.- Parameters:
- groupby_cols
None (no grouping) or a valid subset of the columns available in the dataset index.
- iter_level(
- level: str,
Return generator object containing a subset for every category from the selected level.
- Parameters:
- level
Optional
strthat sets the level which shall be used for iterating. This must be one of the columns names of the index.
- Returns:
- subset
New dataset object containing only one category in the specified
level.