Dataset Handlers#

Researcher supplied Python modules that contain a dataset class (to handle processing, splitting, etc.) that are passed to the sacroml.attacks.target.Target must implement one of these abstract classes.

Scikit-learn models that use numpy arrays should implement SklearnDataHandler.

PyTorch models that use DataLoaders should implement PyTorchDataHandler.

API Reference#

Abstract data handler supporting both PyTorch and scikit-learn.

class sacroml.attacks.data.BaseDataHandler[source]#

Base data handling interface.

abstract __init__() None[source]#

Instantiate a data handler.

class sacroml.attacks.data.PyTorchDataHandler[source]#

PyTorch dataset handling interface.

Methods

get_dataloader(dataset, indices[, ...])

Return a data loader with a requested subset of samples.

get_dataset()

Return a processed dataset.

get_raw_dataset()

Return a raw unprocessed dataset.

abstract __init__() None#

Instantiate a data handler.

abstract get_dataloader(dataset: Dataset, indices: Sequence[int], batch_size: int = 32, shuffle: bool = False) DataLoader[source]#

Return a data loader with a requested subset of samples.

Parameters:
datasetDataset

A (processed) PyTorch dataset.

indicesSequence[int]

The indices to load from the dataset.

batch_sizeint

The batch_size to sample the dataset.

shufflebool

Whether to shuffle the data.

Returns:
DataLoader

A PyTorch DataLoader.

abstract get_dataset() Dataset[source]#

Return a processed dataset.

Returns:
Dataset

A (processed) PyTorch dataset.

abstract get_raw_dataset() Dataset | None[source]#

Return a raw unprocessed dataset.

Returns:
Dataset | None

An unprocessed PyTorch dataset.

class sacroml.attacks.data.SklearnDataHandler[source]#

Scikit-learn data handling interface.

Methods

get_data()

Return the processed data arrays.

get_raw_data()

Return the original unprocessed data arrays.

get_subset(X, y, indices)

Return a subset of the data.

abstract __init__() None#

Instantiate a data handler.

abstract get_data() tuple[ndarray, ndarray][source]#

Return the processed data arrays.

Returns:
tuple[np.ndarray, np.ndarray]

Features (X) and targets (y) as numpy arrays.

abstract get_raw_data() tuple[ndarray, ndarray] | None[source]#

Return the original unprocessed data arrays.

Returns:
tuple[np.ndarray, np.ndarray] | None

Features (X) and targets (y) as numpy arrays.

abstract get_subset(X: ndarray, y: ndarray, indices: Sequence[int]) tuple[ndarray, ndarray][source]#

Return a subset of the data.

Parameters:
Xnp.ndarray

Feature array.

ynp.ndarray

Target array.

indicesSequence[int]

The indices to extract.

Returns:
tuple[np.ndarray, np.ndarray]

Subset of features and targets.