Skip to content

DataLoader

PyTorch and TensorFlow DataLoaders

Tip

To create a standard PyTorch or TensorFlow DataLoader from a dataset, use deeplake.DatasetView.pytorch or deeplake.DatasetView.tensorflow which adapt a dataset or query result to the PyTorch or TensorFlow DataLoader API.

Data Streaming

deeplake.Prefetcher

The Prefetcher can be used to more efficiently stream large amounts of data from a DeepLake dataset, such as to the DataLoader then to the training framework.

Examples:

>>> ds = deeplake.open("al://my_org/dataset")
>>> fetcher = deeplake.Prefetcher(view, batch_size=2000)
>>> for batch in dl:
>>>     process_batch(batch["images"])

__init__

__init__(
    dataset: DatasetView,
    batch_size: int = 1,
    drop_last: bool = False,
) -> None

Parameters:

Name Type Description Default
dataset DatasetView

The deeplake.DatasetView to stream from

required
batch_size int

The numer of rows to return in each iteration

1
drop_last bool

If true, do not return a non-full final batch

False

__iter__

__iter__() -> Prefetcher

Iterate over the dataset view

__len__

__len__() -> int

The number of batches in the Prefetcher

__next__

__next__() -> dict

Returns the next batch of dataset

reset

reset() -> None

Reset the iterator