deeplake.core.tensor

Tensor

class deeplake.core.tensor.Tensor
__len__()

Returns the length of the primary axis of the tensor. Accounts for indexing into the tensor object.

Examples

>>> len(tensor)
0
>>> tensor.extend(np.zeros((100, 10, 10)))
>>> len(tensor)
100
>>> len(tensor[5:10])
5
Returns

The current length of this tensor.

Return type

int

__setitem__(item: Union[int, slice], value: Any)

Update samples with new values.

Example

>>> tensor.append(np.zeros((10, 10)))
>>> tensor.shape
(1, 10, 10)
>>> tensor[0] = np.zeros((3, 3))
>>> tensor.shape
(1, 3, 3)
property _config

Returns a summary of the configuration of the tensor.

_linked_sample()

Returns the linked sample at the given index. This is only applicable for tensors of link[] htype and can only be used for exactly one sample.

>>> linked_sample = ds.abc[0]._linked_sample().path
'https://picsum.photos/200/300'
append(sample: Union[Sample, ndarray, int, float, bool, dict, list, str, integer, floating, bool_])

Appends a single sample to the end of the tensor. Can be an array, scalar value, or the return value from deeplake.read(), which can be used to load files. See examples down below.

Examples

Numpy input:

>>> len(tensor)
0
>>> tensor.append(np.zeros((28, 28, 1)))
>>> len(tensor)
1

File input:

>>> len(tensor)
0
>>> tensor.append(deeplake.read("path/to/file"))
>>> len(tensor)
1
Parameters

sample (InputSample) – The data to append to the tensor. Sample is generated by deeplake.read(). See the above examples.

property base_htype

Base htype of the tensor.

Example

>>> ds.create_tensor("video_seq", htype="sequence[video]", sample_compression="mp4")
>>> ds.video_seq.htype
sequence[video]
>>> ds.video_seq.base_htype
video
clear()

Deletes all samples from the tensor

data(aslist: bool = False, fetch_chunks: bool = False) Any

Returns data in the tensor in a format based on the tensor’s base htype.

  • If tensor has text base htype
  • If tensor has json base htype
  • If tensor has list base htype
  • For video tensors, returns a dict with keys “frames”, “timestamps” and “sample_info”:

    • Value of dict[“frames”] will be same as numpy().

    • Value of dict[“timestamps”] will be same as timestamps corresponding to the frames.

    • Value of dict[“sample_info”] will be same as sample_info.

  • For class_label tensors, returns a dict with keys “value” and “text”.

    • Value of dict[“value”] will be same as numpy().

    • Value of dict[“text”] will be list of class labels as strings.

  • For image or dicom tensors, returns dict with keys “value” and “sample_info”.

    • Value of dict[“value”] will be same as numpy().

    • Value of dict[“sample_info”] will be same as sample_info.

  • For all else, returns dict with key “value” with value same as numpy().

dict(fetch_chunks: bool = False)

Return json data. Only applicable for tensors with ‘json’ base htype.

property dtype: Optional[dtype]

Dtype of the tensor.

extend(samples: Union[ndarray, Sequence[Union[Sample, ndarray, int, float, bool, dict, list, str, integer, floating, bool_]], Tensor], progressbar: bool = False)

Extends the end of the tensor by appending multiple elements from a sequence. Accepts a sequence, a single batched numpy array, or a sequence of deeplake.read() outputs, which can be used to load files. See examples down below.

Example

Numpy input:

>>> len(tensor)
0
>>> tensor.extend(np.zeros((100, 28, 28, 1)))
>>> len(tensor)
100

File input:

>>> len(tensor)
0
>>> tensor.extend([
        deeplake.read("path/to/image1"),
        deeplake.read("path/to/image2"),
    ])
>>> len(tensor)
2
Parameters
  • samples (np.ndarray, Sequence, Sequence[Sample]) – The data to add to the tensor. The length should be equal to the number of samples to add.

  • progressbar (bool) – Specifies whether a progressbar should be displayed while extending.

Raises

TensorDtypeMismatchError – Dtype for array must be equal to or castable to this tensor’s dtype.

property hidden: bool

Whether this tensor is a hidden tensor.

property htype

Htype of the tensor.

property info: Info

Returns the information about the tensor. User can set info of tensor.

Returns

Information about the tensor.

Return type

Info

Example

>>> # update info
>>> ds.images.info.update(large=True, gray=False)
>>> # get info
>>> ds.images.info
{'large': True, 'gray': False}
>>> ds.images.info = {"complete": True}
>>> ds.images.info
{'complete': True}
property is_dynamic: bool

Will return True if samples in this tensor have shapes that are unequal.

Whether this tensor is a link tensor.

property is_sequence

Whether this tensor is a sequence tensor.

list(fetch_chunks: bool = False)

Return list data. Only applicable for tensors with ‘list’ base htype.

property meta

Metadata of the tensor.

modified_samples(target_id: Optional[str] = None, return_indexes: Optional[bool] = False)

Returns a slice of the tensor with only those elements that were modified/added. By default the modifications are calculated relative to the previous commit made, but this can be changed by providing a target id.

Parameters
  • target_id (str, optional) – The commit id or branch name to calculate the modifications relative to. Defaults to None.

  • return_indexes (bool, optional) – If True, returns the indexes of the modified elements. Defaults to False.

Returns

A new tensor with only the modified elements if return_indexes is False. Tuple[Tensor, List[int]]: A new tensor with only the modified elements and the indexes of the modified elements if return_indexes is True.

Return type

Tensor

Raises

TensorModifiedError – If a target id is passed which is not an ancestor of the current commit.

property ndim: int

Number of dimensions of the tensor.

property num_samples: int

Returns the length of the primary axis of the tensor. Ignores any applied indexing and returns the total length.

numpy(aslist=False, fetch_chunks=False) Union[ndarray, List[ndarray]]

Computes the contents of the tensor in numpy format.

Parameters
  • aslist (bool) – If True, a list of np.ndarrays will be returned. Helpful for dynamic tensors. If False, a single np.ndarray will be returned unless the samples are dynamically shaped, in which case an error is raised.

  • fetch_chunks (bool) –

    If True, full chunks will be retrieved from the storage, otherwise only required bytes will be retrieved. This will always be True even if specified as False in the following cases:

    • The tensor is ChunkCompressed.

    • The chunk which is being accessed has more than 128 samples.

Raises
  • DynamicTensorNumpyError – If reading a dynamically-shaped array slice without aslist=True.

  • ValueError – If the tensor is a link and the credentials are not populated.

Returns

A numpy array containing the data represented by this tensor.

Note

For tensors of htype polygon, aslist is always True.

path(fetch_chunks: bool = False)

Return path data. Only applicable for linked tensors

play()

Play video sample. Plays video in Jupyter notebook or plays in web browser. Video is streamed directly from storage. This method will fail for incompatible htypes.

Example

>>> ds = deeplake.load("./test/my_video_ds")
>>> # play second sample
>>> ds.videos[2].play()

Note

Video streaming is not yet supported on colab.

pop(index: Optional[int] = None)

Removes an element at the given index.

property sample_indices

Returns all the indices pointed to by this tensor in the dataset view.

property sample_info: Union[Dict, List[Dict]]

Returns info about particular samples in a tensor. Returns dict in case of single sample, otherwise list of dicts. Data in returned dict would depend on the tensor’s htype and the sample itself.

Example

>>> ds.videos[0].sample_info
{'duration': 400400, 'fps': 29.97002997002997, 'timebase': 3.3333333333333335e-05, 'shape': [400, 360, 640, 3], 'format': 'mp4', 'filename': '../deeplake/tests/dummy_data/video/samplemp4.mp4', 'modified': False}
>>> ds.images[:2].sample_info
[{'exif': {'Software': 'Google'}, 'shape': [900, 900, 3], 'format': 'jpeg', 'filename': '../deeplake/tests/dummy_data/images/cat.jpeg', 'modified': False}, {'exif': {}, 'shape': [495, 750, 3], 'format': 'jpeg', 'filename': '../deeplake/tests/dummy_data/images/car.jpg', 'modified': False}]
property shape: Tuple[Optional[int], ...]

Get the shape of this tensor. Length is included.

Example

>>> tensor.append(np.zeros((10, 10)))
>>> tensor.append(np.zeros((10, 15)))
>>> tensor.shape
(2, 10, None)
Returns

Tuple where each value is either None (if that axis is dynamic) or an int (if that axis is fixed).

Return type

tuple

Note

If you don’t want None in the output shape or want the lower/upper bound shapes, use shape_interval instead.

property shape_interval: ShapeInterval

Returns a ShapeInterval object that describes this tensor’s shape more accurately. Length is included.

Example

>>> tensor.append(np.zeros((10, 10)))
>>> tensor.append(np.zeros((10, 15)))
>>> tensor.shape_interval
ShapeInterval(lower=(2, 10, 10), upper=(2, 10, 15))
>>> str(tensor.shape_interval)
(2, 10, 10:15)
Returns

Object containing lower and upper properties.

Return type

ShapeInterval

Note

If you are expecting a tuple, use shape instead.

summary()

Prints a summary of the tensor.

text(fetch_chunks: bool = False)

Return text data. Only applicable for tensors with ‘text’ base htype.

property timestamps: ndarray

Returns timestamps (in seconds) for video sample as numpy array.

Example

>>> # Return timestamps for all frames of first video sample
>>> ds.videos[0].timestamps.shape
(400,)
>>> # Return timestamps for 5th to 10th frame of first video sample
>>> ds.videos[0, 5:10].timestamps
array([0.2002    , 0.23356667, 0.26693332, 0.33366665, 0.4004    ],
dtype=float32)
tobytes() bytes

Returns the bytes of the tensor.

  • Only works for a single sample of tensor.

  • If the tensor is uncompressed, this returns the bytes of the numpy array.

  • If the tensor is sample compressed, this returns the compressed bytes of the sample.

  • If the tensor is chunk compressed, this raises an error.

Returns

The bytes of the tensor.

Return type

bytes

Raises

ValueError – If the tensor has multiple samples.

property verify

Whether linked data will be verified when samples are added. Applicable only to tensors with htype link[htype].