Htypes

Htype is the class of a tensor: image, bounding box, generic tensor, etc.

The htype of a tensor can be specified at its creation

>>> ds.create_tensor("my_tensor", htype="...")

If not specified, the tensor’s htype defaults to “generic”.

Specifying an htype allows for strict settings and error handling, and it is critical for increasing the performance of Deep Lake datasets containing rich data such as images and videos.

Supported htypes and their respective defaults are:

Htype configs

HTYPE

DTYPE

COMPRESSION

generic

None

None

image

uint8

Required arg

image.rgb

uint8

Required arg

image.gray

uint8

Required arg

video

uint8

Required arg

audio

float64

Required arg

class_label

uint32

None

bbox

float32

None

bbox.3d

float32

None

intrinsics

float32

None

segment_mask

uint32

None

binary_mask

bool

None

keypoints_coco

int32

None

point

int32

None

polygon

float32

None

text

str

None

json

Any

None

list

List

None

dicom

None

dcm

nifti

None

Required arg

point_cloud

None

las

mesh

None

ply

instance_label

uint32

None

link

str

None

sequence

None

None

Image Htype

  • Sample dimensions: (height, width, # channels) or (height, width).

Images can be stored in Deep Lake as compressed bytes or as raw arrays. Due to the high compression ratio for most image formats, it is highly recommended to store compressed images using the sample_compression input to the create_tensor method.

Creating an image tensor

An image tensor can be created using

>>> ds.create_tensor("images", htype="image", sample_compression="jpg")

OR

>>> ds.create_tensor("images", htype="image", chunk_compression="jpg")
  • Optional args:
    • dtype: Defaults to uint8.

  • Supported compressions:

>>> [None, "bmp", "dib", "gif", "ico", "jpeg", "jpeg2000", "pcx", "png", "ppm", "sgi", "tga", "tiff",
... "webp", "wmf", "xbm", "eps", "fli", "im", "msp", "mpo"]

Appending image samples

  • Image samples can be of type np.ndarray or Deep Lake Sample which can be created using deeplake.read().

Examples

Appending pixel data with array

>>> ds.images.append(np.zeros((5, 5, 3), dtype=np.uint8))

Appening Deep Lake image sample

>>> ds.images.append(deeplake.read("images/0001.jpg"))

You can append multiple samples at the same time using extend().

>>> ds.images.extend([deeplake.read(f"images/000{i}.jpg") for i in range(10)])

Note

If the compression format of the input sample does not match the sample_compression of the tensor, Deep Lake will decompress and recompress the image for storage, which may significantly slow down the upload process. The upload process is fastest when the image compression matches the sample_compression.

image.rgb and image.gray htypes

image.rgb and image.gray htypes can be used to force your samples to be of RGB or grayscale type. i.e., if RGB images are appened to an image.gray tensor, Deep Lake will convert them to grayscale and if grayscale images are appended to an image.rgb tensor, Deep Lake will convert them to RGB format.

image.rgb and image.gray tensors can be created using

>>> ds.create_tensor("rgb_images", htype="image.rgb", sample_compression="...")
>>> ds.create_tensor("gray_images", htype="image.gray", sample_compression="...")

Video Htype

  • Sample dimensions: (# frames, height, width, # channels) or (# frames, height, width)

Creating a video tensor

A video tensor can be created using

>>> ds.create_tensor("videos", htype="video", sample_compression="mp4")
  • Optional args:
    • dtype: Defaults to uint8.

  • Supported compressions:

>>> [None, "mp4", "mkv", "avi"]

Appending video samples

  • Video samples can be of type np.ndarray or Sample which is returned by deeplake.read().

  • Deep Lake does not support compression of raw video frames. Therefore, array of raw frames can only be appended to tensors with None compression.

  • Recompression of samples read with deeplake.read is also not supported.

Examples

Appending Deep Lake video sample

>>> ds.videos.append(deeplake.read("videos/0012.mp4"))

Extending with multiple videos

>>> ds.videos.extend([deeplake.read(f"videos/00{i}.mp4") for i in range(10)])

Audio Htype

  • Sample dimensions: (# samples in audio, # channels) or (# samples in audio,)

Creating an audio tensor

An audio tensor can be created using

>>> ds.create_tensor("audios", htype="audio", sample_compression="mp3")
  • Optional args:
    • dtype: Defaults to float64.

  • Supported compressions:

>>> [None, "mp3", "wav", "flac"]

Appending audio samples

  • Audio samples can be of type np.ndarray or Sample which is returned by deeplake.read().

  • Like videos, Deep Lake does not support compression or recompression of input audio samples. Thus, samples of type np.ndarray can only be appended to tensors with None compression.

Examples

Appending Deep Lake audio sample

>>> ds.audios.append(deeplake.read("audios/001.mp3"))

Extending with Deep Lake audio samples

>>> ds.audios.extend([deeplake.read(f"videos/00{i}.mp3") for i in range(10)])

Class Label Htype

  • Sample dimensions: (# labels,)

Class labels are stored as numerical values in tensors, which are indices of the list tensor.info.class_names.

Creating a class label tensor

A class label tensor can be created using

>>> classes = ["airplanes", "cars", "birds", "cats", "deer", "dogs", "frogs", "horses", "ships", "trucks"]
>>> ds.create_tensor("labels", htype="class_label", class_names=classes, chunk_compression="lz4")
  • Optional args:
  • Supported compressions:

>>> ["lz4"]

You can also choose to set the class names after tensor creation.

>>> ds.labels.info.update(class_names = ["airplanes", "cars", "birds", "cats", "deer", "dogs", "frogs", "horses", "ships", "trucks"])

Note

If specifying compression, since the number of labels in one sample will be too low, chunk_compression would be the better option to use.

Appending class labels

  • Class labels can be appended as int, str, np.ndarray or list of int or str.

  • In case of strings, tensor.info.class_names is updated automatically.

Examples

Appending index

>>> ds.labels.append(0)
>>> ds.labels.append(np.zeros((5,), dtype=np.uint32))

Extending with list of indices

>>> ds.labels.extend([[0, 1, 2], [1, 3]])

Appending text labels

>>> ds.labels.append(["cars", "airplanes"])

Bounding Box Htype

  • Sample dimensions: (# bounding boxes, 4)

Bounding boxes have a variety of conventions such as those used in YOLO, COCO, Pascal-VOC and others. In order for bounding boxes to be correctly displayed by the visualizer, the format of the bounding box must be specified in the coords key in tensor meta information mentioned below.

Creating a bbox tensor

A bbox tensor can be created using

>>> ds.create_tensor("boxes", htype="bbox", coords={"type": "fractional", "mode": "CCWH"})
  • Optional args:
    • coords: A dictionary with keys “type” and “mode”.
      • type: Specifies the units of bounding box coordinates.
        • “pixel”: is in unit of pixels.

        • “fractional”: is in units relative to the width and height of the image, such as in YOLO format.

      • mode: Specifies the convention for the 4 coordinates
        • “LTRB”: left_x, top_y, right_x, bottom_y

        • “LTWH”: left_x, top_y, width, height

        • “CCWH”: center_x, center_y, width, height

    • dtype: Defaults to float32.

    • sample_compression or chunk_compression.

  • Supported compressions:

>>> ["lz4"]

You can also choose to set the class names after tensor creation.

>>> ds.labels.info.update(coords = {"type": "pixel", "mode": "LTRB"})

Note

If the bounding box format is not specified, the visualizer will assume a YOLO format (fractional + CCWH) if the box coordinates are < 1 on average. Otherwise, it will assume the COCO format (pixel + LTWH).

Appending bounding boxes

  • Bounding boxes can be appended as np.ndarrays or list or lists of arrays.

Examples

Appending one bounding box

>>> box
array([[462, 123, 238,  98]])
>>> ds.boxes.append(box)

Appending sample with 3 bounding boxes

>>> boxes
array([[965, 110, 262,  77],
       [462, 123, 238,  98],
       [688, 108, 279, 116]])
>>> boxes.shape
(3, 4)
>>> ds.boxes.append(boxes)

3D Bounding Box Htype

In order for 3D bounding boxes to be correctly displayed by the visualizer, the format of the bounding box must be specified in the coords key in tensor meta information mentioned below.

Creating a 3d bbox tensor

Note

In order for 3D bounding boxes to be correctly displayed by the visualizer, the format of the bounding box must be specified in the coords key in tensor meta information mentioned below. In addition, for projecting 3D bounding boxes onto 2D data (such as an image), the intrinsics tensor must exist in the dataset, or the intrinsics matrix must be specified in the ds.img_tensor.info dictionary, where the key is "intrinsics" and the value is the matrix.

A 3d bbox tensor can be created using

>>> ds.create_tensor("3d_boxes", htype="bbox.3d", coords={"mode": "center"})
  • Optional args:
    • coords: A dictionary with key “mode”.
      • mode: Specifies the convention for the bbox coordinates.
        • “center”: [center_x, center_y, center_z, size_x, size_y, size_z, rot_x, rot_y, rot_z]
          • Sample dimensions: (# bounding boxes, 9)

          • size_x - is the length of the bounding box along x direction

          • size_y - is the width of the bounding box along y direction

          • size_z - is the height of the bounding box along z direction

          • rot_x - rotation angle along x axis, given in degrees

          • rot_y - rotation angle along y axis, given in degrees

          • rot_z - rotation angle along z axis, given in degrees

        • “vertex”: 8 3D vertices - [[x0, y0, z0], [x1, y1, z1], [x2, y2, z2], ….., [x7, y7, z7]]
          • Sample dimensions: (# bounding boxes, 8, 3)

          The vertex order is of the following form:

                4_____________________ 5
               /|                    /|
              / |                   / |
             /  |                  /  |
            /___|_________________/   |
          0|    |                 | 1 |
           |    |                 |   |
           |    |                 |   |
           |    |                 |   |
           |    |_________________|___|
           |   /  7               |   / 6
           |  /                   |  /
           | /                    | /
           |/_____________________|/
            3                      2
          
    • dtype: Defaults to float32.

    • sample_compression or chunk_compression.

  • Supported compressions:

>>> ["lz4"]

Note

rotation angles are specified in degrees, not radians

Appending 3d bounding boxes

  • Bounding boxes can be appended as np.ndarrays or list or lists of arrays.

Examples

Appending one bounding box

>>> box
array([[462, 123, 238,  98, 22, 36, 44, 18, 0, 36, 0]])
>>> ds.3d_boxes.append(box)

Appending sample with 3 bounding boxes

>>> boxes
array([[965, 110, 262,  77, 22, 36, 44, 18, 0, 28, 0],
       [462, 123, 238,  98, 26, 34, 24, 19, 0, -50, 0],
       [688, 108, 279, 116, 12, 32, 14, 38, 0, 30, 0]])
>>> boxes.shape
(9, 4)
>>> ds.3d_boxes.append(boxes)

Intrinsics Htype

  • Sample dimensions: (# intrinsics matrices, 3, 3)

The intrinsic matrix represents a projective transformation from the 3-D camera’s coordinates into the 2-D image coordinates. The intrinsic parameters include the focal length, the optical center, also known as the principal point. The camera intrinsic matrix, \(K\), is defined as:

\[\begin{split}\begin{bmatrix} f_x & 0 & c_x \\ 0 & f_y & c_y \\ 0 & 0 & 1 \end{bmatrix}\end{split}\]
  • \([c_x, c_y]\) - Optical center (the principal point), in pixels.

  • \([f_x, f_y]\) - Focal length in pixels.

  • \(f_x = F / p_x\)

  • \(f_y = F / p_y\)

  • \(F\) - Focal length in world units, typically expressed in millimeters.

  • \((p_x, p_y)\) - Size of the pixel in world units.

Creating an intrinsics tensor

An intrinsics tensor can be created using

>>> ds.create_tensor("intrinsics", htype="intrinsics")
>>> ["lz4"]

Appending intrinsics matrices

>>> intrinsic_params = np.zeros((3, 3))
>>> ds.intrinsics.append(intrinsic_params)

Segmentation Mask Htype

  • Sample dimensions: (height, width)

Segmentation masks are 2D representations of class labels where the numerical label data is encoded in an array of same shape as the image. The numerical values are indices of the list tensor.info.class_names.

Creating a segment_mask tensor

A segment_mask tensor can be created using

>>> classes = ["background", "aeroplane", "bicycle", "bird", "boat", "bottle"]
>>> ds.create_tensor("masks", htype="segment_mask", class_names=classes, sample_compression="lz4")
  • Optional args:
  • Supported compressions:

>>> ["lz4"]

You can also choose to set the class names after tensor creation.

>>> ds.labels.info.update(class_names = ["background", "aeroplane", "bicycle", "bird", "boat", "bottle"])

Note

Since segmentation masks often contain large amounts of data, it is recommended to compress them using lz4.

Appending segmentation masks

  • Segmentation masks can be appended as np.ndarray.

Examples

>>> ds.masks.append(np.zeros((512, 512)))

Note

Since each pixel can only be labeled once, segmentation masks are not appropriate for datasets where objects might overlap, or where multiple objects within the same class must be distinguished. For these use cases, please use htype = “binary_mask”.

Binary Mask Htype

  • Sample dimensions: (height, width, # objects in a sample)

Binary masks are similar to segmentation masks, except that each object is represented by a channel in the mask. Each channel in the mask encodes values for a single object. A pixel in a mask channel should have a value of 1 if the pixel of the image belongs to this object and 0 otherwise. The labels corresponding to the channels should be stored in an adjacent tensor of htype class_label, in which the number of labels at a given index is equal to the number of objects (number of channels) in the binary mask.

Creating a binary_mask tensor

A binary_mask tensor can be created using

>>> ds.create_tensor("masks", htype="binary_mask", sample_compression="lz4")
  • Optional args:
    • ref:sample_compression <sample_compression> or chunk_compression

    • dtype: Defaults to bool.

  • Supported compressions:

>>> ["lz4"]

Note

Since segmentation masks often contain large amounts of data, it is recommended to compress them using lz4.

Appending binary masks

  • Binary masks can be appended as np.ndarray.

Examples

Appending a binary mask with 5 objects

>>> ds.masks.append(np.zeros((512, 512, 5), dtype="bool"))
>>> ds.labels.append(["aeroplane", "aeroplane", "bottle", "bottle", "bird"])

COCO Keypoints Htype

  • Sample dimensions: (3 x # keypoints, # objects in a sample)

COCO keypoints are a convention for storing points of interest in an image. Each keypoint consists of 3 values: x - coordinate, y - coordinate and v - visibility. A set of K keypoints of an object is represented as:

[x1, y1, v1, x2, y2, v2, …, xk, yk, vk]

The visibility v can be one of three values:

0

keypoint not in image.

1

keypoint in image but not visible.

2

keypoint in image and visible.

Creating a keypoints_coco tensor

A keypoints_coco tensor can be created using

>>> ds.create_tensor("keypoints", htype="keypoints_coco", keypoints=["knee", "elbow", "head"], connections=[[0, 1], [1, 2]])
  • Optional args:
    • keypoints: List of strings describing the i th keypoint. tensor.info.keypoints will be set to this list.

    • connections: List of strings describing which points should be connected by lines in the visualizer.

    • sample_compression or chunk_compression

    • dtype: Defaults to int32.

  • Supported compressions:

>>> ["lz4"]

You can also choose to set keypoints and / or connections after tensor creation.

>>> ds.keypoints.info.update(keypoints = ['knee', 'elbow',...])
>>> ds.keypoints.info.update(connections = [[0,1], [2,3], ...])

Appending keypoints

  • Keypoints can be appended as np.ndarray or list.

Examples

Appending keypoints sample with 3 keypoints and 4 objects

>>> ds.keypoints.update(keypoints = ["left ear", "right ear", "nose"])
>>> ds.keypoints.update(connections = [[0, 2], [1, 2]])
>>> kp_arr
array([[465, 398, 684, 469],
       [178, 363, 177, 177],
       [  2,   2,   2,   1],
       [454, 387, 646, 478],
       [177, 322, 137, 161],
       [  2,   2,   2,   2],
       [407, 379, 536, 492],
       [271, 335, 150, 143],
       [  2,   1,   2,   2]])
>>> kp_arr.shape
(9, 4)
>>> ds.keypoints.append(kp_arr)

Warning

In order to correctly use the keypoints and connections metadata, it is critical that all objects in every sample have the same number of K keypoints in the same order. For keypoints that are not present in an image, they can be stored with dummy coordinates of x = 0, y = 0, and v = 0, and the visibility will prevent them from being drawn in the visualizer.

Point Htype

  • Sample dimensions: (# points, 2) in case of 2-D (X, Y) co-ordinates or (# points, 3) in case of 3-D (X, Y, Z) co-ordinates of the point.

Points does not contain a fixed mapping across samples between the point order and real-world objects (i.e., point 0 is an elbow, point 1 is a knee, etc.). If you require such a mapping, use COCO Keypoints Htype.

Creating a point tensor

A point tensor can be created using

>>> ds.create_tensor("points", htype="point", sample_compression=None)
>>> ["lz4"]

Appending point samples

  • Points can be appended as np.ndarray or list.

Examples

Appending 2 2-D points

>>> ds.points.append([[0, 1], [1, 3]])

Appending 2 3-D points

>>> ds.points.append(np.zeros((2, 3)))

Polygon Htype

  • Sample dimensions: (# polygons, # points per polygon, # co-ordinates per point)

  • Each sample in a tensor of polygon htype is a list of polygons.

  • Each polygon is a list / array of points.

  • All points in a sample should have the same number of co-ordinates (eg., cannot mix 2-D points with 3-D points).

  • Different samples can have different number of polygons.

  • Different polygons can have different number of points.

Creating a polygon tensor

A polygon tensor can be created using

>>> ds.create_tensor("polygons", htype="polygon", sample_compression=None)
>>> ["lz4"]

Appending polygons

  • Polygons can be appended as a list of list of tuples or np.ndarray.

Examples

Appending polygons with 2-D points

>>> poly1 = [(1, 2), (2, 3), (3, 4)]
>>> poly2 = [(10, 12), (14, 19)]
>>> poly3 = [(33, 32), (54, 67), (67, 43), (56, 98)]
>>> sample = [poly1, poly2, poly3]
>>> ds.polygons.append(sample)

Appending polygons with 3-D points

>>> poly1 = [(10, 2, 9), (12, 3, 8), (12, 10, 4)]
>>> poly2 = [(10, 1, 8), (5, 17, 11)]
>>> poly3 = [(33, 33, 31), (45, 76, 13), (60, 24, 17), (67, 87, 83)]
>>> sample = [poly1, poly2, poly3]
>>> ds.polygons.append(sample)

Appending polygons with numpy arrays

>>> import numpy as np
>>> sample = np.random.randint(0, 10, (5, 7, 2))  # 5 polygons with 7 points
>>> ds.polygons.append(sample)
>>> import numpy as np
>>> poly1 = np.random.randint(0, 10, (5, 2))
>>> poly2 = np.random.randint(0, 10, (8, 2))
>>> poly3 = np.random.randint(0, 10, (3, 2))
>>> sample = [poly1, poly2, poly3]
>>> ds.polygons.append(sample)

Nifti Htype

  • Sample dimensions: (# height, # width, # slices) or (# height, # width, # slices, # time unit) in case of time-series data.

Creating a nifti tensor

A nifti tensor can be created using

>>> ds.create_tensor("patients", htype="nifti", sample_compression="nii.gz")
  • Supported compressions:

>>> ["nii.gz", "nii", None]

Appending nifti data

  • Nifti samples can be of type np.ndarray or Sample which is returned by deeplake.read().

  • Deep Lake does not support compression of raw nifti data. Therefore, array of raw frames can only be appended to tensors with None compression.

Examples

>>> ds.patients.append(deeplake.read("data/patient0.nii.gz"))
>>> ds.patients.extend([deeplake.read(f"data/patient{i}.nii.gz") for i in range(10)])

Point Cloud Htype

  • Sample dimensions: (# num_points, 3)

  • Point cloud samples can be of type np.ndarray or Sample which is returned by deeplake.read().

  • Each point cloud is a list / array of points.

  • All points in a sample should have the same number of co-ordinates.

  • Different point clouds can have different number of points.

Creating a point cloud tensor

A point cloud tensor can be created using

>>> ds.create_tensor("point_clouds", htype="point_cloud", sample_compression="las")
>>> [None, "las"]

Appending point clouds

  • Point clouds can be appended as a np.ndarray.

Examples

Appending point clouds with numpy arrays

>>> import numpy as np
>>> point_cloud1 = np.random.randint(0, 10, (5, 3))
>>> ds.point_clouds.append(point_cloud1)
>>> point_cloud2 = np.random.randint(0, 10, (15, 3))
>>> ds.point_clouds.append(point_cloud2)
>>> ds.point_clouds.shape
>>> (2, None, 3)

Or we can use deeplake.read() method to add samples

>>> import deeplake as dp
>>> sample = dp.read("example.las") # point cloud with 100 points
>>> ds.point_cloud.append(sample)
>>> ds.point_cloud.shape
>>> (1, 100, 3)

Mesh Htype

  • Sample dimensions: (# num_points, 3)

  • Mesh samples can be of type np.ndarray or Sample which is returned by deeplake.read().

  • Each sample in a tensor of mesh htype is a mesh array (3-D object data).

  • Each mesh is a list / array of points.

  • Different meshes can have different number of points.

Creating a mesh tensor

A mesh tensor can be created using

>>> ds.create_tensor("mesh", htype="mesh", sample_compression="ply")
>>> ["ply"]

Appending meshes

Examples

Appending a ply file contatining a mesh data to tensor

>>> import deeplake as dp
>>> sample = dp.read("example.ply")  # mesh with 100 points and 200 faces
>>> ds.mesh.append(sample)
>>> ds.mesh.shape
>>> (1, 100, 3)

Sequence htype

  • A special meta htype for tensors where each sample is a sequence. The items in the sequence are samples of another htype.

  • It is a wrapper htype that can wrap other htypes like sequence[image], sequence[video], sequence[text], etc.

Examples

>>> ds.create_tensor("seq", htype="sequence")
>>> ds.seq.append([1, 2, 3])
>>> ds.seq.append([4, 5, 6])
>>> ds.seq.numpy()
array([[[1],
        [2],
        [3]],
       [[4],
        [5],
        [6]]])
>>> ds.create_tensor("image_seq", htype="sequence[image]", sample_compression="jpg")
>>> ds.image_seq.append([deeplake.read("img01.jpg"), deeplake.read("img02.jpg")])