Working with Videos¶
Deeplake handles videos as a native column type. You can ingest video files, access individual frames without decompressing the full video, and stream large videos directly from cloud storage.
Objective¶
Ingest video files into a managed table, add annotations, access frames by index, and query your video dataset.
Prerequisites¶
pip install deeplake- A Deeplake API token.
Set credentials first
Complete Code¶
from deeplake import Client
client = Client()
# Ingest videos from local files
client.ingest("my_videos", {
"video": ["./videos/clip1.mp4", "./videos/clip2.mp4"],
"label": ["traffic", "parking"],
})
# Query your videos
results = client.query("SELECT * FROM my_videos WHERE label = 'traffic'")
print(results)
# Access the underlying dataset for frame-level operations
ds = client.open_table("my_videos")
print(f"Dataset has {len(ds)} samples")
import deeplake
import numpy as np
# Create a dataset with video columns
ds = deeplake.create("s3://my-bucket/video-dataset")
ds.add_column("videos", deeplake.types.Video())
ds.add_column("labels", "text")
ds.add_column("boxes", deeplake.types.BoundingBox())
# Append video data
ds.append([{
"videos": deeplake.read("./videos/clip1.mp4"),
"labels": "traffic",
"boxes": np.array([[10, 20, 100, 150]], dtype=np.float32),
}])
ds.commit("Added first video")
# Create a table with a VIDEO column
curl -s -X POST "$API_URL/workspaces/$DEEPLAKE_WORKSPACE/tables/query" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $DEEPLAKE_API_KEY" \
-H "X-Activeloop-Org-Id: $DEEPLAKE_ORG_ID" \
-d '{
"query": "CREATE TABLE IF NOT EXISTS \"'$DEEPLAKE_WORKSPACE'\".\"my_videos\" (id SERIAL PRIMARY KEY, video VIDEO, label TEXT) USING deeplake"
}'
Frame-Level Access¶
Deeplake decompresses only the frames you request, not the entire video:
ds = client.open_table("my_videos")
# Get video shape: (num_frames, height, width, channels)
print(ds["videos"][0].shape) # e.g. (400, 360, 640, 3)
# Access a range of frames, only these frames are decompressed
frames = ds["videos"][0, 100:200].numpy() # shape: (100, 360, 640, 3)
# Access with step (every 5th frame)
sampled = ds["videos"][0, 0:200:5].numpy() # shape: (40, 360, 640, 3)
# Single frame
last_frame = ds["videos"][0, -1].numpy() # shape: (360, 640, 3)
Timestamps¶
Access presentation timestamps (in seconds) for precise temporal alignment:
# Get timestamps for a frame range
ts = ds["videos"][0, 10:15].timestamp
print(ts) # e.g. array([0.367, 0.400, 0.434, 0.467, 0.500])
# Get both frames and timestamps together
data = ds["videos"][0, 15:20].data()
print(data["frames"].shape) # (5, 360, 640, 3)
print(data["timestamps"]) # array of 5 timestamps
Video Metadata¶
info = ds["videos"][0].sample_info
print(info)
# {'duration': 13.33, 'fps': 30.0, 'format': 'mp4', ...}
Linked Videos¶
Instead of uploading video files, you can link to videos stored in your own cloud:
ds = deeplake.create("al://my-org/linked-videos", token="...")
ds.add_column("video_links", deeplake.types.Link(deeplake.types.Video()))
# Link to remote videos
ds.append([{
"video_links": deeplake.link("s3://my-bucket/video1.mp4", creds_key="my_s3_creds"),
}])
ds.commit()
# Access works the same way, frames are streamed from source
frames = ds["video_links"][0, 0:10].numpy()
Streaming¶
Videos larger than 16 MB are automatically streamed from storage. Only the packets needed for the requested frames are fetched. No full download required. This works for both uploaded and linked videos.
What to try next¶
- Video Retrieval: semantic search across video content.
- Dataloaders: stream video frames into PyTorch/TensorFlow training loops.
- Authentication: set up your API token.