Skip to content

History & Tags

Overview

Every transaction made to your dataset is tracked in a history log which can be viewed.

Furthermore, historical versions can be opened and worked with using Version.open().

Viewing History

This history of your dataset can be viewed with deeplake.Dataset.history.

print(ds.history)
0000000000v00000000 2024-10-10 01:06:29 Dataset created
0000000001v3743f11e 2024-10-11 16:43:07 Added columns
0000000001v610e041a 2024-10-11 16:43:07 (no message)
0000000002v41f5deb5 2024-10-13 03:15:53 (no message)
0000000003v99a77328 2024-10-13 03:23:51 (no message)
0000000004va8d94874 2024-10-16 12:13:09 Inserted customer A data
0000000005vfa7f1259 2024-10-17 16:51:31 (no message)
0000000006ve6d9f6d1 2024-11-03 05:12:21 (no message)
0000000007vbe164edb 2024-11-15 18:13:05 (no message)
0000000008vecddc410 2024-11-15 18:22:11 Cleaned up invalid dates
0000000009v102f3ee8 2024-11-16 06:23:25 (no message)
0000000010v203482cc 2024-11-21 07:44:53 (no message)

In the example above, each line represents a transaction with a unique version hash, timestamp, and the optional message provided at commit time.

Note

The portion of the version hash before the v is not necessarily unique. Depending on the concurrency of commits, multiple transactions may share the same prefix.

Opening Historical Versions

Recent historical versions can be opened with Version.open().

old_ds = ds.history["0000000002v41f5deb5"].open()

After opening a historical version, you can read data from it and query it as you would with the current version.

Node

You cannot modify an opened historical version becauase committed versions are immutable.

For performance reasons, all historical versions are not stored indefinitely on disk. To ensure you can always reference a particular version, use deeplake.Dataset.tag() described below.

Tagging a Version

For points in history you want to refer to long-term, you can "tag" particular versions.

ds.tag("v1.0")

Tagging allows you to check out a specific version by a meaningful name rather than the version hash, and ensures the particular version can always be opened regardless of how old the version is.

v1 = ds.tags["v1.0"].open()

Next Steps