Skip to content

Tables

All table operations go through SQL. Deeplake executes it and returns the result.

Setup

Set DEEPLAKE_API_KEY and DEEPLAKE_WORKSPACE as environment variables (see Quickstart).

Set credentials first

export DEEPLAKE_API_KEY="your-token-here"
export DEEPLAKE_WORKSPACE="your-workspace"  # optional, defaults to "default"
from deeplake import Client

client = Client()
API_URL="https://api.deeplake.ai"
TABLE="documents"
export DEEPLAKE_ORG_ID="your-org-id"

Create a table

The recommended way to create a table is client.ingest(), which creates the table and inserts data in one call. Schema is inferred automatically:

client.ingest("documents", {
    "title": ["First doc", "Second doc"],
    "content": ["Hello world", "Another document"],
})

For explicit schema control, pass schema=:

client.ingest("documents", {
    "title": ["First doc"],
    "content": ["Hello world"],
}, schema={"title": "TEXT", "content": "TEXT"})

See client.ingest() for all options (files, HuggingFace, chunking, formats).

Advanced: raw SQL

You can also create tables via raw SQL. Tables must be schema-qualified and use USING deeplake for indexes to work:

client.query("""
    CREATE TABLE IF NOT EXISTS "YOUR_WORKSPACE"."documents" (
        id SERIAL PRIMARY KEY,
        title TEXT,
        content TEXT,
        metadata JSONB,
        created_at TIMESTAMPTZ DEFAULT NOW()
    ) USING deeplake
""")
curl -s -X POST "$API_URL/workspaces/$DEEPLAKE_WORKSPACE/tables/query" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $DEEPLAKE_API_KEY" \
  -H "X-Activeloop-Org-Id: $DEEPLAKE_ORG_ID" \
  -d '{
    "query": "CREATE TABLE IF NOT EXISTS \"YOUR_WORKSPACE\".\"documents\" (id SERIAL PRIMARY KEY, title TEXT, content TEXT, metadata JSONB, created_at TIMESTAMPTZ DEFAULT NOW()) USING deeplake"
  }'

Insert rows

Single row

client.query("""
    INSERT INTO "YOUR_WORKSPACE"."documents" (title, content, metadata)
    VALUES ('First doc', 'Hello world', '{"source": "manual"}'::jsonb)
""")
curl -s -X POST "$API_URL/workspaces/$DEEPLAKE_WORKSPACE/tables/query" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $DEEPLAKE_API_KEY" \
  -H "X-Activeloop-Org-Id: $DEEPLAKE_ORG_ID" \
  -d '{
    "query": "INSERT INTO \"YOUR_WORKSPACE\".\"documents\" (title, content, metadata) VALUES ('First doc', 'Hello world', '{\"source\": \"manual\"}'::jsonb)"
  }'

Eventual consistency

After INSERT, data may take a few seconds to become visible in SELECT queries. This is normal behavior for Deeplake tables.

Dynamic insert

Use f-strings for dynamic values:

title = "My title"
content = "My content"
metadata = '{"source": "api"}'
client.query("""
    INSERT INTO "YOUR_WORKSPACE"."documents" (title, content, metadata)
    VALUES ($1, $2, $3::jsonb)
""", (title, content, metadata))
curl -s -X POST "$API_URL/workspaces/$DEEPLAKE_WORKSPACE/tables/query" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $DEEPLAKE_API_KEY" \
  -H "X-Activeloop-Org-Id: $DEEPLAKE_ORG_ID" \
  -d '{
    "query": "INSERT INTO \"YOUR_WORKSPACE\".\"documents\" (title, content, metadata) VALUES ($1, $2, $3::jsonb)",
    "params": ["My title", "My content", "{\"source\": \"api\"}"]
  }'

Batch insert

for title, content in [("Doc A", "Content A"), ("Doc B", "Content B")]:
    client.query("""INSERT INTO "YOUR_WORKSPACE"."documents" (title, content) VALUES ($1, $2)""", (title, content))

Tip

For bulk inserts into a new table, client.ingest() is simpler and faster.

Insert multiple rows in a single query:

curl -s -X POST "$API_URL/workspaces/$DEEPLAKE_WORKSPACE/tables/query" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $DEEPLAKE_API_KEY" \
  -H "X-Activeloop-Org-Id: $DEEPLAKE_ORG_ID" \
  -d '{
    "query": "INSERT INTO \"YOUR_WORKSPACE\".\"documents\" (title, content) VALUES ($1, $2), ($3, $4), ($5, $6)",
    "params": ["Doc A", "Content A", "Doc B", "Content B", "Doc C", "Content C"]
  }'

Query data

Select all

# Fluent API
results = client.table("documents").select("*").limit(10)()
for row in results:
    print(row)

# Or raw SQL
rows = client.query("SELECT * FROM documents LIMIT 10")
curl -s -X POST "$API_URL/workspaces/$DEEPLAKE_WORKSPACE/tables/query" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $DEEPLAKE_API_KEY" \
  -H "X-Activeloop-Org-Id: $DEEPLAKE_ORG_ID" \
  -d '{"query": "SELECT * FROM \"YOUR_WORKSPACE\".\"documents\" LIMIT 10"}'

Filter with WHERE

results = (
    client.table("documents")
        .select("title", "content")
        .where("created_at > '2025-01-01'")
        .order_by("created_at DESC")
        .limit(5)
)()
curl -s -X POST "$API_URL/workspaces/$DEEPLAKE_WORKSPACE/tables/query" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $DEEPLAKE_API_KEY" \
  -H "X-Activeloop-Org-Id: $DEEPLAKE_ORG_ID" \
  -d '{
    "query": "SELECT title, content FROM \"YOUR_WORKSPACE\".\"documents\" WHERE created_at > '2025-01-01' ORDER BY created_at DESC LIMIT 5"
  }'

Count rows

result = client.query("SELECT COUNT(*) FROM documents")
print(result)
curl -s -X POST "$API_URL/workspaces/$DEEPLAKE_WORKSPACE/tables/query" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $DEEPLAKE_API_KEY" \
  -H "X-Activeloop-Org-Id: $DEEPLAKE_ORG_ID" \
  -d '{"query": "SELECT COUNT(*) FROM \"YOUR_WORKSPACE\".\"documents\""}'

Update rows

client.query("""
    UPDATE "YOUR_WORKSPACE"."documents"
    SET content = 'Updated content', metadata = '{"source": "api", "version": 2}'::jsonb
    WHERE title = 'First doc'
""")
curl -s -X POST "$API_URL/workspaces/$DEEPLAKE_WORKSPACE/tables/query" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $DEEPLAKE_API_KEY" \
  -H "X-Activeloop-Org-Id: $DEEPLAKE_ORG_ID" \
  -d '{
    "query": "UPDATE \"YOUR_WORKSPACE\".\"documents\" SET content = 'Updated content', metadata = '{\"source\": \"api\", \"version\": 2}'::jsonb WHERE title = 'First doc'"
  }'

Delete rows

client.query("""DELETE FROM "YOUR_WORKSPACE"."documents" WHERE title = 'First doc'""")
curl -s -X POST "$API_URL/workspaces/$DEEPLAKE_WORKSPACE/tables/query" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $DEEPLAKE_API_KEY" \
  -H "X-Activeloop-Org-Id: $DEEPLAKE_ORG_ID" \
  -d '{
    "query": "DELETE FROM \"YOUR_WORKSPACE\".\"documents\" WHERE title = 'First doc'"
  }'

Drop a table

client.drop_table("documents")
# Via SQL
curl -s -X POST "$API_URL/workspaces/$DEEPLAKE_WORKSPACE/tables/query" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $DEEPLAKE_API_KEY" \
  -H "X-Activeloop-Org-Id: $DEEPLAKE_ORG_ID" \
  -d '{"query": "DROP TABLE IF EXISTS \"YOUR_WORKSPACE\".\"documents\""}'

# Or via the REST endpoint
curl -s -X DELETE "$API_URL/workspaces/$DEEPLAKE_WORKSPACE/tables/$TABLE" \
  -H "Authorization: Bearer $DEEPLAKE_API_KEY" \
  -H "X-Activeloop-Org-Id: $DEEPLAKE_ORG_ID"

List tables

tables = client.list_tables()
print(tables)
curl -s "$API_URL/workspaces/$DEEPLAKE_WORKSPACE/tables" \
  -H "Authorization: Bearer $DEEPLAKE_API_KEY" \
  -H "X-Activeloop-Org-Id: $DEEPLAKE_ORG_ID"

For the full list of supported column types, see Data Types.

Next steps

  • Indexes: create vector, BM25, and exact text indexes
  • Search: query your tables with vector and hybrid search
  • Massive Ingestion: ingest large-scale datasets
  • Hybrid RAG: build a RAG pipeline on your tables