SQL Operators and Types¶
Deep Lake extends standard Postgres with operators and types for similarity search.
The <#> operator¶
The primary search operator. Its behavior depends on the column type:
| Left side | Right side | Behavior |
|---|---|---|
FLOAT4[] column |
FLOAT4[] literal |
Vector similarity (cosine-like) |
FLOAT4[][] column |
FLOAT4[][] literal |
Multi-vector MaxSim |
TEXT column |
TEXT literal |
BM25 keyword score |
(FLOAT4[], TEXT) tuple |
deeplake_hybrid_record(...) |
Hybrid BM25 + vector |
Vector similarity¶
SELECT *, embedding <#> ARRAY[0.1, 0.2, 0.3]::float4[] AS score
FROM "my_workspace"."my_table"
ORDER BY score DESC
LIMIT 10
Higher score = better match. Sort DESC.
Multi-vector (MaxSim)¶
SELECT *, embedding <#> ARRAY[ARRAY[0.1, 0.2], ARRAY[0.3, 0.4]]::float4[][] AS score
FROM "my_workspace"."my_table"
ORDER BY score DESC
LIMIT 10
Each query token is matched to the best-matching token in the item.
BM25¶
SELECT *, content <#> 'search terms here' AS score
FROM "my_workspace"."my_table"
ORDER BY score ASC
LIMIT 10
Lower score = better match. Sort ASC.
deeplake_hybrid_record¶
Combines a vector and a text query into a single hybrid search.
deeplake_hybrid_record(
ARRAY[0.1, 0.2, ...]::float4[], -- vector query
'search text', -- BM25 query
0.5, -- vector weight
0.5 -- BM25 weight
)
Used with a tuple on the left side:
SELECT *,
(embedding, content) <#> deeplake_hybrid_record(
ARRAY[0.1, 0.2, ...]::float4[],
'search text',
0.5, 0.5
) AS score
FROM "my_workspace"."my_table"
ORDER BY score ASC
LIMIT 10
Weights must be between 0 and 1. They control the balance between semantic and keyword matching.
Tensor column types¶
| Type | Usage | Example |
|---|---|---|
FLOAT4[] |
Single embedding vector | ARRAY[0.1, 0.2, 0.3]::float4[] |
FLOAT4[][] |
Multi-vector embedding (bag of tokens) | ARRAY[ARRAY[0.1, 0.2], ARRAY[0.3, 0.4]]::float4[][] |
Formatting in Python¶
Single vector:
Multi-vector:
emb = [[0.1, 0.2], [0.3, 0.4]]
inner = ", ".join("ARRAY[" + ",".join(str(v) for v in row) + "]" for row in emb)
literal = f"ARRAY[{inner}]::float4[][]"
Index syntax¶
Create a vector index¶
Create a BM25 index¶
CREATE INDEX idx_name ON "workspace"."table" USING deeplake_index (column) WITH (index_type = 'bm25');
Create an exact text index¶
CREATE INDEX idx_name ON "workspace"."table" USING deeplake_index (column) WITH (index_type = 'exact_text');
Note: the table must be created with USING deeplake for these indexes to work.
Table engine¶
Tables must use the Deep Lake storage engine and be schema-qualified with the workspace name:
CREATE TABLE "my_workspace"."my_table" (
id SERIAL PRIMARY KEY,
content TEXT,
embedding FLOAT4[]
) USING deeplake;
The USING deeplake clause is required for indexes (vector, BM25, exact text) to work. The workspace name serves as the Postgres schema.
JSONB operators¶
Standard Postgres JSONB operators work in filters:
| Operator | Meaning | Example |
|---|---|---|
->>'key' |
Get value as text | metadata->>'source' = 'api' |
? |
Key exists | tags ? 'nature' |
@> |
Contains | metadata @> '{"version": 2}' |