Semantic Search¶
Semantic search allows you to find documents by meaning rather than just exact keywords. Deeplake supports vector similarity, BM25 keyword ranking, and hybrid search.
Objective¶
Build a searchable index of documents and demonstrate three search modes: Semantic (Vector), Keyword (BM25), and Hybrid.
Prerequisites¶
- Deeplake SDK installed:
pip install deeplakeand Sentence Transformers:pip install sentence-transformers(Python SDK tab) curland a terminal (REST API tab)- A Deeplake API token.
Set credentials first
Complete Code¶
import json
from pathlib import Path
from deeplake import Client
from sentence_transformers import SentenceTransformer
# 1. Setup
client = Client()
model = SentenceTransformer("Qwen/Qwen3-Embedding-8B")
# 2. Prepare Data (simulated loading from local directory)
# data_dir = Path("./my_docs")
# documents = [f.read_text() for f in data_dir.glob("*.txt")]
documents = [
"Deeplake is a GPU-native database optimized for AI datasets.",
"PostgreSQL with pg_deeplake supports BM25 and vector search.",
"Autonomous agents need persistent memory to avoid amnesia."
]
# 3. Generate Embeddings (real model call)
print(f"Generating embeddings for {len(documents)} documents...")
embeddings = model.encode(documents).tolist()
# 4. Ingest: Creates table automatically and inserts data
# client.ingest() handles schema inference and batch insertion
client.ingest("knowledge_base", {
"text": documents,
"embedding": embeddings,
"metadata": [json.dumps({"source": "manual_input", "id": i}) for i in range(len(documents))]
})
# 5. Hybrid Search (Semantic + Keyword)
# We combine vector similarity (0.7) and BM25 ranking (0.3)
query_text = "how does deep lake help AI agents?"
query_emb = model.encode(query_text).tolist()
emb_pg = "{" + ",".join(str(x) for x in query_emb) + "}"
results = client.query("""
SELECT text, metadata,
(embedding, text)::deeplake_hybrid_record <#>
deeplake_hybrid_record($1::float4[], $2, 0.7, 0.3) AS score
FROM knowledge_base
ORDER BY score DESC
LIMIT 5
""", (emb_pg, query_text))
for r in results:
print(f"[{r['score']:.4f}] {r['text']}")
# Requires: export DEEPLAKE_API_KEY="..." (see quickstart)
# Requires: export DEEPLAKE_ORG_ID="your-org-id"
API_URL="https://api.deeplake.ai"
# 1. Create table with vector support
curl -s -X POST "$API_URL/workspaces/$DEEPLAKE_WORKSPACE/tables/query" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $DEEPLAKE_API_KEY" \
-H "X-Activeloop-Org-Id: $DEEPLAKE_ORG_ID" \
-d '{
"query": "CREATE TABLE IF NOT EXISTS \"'$DEEPLAKE_WORKSPACE'\".\"knowledge_base\" (id BIGSERIAL PRIMARY KEY, text TEXT, embedding FLOAT4[], metadata JSONB) USING deeplake"
}'
# 2. Insert a document with embedding and metadata
curl -s -X POST "$API_URL/workspaces/$DEEPLAKE_WORKSPACE/tables/query" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $DEEPLAKE_API_KEY" \
-H "X-Activeloop-Org-Id: $DEEPLAKE_ORG_ID" \
-d '{
"query": "INSERT INTO \"'$DEEPLAKE_WORKSPACE'\".\"knowledge_base\" (text, embedding, metadata) VALUES ($1, $2::float4[], $3::jsonb)",
"params": ["AI Agent Memory", "{0.1,0.2,0.3}", "{\"source\": \"api\"}"]
}'
# 3. Hybrid Search (vector + BM25)
curl -s -X POST "$API_URL/workspaces/$DEEPLAKE_WORKSPACE/tables/query" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $DEEPLAKE_API_KEY" \
-H "X-Activeloop-Org-Id: $DEEPLAKE_ORG_ID" \
-d '{
"query": "SELECT text, (embedding, text) <#> deeplake_hybrid_record($1::float4[], $2, 0.7, 0.3) AS score FROM \"'$DEEPLAKE_WORKSPACE'\".\"knowledge_base\" ORDER BY score DESC LIMIT 5",
"params": ["{0.12,0.22,0.32}", "memory"]
}'
Step-by-Step Breakdown¶
1. Vector Search¶
Uses the <#> operator on an EMBEDDING column. It computes cosine similarity. Higher scores mean better matches, so we use ORDER BY score DESC.
2. BM25 Search¶
Uses the <#> operator on a TEXT column. It calculates relevance based on keyword frequency and document length. Due to an underlying engine quirk, we also use ORDER BY score DESC for best matches.
3. Hybrid Search¶
Combines both signals using deeplake_hybrid_record(embedding, text, vector_weight, text_weight). This is ideal for balancing conceptual meaning with exact keyword matches (e.g., product names or error codes).
4. Indexing¶
Indexes are created once using USING deeplake_index. They are essential for low-latency retrieval on large datasets (1M+ rows).
What to try next¶
- Hybrid RAG - combine search with an LLM for answering questions.
- Video retrieval - search inside video clips using similar logic.
- Search guide - deep dive into all four search modes.