Agent Memory (Evolving Brain)¶

Standard RAG often fails at user session state or long-term learning. By using Deeplake as a persistent memory layer, agents can "remember" past decisions, tool outputs, and user preferences across thousands of turns.

Objective¶

Build a persistent memory store that allows an agent to retrieve relevant historical traces using Hybrid Search (Vector + BM25) to inform its next action.

Prerequisites¶

Deeplake SDK: pip install deeplake and agent framework: pip install pydantic-ai python-dotenv (Python SDK tab)
curl, jq, and a terminal (REST API tab)
A Deeplake API token.
An OpenRouter API key for embeddings.

Scenario: Multi-Agent Debugging Memory¶

A team of SWE agents works on a shared codebase. When an agent encounters a bug, it searches the team's collective memory for similar past issues, including error traces, tool outputs, and resolutions, before deciding how to fix it. This turns every debugging session into institutional knowledge.

Set credentials first

export DEEPLAKE_API_KEY="your-token-here"
export DEEPLAKE_WORKSPACE="your-workspace"  # optional, defaults to "default"

Complete Code¶

Python SDKREST API

import os
import json
import time
import requests
from datetime import datetime
from deeplake import Client

# --- Embedding via OpenRouter (google/gemini-embedding-001, ctx 20K) ---
OPENROUTER_API_KEY = os.environ["OPENROUTER_API_KEY"]

def embed(texts):
    """Generate embeddings using Google Gemini Embedding 001 via OpenRouter."""
    res = requests.post(
        "https://openrouter.ai/api/v1/embeddings",
        headers={"Authorization": f"Bearer {OPENROUTER_API_KEY}"},
        json={"model": "google/gemini-embedding-001", "input": texts},
    )
    return [item["embedding"] for item in res.json()["data"]]

# 1. Setup Client (workspace per agent team is recommended)
client = Client()

# 2. Store a debugging trace after resolving an issue
task = "Fix OOM crash in data pipeline when processing 50GB parquet files"
outcome = (
    "Root cause: pandas.read_parquet() loads entire file into memory. "
    "Fix: switched to pyarrow.dataset with batch_size=10000 and row-group "
    "filtering. Peak RAM dropped from 48GB to 3.2GB. Also added "
    "memory_profiler guard to CI to catch regressions."
)
error_trace = "MemoryError: Unable to allocate 47.3 GiB, triggered in ETL step 3 (transform_parquet)"
tools_used = "pyarrow.dataset, memory_profiler, htop"
metadata = {
    "agent_id": "swe-01",
    "repo": "data-pipeline",
    "error_type": "MemoryError",
    "files_changed": ["etl/transform.py", "ci/memory_guard.yml"],
    "resolution_time_min": 45,
}

task_emb = embed([task])[0]

client.ingest("debug_memory", {
    "task": [task],
    "outcome": [outcome],
    "error_trace": [error_trace],
    "tools_used": [tools_used],
    "metadata": [json.dumps(metadata)],
    "timestamp": [datetime.now().isoformat()],
    "embedding": [task_emb],
})

# 3. Later: a different agent hits a similar issue and searches the team memory
new_bug = "ETL job killed by OOM when loading large CSV exports"
search_emb = embed([new_bug])[0]

emb_pg = "{" + ",".join(str(x) for x in search_emb) + "}"
results = (
    client.table("debug_memory")
        .select("task", "outcome", "error_trace", "tools_used", "metadata",
                f"(embedding, task)::deeplake_hybrid_record <#> deeplake_hybrid_record('{emb_pg}'::float4[], '{new_bug}', 0.5, 0.5) AS score")
        .order_by("score DESC")
        .limit(3)
        .execute()
)

# 4. Use the recalled memory to inform the fix
if results:
    best = results[0]
    print(f"Similar past issue: {best['task']}")
    print(f"Resolution: {best['outcome']}")
    print(f"Tools that helped: {best['tools_used']}")
    # The agent now has context: "last time we hit OOM on large files,
    # we switched to streaming reads with pyarrow.dataset"
else:
    print("No prior experience found. Starting from scratch.")

# Requires: export DEEPLAKE_API_KEY="..." (see quickstart)
# Requires: export DEEPLAKE_ORG_ID="your-org-id"
API_URL="https://api.deeplake.ai"
# Requires: export OPENROUTER_API_KEY="..."

# --- Helper: get embeddings via OpenRouter (google/gemini-embedding-001) ---
embed() {
  curl -s "https://openrouter.ai/api/v1/embeddings" \
    -H "Authorization: Bearer $OPENROUTER_API_KEY" \
    -H "Content-Type: application/json" \
    -d "{\"model\": \"google/gemini-embedding-001\", \"input\": $1}"
}

# 1. Create memory table
curl -s -X POST "$API_URL/workspaces/$DEEPLAKE_WORKSPACE/tables/query" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $DEEPLAKE_API_KEY" \
  -H "X-Activeloop-Org-Id: $DEEPLAKE_ORG_ID" \
  -d '{
    "query": "CREATE TABLE IF NOT EXISTS \"'$DEEPLAKE_WORKSPACE'\".\"debug_memory\" (id BIGSERIAL PRIMARY KEY, task TEXT, outcome TEXT, error_trace TEXT, tools_used TEXT, metadata JSONB, embedding FLOAT4[]) USING deeplake"
  }'

# 2. Get embedding for the task description
TASK="Fix OOM crash in data pipeline when processing 50GB parquet files"
TASK_EMB=$(embed "[\"$TASK\"]" | jq -c '.data[0].embedding' | tr '[]' '{}')


# 3. Insert a debugging trace
curl -s -X POST "$API_URL/workspaces/$DEEPLAKE_WORKSPACE/tables/query" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $DEEPLAKE_API_KEY" \
  -H "X-Activeloop-Org-Id: $DEEPLAKE_ORG_ID" \
  -d '{
    "query": "INSERT INTO \"'$DEEPLAKE_WORKSPACE'\".\"debug_memory\" (task, outcome, error_trace, tools_used, metadata, embedding) VALUES ($1, $2, $3, $4, $5::jsonb, $6::float4[])",
    "params": [
      "Fix OOM crash in data pipeline when processing 50GB parquet files",
      "Root cause: pandas.read_parquet() loads entire file into memory. Fix: switched to pyarrow.dataset with batch_size=10000.",
      "MemoryError: Unable to allocate 47.3 GiB",
      "pyarrow.dataset, memory_profiler, htop",
      "{\"agent\": \"swe-01\", \"error_type\": \"MemoryError\"}",
      "'"$TASK_EMB"'"
    ]
  }'

# 4. Hybrid Search: a different agent searches for similar past issues
SEARCH="ETL job killed by OOM when loading large CSV exports"
SEARCH_EMB=$(embed "[\"$SEARCH\"]" | jq -c '.data[0].embedding' | tr '[]' '{}')


curl -s -X POST "$API_URL/workspaces/$DEEPLAKE_WORKSPACE/tables/query" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $DEEPLAKE_API_KEY" \
  -H "X-Activeloop-Org-Id: $DEEPLAKE_ORG_ID" \
  -d '{
    "query": "SELECT task, outcome, error_trace, tools_used, (embedding, task) <#> deeplake_hybrid_record($1::float4[], $2, 0.5, 0.5) AS score FROM \"'$DEEPLAKE_WORKSPACE'\".\"debug_memory\" ORDER BY score DESC LIMIT 3",
    "params": [
      "'"$SEARCH_EMB"'",
      "ETL job killed by OOM when loading large CSV exports"
    ]
  }'

Step-by-Step Breakdown¶

1. Rich Debugging Traces¶

Each memory entry stores not just "what happened" but the full debugging context: the error trace, which tools were used, which files were changed, and how long it took. This gives future agents actionable detail instead of vague summaries.

2. Cross-Agent Recall with Hybrid Search¶

When agent swe-02 hits an OOM error loading CSVs, it searches the team's memory. The hybrid query combines vector similarity (finding conceptually similar issues like "large file memory problems") with BM25 (matching exact terms like "OOM", "MemoryError"). This retrieves the right prior experience even when the error message or file format differs.

3. Memory-Informed Decisions¶

The retrieved trace tells the agent: "last time we hit OOM on large files, we switched to pyarrow.dataset with streaming reads." Instead of starting from scratch, the agent can apply (or adapt) the known fix, cutting resolution time from 45 minutes to seconds.

What to try next¶

Autonomous Agent Store: an advanced version showing autonomous agent "existence" tracking.
Hybrid RAG: how to feed these retrieved memories into an LLM prompt.
Querying reference: stream millions of memory traces for batch analysis.