Skip to content

Content Cloning & Auto-Posting Agent

Viral content agents in 2026 detect trending videos (TikTok/YouTube), rewrite scripts using LLMs, and auto-post to multiple platforms. Deeplake acts as the "viral memory," storing multimodal assets and feedback loops to help the agent learn what works.

Objective

Build a persistent store for a content agent that tracks viral videos, stores rewritten scripts, and maintains a "cross-post registry" to avoid duplicate posts.

Prerequisites

  • Deeplake SDK: pip install deeplake
  • AI tools: pip install torch transformers accelerate
  • System dependency: ffmpeg (sudo apt-get install ffmpeg)
  • A Deeplake API token.
  • An OpenRouter API key for embeddings.

Set credentials first

export DEEPLAKE_API_KEY="your-token-here"
export DEEPLAKE_WORKSPACE="your-workspace"  # optional, defaults to "default"

Complete Code

import os
import time
import requests
from deeplake import Client

# --- Embedding via OpenRouter (openai/text-embedding-3-large, ctx 8K) ---
OPENROUTER_API_KEY = os.environ["OPENROUTER_API_KEY"]

def embed(texts):
    """Generate embeddings using OpenAI text-embedding-3-large via OpenRouter."""
    res = requests.post(
        "https://openrouter.ai/api/v1/embeddings",
        headers={"Authorization": f"Bearer {OPENROUTER_API_KEY}"},
        json={"model": "openai/text-embedding-3-large", "input": texts},
    )
    return [item["embedding"] for item in res.json()["data"]]

# 1. Setup
client = Client()

# 2. Log a Viral Discovery
# We store the original video, the rewritten caption, and platform-specific metadata
caption = "POV: You switched to a GPU-native DB"
rewritten = "Why your AI agent needs a better hard drive. #DeepLake #AI"
caption_emb = embed([caption])[0]

print("Logging viral content discovery...")
client.ingest("viral_memories", {
    "original_caption": [caption],
    "rewritten_caption": [rewritten],
    "platforms_posted": [{"X": True, "IG": False, "TikTok": True}],
    "viral_score": [98.5],
    "embedding": [caption_emb],
})

# 3. Avoid Duplicates: Search for similar content
# Before posting, the agent checks if it has already processed a similar concept
query_text = "GPU-native database for agents"
query_emb = embed([query_text])[0]

emb_pg = "{" + ",".join(str(x) for x in query_emb) + "}"
duplicates = client.query("""
    SELECT original_caption, rewritten_caption, platforms_posted,
           embedding <#> $1::float4[] AS score
    FROM viral_memories
    ORDER BY score DESC
    LIMIT 1
""", (emb_pg,))

if duplicates and duplicates[0].get("score", 0) > 0.9:
    print(f"Skipping: Already posted similar content: {duplicates[0]['original_caption']}")
else:
    print("Content is unique. Proceeding with auto-post loop...")

# 4. Performance Feedback Loop
# Update the viral score after 24 hours based on real-world engagement
client.query("""
    UPDATE viral_memories
    SET viral_score = 1500.0
    WHERE original_caption LIKE 'POV: You switched%'
""")
# Requires: export DEEPLAKE_API_KEY="..." (see quickstart)
# Requires: export DEEPLAKE_ORG_ID="your-org-id"
API_URL="https://api.deeplake.ai"
WORKSPACE="content-agent-01"

# --- Embedding via OpenRouter (openai/text-embedding-3-large, ctx 8K) ---
# Requires: export OPENROUTER_API_KEY="..."

# Helper: get embedding from OpenRouter
get_embedding() {
  curl -s "https://openrouter.ai/api/v1/embeddings" \
    -H "Authorization: Bearer $OPENROUTER_API_KEY" \
    -H "Content-Type: application/json" \
    -d "{\"model\": \"openai/text-embedding-3-large\", \"input\": [\"$1\"]}"
}

# 1. Create viral memories table
curl -s -X POST "$API_URL/workspaces/$DEEPLAKE_WORKSPACE/tables/query" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $DEEPLAKE_API_KEY" \
  -H "X-Activeloop-Org-Id: $DEEPLAKE_ORG_ID" \
  -d '{
    "query": "CREATE TABLE IF NOT EXISTS \"'$DEEPLAKE_WORKSPACE'\".\"viral_memories\" (id BIGSERIAL PRIMARY KEY, original_caption TEXT, rewritten_caption TEXT, platforms_posted JSONB, viral_score FLOAT4, embedding FLOAT4[]) USING deeplake"
  }'

# 2. Log a viral discovery with real embeddings
CAPTION="POV: You switched to a GPU-native DB"
REWRITTEN="Why your AI agent needs a better hard drive. #DeepLake #AI"
CAPTION_EMB=$(get_embedding "$CAPTION" | jq -c '.data[0].embedding' | tr '[]' '{}')


curl -s -X POST "$API_URL/workspaces/$DEEPLAKE_WORKSPACE/tables/query" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $DEEPLAKE_API_KEY" \
  -H "X-Activeloop-Org-Id: $DEEPLAKE_ORG_ID" \
  -d '{
    "query": "INSERT INTO \"'$DEEPLAKE_WORKSPACE'\".\"viral_memories\" (original_caption, rewritten_caption, platforms_posted, viral_score, embedding) VALUES ($1, $2, $3::jsonb, 98.5, $4::float4[])",
    "params": [
      "'"$CAPTION"'",
      "'"$REWRITTEN"'",
      "{\"X\": true, \"IG\": false, \"TikTok\": true}",
      "'"$CAPTION_EMB"'"
    ]
  }'

# 3. Duplicate check via vector search
SEARCH_TEXT="GPU-native database for agents"
SEARCH_EMB=$(get_embedding "$SEARCH_TEXT" | jq -c '.data[0].embedding' | tr '[]' '{}')


curl -s -X POST "$API_URL/workspaces/$DEEPLAKE_WORKSPACE/tables/query" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $DEEPLAKE_API_KEY" \
  -H "X-Activeloop-Org-Id: $DEEPLAKE_ORG_ID" \
  -d '{
    "query": "SELECT original_caption, rewritten_caption, embedding <#> $1::float4[] AS score FROM \"'$DEEPLAKE_WORKSPACE'\".\"viral_memories\" ORDER BY score DESC LIMIT 1",
    "params": ["'"$SEARCH_EMB"'"]
  }'

# 4. Feedback loop: update viral score
curl -s -X POST "$API_URL/workspaces/$DEEPLAKE_WORKSPACE/tables/query" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $DEEPLAKE_API_KEY" \
  -H "X-Activeloop-Org-Id: $DEEPLAKE_ORG_ID" \
  -d '{
    "query": "UPDATE \"'$DEEPLAKE_WORKSPACE'\".\"viral_memories\" SET viral_score = 1500.0 WHERE original_caption LIKE $1",
    "params": ["POV: You switched%"]
  }'

Step-by-Step Breakdown

1. Multimodal Viral Memory

Content agents deal with videos, images (thumbnails), and text (captions). Deeplake allows you to store all these in a single row. This is critical for Content Cloning, where you need to keep the source media perfectly synced with its AI-generated rewrites.

2. Avoiding "Amnesia" & Duplicate Spam

Social media platforms penalize duplicate content. By using Hybrid Search on your viral_memories table, the agent can "recall" if it has already processed a similar video or used a similar caption style, ensuring high-quality, unique output.

3. Feedback Loops for Learning

The agent can use the UPDATE SQL command to record engagement metrics (likes, shares) back into the same table. Over time, the agent can query its own history (ORDER BY viral_score DESC) to learn which caption styles or video topics are most effective.

What to try next