Skip to Content

Cloudflare Agent Memory Beta 2026: How to Build AI Agents That Remember Across Sessions [Code Tutorial]

Step-by-step guide to building persistent memory AI agents using Cloudflare's Agent Memory service — code examples, five operations, and real-world implementation
Sk Jabedul Haque
May 5, 2026 5 min read 74 views
Cloudflare Agent Memory Beta 2026: How to Build AI Agents That Remember Across Sessions [Code Tutorial]
Navigation
10 Sections

    Cloudflare Agent Memory is now in private beta. This managed service extracts information from agent conversations and makes it available when needed — without filling up your context window. You get persistent, retrievable memory across sessions with just five operations.

    What You Will Learn

    • The five core operations: ingest, remember, recall, list, and forget
    • How to integrate Agent Memory with Cloudflare Workers
    • Four memory types: facts, events, instructions, and tasks
    • Building an agent that remembers user preferences across sessions

    Why AI Agents Need Memory

    Agents, as they exist today, are ephemeral. They run for a session, tied to a single process, and then they are gone. A coding agent forgets what you asked it to build. A customer service bot forgets the user's preferences. Every conversation starts from zero.

    This problem — called context rot — happens because traditional agents store everything in the context window. As conversations grow longer, you hit token limits, and the agent starts losing track of important details. The solution isn't more context; it's smarter memory.

    Cloudflare Agent Memory solves this by moving memory out of the prompt entirely. Instead of keeping everything in context, it extracts useful information from conversations and stores it separately, making it available when needed without filling up the model's working window.

    Professional Recommendation

    Agent Memory is currently in private beta. You can request access through Cloudflare's developer documentation. The service integrates with the Cloudflare Agents SDK and is accessible via Worker bindings or REST API.

    The Five Core Operations

    Agent Memory exposes a simple API with five operations. Each operation handles a specific memory management task:

    ingest

    Extract and store memories from conversation turns automatically

    remember

    Store a single memory explicitly (direct tool use by the model)

    recall

    Retrieve relevant memories based on a query

    list

    List all stored memories for a session or user

    forget

    Remove specific memories that are no longer relevant

    Four Memory Types

    Agent Memory organizes information into four distinct types, each suited to different kinds of knowledge:

    Memory Type Description Example
    Facts User preferences, personal details, learned information "User prefers dark mode"
    Events Past interactions, completed tasks, significant moments "User completed onboarding"
    Instructions Custom rules, preferences, behavior guidelines "Always use TypeScript"
    Tasks Pending actions, goals, todo items "Review pull request"

    Code Tutorial: Building a Remembering Agent

    Let's build a customer service agent that remembers user preferences across sessions. This example shows how to integrate Agent Memory with a Cloudflare Worker.

    Step 1: Set Up the Worker with Agent Memory Binding

    // wrangler.toml
    name = "remembering-agent"
    main = "src/index.ts"
    compatibility_date = "2026-05-06"
    
    [observability]
    enabled = true
    
    # Add Agent Memory binding
    [[observability.logs drains]]
    destination = "stdout"

    Step 2: Configure the Agent Memory Binding

    // In your Worker's tsconfig or type definitions
    interface Env {
      AGENT_MEMORY: AgentMemory;
    }
    
    // Agent Memory binding is automatically available
    // when you enable the feature in your Cloudflare account

    Step 3: Implement the Remembering Agent

    import { Agent } from "@cloudflare/agents";
    import type { AIEvent } from "@cloudflare/agents";
    
    export default {
      async fetch(request: Request, env: Env): Promise {
        const url = new URL(request.url);
        
        // Handle API requests
        if (url.pathname === "/chat") {
          return handleChat(request, env);
        }
        
        return new Response("Not Found", { status: 404 });
      }
    };
    
    async function handleChat(request: Request, env: Env): Promise {
      const { message, sessionId } = await request.json();
      const memory = env.AGENT_MEMORY;
      
      // 1. RECALL: Get relevant memories before processing
      const relevantMemories = await memory.recall({
        sessionId,
        query: message,
        limit: 5,
        memoryTypes: ["facts", "instructions"]
      });
      
      // 2. Build context from memories
      const contextFromMemory = relevantMemories.length > 0 
        ? `\nUser context from previous sessions:\n${relevantMemories.map(m => `- ${m.content}`).join('\n')}`
        : "";
      
      // 3. Process with the agent
      const agent = new Agent({
        model: "claude-3.7-sonnet",
        systemPrompt: `You are a helpful customer service agent.${contextFromMemory}`,
      });
      
      const response = await agent.run(message);
      
      // 4. INGEST: Automatically extract and store new memories
      await memory.ingest({
        sessionId,
        messages: [
          { role: "user", content: message },
          { role: "assistant", content: response }
        ]
      });
      
      // 5. EXPLICIT REMEMBER: Store specific important facts
      if (message.includes("I prefer")) {
        const preference = message.match(/I prefer (.+)/)?.[1];
        if (preference) {
          await memory.remember({
            sessionId,
            memoryType: "facts",
            content: `User prefers ${preference}`,
            importance: 0.8
          });
        }
      }
      
      return Response.json({ 
        response,
        memoriesRetrieved: relevantMemories.length 
      });
    }

    Step 4: Using the List and Forget Operations

    // List all memories for a user (admin functionality)
    async function listUserMemories(sessionId: string, env: Env) {
      const memories = await memory.list({
        sessionId,
        memoryTypes: ["facts", "events", "instructions", "tasks"],
        limit: 100
      });
      
      return memories;
    }
    
    // Forget specific outdated information
    async function clearOutdatedMemories(sessionId: string, env: Env) {
      // Get all memories first
      const memories = await memory.list({ sessionId, limit: 50 });
      
      // Find and remove outdated ones
      for (const mem of memories) {
        if (mem.createdAt && Date.now() - mem.createdAt > 90 * 24 * 60 * 60 * 1000) {
          // Older than 90 days
          await memory.forget({
            sessionId,
            memoryId: mem.id
          });
        }
      }
    }
    
    // Task management with memory
    async function completeTask(taskId: string, sessionId: string, env: Env) {
      // Mark task as completed
      await memory.remember({
        sessionId,
        memoryType: "events",
        content: `Task ${taskId} completed`,
        importance: 0.6
      });
      
      // Remove from active tasks
      await memory.forget({
        sessionId,
        memoryId: taskId
      });
    }
    Common Mistake to Avoid

    Don't store everything in memory. Agent Memory works best when you let the system automatically extract important information (via ingest) and only explicitly store critical facts (via remember). Over-stuffing memory leads to retrieval noise and slower performance.

    How Retrieval Works

    When you call the recall operation, Agent Memory doesn't just do simple keyword matching. Behind the scenes, five parallel channels fetch what's relevant from different angles:

    • Semantic search — vector-based similarity matching
    • Keyword matching — traditional full-text search
    • Temporal weighting — recent memories ranked higher
    • Importance scoring — explicitly remembered facts ranked higher
    • Type filtering — memories from the right type (facts vs tasks)

    A Reciprocal Rank Fusion algorithm combines the results from all five channels, so the best memories always surface first. This multi-channel approach ensures you get the most relevant context without manual tuning.

    5 Parallel Channels
    4 Memory Types
    RRF Ranking Algo

    Shared Memory for Teams

    One powerful feature of Agent Memory is shared memory capability. This allows teams to share a profile so knowledge learned by one engineer's coding agent is available to all. Imagine a team where everyone's coding assistant knows the team's coding standards, preferred libraries, and project architecture — without each agent having to learn it independently.

    To enable shared memory, you configure a team or organization-level session ID instead of individual user IDs. All agents in the team then query against the same memory store.

    Example: Team Coding Standards Memory

    // Store team coding standards (done once, shared across all agents)
    await memory.remember({
      sessionId: "team-engineering",  // Team-level session
      memoryType: "instructions",
      content: "Use TypeScript for all new projects. Prefer functional components in React. Use 2-space indentation.",
      importance: 1.0  // Maximum importance
    });
    
    // All agents can now recall these standards
    const standards = await memory.recall({
      sessionId: "team-engineering",
      query: "What are the team's React patterns?",
      limit: 3
    });

    Final Verdict

    Cloudflare Agent Memory solves the biggest problem with AI agents today: they forget everything between sessions. With just five operations (ingest, remember, recall, list, forget), four memory types, and smart retrieval, you can build agents that actually remember what they learned. The private beta is open — request access and start building.

    Last Updated: May 06, 2026 | Source: Cloudflare Blog & Developer Documentation (Official Website)

    Frequently Asked Questions

    Cloudflare Agent Memory is a managed service that gives AI agents persistent, retrievable memory across sessions. It extracts information from conversations and makes it available when needed — without filling up the context window. Currently in private beta, it provides five core operations: ingest, remember, recall, list, and forget.
    The five operations are: 1) ingest — automatically extracts and stores memories from conversation turns, 2) remember — lets you explicitly store a single memory, 3) recall — retrieves relevant memories based on a query, 4) list — shows all stored memories, and 5) forget — removes specific memories that are no longer relevant.
    Agent Memory organizes information into four types: Facts (user preferences, personal details), Events (past interactions, completed tasks), Instructions (custom rules, behavior guidelines), and Tasks (pending actions, goals). Each type serves different retrieval purposes.
    Yes! Agent Memory can be accessed via a binding from any Cloudflare Worker, or via REST API for agents running outside of Workers. This follows the same pattern as other Cloudflare developer platform APIs.
    Agent Memory uses five parallel channels to fetch relevant memories: semantic search (vector-based), keyword matching, temporal weighting (recent memories ranked higher), importance scoring, and type filtering. A Reciprocal Rank Fusion algorithm combines results so the best memories always surface first.
    Yes! Agent Memory supports shared memory capability. Teams can configure a team or organization-level session ID instead of individual user IDs, so knowledge learned by one engineer's coding agent is available to all agents in the team.
    Sk Jabedul Haque

    Sk Jabedul Haque

    Founder & Chief Editor

    Building India's most trusted finance education platform — simplifying news, calculators, and market trends so anyone can understand and invest confidently.