Skip to main content

Module 10: Memory

Without memory, every interaction starts from scratch. With it, the agent builds institutional knowledge that compounds over time.

Two Types of Memory​

Short-Term Memory (Within a Session)​

The working context for the current task. The agent remembers that it already extracted payment terms, so it doesn't re-parse the document when the compliance check needs them.

This is essentially the conversation history and intermediate results passed between steps in a prompt chain or between an orchestrator and its subagents.

Long-Term Memory (Across Sessions)​

Persistent knowledge that survives beyond the current task. The agent remembers that this vendor had compliance issues flagged three months ago. It remembers that the legal team always pushes back on indemnification caps below $500K.

Long-term memory lives in an external store (vector database, DynamoDB, or both) and gets retrieved when relevant.

In Our Contract Workflow​

Memory TypeExampleStorage
Short-termExtracted clauses from current contractIn-context (passed between steps)
Long-termPast reviews of this vendorVector database (OpenSearch)
Long-termLegal team's common feedback patternsDynamoDB (structured)
Long-termOverride history (when humans corrected the agent)DynamoDB (structured)

When a new contract from Vendor X arrives, the agent retrieves:

  • Last 3 review summaries for Vendor X
  • Any flagged issues from past reviews
  • Legal team's notes on this vendor

That context goes into the compliance agent's prompt alongside the current contract. The agent doesn't just analyze the contract in isolation. It analyzes it with full institutional context.

Memory Architecture​

Current Task Context (short-term)
+
Vector Search Results (long-term, semantic)
+
Structured Query Results (long-term, exact)
↓
Agent Prompt

Vector memory is good for semantic retrieval: "Find past reviews where liability was an issue." It uses embeddings and similarity search.

Structured memory is good for exact lookups: "Get all reviews for Vendor X in the last 12 months." It uses traditional database queries.

Most production systems use both.

Memory Management​

Memory isn't free. More context means longer prompts, higher costs, and slower responses. You need to manage what gets remembered and what gets forgotten.

Relevance filtering - Don't dump every past interaction into the prompt. Use RAG (Module 6) to retrieve only what's relevant to the current task.

Summarization - Instead of storing full conversation histories, periodically summarize them. A 50-page interaction history becomes a 2-paragraph summary of key decisions and outcomes.

TTL (Time-to-Live) - Some memories expire. A vendor's risk profile from 3 years ago is less relevant than one from 3 months ago. Set expiration policies.

Priority ranking - Not all memories are equal. Human override events are high-priority memories (the agent was wrong and should learn from it). Routine successful completions are low priority.

Why Memory Matters​

People leave companies. When the attorney who reviewed every contract for 5 years moves on, their knowledge of vendor patterns, common issues, and negotiation history leaves with them.

An agent with well-structured long-term memory retains that institutional knowledge. The new attorney gets the same context the experienced one had.

What's Next​

You've built an agent that connects to data, reasons through steps, collaborates with other agents, and remembers across sessions. How do you know it's working? In Module 11: Observability, we cover how to monitor and debug agents in production.

Premium

Memory Lab

Build short-term and long-term memory systems using DynamoDB and OpenSearch Serverless with summarization, TTL policies, and relevance-based retrieval.