# Long-Term Memory Design This document describes the design of the long-term memory system for the Atlas voice agent. ## Overview The memory system stores persistent facts about the user, their preferences, routines, and important information that should be remembered across conversations. ## Goals 1. **Persistent Storage**: Facts survive across sessions and restarts 2. **Fast Retrieval**: Quick lookup of relevant facts during conversations 3. **Confidence Scoring**: Track how certain we are about each fact 4. **Source Tracking**: Know where each fact came from 5. **Privacy**: Memory is local-only, no external storage ## Data Model ### Memory Entry Schema ```python { "id": "uuid", "category": "personal|family|preferences|routines|facts", "key": "fact_key", # e.g., "favorite_color", "morning_routine" "value": "fact_value", # e.g., "blue", "coffee at 7am" "confidence": 0.0-1.0, # How certain we are "source": "conversation|explicit|inferred", "timestamp": "ISO8601", "last_accessed": "ISO8601", "access_count": 0, "tags": ["tag1", "tag2"], # For categorization "context": "additional context about the fact" } ``` ### Categories - **personal**: Personal facts (name, age, location, etc.) - **family**: Family member information - **preferences**: User preferences (favorite foods, colors, etc.) - **routines**: Daily/weekly routines - **facts**: General facts about the user ## Storage ### SQLite Database **Table: `memory`** ```sql CREATE TABLE memory ( id TEXT PRIMARY KEY, category TEXT NOT NULL, key TEXT NOT NULL, value TEXT NOT NULL, confidence REAL DEFAULT 0.5, source TEXT NOT NULL, timestamp TEXT NOT NULL, last_accessed TEXT, access_count INTEGER DEFAULT 0, tags TEXT, -- JSON array context TEXT, UNIQUE(category, key) ); ``` **Indexes**: - `(category, key)` - For fast lookups - `category` - For category-based queries - `last_accessed` - For relevance ranking ## Memory Write Policy ### When Memory Can Be Written 1. **Explicit User Statement**: "My favorite color is blue" - Confidence: 1.0 - Source: "explicit" 2. **Inferred from Conversation**: "I always have coffee at 7am" - Confidence: 0.7-0.9 - Source: "inferred" 3. **Confirmed Inference**: User confirms inferred fact - Confidence: 0.9-1.0 - Source: "confirmed" ### When Memory Should NOT Be Written - Uncertain information (confidence < 0.5) - Temporary information (e.g., "I'm tired today") - Work-related information (for family agent) - Information from unreliable sources ## Retrieval Strategy ### Query Types 1. **By Key**: Direct lookup by category + key 2. **By Category**: All facts in a category 3. **By Tag**: Facts with specific tags 4. **Semantic Search**: Search by value/content (future: embeddings) ### Relevance Ranking Facts are ranked by: 1. **Recency**: Recently accessed facts are more relevant 2. **Confidence**: Higher confidence facts preferred 3. **Access Count**: Frequently accessed facts are important 4. **Category Match**: Category relevance to query ### Integration with LLM Memory facts are injected into prompts as context: ``` ## User Memory Personal Facts: - Favorite color: blue (confidence: 1.0, source: explicit) - Morning routine: coffee at 7am (confidence: 0.8, source: inferred) Preferences: - Prefers metric units (confidence: 0.9, source: explicit) ``` ## API Design ### Write Operations ```python # Store explicit fact memory.store( category="preferences", key="favorite_color", value="blue", confidence=1.0, source="explicit" ) # Store inferred fact memory.store( category="routines", key="morning_routine", value="coffee at 7am", confidence=0.8, source="inferred" ) ``` ### Read Operations ```python # Get specific fact fact = memory.get(category="preferences", key="favorite_color") # Get all facts in category facts = memory.get_by_category("preferences") # Search facts facts = memory.search(query="coffee", category="routines") ``` ### Update Operations ```python # Update confidence memory.update_confidence(id="uuid", confidence=0.9) # Update value memory.update_value(id="uuid", value="new_value", confidence=1.0) # Delete fact memory.delete(id="uuid") ``` ## Privacy Considerations 1. **Local Storage Only**: All memory stored locally in SQLite 2. **No External Sync**: No cloud backup or sync 3. **User Control**: Users can view, edit, and delete all memory 4. **Category Separation**: Work vs family memory separation 5. **Deletion Tools**: Easy memory deletion and export ## Future Enhancements 1. **Embeddings**: Semantic search using embeddings 2. **Memory Summarization**: Compress old facts into summaries 3. **Confidence Decay**: Reduce confidence over time if not accessed 4. **Memory Conflicts**: Handle conflicting facts 5. **Memory Validation**: Periodic validation of stored facts ## Integration Points 1. **LLM Prompts**: Inject relevant memory into system prompts 2. **Conversation Manager**: Track when facts are mentioned 3. **Tool Calls**: Tools can read/write memory 4. **Admin UI**: View and manage memory