ilia/atlas

ilia bdbf09a9ac feat: Implement voice I/O services (TICKET-006, TICKET-010, TICKET-014)

✅ TICKET-006: Wake-word Detection Service
- Implemented wake-word detection using openWakeWord
- HTTP/WebSocket server on port 8002
- Real-time detection with configurable threshold
- Event emission for ASR integration
- Location: home-voice-agent/wake-word/

✅ TICKET-010: ASR Service
- Implemented ASR using faster-whisper
- HTTP endpoint for file transcription
- WebSocket endpoint for streaming transcription
- Support for multiple audio formats
- Auto language detection
- GPU acceleration support
- Location: home-voice-agent/asr/

✅ TICKET-014: TTS Service
- Implemented TTS using Piper
- HTTP endpoint for text-to-speech synthesis
- Low-latency processing (< 500ms)
- Multiple voice support
- WAV audio output
- Location: home-voice-agent/tts/

✅ TICKET-047: Updated Hardware Purchases
- Marked Pi5 kit, SSD, microphone, and speakers as purchased
- Updated progress log with purchase status

📚 Documentation:
- Added VOICE_SERVICES_README.md with complete testing guide
- Each service includes README.md with usage instructions
- All services ready for Pi5 deployment

🧪 Testing:
- Created test files for each service
- All imports validated
- FastAPI apps created successfully
- Code passes syntax validation

🚀 Ready for:
- Pi5 deployment
- End-to-end voice flow testing
- Integration with MCP server

Files Added:
- wake-word/detector.py
- wake-word/server.py
- wake-word/requirements.txt
- wake-word/README.md
- wake-word/test_detector.py
- asr/service.py
- asr/server.py
- asr/requirements.txt
- asr/README.md
- asr/test_service.py
- tts/service.py
- tts/server.py
- tts/requirements.txt
- tts/README.md
- tts/test_service.py
- VOICE_SERVICES_README.md

Files Modified:
- tickets/done/TICKET-047_hardware-purchases.md

Files Moved:
- tickets/backlog/TICKET-006_prototype-wake-word-node.md → tickets/done/
- tickets/backlog/TICKET-010_streaming-asr-service.md → tickets/done/
- tickets/backlog/TICKET-014_tts-service.md → tickets/done/

2026-01-12 22:22:38 -05:00

5.0 KiB

Raw Blame History

Long-Term Memory Design

This document describes the design of the long-term memory system for the Atlas voice agent.

Overview

The memory system stores persistent facts about the user, their preferences, routines, and important information that should be remembered across conversations.

Goals

Persistent Storage: Facts survive across sessions and restarts
Fast Retrieval: Quick lookup of relevant facts during conversations
Confidence Scoring: Track how certain we are about each fact
Source Tracking: Know where each fact came from
Privacy: Memory is local-only, no external storage

Data Model

Memory Entry Schema

{
    "id": "uuid",
    "category": "personal|family|preferences|routines|facts",
    "key": "fact_key",  # e.g., "favorite_color", "morning_routine"
    "value": "fact_value",  # e.g., "blue", "coffee at 7am"
    "confidence": 0.0-1.0,  # How certain we are
    "source": "conversation|explicit|inferred",
    "timestamp": "ISO8601",
    "last_accessed": "ISO8601",
    "access_count": 0,
    "tags": ["tag1", "tag2"],  # For categorization
    "context": "additional context about the fact"
}

Storage

SQLite Database

Table: memory

CREATE TABLE memory (
    id TEXT PRIMARY KEY,
    category TEXT NOT NULL,
    key TEXT NOT NULL,
    value TEXT NOT NULL,
    confidence REAL DEFAULT 0.5,
    source TEXT NOT NULL,
    timestamp TEXT NOT NULL,
    last_accessed TEXT,
    access_count INTEGER DEFAULT 0,
    tags TEXT,  -- JSON array
    context TEXT,
    UNIQUE(category, key)
);

Indexes:

(category, key) - For fast lookups
category - For category-based queries
last_accessed - For relevance ranking

Memory Write Policy

When Memory Can Be Written

Explicit User Statement: "My favorite color is blue"
- Confidence: 1.0
- Source: "explicit"
Inferred from Conversation: "I always have coffee at 7am"
- Confidence: 0.7-0.9
- Source: "inferred"
Confirmed Inference: User confirms inferred fact
- Confidence: 0.9-1.0
- Source: "confirmed"

When Memory Should NOT Be Written

Uncertain information (confidence < 0.5)
Temporary information (e.g., "I'm tired today")
Work-related information (for family agent)
Information from unreliable sources

Retrieval Strategy

Query Types

By Key: Direct lookup by category + key
By Category: All facts in a category
By Tag: Facts with specific tags
Semantic Search: Search by value/content (future: embeddings)

Relevance Ranking

Facts are ranked by:

Recency: Recently accessed facts are more relevant
Confidence: Higher confidence facts preferred
Access Count: Frequently accessed facts are important
Category Match: Category relevance to query

Integration with LLM

Memory facts are injected into prompts as context:

## User Memory

Personal Facts:
- Favorite color: blue (confidence: 1.0, source: explicit)
- Morning routine: coffee at 7am (confidence: 0.8, source: inferred)

Preferences:
- Prefers metric units (confidence: 0.9, source: explicit)

API Design

Write Operations

# Store explicit fact
memory.store(
    category="preferences",
    key="favorite_color",
    value="blue",
    confidence=1.0,
    source="explicit"
)

# Store inferred fact
memory.store(
    category="routines",
    key="morning_routine",
    value="coffee at 7am",
    confidence=0.8,
    source="inferred"
)

Read Operations

# Get specific fact
fact = memory.get(category="preferences", key="favorite_color")

# Get all facts in category
facts = memory.get_by_category("preferences")

# Search facts
facts = memory.search(query="coffee", category="routines")

Update Operations

# Update confidence
memory.update_confidence(id="uuid", confidence=0.9)

# Update value
memory.update_value(id="uuid", value="new_value", confidence=1.0)

# Delete fact
memory.delete(id="uuid")

Privacy Considerations

Local Storage Only: All memory stored locally in SQLite
No External Sync: No cloud backup or sync
User Control: Users can view, edit, and delete all memory
Category Separation: Work vs family memory separation
Deletion Tools: Easy memory deletion and export

Future Enhancements

Embeddings: Semantic search using embeddings
Memory Summarization: Compress old facts into summaries
Confidence Decay: Reduce confidence over time if not accessed
Memory Conflicts: Handle conflicting facts
Memory Validation: Periodic validation of stored facts

Integration Points

LLM Prompts: Inject relevant memory into system prompts
Conversation Manager: Track when facts are mentioned
Tool Calls: Tools can read/write memory
Admin UI: View and manage memory

5.0 KiB Raw Blame History