- Enhanced `ARCHITECTURE.md` with details on LLM models for work (Llama 3.1 70B Q4) and family agents (Phi-3 Mini 3.8B Q4). - Introduced new documents: - `ASR_EVALUATION.md` for ASR engine evaluation and selection. - `HARDWARE.md` outlining hardware requirements and purchase plans. - `IMPLEMENTATION_GUIDE.md` for Milestone 2 implementation steps. - `LLM_CAPACITY.md` assessing VRAM and context window limits. - `LLM_MODEL_SURVEY.md` surveying open-weight LLM models. - `LLM_USAGE_AND_COSTS.md` detailing LLM usage and operational costs. - `MCP_ARCHITECTURE.md` describing the Model Context Protocol architecture. - `MCP_IMPLEMENTATION_SUMMARY.md` summarizing MCP implementation status. These updates provide comprehensive guidance for the next phases of development and ensure clarity in project documentation.
1.7 KiB
1.7 KiB
Ticket: Select ASR Engine and Target Hardware
Ticket Information
- ID: TICKET-009
- Title: Select ASR Engine and Target Hardware
- Type: Research
- Priority: High
- Status: Done
- Track: Voice I/O
- Milestone: Milestone 1 - Survey & Architecture
- Created: 2024-01-XX
Description
Decide on ASR (Automatic Speech Recognition) engine and deployment:
- Evaluate options: faster-whisper, Whisper.cpp, etc.
- Decide deployment: faster-whisper on 4080, CPU-only on small box, or shared
- Consider model size vs latency trade-offs
- Document hardware requirements
Acceptance Criteria
- ASR engine selected: faster-whisper (primary)
- Target hardware decided: RTX 4080 (primary) or CPU always-on node (alternative)
- Model size selected: small (or medium if GPU headroom available)
- Latency requirements documented (< 2s target)
- Decision recorded in architecture docs
Technical Details
Considerations:
- faster-whisper on 4080: Lower latency, higher quality
- CPU-only on small box: Lower cost, higher latency
- Shared deployment: Resource contention considerations
- Model sizes: tiny/small/medium/base for latency/quality trade-off
Dependencies
- TICKET-004 (architecture) - helpful context
Related Files
docs/ASR_EVALUATION.md(to be created)ARCHITECTURE.md
Notes
Can run in parallel with TTS and LLM work. Needs wake-word event flow defined for when to start/stop capture.
Progress Log
- 2024-01-XX - ASR evaluation document created (
docs/ASR_EVALUATION.md) - 2024-01-XX - Selected: faster-whisper with small model
- 2024-01-XX - Deployment: RTX 4080 (primary) or CPU always-on node (alternative)
- 2024-01-XX - Ready for implementation (TICKET-010)