- Added .cursorrules for project guidelines and context - Created README.md for project overview and goals - Established ARCHITECTURE.md for architectural documentation - Set up tickets directory with initial ticket management files - Included .gitignore to manage ignored files and directories This commit lays the foundation for the Atlas project, ensuring a clear structure for development and collaboration.
1.4 KiB
1.4 KiB
Ticket: Implement Streaming Audio Capture → ASR Service
Ticket Information
- ID: TICKET-010
- Title: Implement Streaming Audio Capture → ASR Service
- Type: Feature
- Priority: High
- Status: Backlog
- Track: Voice I/O
- Milestone: Milestone 2 - Voice Chat MVP
- Created: 2024-01-XX
Description
Build streaming ASR service:
- Implement audio capture (GStreamer/ffmpeg or browser getUserMedia)
- Create WebSocket endpoint for audio streaming
- Integrate selected ASR engine
- Stream audio chunks to ASR and return text segments
- Handle start/stop based on wake-word events
Acceptance Criteria
- Audio capture working (from mic or WebSocket)
- WebSocket endpoint for audio streaming
- ASR engine integrated
- Text segments returned with timestamps
- Handles wake-word start/stop events
- Streaming latency acceptable (< 2s end-to-end)
Technical Details
Implementation:
- Audio capture: PyAudio, GStreamer, or browser MediaRecorder
- WebSocket server for real-time streaming
- ASR processing: faster-whisper or selected engine
- Return format: JSON with text, timestamps, confidence
Dependencies
- TICKET-009 (ASR engine selection)
- Wake-word event flow defined (from TICKET-006)
Related Files
home-voice-agent/asr/(to be created)
Notes
Can run in parallel with TTS and LLM work. Needs wake-word integration for start/stop.