atlas/tickets/backlog/TICKET-010_streaming-asr-service.md
ilia 7c633a02ed Initialize project structure with essential files and documentation
- Added .cursorrules for project guidelines and context
- Created README.md for project overview and goals
- Established ARCHITECTURE.md for architectural documentation
- Set up tickets directory with initial ticket management files
- Included .gitignore to manage ignored files and directories

This commit lays the foundation for the Atlas project, ensuring a clear structure for development and collaboration.
2026-01-05 20:09:44 -05:00

52 lines
1.4 KiB
Markdown

# Ticket: Implement Streaming Audio Capture → ASR Service
## Ticket Information
- **ID**: TICKET-010
- **Title**: Implement Streaming Audio Capture → ASR Service
- **Type**: Feature
- **Priority**: High
- **Status**: Backlog
- **Track**: Voice I/O
- **Milestone**: Milestone 2 - Voice Chat MVP
- **Created**: 2024-01-XX
## Description
Build streaming ASR service:
- Implement audio capture (GStreamer/ffmpeg or browser getUserMedia)
- Create WebSocket endpoint for audio streaming
- Integrate selected ASR engine
- Stream audio chunks to ASR and return text segments
- Handle start/stop based on wake-word events
## Acceptance Criteria
- [ ] Audio capture working (from mic or WebSocket)
- [ ] WebSocket endpoint for audio streaming
- [ ] ASR engine integrated
- [ ] Text segments returned with timestamps
- [ ] Handles wake-word start/stop events
- [ ] Streaming latency acceptable (< 2s end-to-end)
## Technical Details
Implementation:
- Audio capture: PyAudio, GStreamer, or browser MediaRecorder
- WebSocket server for real-time streaming
- ASR processing: faster-whisper or selected engine
- Return format: JSON with text, timestamps, confidence
## Dependencies
- TICKET-009 (ASR engine selection)
- Wake-word event flow defined (from TICKET-006)
## Related Files
- `home-voice-agent/asr/` (to be created)
## Notes
Can run in parallel with TTS and LLM work. Needs wake-word integration for start/stop.