Voice Tutor
TutorQ's voice tutor provides real-time speech-to-speech tutoring powered by AWS Nova Sonic. Students speak naturally, and the AI responds with voice — grounded in the course materials.
How It Works
Student speaks a question
↓
Audio streamed to TutorQ (16kHz PCM)
↓
Nova Sonic processes speech and decides whether to search course materials
↓
If needed: RAG search finds relevant passages from uploaded materials
↓
Nova Sonic generates a spoken response using the found content
↓
Audio streamed back to student (24kHz)Key Features
Curriculum-Grounded
Every answer comes from the professor's uploaded materials. The AI uses a search_course_materials tool to find relevant passages before responding. No hallucinations from generic training data.
Voice-First
Students speak naturally — no typing. The AI responds with voice. This is especially valuable for:
- Students who struggle with written text
- Mobile users
- Hands-free study sessions
- Students with reading difficulties
Adaptive Teaching
Six discussion modes that adapt to how the student wants to learn — from direct explanation to Socratic dialogue to guided reading.
Barge-In Support
Students can interrupt the AI mid-response (barge-in), just like in a real conversation. The AI stops immediately and listens.
Multi-Language
Voice selection adapts to language. Currently supported: English (matthew, tiffany, gregory, stephen voices) and Hindi.
Technical Details
| Feature | Specification |
|---|---|
| Speech model | AWS Nova Sonic (amazon.nova-2-sonic-v1:0) |
| Input audio | 16kHz, 16-bit PCM, mono |
| Output audio | 24kHz, 16-bit PCM, mono |
| Latency | ~1-2 seconds for first response |
| Max session | 10 minutes (configurable) |
| VAD | Built-in voice activity detection |
| Protocol | WebSocket (bidirectional streaming) |
Session Flow
- Connect — WebSocket connection established
- Init — Client sends init message with course ID and language
- Session start — Nova Sonic session created with system prompt and course context
- Audio loop — Bidirectional audio streaming (student ↔ AI)
- Tool calls — AI searches course materials when relevant
- End — Client sends end_session, metrics saved to database