Voice Input
Voice queries are a powerful feature of Billx-Agent, enabling users to speak naturally and receive database results without typing. To ensure accurate transcription and reliable responses, follow these best practices.
✅ DO
✔️ Speak Clearly and Naturally
Use a steady speaking pace
Avoid trailing off or mumbling
Speak as if you’re giving a command, like:
“Show me top 10 products by sales in the last month”
✔️ Use Concise Voice Prompts
Ideal voice prompts are under 15 seconds
Long, multi-part queries can reduce transcription accuracy
🎯 Good: “How many new users signed up this week?”
⚠️ Risky: “Give me users by signup date but only if they purchased in June and are from Canada or the US and they cancelled later”
✔️ Use High-Quality Audio
Preferred formats:
.mp3or.wav(mono)Sample rate: 16kHz or higher
Limit background noise and echo
📱 On mobile? Use the built-in mic and speak close to the device.
✔️ Test Voice Equivalents of Text Prompts
Make sure your spoken prompts resemble the same structure as effective text prompts:
Voice: “Orders over $1,000 in March 2024” Text equivalent:
"Show orders over $1000 from March 2024"
❌ AVOID
❌ Using Filler Words
Avoid words like:
“Uh...”, “Okay so like...”, “I guess I want to maybe...”
These confuse the transcription and degrade the SQL match.
❌ Uploading Non-Speech Audio
Do not upload music, multi-speaker podcasts, or unclear voice recordings
The STT engine is optimized for single-speaker, query-style input
❌ Giving Update or Instructional Commands
❌ “Remove all the failed transactions from the last quarter” ❌ “Delete accounts that are inactive”
Voice input is treated as read-only intent and will not generate destructive SQL.
🔁 Voice + Text Option
You can also send a text override along with your audio via /audio-chat, in case speech fails:
{
"text": "Top 5 customers by purchase value last year"
}🧠 Voice commands are powerful — and with the right phrasing and audio quality, your users can get spoken answers from live data in seconds.
Last updated