Input Handler (Text / Voice)
The Input Handler is the first processing layer in Billx-Agent’s pipeline. It accepts user input in either text or audio form and prepares it for the AI Query Engine.
🎯 What It Does
Accepts input via:
POST /chat→ Text promptPOST /audio-chat→ Voice or text
Automatically detects the input type:
If audio is provided, it uses ElevenLabs Speech-to-Text (STT) to transcribe it.
If text is provided directly, transcription is skipped.
🧼 Pre-Processing
Before passing the prompt to the AI, the input handler performs:
Whitespace and formatting cleanup
Punctuation standardization
Validation to ensure it's a supported query format
🎤 Audio Input Flow
User uploads a
.wavor.mp3audio fileElevenLabs STT converts speech → text
Transcribed prompt is treated like a regular
/chatinput
✏️ Example Input Flow
Text
"Top 5 products by sales"
Sent directly to AI for SQL generation
Audio
Spoken query: same as above
Transcribed → cleaned → forwarded
✔ This abstraction makes it possible to switch seamlessly between text and voice inputs with no extra code on the client side.
Last updated