Quick Start

Billix-Agent lets you query your database using natural language or voice input, extract structured data from invoices using AI, and interact with everything via secure, production-grade REST APIs.

This guide walks you through:

  • Authenticating

  • Connecting your database

  • Making your first query (text or voice)

  • Uploading and extracting invoice data


⚡ Quick Start

1. 🆕 Get API Access

To start using Billix-Agent:

  • Register via POST /sign-up

  • Sign in via POST /sign-in to receive your JWT token

  • Use the token for all secured requests:

Authorization: Bearer <your_token>

Billix-Agent supports external PostgreSQL databases. Provide the connection string in your API call.

Example:


3. 💬 Make Your First Text Query

Endpoint: POST /chat Use natural language to generate and run SQL queries.

Request:

Response:


4. 🎙️ Make Your First Voice Query

Endpoint: POST /audio-chat Upload a voice file (WAV/MP3) or send plain text for processing.

Request (form-data):

Field
Type
Required
Description

audio

file

optional

Voice input (WAV/MP3)

text

string

optional

Plain text alternative

Response:


5. 📄 AI Invoice Data Extraction

Billix-Agent also lets you upload invoices and get structured data extracted using AI.

A. 🧾 Upload PDF or Image

Endpoint: POST /extract/pdf-image-text Uploads your invoice and extracts plain text from the file.

B. 📊 Extract Structured Invoice Data

Endpoint: POST /extract/invoice Send raw text and receive structured JSON output.

Request:

Response:


🧠 Concepts & Architecture

🔍 Natural Language to SQL

When you send a prompt like “Show revenue this quarter,” Billix-Agent:

  • Parses your database schema

  • Converts your prompt into SQL

  • Executes it on your provided DB

  • Refines the result into human-readable text

Powered by Google Gemini for natural language understanding.


🛠 Tool-Based Querying (Templates)

Billix-Agent uses parameterized SQL templates called tools.

Example:

The system automatically fills parameters like {start} and {end} based on your prompt.


🎙️ Voice Interface (via ElevenLabs)

  • STT (speech-to-text): Converts voice into a prompt

  • TTS (text-to-speech): Converts response into audio

Voice flow:

  1. You upload an audio file to /audio-chat

  2. It's transcribed and processed like text

  3. You receive both a text and audio response


Last updated