Ettore Messina’s
Tech Blog

Chat with your data #2: Agentic Application for querying databases using SQL

Chat with your data #2 Agentic Application for querying databases using SQL - ettoremessina.tech

Introduction

As AI-powered conversational agents evolve, their integration with structured databases has become a crucial innovation. The ability to query databases using natural language simplifies data retrieval for non-technical users while enhancing productivity. This article explores a Python script that employs LangChainOpenAI’s GPT modelsOllama, and SQLAlchemy to create a chatbot that interacts with an SQL database.

This guide breaks down the script, explaining its key components and showcasing its practical applications.

Overview of the Script

The script creates an interactive SQL agent that allows users to query a database of quality control results using natural language. It provides support for different LLM connectors (openai and ollama) and offers flexibility through command-line parameters.

Key features include:

  • Dynamic database setup using SQLite in-memory storage.
  • Integration with OpenAI’s ChatGPT or Ollama for conversational data retrieval.
  • SQL query generation through LangChain’s create_sql_agent.
  • Interactive chat interface to facilitate real-time querying.
  • Verbose logging and error handling for debugging and enhanced user experience.

Breaking Down the Code

1. Importing Required Libraries

from langchain.agents import create_sql_agent
from langchain.agents.agent_toolkits import SQLDatabaseToolkit
from langchain.agents.agent_types import AgentType
from langchain.utilities import SQLDatabase
from langchain_openai import ChatOpenAI
from langchain_community.llms import Ollama
from langchain.callbacks.tracers import ConsoleCallbackHandler
from sqlalchemy import MetaData, Column, Integer, String, Table, Date
from sqlalchemy import create_engine, insert
import pandas as pd
import argparse
from datetime import datetime

This script makes use of:

OpenAI’s GPT models and Ollama to process natural language inputs.
LangChain’s SQL integration for creating query-driven agents.
SQLAlchemy for database schema creation and manipulation.
Argparse to handle user-defined execution parameters.
Pandas for loading structured data from a CSV file.

2. ANSI Color Coding for Terminal Output

class bcolors:
    HEADER = '\033[95m'
    OKBLUE = '\033[94m'
    OKGREEN = '\033[92m'
    WARNING = '\033[93m'
    FAIL = '\033[91m'
    ENDC = '\033[0m'

This class defines colored outputs for terminal messages, improving readability.

3. Creating and Populating the SQL Database

def prepare_database(engine):
    metadata_obj = MetaData()

    table_qc = Table(
        "QualityChecks",
        metadata_obj,
        Column("ID", Integer, primary_key=True, autoincrement=True),
        Column("Article_Code", String(20), nullable=False),
        Column("Description", String(100), nullable=False),
        Column("Category", String(50), nullable=False),
        Column("Date_of_Check", Date, nullable=False),
        Column("Quantity_in_First_Class", Integer, nullable=False),
        Column("Quantity_in_Second_Class", Integer, nullable=False),
        Column("Quantity_Rejected", Integer, nullable=False),
    )

    metadata_obj.create_all(engine)

This function creates a table QualityChecks in an SQLite in-memory database, with fields for article codes, descriptions, categories, and inspection results.

    df = pd.read_csv("../datasets/quality_assurance.csv")
    for record in df.itertuples():
        date_of_check = datetime.strptime(record[4], '%Y-%m-%d')
        stmt = insert(table_qc).values(
            Article_Code=record[1],
            Description=record[2],
            Category=record[3],
            Date_of_Check=date_of_check,
            Quantity_in_First_Class=record[5],
            Quantity_in_Second_Class=record[6],
            Quantity_Rejected=record[7]
        )
        with engine.begin() as conn:
            conn.execute(stmt)

The script then loads quality assurance data from a CSV file and populates the database.

4. Command-Line Configuration

parser = argparse.ArgumentParser(description='Chat with an SQL agent')
parser.add_argument('-v', '--verbose', action='store_true', help='Enable verbose output')
parser.add_argument('-c', '--connector', type=str, required=False, default='openai', help='Connector to use: openai|ollama')
parser.add_argument('-m', '--model', type=str, required=False, default='', help='Model name to use')
parser.add_argument('-ou', '--ollamaurl', type=str, required=False, default='http://localhost:11434', help='Ollama url')
args = parser.parse_args()

These arguments allow users to specify:

  • -v enables verbose mode.
  • -c allows switching between OpenAI (openai) and locally hosted models (ollama).
  • -m specifies the model to use
  • -ou specifies the Ollama server URL

5. Setting Up the Agent

engine = create_engine("sqlite:///:memory:")
prepare_database(engine)
db = SQLDatabase(engine)

The SQLite in-memory database is initialized and populated.

agent = create_sql_agent(
    llm=model,
    toolkit=SQLDatabaseToolkit(db=db, llm=model),
    verbose=True,
    agent_type=AgentType.ZERO_SHOT_REACT_DESCRIPTION,
    max_iterations=50,
    handle_parsing_errors=True,
    return_intermediate_steps=True
)

An SQL agent is created using create_sql_agent, allowing natural language queries over the database.

6. Interactive Querying Loop

print("Chat with me (ctrl+D to quit)!")

while True:
    try:
        question = input("human: ")
        answer = agent.invoke(
            "system: " + system_msg + '\n' + "user:" + question,
            config=config
        )
        print("agent: ", answer['output'])
    except EOFError:
        print("\nGoodbye!")
        break
    except Exception as e:
        print(f"{bcolors.FAIL}{type(e)}")
        print(f"{bcolors.FAIL}{e.args}")
        print(f"{bcolors.FAIL}{e}")
        print(f"{bcolors.ENDC}")

This loop continuously processes user queries and returns database-driven responses.

Examples of usage

  • python qa_sql_agent.py
  • python qa_sql_agent.py -h
  • python qa_sql_agent.py -v
  • python qa_sql_agent.py -c openai
  • python qa_sql_agent.py -c openai -m gpt-4o-mini
  • python qa_sql_agent.py -c ollama
  • python qa_sql_agent.py -c ollama -m llama3.2 -v

Examples of questions

By launching the script, when the prompt “human:” appears you can make a natural language query on the pandas dataframe. Here are some examples of queries:

  • What is the percentage of rejected articles?
  • What is the percentage of rejected articles of Educational category?
  • What are the distinct categories?
  • What is the worst category?
  • What is the range of dates of checks?
  • Are there articles specialized for kids? If they are, what are them?
  • What is the total quantity for articles of the Education category?
  • What is the ration between first class and 2nd class for – Education category?
  • What is the most recent check for Educational category?
  • How many articles of category Vehicles have defects?

Conclusion

This script showcases how SQL databases can be queried using natural language via an agentic application. By leveraging LangChain, OpenAI, and SQLAlchemy, the application bridges the gap between structured data and conversational AI, making data retrieval more accessible and efficient. Whether used for business analytics, customer support, or data-driven decision-making, this tool provides an intuitive way to interact with SQL databases.

Download of the complete code

The complete code is available at GitHub.
These materials are distributed under MIT license; feel free to use, share, fork and adapt these materials as you see fit.
Also please feel free to submit pull-requests and bug-reports to this GitHub repository or contact me on my social media channels available on the contact page.

Related Reads

AeyeGuard_mcp
Dec 30 2025
AeyeGuard_mcp: a reliable and capable http mcp local service static code analyzer powered by a local LLM written in Python
Introduction Traditional SAST (Static Application Security Testing) tools have long struggled with detecting complex vulnerabilities due to their...
AeyeGuard_cmd
Oct 24 2025
AeyeGuard_cmd: a reliable and capable command line static code analyzer powered by a local LLM
Introduction In modern software development, Static Application Security Testing (SAST) plays a vital role in identifying vulnerabilities before...
Chat with your docs #1 Simple RAG with Chroma & Ollama for AI - ettoremessina.tech
Mar 01 2025
Chat with your docs #1: Simple RAG with Chroma & Ollama for AI
Introduction Retrieval-Augmented Generation (RAG) represents a significant advancement in the field of artificial intelligence, combining the...
Chat with your data #1 Agentic Application for querying pandas DataFrames - ettoremessina.tech
Feb 17 2025
Chat with your data #1: Agentic Application for querying pandas DataFrames
Introduction With the rise of large language models (LLMs) and their integration into various domains, one of the most exciting use cases is...