Vanna AI/Ollama/DuckDB: Learn to Connect Database and Query in Natural Language with Vanna AI+Ollama and automated Visualization with Plotly, Other Important Tools/Database/LLM in this Article are ChromaDB/Jupyter Notebook/Mistral.

CA Amit Singh

Published in

Free or Open Source software’s

6 min readApr 24, 2024

Vanna AI

GitHub - vanna-ai/vanna: 🤖 Chat with your SQL database 📊. Accurate Text-to-SQL Generation via…

🤖 Chat with your SQL database 📊. Accurate Text-to-SQL Generation via LLMs using RAG 🔄. - vanna-ai/vanna

github.com

Vanna.AI Documentation

At its core, Vanna is a Python package that uses retrieval augmentation to help you generate accurate SQL queries for…

vanna.ai

Chat with your SQL database. Accurate Text-to-SQL Generation via LLMs using RAG

Let Vanna.AI write your SQL for you

The fastest way to get actionable insights from your database just by asking questions

Why Vanna

Open-Source

The Vanna Python package and the various frontend integrations are all open-source. You can run Vanna on your own infrastructure.

High accuracy on complex datasets

Vanna’s capabilities are tied to the training data you give it. More training data means better accuracy for large and complex datasets.

Designed for security

Your database contents are never sent to the LLM unless you specifically enable features that require it. The metadata storage layer only sees schemas, documentation, and queries.

Self learning

As you use Vanna more, your model continuously improves as we augment your training data.

Supports many databases

Snowflake , BigQuery, Postgres, and many others. You can easily make a connector for any database.

Choose your front end

Start in a Jupyter Notebook. Expose to business users via Slackbot, web app, Streamlit app, any other frontend. Even integrate in your web app for customers.

Ollama: Large Language Model Runner.

GitHub - ollama/ollama: Get up and running with Llama 3, Mistral, Gemma, and other large language…

Get up and running with Llama 3, Mistral, Gemma, and other large language models. - ollama/ollama

github.com

Ollama

Get up and running with large language models.

ollama.com

Docker

Edit description

hub.docker.com

GitHub - ollama/ollama-python: Ollama Python library

Ollama Python library. Contribute to ollama/ollama-python development by creating an account on GitHub.

github.com

GitHub - ollama/ollama-js: Ollama JavaScript library

Ollama JavaScript library. Contribute to ollama/ollama-js development by creating an account on GitHub.

github.com

ChromaDB:

GitHub - chroma-core/chroma: the AI-native open-source embedding database

the AI-native open-source embedding database. Contribute to chroma-core/chroma development by creating an account on…

github.com

the AI-native open-source embedding database

the AI-native open-source embedding databasewww.trychroma.com

the AI-native open-source embedding database

Simple: Fully-typed, fully-tested, fully-documented == happiness

Integrations: LangChain (python and js), LlamaIndex and more soon

Dev, Test, Prod: the same API that runs in your python notebook, scales to your cluster

Feature-rich: Queries, filtering, density estimation and more

Free & Open Source: Apache 2.0 Licensed

Use case: Chat

For example, the “Chat your data” use case:

1. Add documents to your database. You can pass in your own embeddings, embedding function, or let Chroma embed them for you.

2. Query relevant documents with natural language.

3. Compose documents into the context window of an LLM like GPT3 for additional summarisation or analysis.

An in-process SQL OLAP database management system

DuckDB is an in-process SQL OLAP database management system. Simple, feature-rich, fast & open source.

duckdb.org

GitHub - duckdb/duckdb: DuckDB is an in-process SQL OLAP Database Management System

DuckDB is an in-process SQL OLAP Database Management System - duckdb/duckdb

github.com

DuckDB is a fast in-process analytical database

DuckDB supports a feature-rich SQL dialect complemented with deep integrations into client APIs

Simple

DuckDB is easy to install and deploy. It has zero external dependencies and runs in-process in its host application or as a single binary.

Portable

DuckDB runs on Linux, macOS, Windows, and all popular hardware architectures. It has idiomatic client APIs for major programming languages.

Feature-rich

DuckDB offers a rich SQL dialect. It can read and write file formats such as CSV, Parquet, and JSON, to and from the local file system and remote endpoints such as S3 buckets.

Fast

DuckDB runs analytical queries at blazing speed thanks to its columnar engine, which supports parallel execution and can process larger-than-memory workloads.

Extensible

DuckDB is extensible by third-party features such as new data types, functions, file formats and new SQL syntax.

Free

DuckDB and its core extensions are open-source under the permissive MIT License.

Project Jupyter

The Jupyter Notebook is a web-based interactive computing platform. The notebook combines live code, equations…

jupyter.org

JupyterLab is the latest web-based interactive development environment for notebooks, code, and data.

Its flexible interface allows users to configure and arrange workflows in data science, scientific computing, computational journalism, and machine learning. A modular design invites extensions to expand and enrich functionality.

Plotly

Plotly's

plotly.com

Plotly’s Python graphing library makes interactive, publication-quality graphs. Examples of how to make line plots, scatter plots, area charts, bar charts, error bars, box plots, histograms, heatmaps, subplots, multiple-axes, polar charts, and bubble charts. Plotly.py is free and open source and you can view the source, report issues or contribute on GitHub

Now here is How to get started.

Step 01: Open Jupyter Notebook and type below command

%pip install 'vanna[chromadb,duckdb]'

Step 02: Now after running first command type below command in Jupyter Notebook.

from vanna.ollama import Ollama
from vanna.chromadb import ChromaDB_VectorStore

Step 03: Now Connect Vanna/Chromadb/Ollama-Mistral Model by typing and running below command in Jupyter Notebook

class MyVanna(ChromaDB_VectorStore, Ollama):
    def __init__(self, config=None):
        ChromaDB_VectorStore.__init__(self, config=config)
        Ollama.__init__(self, config=config)

vn = MyVanna(config={'model': 'mistral'})

Step 04: Now Connect with DuckDB with below command,

vn.connect_to_duckdb(url='abc.duckdb')

Step 05: You can training you data once unless you have added other data with below command.

# The information schema query may need some tweaking depending on your database. This is a good starting point.
df_information_schema = vn.run_sql("SELECT * FROM INFORMATION_SCHEMA.COLUMNS")

# This will break up the information schema into bite-sized chunks that can be referenced by the LLM
plan = vn.get_training_plan_generic(df_information_schema)
plan

# If you like the plan, then uncomment this and run it to train
# vn.train(plan=plan)

Step 06: At any time you can inspect what training data the package is able to reference

training_data = vn.get_training_data()
training_data

Step 07: You can remove training data if there’s obsolete/incorrect information.

vn.remove_training_data(id='1-ddl')

Step 08: Now You can start flask app by typing and running below command in Jupyter Notebook, App will be available at localhost:8084 and start Asking your question in Natural Language.

from vanna.flask import VannaFlaskApp
app = VannaFlaskApp(vn)
app.run()

Step 09: Now Start asking question in Natural Language First it will show results and then generate automated Visualization based on Plotly like show below.

Results

Then Automated Visualization

Step 10: You can ask other question

Answer

Visualization

Other Articles for Vanna AI+Ollama are here

Vanna AI/Ollama/MySQL:Text to SQL & Automated Visualization (AI Powered Copilot for SQL…

Vanna AI/Ollama/MySQL: Learn to Connect Database and Query in Natural Language with Vanna AI+Ollama and automated…

medium.com

Vanna.ai & Ollama:Text to SQL & Automated Visualization (AI Powered Copilot for SQL…

Learn to Connect Databases and Query in Natural Language with Vanna AI+Ollama and automated Visualization with Plotly…

medium.com

Vanna AI & Ollama:Text to SQL & Automated Visualization (AI Powered Copilot for SQL…

Vanna AI/Ollama/PostgreSQL: Learn to Connect Database and Query in Natural Language with Vanna AI+Ollama and automated…

medium.com

Vanna AI/Ollama/DuckDB: Learn to Connect Database and Query in Natural Language with Vanna AI+Ollama and automated Visualization with Plotly, Other Important Tools/Database/LLM in this Article are ChromaDB/Jupyter Notebook/Mistral.

GitHub - vanna-ai/vanna: 🤖 Chat with your SQL database 📊. Accurate Text-to-SQL Generation via…

🤖 Chat with your SQL database 📊. Accurate Text-to-SQL Generation via LLMs using RAG 🔄. - vanna-ai/vanna

Vanna.AI Documentation

At its core, Vanna is a Python package that uses retrieval augmentation to help you generate accurate SQL queries for…

GitHub - ollama/ollama: Get up and running with Llama 3, Mistral, Gemma, and other large language…

Get up and running with Llama 3, Mistral, Gemma, and other large language models. - ollama/ollama

Ollama

Get up and running with large language models.

Docker

Edit description

GitHub - ollama/ollama-python: Ollama Python library

Ollama Python library. Contribute to ollama/ollama-python development by creating an account on GitHub.

GitHub - ollama/ollama-js: Ollama JavaScript library

Ollama JavaScript library. Contribute to ollama/ollama-js development by creating an account on GitHub.

GitHub - chroma-core/chroma: the AI-native open-source embedding database

the AI-native open-source embedding database. Contribute to chroma-core/chroma development by creating an account on…

the AI-native open-source embedding database

the AI-native open-source embedding database

An in-process SQL OLAP database management system

DuckDB is an in-process SQL OLAP database management system. Simple, feature-rich, fast & open source.

GitHub - duckdb/duckdb: DuckDB is an in-process SQL OLAP Database Management System

DuckDB is an in-process SQL OLAP Database Management System - duckdb/duckdb

Project Jupyter

The Jupyter Notebook is a web-based interactive computing platform. The notebook combines live code, equations…

Plotly

Plotly's

Vanna AI/Ollama/MySQL:Text to SQL & Automated Visualization (AI Powered Copilot for SQL…

Vanna AI/Ollama/MySQL: Learn to Connect Database and Query in Natural Language with Vanna AI+Ollama and automated…

Vanna.ai & Ollama:Text to SQL & Automated Visualization (AI Powered Copilot for SQL…

Learn to Connect Databases and Query in Natural Language with Vanna AI+Ollama and automated Visualization with Plotly…

Vanna AI & Ollama:Text to SQL & Automated Visualization (AI Powered Copilot for SQL…

Vanna AI/Ollama/PostgreSQL: Learn to Connect Database and Query in Natural Language with Vanna AI+Ollama and automated…

Written by CA Amit Singh