This tutorial showcases how to use data stored in Airbyte's Vectara destination to perform Retrieval-Augmented Generation (RAG).
Prerequisites
1) OpenAI API Key:
- Create an OpenAI Account: Sign up for an account on OpenAI.
- Generate an API Key: Go to the API section and generate a new API key. For detailed instructions, refer to the OpenAI documentation.
2) Vectara Customer ID, Corpus ID ,API Key:
- Create an Vectara Account: Sign up for an account on Vectara.
- Customer ID: Click on the profile icon on top right, and look for your customer ID Vectara Console.
- Corpus ID: You can see the list of Corpora you've created in your Vectara Account. Note down the required Corpus ID Vectara Corpora.
- Generate an API Key: Go here and generate a new API key. Vectara API_Key.
Install Dependencies
As in any other Python Code, the first step is to install the required Dependencies!
# Add virtual environment support in Google Colab
!apt-get install -qq python3.10-venv
# Install required packages
%pip install --quiet openai langchain-openai tiktoken pandas langchain_community
Set Up Environment Variables
We configure the required credentials here!
import os
os.environ["VECTARA_CUSTOMER_ID"] = "YOUR_VECTARA_CUSTOMER_ID"
os.environ["VECTARA_CORPUS_ID"] = "YOUR_VECTARA_CORPUS_ID"
os.environ["VECTARA_API_KEY"] = "YOUR_VECTARA_API_KEY"
os.environ["OPENAI_API_KEY"] = "YOUR_OPENAI_API_KEY"
Initialize Vectara Vector Store
We start by initializing the Vectara Vector Store and then see what the data in Vectara looks like.
import pandas as pd
from langchain_community.vectorstores import Vectara
from google.colab import userdata
# Initialize Vectara vector store
vectara = Vectara(
vectara_customer_id=os.getenv("VECTARA_CUSTOMER_ID"),
vectara_corpus_id=os.getenv("VECTARA_CORPUS_ID"),
vectara_api_key=os.getenv("VECTARA_API_KEY")
)
def fetch_vectara_data():
# Simulate fetching data from Vectara
data = {
"document_id": [1, 2, 3],
"document_content": ["Content of doc 1", "Content of doc 2", "Content of doc 3"],
"metadata": ["Metadata1", "Metadata2", "Metadata3"],
"embedding": ["[0.1, 0.2, ...]", "[0.3, 0.4, ...]", "[0.5, 0.6, ...]"]
}
df = pd.DataFrame(data)
return df
# show data
data_frame = fetch_vectara_data()
print(data_frame)
Embedding and similarity search with Vectara
Here we will convert the user's query into embeddings using OpenAI and retrieve similar chunks from Vectara based on the query
from openai import OpenAI
from langchain_openai import OpenAIEmbeddings
from typing import List
from rich.console import Console
client = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))
# Convert user's query into a vector array to prep for similarity search
def get_embedding_from_openai(query) -> List[float]:
print(f"Embedding user's query -> {query}...")
embeddings = OpenAIEmbeddings(openai_api_key=client.api_key)
return embeddings.embed_query(query)
# Use Vectara to find matching chunks
def get_similar_chunks_from_vectara(query: str) -> List[str]:
print("\nRetrieving similar chunks...")
try:
results = vectara.similarity_search(query=query)
chunks = [result.page_content for result in results]
print(f"Found {len(chunks)} matching chunks!")
return chunks
except Exception as e:
print(f"Error in retrieving chunks: {e}")
return []
Building RAG Pipeline and asking a question
Finally we use OpenAI for querying our data!
We know the three main steps of a RAG Pipeline are :
- Embedding incoming query
- Doing similarity search to find matching chunks
- Send chunks to LLM for completion
# Use OpenAI to complete the response
def get_completion_from_openai(question, document_chunks: List[str], model_name="gpt-3.5-turbo"):
print(f"\nSending chunks to OpenAI (LLM: {model_name}) for completion...")
chunks = "\n\n".join(document_chunks)
response = client.chat.completions.create(
model=model_name,
messages=[
{"role": "system", "content": "You are an Airbyte product assistant. Answer the question based on the context. Do not use any other information. Be concise."},
{"role": "user", "content": f"Context:\n{chunks}\n\n{question}\n\nAnswer:"}
],
max_tokens=150
)
return response.choices[0].message.content.strip()
# Putting it all together
def get_response(query, model_name="gpt-3.5-turbo"):
chunks = get_similar_chunks_from_vectara(query)
if len(chunks) == 0:
return "I am sorry, I do not have the context to answer your question."
else:
return get_completion_from_openai(query, chunks, model_name)
# Ask a question
query = 'What data do you have?'
response = get_response(query)
Console().print(f"\n\nResponse from LLM:\n\n[blue]{response}[/blue]")