Build a Q&A app using an LLM
Learn how to build and end-to-end chatbot and custom question answering app using OctoAI. The app will use OctoAI, LangChain, and Llama Index.
Prerequisites
The creation of this was inspired by chatPDF. We have fully built end-to-end examples on GitHub you can clone and edit.
Environment setup
To run our example app complete the following steps:
Utilize the Llama 2 demo model
Paste the Endpoint URL in a file called
.env
in the root directory of the project
Get an OctoAI API Token
Paste the OctoAI API key in a file called
.env
in the root directory of the project
Python wrapper for LangChain
The following is a walkthrough of the code in OctoAI's endpoint wrapper for LangChain.
At a high level, we define a Python wrapper class to help developers easily use OctoAI’s LLM endpoints within a LangChain. LangChain is a Python library commonly used to build LLM applications. Our class extends the LLM base class from the LangChain library.
First, our class defines some attributes:
The endpoint_url
points to the OctoAI-hosted endpoint for your model. The task refers to the model task/function to call. model_kwargs
are any arguments to pass to the model. octoai_api_token
is the API access token.
Next, the class defines a Config class and root_validator to validate the required environment variables:
The _llm_type
method returns the model type, which is octoai_cloud_llm
:
The _identifying_params
method returns the parameters that identify the model, such as the endpoint, task, and arguments:
Finally, the _call
method makes a request to the inference endpoint to generate text:
The method constructs the request payload and headers, sends a POST request to the endpoint, and returns the generated text from the response.
Chat app that responds to a user
The following is a code walkthrough for the chatbot app.
First, we import the necessary libraries for logging, environment variables, the OctoAI-hosted LLM, the LangChain library, and LlamaIndex’s LMPredictor:
Next, we set the current directory and load environment variables from a .env
file to get credentials for the OctoAI endpoint:
Then we define a function to handle exiting the program:
Next, we define the main ask()
function which will interactively ask questions to the model:
We load the endpoint URL from the environment, instantiate the OctoAI LLM endpoint, and create an LLMPredictor.
We define a prompt template with a {question}
placeholder, create a PromptTemplate, and construct an LLMChain to generate responses.
We provide an example prompt/response, then enter a loop to collect user prompts and generate responses until the user exits.
Finally, we call the ask()
function:
Q & A on a custom PDF app
Below is a code walkthrough for the app that indexes a PDF document and answers questions about that document.
First, we import the necessary libraries:
We import libraries for environment variables, the OctoAI LLM endpoint, Langchain, embeddings, and the LLama index.
Next, we set the current directory and logging level:
We need to load environment variables from a .env file to get credentials for the OctoAI model. We set the logging level to CRITICAL to reduce noise.
Then we define a function to initialize the files directory:
Next, we define a function to handle exiting the program:
Functions to load a PDF file, create a query engine, and prompt the user to ask questions:
We load the selected PDF, instantiate the OctoAI-hosted LLM and a predictor, create embeddings and a ServiceContext, build an index of the document, and construct a query engine to answer questions.
The select_file()
function prompts the user to select a PDF file to process:
Finally, we call the initialization function, prompt the user to select a file, and if a file is selected, start the interactive query session:
Share what you build
We are excited to see what you build. Feel free to showcase it in our Discord, and see what other community members are building.