Integrations

LlamaIndex Integration

A developer building AI apps can now access highly optimized LLMs and Embeddings models on OctoAI.

Introduction

LlamaIndex strives to help manage the interactions between your language modles and private DataTransfer. If you are building your application and using LlamaIndex you benefit from the vast ecosystem of integrations, and top LLMs amd Embeddings models hosted by OctoAI.

Using OctoAI’s LLMs and LlamaIndex

Get started reviewing more about LlamaIndex, and signing up for a free OctoAI account.

LlamaIndex has both Python and TypScript libraries, and OctoAI is available in the Python SDK.

To use OctoAI LLM endpoints with LlamaIndex start with the code below using Llama 3 8B as the LLM.

1from os import environ
2from llama_index.llms.octoai import OctoAI
3
4OCTOAI_API_KEY = environ.get("OCTOAI_TOKEN")
5
6octoai = OctoAI(model="meta-llama-3-8b-instruct", token=OCTOAI_API_KEY)
7
8# Using complete
9response = octoai.complete("Octopi can not play chess because...")
10print(response)
11
12print("\n=====================\n")
13
14# Using the chat interface
15from llama_index.core.llms import ChatMessage
16
17messages = [
18 ChatMessage(
19 role="system",
20 content="Below is an instruction that describes a task. Write a response that appropriately completes the request.",
21 ),
22 ChatMessage(role="user", content="Write a short blog about Seattle"),
23]
24response = octoai.chat(messages)
25print(response)

To use OctoAI Embedding endpoints with llamaindex you can use the code below to get started. We’re using GTE large in the example below (default model).

1from os import environ
2from llama_index.embeddings.octoai import OctoAIEmbedding
3
4OCTOAI_API_KEY = environ.get("OCTOAI_TOKEN")
5embed_model = OctoAIEmbedding(api_key=OCTOAI_API_KEY)
6
7# Single embedding request
8embeddings = embed_model.get_text_embedding("Once upon a time in Seattle.")
9assert len(embeddings) == 1024
10print(embeddings[:10])
11
12
13# Batch embedding request
14texts = [
15 "Once upon a time in Seattle.",
16 "This is a test.",
17 "Hello, world!"
18]
19embeddings = embed_model.get_text_embedding_batch(texts)
20assert len(embeddings) == 3
21print(embeddings[0][:10])

If you are using LlamaIndex you can easily switch model provider, and enjoy using models hosted and optimized for scale on OctoAI.