Text Gen TypeScript SDK

The OctoAI Text Gen TypeScript SDK supports both the Chat Completions API and the Completions API.

At a Glance

This guide will walk you through how to use the TypeScript SDK to call our Text Gen API. The TypeScript SDK supports streaming and non-streaming inferences for both the Chat Completions API and legacy Completions API. There are also additional parameters such as frequencyPenalty, maxTokens, presencePenalty, etc. that can be used for finer control.

Requirements

  • Please create an OctoAI API token if you don’t have one already.
  • Please also verify you’ve completed TypeScript SDK Installation & Setup.
    • If you use the OCTOAI_TOKEN envvar for your token, you can instantiate the client with octoai = new OctoAIClient(), otherwise you will need to pass an API token using: octoai = new OctoAIClient({ apiKey: process.env.OCTOAI_TOKEN })

Chat Completions API

Non-Streaming Example

To make a chat completions call, you will need to provide the model you wish to call and a list of chat messages.

TypeScript
1import { OctoAIClient } from "@octoai/sdk";
2
3const octoai = new OctoAIClient({
4 apiKey: process.env.OCTOAI_TOKEN,
5});
6
7const result = await octoai.textGen.createChatCompletion({
8 model: "meta-llama-3.1-8b-instruct",
9 messages: [
10 {
11 role: "system",
12 content:
13 "You are a helpful assistant. Keep your responses limited to one short paragraph if possible.",
14 },
15 {
16 role: "user",
17 content: "Write a blog about Seattle",
18 },
19 ],
20});
21
22console.log(result.choices[0].message.content);
23// "Seattle is a vibrant and eclectic city..."

Streaming Example

The above example can work great in some scenarios, but if you’re dealing with larger requests or are building a highly-interactive user experience, using the streaming interface may be a better choice. The available options between non-streaming and streaming inferences are identical, but there are two main code changes needed:

  • You will need to use the createChatCompletionStream() method instead of createChatCompletion().
  • Instead of grabbing the final text message from the response, you will need to loop over the individual chunks and concatenate the tokens.
TypeScript
1import { OctoAIClient } from "@octoai/sdk";
2
3const octoai = new OctoAIClient({
4 apiKey: process.env.OCTOAI_TOKEN,
5});
6
7const stream = await octoai.textGen.createChatCompletionStream({
8 model: "meta-llama-3.1-8b-instruct",
9 messages: [
10 {
11 role: "system",
12 content:
13 "You are a helpful assistant. Keep your responses limited to one short paragraph if possible.",
14 },
15 {
16 role: "user",
17 content: "Write a blog about Seattle",
18 },
19 ],
20});
21
22let result = "";
23
24// Loops over the returned chunks whenever they're ready
25for await (const chunk of stream) {
26 // The content of the first chunk can be `undefined`
27 result += chunk.choices[0].delta.content ?? "";
28}
29
30console.log(result);
31// "Seattle is a vibrant and eclectic city..."

Completions API

The TypeScript SDK also supports the legacy Completions API with the same customization options as the Chat Completions API. The key difference between the two is that you provide a prompt string instead of a list of chat message objects. Much like the Chat Completions API, you can choose between non-streaming and streaming inference.

Non-Streaming Example

TypeScript
1import { OctoAIClient } from "@octoai/sdk";
2
3const octoai = new OctoAIClient({
4 apiKey: process.env.OCTOAI_TOKEN,
5});
6
7const response = await octoai.textGen.createCompletion({
8 model: "meta-llama-3.1-8b-instruct",
9 prompt: "Write a blog about Seattle",
10});
11
12console.log(response.choices[0].text);
13// "Seattle is a vibrant and eclectic city..."

Streaming Example

TypeScript
1import { OctoAIClient } from "@octoai/sdk";
2
3const octoai = new OctoAIClient({
4 apiKey: process.env.OCTOAI_TOKEN,
5});
6
7const stream = await octoai.textGen.createCompletionStream({
8 model: "meta-llama-3.1-8b-instruct",
9 prompt: "Write a blog about Seattle",
10});
11
12let result = "";
13
14// Loops over the returned chunks whenever they're ready
15for await (const chunk of stream) {
16 result += chunk.choices[0].text;
17}
18
19console.log(result);
20// "Seattle is a vibrant and eclectic city..."