Text Gen REST API
All OctoAI text generation models are accessible via REST API. Learn how to implement with easy to follow code examples.
All of our text generation models are accessible via REST API, and we follow the “Chat Completions” standard popularized by OpenAI. Below you can see a simple cURL example and JSON response for our endpoint, along with explnations of all parameters.
Input Sample
cURL
Input Parameters
- model (string): The model to be used for chat completion. Here is the complete list of presently supported model arguments. For more information regarding these models, see this description.
- max_tokens (integer, optional): The maximum number of tokens to generate for the chat completion.
- messages (list of objects): A list of chat messages, where each message is an object with properties:
role
andcontent
. Supported roles are “system”, “assistant”, and “user”. - temperature (float, optional): A value between 0.0 and 2.0 that controls the randomness of the model’s output.
- top_p (float, optional): A value between 0.0 and 1.0 that controls the probability of the model generating a particular token.
- stop (list of strings, optional): A list of strings that the model will stop generating text if it encounters any of them.
- frequency_penalty (float, optional): A value between 0.0 and 1.0 that controls how much the model penalizes generating repetitive responses.
- presence_penalty (float, optional): A value between 0.0 and 1.0 that controls how much the model penalizes generating responses that contain certain words or phrases.
- stream (boolean, optional): Indicates whether the response should be streamed.
Non-Streaming Response Sample:
JSON
Streaming Response Sample:
Once parsed to JSON, you will see the content of the streaming response similar to below:
JSON
Without parsing, the text stream will start with data:
for each chunk. Below is an example. Please note, the final chunk contains simply data: [DONE]
as text which can break JSON parsing if not accounted for.
Response Parameters
Parameters
- id (string): A unique identifier for the chat completion.
- choices (list of objects):
- This is a list of chat completion choices, each represented as an object.
- Each object within the
choices
list contains the following fields:
_ index (integer): The position of the choice in the list of generated completions.
_ message (object):
_ An object representing the content of the chat completion, which includes:
_ role (string): The role associated with the message, typically “assistant” for the generated response.
_ content (string): The actual text content of the chat completion.
_ function_call (object or null): An optional field that may contain information about a function call made within the message. It’s usuallynull
in standard responses.
_ delta (object or null): An optional field that can contain additional metadata about the message, typicallynull
.
_ finish_reason (string): The reason why the message generation was stopped, such as reaching the maximum length ("length"
).
- created (integer): The Unix timestamp (in seconds) of when the chat completion was created.
- model (string): The model used for the chat completion.
- object (string): The object type, which is always
chat.completion
. - system_fingerprint (object or null): An optional field that may contain system-specific metadata.
- usage (object):
- Usage statistics for the completion request, detailing token usage in the prompt and completion.