Using Structured Outputs (JSON mode) with Text Gen endpoints
Ensure Text Gen outputs fit into your desired JSON schema.
OctoAI’s Large Language Models (LLMs) can generate outputs that not only adhere to JSON format but also align with your unique schema specifications. This guide covers two approaches to JSON mode: OpenAI Compatible JSON mode for Llama-3.1-8B and 70B, and Legacy JSON mode.
**Supported models
- Llama 3.1 8B
- Llama 3.1 70B
OpenAI Compatible JSON mode for Llama-3.1-8B and 70B
This section covers the new JSON mode compatible with OpenAI’s new response format standard, specifically for Llama-3.1-8B and 70B models.
Setup
First, set up the OpenAI client and set it to run with OctoAI base and tokens.
Generate JSON without adhering to any schema (json_object)
If you want the response as a JSON object but without any specific schema:
Generating JSON adhering to schema (without constrained decoding):
For generating JSON that adheres to a simple schema, but without strict (guarenteed) schema following (see the “strict”: False below). This mode is faster and works on both Llama-3.1-8B-Instruct and Llama-3.1-70B-Instruct. For most use cases, it is sufficient and recommended.
Generating JSON adhering to schema (with constrained decoding):
When you need strict adherence to a JSON schema, you can activate this mode on Llama-3.1-8b-Instruct only. This is recommended for more complex schemas. Activating this mode can create a latency increase.