Quickstart — OctoAI

Welcome to OctoAI! Our mission is to enable users to harness value from the latest AI innovations by delivering efficient, reliable, and customizable AI systems for your apps. Run your models or checkpoints on our cost-effective API endpoints, or run our optimized GenAI stack in your environment.

Get started with inference

Navigate to a model page and click Get API Token:
Copy the code sample to run an inference:

cURL

$ curl -X POST "https://text.octoai.run/v1/chat/completions" \
>     -H "Content-Type: application/json" \
>     -H "Authorization: Bearer $OCTOAI_TOKEN" \
>     --data-raw '{
>         "messages": [
>             {
>                 "role": "system",
>                 "content": "You are a helpful assistant."
>             },
>             {
>                 "role": "user",
>                 "content": "Hello world"
>             }
>         ],
>         "model": "mixtral-8x7b-instruct",
>         "max_tokens": 512,
>         "presence_penalty": 0,
>         "temperature": 0.1,
>         "top_p": 0.9
>     }'

Next steps

Check out the wide variety of text generation models and media generation models models we support.
Learn more about our Text Gen Solution, Media Gen Solution, or OctoStack.
Explore our demos to see OctoAI in action.

Additional Resources

Pricing & billing

$	curl -X POST "https://text.octoai.run/v1/chat/completions" \
>	-H "Content-Type: application/json" \
>	-H "Authorization: Bearer $OCTOAI_TOKEN" \
>	--data-raw '{
>	"messages": [
>	{
>	"role": "system",
>	"content": "You are a helpful assistant."
>	},
>	{
>	"role": "user",
>	"content": "Hello world"
>	}
>	],
>	"model": "mixtral-8x7b-instruct",
>	"max_tokens": 512,
>	"presence_penalty": 0,
>	"temperature": 0.1,
>	"top_p": 0.9
>	}'