Sign up
Log in
Sign up
Log in
On-demand webinar
Learn from our technical deep dive into using function calling to develop AI agents.
Watch now

Pricing & Plans

Get started today on OctoAI and receive $10 of free credit in your account.

OverviewText Gen SolutionMedia Gen Solution


OctoAI provides products that enable builders to create the next generation of AI applications.

Text gen icon

Text Gen Solution

Build on your choice of LLMs like Llama 2, Code Llama, Mistral, and Mixtral against one unified API endpoint, or bring your own checkpoint.

Image gen icon

Media Gen Solution

Easily customize (fine-tune) Stable Diffusion models and seamlessly scale usage with no impact to image generation or animation speed or quality.

OctoStack icon yellow


OctoStack allows you to run your choice of models in your environment, including any cloud platform, VPC, or on-premise, ensuring full control over your data.

Only pay for what
you use

OctoAI uses highly sophisticated AI systems expertise to accelerate foundational models. This allows us to pass on the performance gains from lower latency and increased speeds back to you with reduced inference pricing.

Model optimization icon


Run your choice of models on our reliable and scalable compute

user icon

Better user experience

Lower latencies and higher speeds mean your users only experience the snappiest and best app performance

money icon with arrows pointing down

Cost Savings

We pass on the performance improvements as some of the lowest inference costs in the market

dollar sign icon

Get started at no cost

All new sign ups get $10 of free usage on OctoAI

Frequently asked questions

Don’t see the answer to your question here? Feel free to reach out so we can help.

What is OctoAI?

OctoAi is an efficient, customizable, and reliable platform for GenAI inference, so you can build and scale your production applications. The OctoAI compute service an efficient serverless compute layer to run their choice of OSS, fine-tuned, or custom models. OctoAI solutions are built on the OctoAI compute service.

How does billing work?

At sign up, you get $10 of free credit, which doesn't expire. You can enter your credit card at any time and pay for your use at the end of each month. Free credits are always used before any credit card charges apply.

Can custom or automated workflows be created?

We do not have a generalized workflow builder. But, please review some of our demos to see examples of building pipelines from various models to create GenAI apps.

Do you offer enterprise pricing? What are the additional features?
Contact us for this tier of pricing. Additional features include: inferences in your private environment with OctoStack, our experts available to accelerate your model, reserved hardware, and priority support.
What about privacy and security?
OctoAI is SOC 2 Type II certified. Keeping our customers’ data private and secure is a top priority, and we have internal systems to ensure appropriate handling of customer data. Your data is never used for training purposes.

The SOC 2 Type II certification provides independent validation of these processes and safeguards. We do not persist the inputs, outputs, nor intermediate computations of your inferences, except for runtime logs that you choose to expose in your container. For encryption in transit, we ensure that all connections from customer to the OctoAI compute service require TLS, without you having to manage TLS certifications yourself. We also use encryption at rest for any data that we write to disk.

Do you use customer data for training?

No. Your data is never used for training purposes. See more details about our SOC 2 compliance and data policies.

Start building with ease in minutes using OctoAI

We enable users to harness the value from AI innovations to build the next generation of intelligent applications. Sign up and enjoy the freedom to choose your model, infrastructure, and deployment templates.

Sign Up Today
Talk to sales