OctoAI Logo
Sign up
Log in
Sign up
Log in

Pricing & Plans

Get started today on OctoAI and receive $10 of free credit in your account.


OctoAI provides products that enable builders to create the next generation of AI applications.

Text gen icon

Text Gen Solution

Build on your choice of OSS LLMs like Llama 2, Code Llama, Mistral, and Mixtral against one unified API endpoint.

Image gen icon

Image Gen Solution

Easily customize (fine-tune) Stable Diffusion and seamlessly scale usage with no impact to image generation speed or quality.

Compute service icon

Compute Service

Start building your AI powered app on OctoAI’s cost-efficient compute service. Run or tune one of our ready to deploy open source models, or bring your own custom model.

Only pay for what
you use

OctoAI uses highly sophisticated AI systems expertise to accelerate foundational models. This allows us to pass on the performance gains from lower latency and increased speeds back to you with reduced inference pricing.

Model optimization icon


Run your choice of models on our reliable and scalable compute

user icon

Better user experience

Lower latencies and higher speeds mean your users only experience the snappiest and best app performance

money icon with arrows pointing down

Cost Savings

We pass on the performance improvements as some of the lowest inference costs in the market

dollar sign icon

Get started at no cost

All new sign ups get $10 of free usage on OctoAI

Frequently asked questions

Don’t see the answer to your question here? Feel free to reach out so we can help.

What is the OctoAI compute service?

OctoAI is a platform to run, tune and scale generative AI for your applications. The OctoAI compute service an efficient serverless compute layer to run their choice of OSS (eg. Llama 2, WhisperX), fine tuned or custom models. OctoAI solutions are built on the OctoAI compute service.

How does billing work?

At sign up, you get $10 of free credit, which can be used until the end of the first month. You can enter your credit card at any time, and your account will be automatically charged to keep your credit replenished. This will be a minimum of $10 or a maximum amount that you set. We will auto-reload your account when the balance reaches 10% of your reload amount. Your account must have a positive balance for you to use the service.

Do you offer enterprise pricing? What are the additional features?
Contact us for this tier of pricing. Additional features include: inferences in your private environment, our experts available to accelerate your model, reserved hardware, and priority support.
What about privacy and security?

OctoML is SOC 2 Type II certified. Keeping our customers’ data private and secure is a top priority, and we have internal systems to ensure appropriate handling of customer data. The SOC 2 Type II certification provides independent validation of these processes and safeguards. We do not persist the inputs, outputs, nor intermediate computations of your inferences, except for runtime logs that you choose to expose in your container. For encryption in transit, we ensure that all connections from customer to the OctoAI compute service require TLS, without you having to manage TLS certifications yourself. We also use encryption at rest for any data that we write to disk.

Start building with ease in minutes using OctoAI

Our mission at OctoML is to make AI sustainable and accessible so that developers are liberated to build the next generation of intelligent applications. Sign up and enjoy the freedom to choose your model, infrastructure, and deployment templates.

Sign Up Today
Talk to sales