OctoAI Logo
HomeBlogOctoAI Provides Fastest Stable Diffusion XL Endpoint

OctoAI Provides Fastest Stable Diffusion XL Endpoint

Aug 16, 20235 minutes
Chart showing OctoAI vs Stability.ai Stable Diffusion XL endpoint latency times in seconds with OctoAI being significantly faster

2X Faster SDXL with Feature-Rich AI Customization Capabilities

OctoAI delivers generative AI infrastructure to run, tune and scale models so that developers can focus on building apps and services leveraging powerful AI models. Our commitment to users is that we will provide easy and efficient access to the latest AI innovations; so three weeks ago we delivered a Stable Diffusion (SD) XL endpoint two days after its market release.

Stability AI did incredible work developing SDXL as a powerful model for generating detailed images with minimal prompting. The role of OctoAI is to deliver its unique capabilities to our customers in an efficient and developer friendly way.

Stable-Diffusion-XL-images-generated-on-OctoAI-with-minimal-prompting

Since the release of SDXL, our expert ML engineers have been hard at work on an accelerated version of this powerful model. Today, we’re delighted to announce that we’ve delivered the fastest SDXL endpoint on the market. The best part? The 2X speedup does not impact image quality at all, and comes with ability to sustain high request load and strong reliability. The new accelerated endpoint is available right now if you want to experience the blazing fast performance yourself.

Why performance matters

While the SDXL model can deliver higher quality images than its predecessors, it takes considerably longer because of its size and complexity. Using OctoAI, organizations that want to deliver the best experience to users of SDXL applications can realize a 2X performance boost over Stability AI – one of the open source sponsors of the model. The TL;DR is OctoAI can generate the highest quality AI generated images on the market in under 10 seconds.

Chart showing OctoAI vs Stability.ai Stable Diffusion XL endpoint latency times in seconds with OctoAI being significantly faster

To get these results, we ran K6 load tests to collect end-to-end latency of image generations at scale. OctoAI consistently generated images in 9.6 seconds, whereas Stability had more variability, with latencies up to 40 seconds per image in the upper range.

If you need even faster latency, say around 5 seconds or less, we can help. Get in touch with us here and we’ll schedule a time to chat with our engineers about the options available to you.

The end user experience and pricing advantages OctoAI delivers allows teams and companies building apps and services based on OctoAI a competitive advantage over those trying to do so with StabilityAI.

Speed + customization FTW

Model speed is just one tool in the arsenal of AI app developers who want to delight their end users. Another area to lean on OctoAI for differentiation is the freedom to choose the best foundation models on the market. That is why we are showcasing a free sample application we call “Octoshop.”

OctoShop-an-OctoAI-demo-app-uses-SDXL-and-Llama2-to-transform-images

Octoshop combines the power of two of the most impactful foundation models to have come along this year: SDXL and Llama2, Meta’s large language model (LLM) that has a permissive license for commercial use. Octoshop is an example of what developers can achieve with an easy-to-use OctoAI endpoint that removes the hassle of creating and managing AI infrastructure. Developers get to just focus on the application innovation while OctoAI seamlessly manages the behind-the-scenes complexities of running, tuning and scaling both text-to-image and text-to-text models.

Define your own unique style with fine-tuning

We wanted to close out this post by highlighting the new fine-tuning innovations that we have released in concert with support and guidance from our design partners Civit.ai and Extropolis. Both companies have been incredible to collaborate with since our early access release in May. And they have helped us truly understand that the foundation model is really just a basic starting point for building GenAI powered apps. There is a critical need to customize (aka tune) the model to meet and align with the needs of the app/service you want to deliver.

To that end, we are excited to announce that we have just released the most comprehensive and flexible customization capabilities on the market for image generation based upon Stable Diffusion. Fine tuning in OctoAI is available in private preview now.

Fine-tuning is a method to create a new asset that represents a specific person, object, or style. With 10-15 training images and about 10 minutes of tuning, a base image generation model (like Stable Diffusion or SDXL) will learn to generate images in the concept you want, and will represent that knowledge within an asset file.

OctoAI-fine-tuning-a-cupcake-in-different-backgrounds

Many such fine-tuned model assets are shared on image–gen community hubs such as CivitAI, Reddit, HuggingFace, and others. Many businesses, however, will decide to keep these fine-tuned assets private, as they are tuned to feature a specific product, brand identity, or unique style. Here’s an example: let’s say you’re developing a VR headset and need photorealistic images of humans wearing VR glasses to build a global advertising campaign. The photos need to feature diverse models but maintain a consistent style. Check out this video from Octonaut Jordan Janes to see how you can accomplish this task in just under 3-minutes with fine-tuning in OctoAI.

Two-people-lady-and-man-generated-by-SDXL-as-product-photography-example

Putting it all together

Not only can you now fine-tune assets directly in OctoAI, the platform also enables users to discover and use existing tuned assets. whether from a community hub, your organization’s own private assets, or a blend of the two. This suite of tooling helps you generate images that are much more aligned to your brand identity and creative palette than the limited set of style options on Stability AI.

For example, let’s say I am a developer building a travel app. I want to generate images that help my users visualize themselves being in Tokyo.

Two-scenes-in-Tokyo-generated-by-SDXL-as-travel-photography-example

Using OctoAI, I can create a fine-tuning workflow to represent each user’s features as a private asset, then mix each private asset with a community asset representing the style of Tokyo. Using an OctoAI API, I can repeat the same for all of my users and all the cities I want to showcase in my app, storing terabytes of assets on OctoAI. OctoAI helps me cache and load these assets efficiently, so I can benefit from customization while still getting the best price/performance in image generation.

OctoAI’s industry-leading model speed, customization capabilities, and scalable infrastructure helps developers build world-class generative AI applications. The images generated by your apps will not only be aligned to your unique brand identity but also personalized to each of your end customers. This in turn leads to sticky product experiences and customer loyalty.

Sign up to try OctoAI today and receive 25 free GPU hours to start building. Then, join our Fine-Tuning Private Preview program to experience the bleeding edge of our image generation feature set.

Your choice of models on our SaaS or in your environment

Run any model or checkpoint on our efficient, reliable, and customizable API endpoints. Sign up and start building in minutes.