OctoAI Image Gen Solution: Fine-tuned GenAI images in seconds, on the fastest and most cost effective Stable Diffusion XL 1.0 (SDXL) and Stable Diffusion 1.5 models in the market

Blog Author - Vanessa Yan
Blog Author - Deepak Mohan

Nov 8, 2023

6 minutes

We’re excited today to launch the OctoAI Image Gen Solution, the fastest and most customizable GenAI stack for production-grade image generation applications. You can select from thousands of fine-tuning assets, including your own custom created fine-tuning assets, and easily generate millions of custom images a day — all at industry-leading speeds. OctoAI Image Gen Solution lets you launch production-grade GenAI powered applications in days and not months. Building on OctoAI, you can now focus on your application innovation and on your differentiated GenAI powered experiences.

Launching and scaling successful image generation apps needs more than just a hosted SDXL model

We learned from early customers of OctoAI that launching and scaling an image generation powered application needs much more than just the model and or a hosted SDXL endpoint. These discussions revealed a few common challenges builders deal with today:

  • Poor reliability and latency performance at scale: Maintaining customer experience with growth is critical for GenAI powered applications. Dropped inferences and poor latency are friction to repeat usage and growth. This reliability can be make or break for new applications, and we’ve heard repeatedly that high latency variability and dropped inferences are real barriers to growth today.
  • Lack of ability to customize (fine-tune) the model as needed: Differentiated application experiences need consistent and predictable images. This requires customization or fine-tuning — not just one or two, but often 10s to 100s to create rich experiences, or even 1000s, for creative and artistic use cases. Several of the most innovative and popular customizations are created by the broader community around Stable Diffusion. Customers want to be able to easily use these community created customizations for their image generation. Today however, builders are forced to add time and cost to create or import customizations, or just use the base model.
  • Exponential cost growth, with increase in application usage: Builders continue to be concerned about ballooning costs with growth in GenAI usage, as end user adoption of their application grows.

These limit the ability to quickly launch or iterate applications, and add friction to end user experience and growth. Working with early adopters and leaning into our AI systems expertise, we built the OctoAI Image Gen Solution — specifically to empower builders to launch and scale GenAI image generation powered applications.

Proven performance at scale, infinite customization, and industry-leading price, on one unified API endpoint

The OctoAI Image Gen Solution is an integrated collection of models and tools, to run, tune, and scale GenAI powered image generation for your applications and your business needs. Building on multiple open source technologies including SDXL, ControlNets, and Real-ESRGAN, the solution is designed from the bottom up to deliver speed and customization at scale. Builders can keep their attention and focus on their priorities — building differentiated end-user experiences using GenAI, while we handle the details of running and scaling the GenAI backend to power these applications.

The OctoAI Image Gen Solution brings to application builders:

  • Reliability and speed at scale: OctoAI Image Gen Solution can deliver predictable end to end SDXL latencies in under 3 seconds, millions of daily images, and success rates over 99.99% of inference calls. Latency and errors directly contribute to customer experience and costs for an application, and our customers are already building on the benefits of this speed and reliability for their business.

Speed is key to the AI art experience we deliver. We’ve been able to increase our image generation speeds by 5x with OctoAI’s low latency inferences, and this has resulted in even more usage and growth for our platform!

Angus Russell, Founder @ NightCafe

  • Rich customization and fine-tuning: Customers can now easily create or import fine-tuning assets to customize the model, and choose from 10s to 100s to 1000s of fine-tuning assets at image generation time — both over the WebUI and programmatically using the API. This includes
    • Importing from the 1000s of fine-tuning assets (like LoRAs, checkpoints and textual inversions) from popular repositories like Hugging Face and Civitai, or

    • Creating your own custom assets via fine-tuning, for use cases like product placement and object preservation

OctoAI’s integration has been instrumental in making it possible for CALA to power the ability for our customers to fine-tune their image generation. OctoAI has allowed us to accelerate our development and time to market with these new features while eliminating the typical costs that we would have faced by running multiple parallel model variants.

Dylan Pyle, CTO & Co-Founder @ CALA

Our customers have been especially happy with the fact that the generation speed, latencies, and cost, for these images, from custom fine-tuned SDXL models, are actually faster and more reliable than vanilla or base SDXL images from other services.

  • Cost effectiveness and savings: The AI systems enhancements we have built allow us to run models faster, more reliably, and across broader hardware options, than alternatives. We pass these efficiencies to customers as savings, through our competitive image generation pricing. With SDXL image generation, and fine-tuned Stable Diffusion 1.5 images at $0.0015 per image, as well as additional options to deliver higher latency as needed, we offer customers a broad range of options to meet the image generation cost needs best suited for their target use case.

The OctoAI Image Gen Solution is designed to fast track adoption of GenAI image generation and models like SDXL, and simplify how builders test, launch and scale applications built on these capabilities. Builders can focus on their application innovations and differentiated customer experiences, while we run, tune, and scale the generative AI backend for the application. And our early customer experiences and feedback have resonated strongly with this focus. See the below from Brian Carlson, Founder & CEO @ Storytime AI.

Our top priority was to get the product to market quickly using an open source image solution once we had completed internal prototyping and validation. We also wanted to ensure that our imagery within each personalized book provided an outstanding experience to our users, while maintaining consistency with our guardrails and themes. OctoAI simplified how we achieved both of these goals while providing the highest level of speed and reliability.

OctoAI Asset Orchestrator, the engine powering OctoAI’s infinite customizations

Prior to the OctoAI Image Gen Solution, developers have one of two options to generate customized or fine-tuned images. Either you optimize for cost and only run your fine-tuned model at inference time (eg. using a serverless GPU service); or you optimize for speed and overprovision resources, with a running endpoint for each fine-tuned model. The former adds latency to each inference call because of cold start, which can be 10s of seconds to load multi GB sized models. The latter requires an expensive endpoint running for each fine-tuned model — which will quickly get expensive as you add customizations and experiences in your application.

OctoAI Image Gen Solution does not require you to maintain an endpoint for each fine-tuned model, and still delivers lightning fast fine-tuned image generation — faster than non-fine-tuned images from other popular image generation services. This is enabled by a new architecture and a new approach to delivering the fine-tuned model endpoints — powered by the OctoAI Asset Orchestrator. The OctoAI Asset Orchestrator lets the solution efficiently load the required customizations and model weights at runtime, and to generate images using your desired fine-tuned model, without the cold start or over provisioning inefficiencies. You can select from 100s to 1000s of fine-tuning assets stored in the OctoAI Asset Library, parametrically select the desired fine-tuning asset at runtime, and still experience OctoAI’s industry-leading image generation latency. The OctoAI Asset Orchestrator simplifies and reduces the operational cost for running fine-tuned image generation models, allowing for infinite customization possibilities within one unified API endpoint.

Sign up for the OctoAI Image Gen Solution today

GenAI image generation has grown beyond hobbyist use cases. Innovative applications are revolutionizing industries from creative agencies, to entertainment, to retail. The OctoAI Image Gen Solution is built to deliver the flexibility, cost efficiency, and product grade reliability needed to power the growth of these applications.

Sign up and start building today at no cost, with the free tier of the OctoAI Image Gen Solution. You can go on to build and scale commercial applications with our Pro tier. For specific SLA, performance, or deployment needs, contact us for details about our Enterprise tier.

You can learn more about the OctoAI Image Gen Solution and its capabilities, through the OctoStudio demo application walkthrough. You’re also welcome to join us on Discord to engage with the team and our community. We look forward to hearing about your experience and your feedback!