Sign up
Log in
Sign up
Log in
On-demand webinar
Learn from our technical deep dive into using function calling to develop AI agents.
Watch now

Announcing Stable Diffusion XL 1.0 on OctoAI

Blog Author - Vanessa Yan

Jul 27, 2023

3 minutes
Stable Diffusion XL (SDXL) 1.0 was released by Stability AI this week, and marks a major evolution in the Stable Diffusion image generation quality and experience. OctoAI has rapidly gained adoption as a preferred technology platform for GenAI image generation, and we’re excited to announce that SDXL 1.0 on OctoAI is now available. You can get started with SDXL on OctoAI today from your OctoAI console.

SDXL 1.0: Better quality, more functionality, and easier generation on “the largest open image model” to date

Stable Diffusion XL adds quality and functionality improvements over previous versions. Perhaps most important are (i) the model’s ability to generate richer and better quality images with higher levels of photorealism, and (ii) the lower requirement on prompting, allowing users to generate complex and creative images with simpler prompts. These can be seen below, in images created against the prompt “child eating ice cream in park.

The chart below, published by Stability AI, highlights SDXL winning over 80% of human votes in a human-evaluated test comparing outputs against previous versions. This is validated in discussions across multiple forums over the past few weeks comparing SDXL to previous versions. The paper further highlights user preference evaluations where SDXL outperforms Midjourney v5.1 in four of six image generation categories.

Source: SDXL: Improving Latent Diffusion Models for High-Resolution Image Synthesis

The SDXL paper highlights the core architectural changes introduced with SDXL that enable these improvements -  including a larger neural network for the base model; use of a combination of OpenCLIP ViT-bigG and CLIP ViT-L as text encoder; fine tuning to better support broader (non-square) aspect ratios; and the use of a second refinement stage to further improve image fidelity. 

With these enhancements, the SDXL 1.0 base model has 2.6B parameters, compared to 860M in 1.5 and 865M in 2.1. This makes SDXL over 2.5 times larger in size compared to its predecessors, and correspondingly more expensive due to the larger hardware footprint requirement. The higher size and cost of the model are likely to make SDXL 1.0 initially appealing only to highly quality-sensitive use cases. At the same time, we believe that the quality and openness of SDXL, combined with the ecosystem of technologies forming around it, have the potential to unblock adoption of GenAI image generation for a broad range of new organizations and use cases - including gaming, marketing asset creation, and entertainment.

Image Generation on OctoAI

Starting with the announcement of our accelerated Stable Diffusion 2.1 model earlier this year, OctoAI has been actively expanding its toolkit of features and services for developers to build image generation applications. Shortly after our launch, we added the ability to load different styles/checkpoints into Stable Diffusion 1.5, as well as the Automatic1111 Web UI for Stable Diffusion for developers to easily experiment with different LoRAs, checkpoints, and extensions. We also added Stable Diffusion on AWS Inferentia2 (private preview) and Stable Diffusion fine tuning on OctoAI (private preview). OctoAI’s earliest adopters, Civitai and Extropolis AI - both innovative trailblazers bringing image generation to a broad audience, are a testament to OctoAI’s image generation strengths. And we’re building on this today with the addition of SDXL 1.0 on OctoAI.

Get Started with SDXL on OctoAI today!

You can take SDXL 1.0 for a spin today with a free trial on OctoAI

You’re also welcome to join us on our Discord server to engage with the team and community, and to share your creative images. We look forward to hearing from you on our channels!


Related Posts

All Posts
From GANs to Stable Diffusion: The History, Hype, & Promise of Generative AI

The past couple of years have seen a meteoric rise of text-to-image models such as OpenAI's DALL-E 2, Google Brain's Imagen, Midjourney, and Stable Diffusion.

Blog Author - Sameer Farooqui
Sameer Farooqui
How to Run Stable Diffusion 3X Faster for 5X Less: Available for Early Access on OctoML Compute Service on AWS

At OctoML, we are on a mission to deliver affordable AI compute services for those who want control over the business they are building. That’s why we built a new compute service, available now in early access. It delivers AI infrastructure and advanced machine learning optimization techniques that you can only find in large scale AI services like OpenAI, but gives you the power to control your own API, choose your own models and  work within your AI budget. 

Blog Author - Andrew Luo
Andrew Luo
OctoML launches OctoAI, an AI compute service to run, tune, and scale your generative AI models

OctoAI is a compute service to run, tune (or customize), and scale your generative AI models. The service builds on the expertise and technologies around AI/ML systems optimization at OctoML and abstracts away details of model execution and hardware optimization from developers.

Blog Author - Jared Roesch
Jared Roesch