Generate SDXL

POST
/generate/sdxl

Generate images in response to the given request.

Request

This endpoint expects an object.
promptstringRequired

Text describing the image content to generate.

cfg_scaledoubleOptional<=50Defaults to 12

Floating-point number represeting how closely to adhere to prompt description. Must be a positive number no greater than 50.0.

checkpointstringOptional

[Not supported on SD3, FLUX] Custom checkpoint to be used during image generation.

clip_skipintegerOptional>=1<=12

[Not supported on SD3, FLUX] Optionally skip later layers of the text encoder. Higher values lead to more abstract interpretations of the prompt.

controlnetstringOptional

[Not supported on SD3, FLUX] ControlNet to be used during image generation.

controlnet_conditioning_scaledoubleOptional>=0Defaults to 1

[Not supported on SD3, FLUX] How strong the effect of the controlnet should be.

controlnet_early_stopdoubleOptional>=0<=1

[Not supported on SD3, FLUX] If provided, indicates fraction of steps at which to stop applying controlnet. This can be used to sometimes generate better outputs.

controlnet_imagestringOptional

[Not supported on SD3, FLUX] Controlnet image encoded in b64 string for guiding image generation. Required for controlnet engines.

controlnet_preprocessbooleanOptional

[Not supported on SD3, FLUX] Whether to apply automatic ControlNet preprocessing.

heightintegerOptional

Integer representing the height of image to generate. None will default to 512 for SD 1.5 and 1024 for FLUX, SD3, SDXL, and SSD. Supported resolutions (w,h): FLUX=[(512, 512), (576, 1024), (640, 1536), (768, 1024), (768, 1344), (832, 1216), (896, 1152), (1024, 576), (1024, 768), (1024, 1024), (1152, 896), (1216, 832), (1344, 768), (1536, 640)], SD3=[(640, 1536), (768, 1344), (832, 1216), (896, 1152), (1024, 1024), (1152, 896), (1216, 832), (1344, 768), (1536, 640)], SDXL=[(640, 1536), (768, 1344), (832, 1216), (896, 1152), (1024, 1024), (1152, 896), (1216, 832), (1344, 768), (1536, 640)], SD1.5=[(384, 704), (448, 576), (512, 512), (512, 704), (512, 768), (512, 832), (576, 448), (576, 768), (576, 1024), (640, 512), (640, 640), (640, 768), (704, 384), (704, 1216), (768, 512), (768, 576), (768, 1024), (832, 512), (896, 896), (1024, 576), (1024, 768), (1024, 1024), (1024, 1536), (1216, 704), (1536, 1024)], SSD=[(640, 1536), (768, 1344), (832, 1216), (896, 1152), (1024, 1024), (1152, 896), (1216, 832), (1344, 768), (1536, 640)].

high_noise_fracdoubleOptional>=0<=1Defaults to 0.8

[Not supported on SD3, FLUX] Floating-point number that defines the fraction of steps to perform with the base model. Used only by SD XL. Must be greater than or equal to 0.0 and less than or equal to 1.0.

image_encoding"jpeg" or "png"Optional
Allowed values: jpegpng

Define which encoding process should be applied before returning the generated image(s).

init_imagestringOptional

Starting point image encoded in b64 string for Image to Image generation mode.

inpaint_full_resbooleanOptional

When true, upsample ‘init_image’ and ‘mask_image’ such that a bounding box drawn around the non-black content of ‘mask_image’ becomes the full inpainting frame. The bounding box has the same aspect ratio as that specified by (‘height’, ‘width’) parameters.

inpaint_full_res_paddingintegerOptionalDefaults to 32

When ‘inpaint_full_res’ is true, include a frame around the non-black mask area that is this many pixels wide. This serves as a form of context for the inpainting and may improve blending.

log_promptbooleanOptional

OctoAI stores request prompt if True.

lorasmap from strings to optional doublesOptional

[Not supported on SD3] A dictionary of LoRAs to apply. LoRAs as keys and their weights (float) as values.

mask_imagestringOptional

[Not supported on SD3] b64 encoded mask image for inpainting. White area should indicate where to paint.

negative_promptstringOptional

[Not supported on FLUX] Text describing image traits to avoid during generation.

negative_prompt_2stringOptional

[Not supported on SD3, FLUX] Text with a high level description of things to avoid during generation. Used only by SD XL.

num_imagesintegerOptional<=16Defaults to 1

Integer representing how many output images to generate with a single prompt/configuration.

outpaintingbooleanOptional

[Not supported on SD3, FLUX] Whether the request requires outpainting or not. If so, special preprocessing is applied for better results.

prompt_2stringOptional

[Not supported on SD3, FLUX] Text with a high-level description of the image to generate. Used only by SD XL.

samplerenumOptional

[Not supported on SD3, FLUX] Sampler name (also known as ‘scheduler’) to use during image generation.

seedinteger or list of optional integersOptional

Integer number or list of integers representing the seeds of random generators. Fixing random seed is useful when attempting to generate a specific image. Must be greater than 0 and less than 2^32.

stepsintegerOptional<=200Defaults to 30

Integer repreenting how many steps of diffusion to run. Must be greater than 0 and less than or equal to 200.

strengthdoubleOptional<=1Defaults to 0.8

Floating-point number indicating how much creative the Image to Image generation mode should be. Must be greater than 0 and less than or equal to 1.0.

style_presetenumOptional

[Not supported on SD3, FLUX] Pre-defined styles used to guide the output image towards a particular style. Pre-defined styles are only supported by SDXL.

textual_inversionsmap from strings to optional stringsOptional

[Not supported on SD3, FLUX] A dictionary of textual inversions to be used during image generation. Textual inversions as keys and trigger words as values.

transfer_imagesmap from strings to optional lists of stringsOptional

[Not supported on SD3, FLUX] A dictionary containing a mapping of trigger words to a list of sample images which demonstrate the desired object or style to transfer.

use_refinerbooleanOptional

[Not supported on SD3, FLUX] Whether to enable and apply the SDXL refiner model to the image generation.

vaestringOptional

[Not Supported on SD3, FLUX] Custom VAE to be used during image generation.

widthintegerOptional

Integer representing the width of image to generate. None will default to 512 for SD 1.5 and 1024 for FLUX, SD3, SDXL, and SSD. Supported resolutions (w,h): FLUX=[(512, 512), (576, 1024), (640, 1536), (768, 1024), (768, 1344), (832, 1216), (896, 1152), (1024, 576), (1024, 768), (1024, 1024), (1152, 896), (1216, 832), (1344, 768), (1536, 640)], SD3=[(640, 1536), (768, 1344), (832, 1216), (896, 1152), (1024, 1024), (1152, 896), (1216, 832), (1344, 768), (1536, 640)], SDXL=[(640, 1536), (768, 1344), (832, 1216), (896, 1152), (1024, 1024), (1152, 896), (1216, 832), (1344, 768), (1536, 640)], SD1.5=[(384, 704), (448, 576), (512, 512), (512, 704), (512, 768), (512, 832), (576, 448), (576, 768), (576, 1024), (640, 512), (640, 640), (640, 768), (704, 384), (704, 1216), (768, 512), (768, 576), (768, 1024), (832, 512), (896, 896), (1024, 576), (1024, 768), (1024, 1024), (1024, 1536), (1216, 704), (1536, 1024)], SSD=[(640, 1536), (768, 1344), (832, 1216), (896, 1152), (1024, 1024), (1152, 896), (1216, 832), (1344, 768), (1536, 640)].

Response

This endpoint returns an object.
imageslist of objects

List of ImageGeneration(s) generated by the request.

prediction_time_msdouble

Total runtime of the image generation(s).