Media Generation API

Generate SDXL

POST
Generate images in response to the given request.

Request

This endpoint expects an object.
prompt
stringRequired
Text describing the image content to generate.
cfg_scale
doubleOptionalDefaults to 12
Floating-point number represeting how closely to adhere to prompt description. Must be a positive number no greater than 50.0.
checkpoint
stringOptional
[Not supported on SD3] Custom checkpoint to be used during image generation.
clip_skip
integerOptional
[Not supported on SD3] Optionally skip later layers of the text encoder. Higher values lead to more abstract interpretations of the prompt.
controlnet
stringOptional
[Not supported on SD3 or IPAdapter] ControlNet to be used during image generation.
controlnet_conditioning_scale
doubleOptionalDefaults to 1
[Not supported on SD3 or IPAdapter] How strong the effect of the controlnet should be.
controlnet_early_stop
doubleOptional
[Not supported on SD3 or IPAdapter] If provided, indicates fraction of steps at which to stop applying controlnet. This can be used to sometimes generate better outputs.
controlnet_image
stringOptional
[Not supported on SD3 or IPAdapter] Controlnet image encoded in b64 string for guiding image generation. Required for controlnet engines.
controlnet_preprocess
booleanOptional
[Not supported on SD3 or IPAdapter] Whether to apply automatic ControlNet preprocessing.
height
integerOptional
Integer representing the height of image to generate. None will default to 512 for SD 1.5 and 1024 for SD3, SDXL, and SSD. Supported resolutions (w,h): SD3={(1536, 640), (768, 1344), (832, 1216), (1344, 768), (1152, 896), (640, 1536), (1216, 832), (896, 1152), (1024, 1024)}, SDXL={(1536, 640), (768, 1344), (832, 1216), (1344, 768), (1152, 896), (640, 1536), (1216, 832), (896, 1152), (1024, 1024)}, SD1.5={(768, 576), (1024, 576), (640, 512), (384, 704), (640, 768), (640, 640), (1024, 768), (1536, 1024), (768, 1024), (576, 448), (1024, 1024), (896, 896), (704, 1216), (512, 512), (448, 576), (832, 512), (512, 704), (576, 768), (1216, 704), (512, 768), (512, 832), (1024, 1536), (576, 1024), (704, 384), (768, 512)}, SSD={(1536, 640), (768, 1344), (832, 1216), (1344, 768), (1152, 896), (640, 1536), (1216, 832), (896, 1152), (1024, 1024)}.
high_noise_frac
doubleOptionalDefaults to 0.8
[Not supported on SD3] Floating-point number that defines the fraction of steps to perform with the base model. Used only by SD XL. Must be greater than or equal to 0.0 and less than or equal to 1.0.
image_encoding
enumOptional
Define which encoding process should be applied before returning the generated image(s).
Allowed values: jpegpng
init_image
stringOptional
Starting point image encoded in b64 string for Image to Image generation mode.
ip_adapter_image
stringOptional
IP Adapter image encoded in b64 string for guiding image generation. Required for ip adapter engines.
ip_adapter_mask_image
stringOptional
b64 encoded mask image for ip adapter. White area should indicate where to paint.
ip_adapter_method
enumOptional
Define which ip adapter method should be applied during the image generation. Supported modes: full, style, and composition
Allowed values: fullstylecomposition
ip_adapter_mode
string literalOptional
Define which ip adapter mode should be applied during the image generation. Supported modes: base
Allowed value: base
ip_adapter_scale
doubleOptionalDefaults to 1
How strong the effect of the ip adapter should be.
loras
map from strings to optional doublesOptional
[Not supported on SD3] A dictionary of LoRAs to apply. LoRAs as keys and their weights (float) as values.
mask_image
stringOptional
[Not supported on SD3] b64 encoded mask image for inpainting. White area should indicate where to paint.
negative_prompt
stringOptional
Text describing image traits to avoid during generation.
negative_prompt_2
stringOptional
[Not supported on SD3] Text with a high level description of things to avoid during generation. Used only by SD XL.
num_images
integerOptionalDefaults to 1
Integer representing how many output images to generate with a single prompt/configuration.
outpainting
booleanOptional
[Not supported on SD3] Whether the request requires outpainting or not. If so, special preprocessing is applied for better results.
prompt_2
stringOptional
[Not supported on SD3] Text with a high-level description of the image to generate. Used only by SD XL.
sampler
enumOptional
[Not supported on SD3] Sampler name (also known as 'scheduler') to use during image generation.
seed
unionOptional
Integer number or list of integers representing the seeds of random generators. Fixing random seed is useful when attempting to generate a specific image. Must be greater than 0 and less than 2^32.
steps
integerOptionalDefaults to 30
Integer repreenting how many steps of diffusion to run. Must be greater than 0 and less than or equal to 200.
strength
doubleOptionalDefaults to 0.8
Floating-point number indicating how much creative the Image to Image generation mode should be. Must be greater than 0 and less than or equal to 1.0.
style_preset
enumOptional
[Not supported on SD3] Pre-defined styles used to guide the output image towards a particular style. Pre-defined styles are only supported by SDXL.
textual_inversions
map from strings to optional stringsOptional
[Not supported on SD3] A dictionary of textual inversions to be used during image generation. Textual inversions as keys and trigger words as values.
transfer_images
map from strings to optional lists of stringsOptional
[Not supported on SD3] A dictionary containing a mapping of trigger words to a list of sample images which demonstrate the desired object or style to transfer.
use_refiner
booleanOptional
[Not supported on SD3] Whether to enable and apply the SDXL refiner model to the image generation.
vae
stringOptional
[Not Supported on SD3] Custom VAE to be used during image generation.
width
integerOptional
Integer representing the width of image to generate. None will default to 512 for SD 1.5 and 1024 for SD3, SDXL, and SSD. Supported resolutions (w,h): SD3={(1536, 640), (768, 1344), (832, 1216), (1344, 768), (1152, 896), (640, 1536), (1216, 832), (896, 1152), (1024, 1024)}, SDXL={(1536, 640), (768, 1344), (832, 1216), (1344, 768), (1152, 896), (640, 1536), (1216, 832), (896, 1152), (1024, 1024)}, SD1.5={(768, 576), (1024, 576), (640, 512), (384, 704), (640, 768), (640, 640), (1024, 768), (1536, 1024), (768, 1024), (576, 448), (1024, 1024), (896, 896), (704, 1216), (512, 512), (448, 576), (832, 512), (512, 704), (576, 768), (1216, 704), (512, 768), (512, 832), (1024, 1536), (576, 1024), (704, 384), (768, 512)}, SSD={(1536, 640), (768, 1344), (832, 1216), (1344, 768), (1152, 896), (640, 1536), (1216, 832), (896, 1152), (1024, 1024)}.

Response

This endpoint returns an object
images
list of objects
List of ImageGeneration(s) generated by the request.
prediction_time_ms
double
Total runtime of the image generation(s).