Media Gen SolutionFine-tuning Stable Diffusion

TypeScript SDK Fine-tuning

How to create a fine-tuned LoRA using OctoAI's TypeScript SDK

This guide will walk you through creating a fine-tuned LoRA using our TypeScript SDK to upload image file assets, creating a tune, and then using that LoRA to run an inference using our Image Generation service once it’s ready.

Please see Fine-tuning Stable Diffusion for more specifics about each parameter in the fine-tuning API, using curl, or the Python SDK. Our Asset Library in the TypeScript SDK documentation goes more into the specifics of using different asset methods as well, and our TypeScript SDK Reference covers each parameter and method in more detail.

Requirements

  • Please create an OctoAI API token if you don’t have one already.
  • Please also verify you’ve completed TypeScript SDK Installation & Setup. Must be version >= 0.4.0.
  • If you use the OCTOAI_TOKEN envvar for your token, you can instantiate the client with const client = OctoAI()process.env.OCTOAI_TOKEN) or pass the token as a parameter to the constructor.
  • An account and API token is required for all the following steps.

High-level steps to creating a fine-tuned LoRA

In order to run a LoRA fine-tuning job and then , you need to complete a few steps:

  1. Create image file assets using AssetLibrary, then wait for those assets’ status to be “ready”
  2. Either create a checkpoint asset you would like to use or get one from OctoAI’s public checkpoints.
  3. Create a tune job, then wait for the status to be “succeeded”.
  4. Run an inference with the new LoRA.
  5. Clean up

The client.tune also provides methods for get, cancel, list, and delete. With the exception of list, these all take in a tune.id and execute the related action on that tune. This guide will also include a way to use list with delete to clean up your test tune and related assets at the end of the guide.

Directions with all the code put together are included at the bottom of the document, but at each step we will cover additional information.

1) Creating an image file asset

Asset Library in the TypeScript SDK covers more specifics about the methods, so this example will be focused on a code snippet for uploading multiple files in a folder at once. There are different approaches, such as making the asset name match the name of the file, however for our purpose we are going to keep everything named with our NAME constant in order to make it easier to search and delete the assets created later.

This is a snippet, however for the full example, please check at the end of this guide.

In this example, we use multiple photos of a toy poodle named Mitchi.

TypeScript
1import { OctoAIClient } from "@octoai/sdk";
2
3// These constants will be the same in the rest of this doc
4const NAME = "test-sks3-poodle-sd15"; // To be used for loras in infer method
5const FILE_PATH = "test_assets/mitchi";
6const FILE_SUFFIX = "jpg";
7
8const client = new OctoAIClient({
9 apiKey: "<OCTOAI_TOKEN>",
10});
11
12// First, we will upload and create a number of image assets to use for fine-tuning
13const assets = [];
14for (let i = 0; i < 5; i++) {
15 const asset = await client.assetLibrary.upload(`${FILE_PATH}${i}.${FILE_SUFFIX}`, {
16 name: `${NAME}-image-${i}`,
17 data: { assetType: "file", fileFormat: FILE_SUFFIX },
18 assetType: "file",
19 description: `${NAME}`,
20 });
21
22 assets.push(asset);
23}

We can also use a call back for our snippet to verify that the assets are ready to be used, such as:

TypeScript
1 let pos = 0;
2 let assetStatus = assets[pos].status;
3 while (pos < assets.length) {
4 await new Promise((resolve) => setTimeout(resolve, 1000));
5 const retrieved = await client.assetLibrary.get(assets[pos].id);
6 assets[pos] = retrieved;
7 assetStatus = retrieved.status;
8 if (assetStatus === "ready") {
9 pos += 1;
10 }
11 }

After this completes, all assets will hopefully be in the ready state, or you should time out. Mitchi is now on OctoAI!

astropus.png

2) Get a checkpoint asset to use for tuning our LoRA

Next, you’ll need a checkpoint to use to tune your asset. In this example, we will just use the default checkpoint using Stable Diffusion 1.5, but you can also use other public OctoAI checkpoints or create your own using AssetLibrary.

TypeScript
1 // Let's use an OctoAI public checkpoint for tuning our LoRA
2 const checkpoint = await client.assetLibrary
3 .list({
4 isPublic: true,
5 owner: "octoai",
6 name: "default-sd15",
7 })
8 .then((r) => r.data[0]);

3) Creating a tune job

We can create a tune job most simply by passing in the checkpoint directly and the assets directly. This does come at a minor cost to quality because this will lead to the captions being set directly to the trigger word, where as for better results you’ll want to set your own captions.

TypeScript
1 const tune = await client.fineTuning.create({
2 name: NAME,
3 description: "sks3 poodle",
4 details: {
5 baseCheckpoint: checkpoint,
6 files: assets,
7 steps: 500,
8 tuneType: "lora_tune",
9 triggerWords: ["sks3 poodle"],
10 },
11 });

For better results, you can set your own captions for your assets as follows, or use the asset_id strings directly, then pass this to the files field in the above request.

TypeScript
1 const tune = await client.fineTuning.create({
2 name: NAME,
3 description: "sks3 poodle",
4 details: {
5 baseCheckpoint: checkpoint,
6 files: assets.map((asset) => ({
7 fileId: asset.id,
8 caption: "your detailed caption with sks3 poodle the trigger word in it here",
9 })),
10 steps: 500,
11 tuneType: "lora_tune",
12 triggerWords: ["sks3 poodle"],
13 },
14 });

You can also pass the details for your base_checkpoint directly instead of looking up the asset as well if you know it.

TypeScript
1 const tune = await client.fineTuning.create({
2 name: NAME,
3 description: "sks3 poodle",
4 details: {
5 baseCheckpoint: {
6 checkpointId: "asset_01hev42y7ffc58b3aqc8wa04p4",
7 engine: "image/stable-diffusion-v1-5",
8 name: "default-sd15",
9 },
10 files: assets
11 steps: 500,
12 tuneType: "lora_tune",
13 triggerWords: ["sks3 poodle"],
14 },
15 });

Similar to creating assets, we can also wait for the tune job to succeed (or fail) before we move on to running an inference.

TypeScript
1 let { status } = tune;
2 while (status !== "failed" && status !== "succeeded") {
3 await new Promise((resolve) => setTimeout(resolve, 1000));
4 tune = await client.fineTuning.get(tune.id);
5 status = tune.status;
6 }

4) Run an inference with the tuned LoRA

Next, you can run an inference with the tuned LoRA

TypeScript
1 const { images } = await client.imageGen.generateSd({
2 "prompt": "A photo of sks3 poodle as a puppy",
3 "negativePrompt": "Blurry photo, distortion, low-res, poor quality, extra limbs, extra tails",
4 "loras": {
5 "test-sks3-poodle-sd15": 0.8 // Replace this whatever you named your NAME const
6 },
7 "numImages": 1,
8 })
9
10 images.forEach((imageOutputs: any, i: number) => {
11 const buffer = Buffer.from(imageOutputs.image_b64, "base64");
12 writeFileSync(`result${i}.jpg`, buffer);
13 });

The end result will be a saved poodle to your local folder.

Stable Diffusion Tuned LoRA generated toy poodle puppy

result0.jpg

5) Clean up

This example will delete everything associated with the NAME constant used earlier, including the LoRA.

If you wish to merely delete the file assets, you can filter by asset_type: "file". Please refer to the ListAssetsRequest reference docs for more information on parameters that might help you filter for how you’d like to use the service.

TypeScript
1 // Warning: This will delete all associated assets, including the created example LoRA.
2 // Please see above docs if you'd rather keep it, or you can also add additional file assets
3 // to tune your LoRA differently.
4 const tunes = await client.fineTuning
5 .list({ name: NAME })
6 .then((r) => r.data);
7
8 for (let i = 0; i < tunes.length; i++) {
9 await client.fineTuning.delete(tunes[i].id);
10 }
11
12 const assets = await client.asset
13 .list({
14 name: NAME,
15 })
16 .then((r) => r.data);
17
18 for (let i = 0; i < assets.length; i++) {
19 await client.assetLibrary.delete(assets[i].id);
20 }

Putting it all together: From Asset Creation to Running an Inference with Tuned LoRA

This does not include the above clean up script, but it’s recommended you run that after if you’d like to clean up all related assets and tunes.

TypeScript
1import { OctoAIClient } from "@octoai/sdk";
2import { writeFileSync } from "fs";
3
4const client = new OctoAIClient();
5
6// Magic Strings for example
7const NAME = "test-sks3-poodle-sd15"; // To be used for loras in infer method
8const FILE_PATH = "test_assets/mitchi";
9const FILE_SUFFIX = "jpg";
10
11// Some versions of TS don't allow async outside of a method, and so...
12async function fineTuneExample() {
13 // First, we will upload and create a number of image assets to use for fine-tuning
14 const assets = [];
15 for (let i = 0; i < 5; i++) {
16 const asset = await client.assetLibrary.upload(`${FILE_PATH}${i}.${FILE_SUFFIX}`, {
17 name: `${NAME}-image-${i}`,
18 data: { assetType: "file", fileFormat: FILE_SUFFIX },
19 assetType: "file",
20 description: `${NAME}`,
21 });
22
23 assets.push(asset);
24 }
25 // Verify Assets are ready to do be used for fine-tuning
26 let pos = 0;
27 let assetStatus = assets[pos].status;
28 while (pos < assets.length) {
29 await new Promise((resolve) => setTimeout(resolve, 1000));
30 const retrieved = await client.assetLibrary.get(assets[pos].id);
31 assets[pos] = retrieved;
32 assetStatus = retrieved.status;
33 if (assetStatus === "ready") {
34 pos += 1;
35 }
36 }
37
38 // Then let's use an OctoAI public checkpoint for tuning our LoRA
39 // You can also use your own checkpoints as well
40 const checkpoint = await client.assetLibrary
41 .list({
42 isPublic: true,
43 owner: "octoai",
44 name: "default-sd15",
45 })
46 .then((r) => r.data[0]);
47
48 // And finally creating a finetuning job after verifying the assets are ready
49 // This will set the captions to the trigger word, but setting descriptive captions will
50 // yield better results
51 const tune = await client.fineTuning.create({
52 name: NAME,
53 description: "sks3 poodle",
54 details: {
55 baseCheckpoint: checkpoint,
56 files: assets,
57 steps: 500,
58 tuneType: "lora_tune",
59 triggerWords: ["sks3 poodle"],
60 },
61 });
62
63 let { status } = tune;
64 while (status !== "failed" && status !== "succeeded") {
65 await new Promise((resolve) => setTimeout(resolve, 1000));
66 tune = await client.fineTuning.get(tune.id);
67 status = tune.status;
68 }
69
70 // And once the job is finished, using that tuned lora for an image generation request
71 const { images } = await client.imageGen.generateSd({
72 "prompt": "A photo of sks3 poodle as a puppy",
73 "negativePrompt": "Blurry photo, distortion, low-res, poor quality, extra limbs, extra tails",
74 "loras": {
75 "test-sks3-poodle-sd15": 0.8 // Replace this whatever you named your NAME const
76 },
77 "numImages": 1,
78 })
79
80 images.forEach((imageOutputs: any, i: number) => {
81 const buffer = Buffer.from(imageOutputs.image_b64, "base64");
82 writeFileSync(`result${i}.jpg`, buffer);
83 });
84}
85fineTuneExample().then();