Easy Guide to making Style LoRA for Flux


Updated:

1. Introduction

Dear friends,

As a second article about LoRA creation, I want to try a Style LoRA making story~. I'm not a professional to LoRA making at all. Please take my writing as a reference only.

If you are new to LoRA making, please check my 1st article about character LoRA making first. It shows the details of using the TA trainer. I'll focus on the important factors for Style LoRA making here.

2. Image Preparation

For the source images of style LoRA, you need to have 30 ~ 100 images of the style you want. In this LoRA tutorial, I'm going to make an oil painting style LoRA. So I prepared for 32 images of Edward Hopper style. I generated them myself using the LoRA of other user's. It's easy to prepare for the source images if you generate them for yourself since you can just generate the images as 1024x1024 and no more editing is needed. The images must be consistent to the style you want to create. For retro style for example, you can choose a great selection of retro images of your favorite. That's it!

Now upload the source images into TA's trainer as I explained before. The captions are automatically generated again.

3. Parameter Setting

I choose to create a LoKr instead of LoRA here again. LoKr is a more efficient model than LoRA and I want to recommend it. However, for your model project setting, please choose LyCORIS instead of LoKr for now. For some reason the LoKr selection leads to a bug when you execute your image generation later. LyCORIS is a superset to include efficient models like LoKr, LoHA, LoCON etc. (I asked TA staffs about this bug.)

Let's just look at the summary of the parameter setting shown in the training progress window. Check the trigger words. I added additional captions to train Hopper style like "flat color", "strong contrast of light and dark", etc. I'll use some of them as trigger words for image generation later. For style LoRAs, it's more important to put the proper styling caption in each image so that LoRA learns the right style from the images. Especially when you don't use many source images (like this case, 32), the style captions of each image might help the LoRA understand the right style during the training.

Repeat is not as high as character LoRA's because we are not trying to learn each image deeply. So 5~10 iterations are enough for "repeat" now. You can try different numbers depending on the situation. The style LoRA doesn't require many epochs. Sometimes the style is learned rapidly at the early epochs and the progress of epoch looks negligible or makes the images worse by overfitting and increased Loss number. I used repeat 10 and epoch 3 here.

I used 0.0002 for Unet learning rate here. You can use the default 0.0001 too. I wanted faster learning and increased the number.

As I explained in the previous article, the conv_dim and conv_alpha are 8 and 2 now. Make the prompt to make your sample images for each epoch again. My prompt was slightly complicated this time. Let's see the training progress now. For this training, the credits used were around 112. I used 1024 x 1024 images this time.

3. Training Progress

The training takes about 1 hour. The loss of each epoch was 0.305, 0.302 and 0.301. You can see the improvement of sample image at each epoch. At epoch 3, the result looks pretty good. So I'll use epoch 3 for publication.

It’s nice to have decreasing loss values as training progresses. However, loss value is not an absolute measure to decide the quality of an epoch. Sometimes the epoch with higher loss shows you better sample images. Then, you’d better choose the epoch with better sample results. That’s why it’s so important to make prompt for smaple image generation appropriate to see the effect of LoRA, but not to overwhelm the LoRA effects. If you are not sure, just make the prompt very simple and put the trigger words of your choice.

3. Choosing the best Epoch

If you have many epochs and keep watching the progress, check if the sample image begins to show artifacts because of overfitting such as very bad hands, head or face, whatever getting worse and breaking down. Then you probably don’t have to continue the training and stop it there. Then choose the best epoch by checking the sample images. Even though you don’t see the artifacts, sometimes you can find that the sample images are not improving any more at certain point and see almost the same images repeated. This is also a sign to stop the training because it reached a saturation point. Save your credits by early terminaiton whenever possible.

Sometimes, you are in the situation that image #1 looks pretty good at epoch 5, but image #3 of epoch 7 is much better. What to choose, epoch 5 or 7? Sounds familiar? Don’t worry. You can always publish any epoch you want and generate images from those epochs and compare the results. Is it still hard to make a call which gen is really better? Then keep them all with different version numbers in your project! Your users will choose whatever they want~ Otherwise, you can set one version as pro and the other non-pro so that all the users can enjoy your LoRA, and possibly you can make more profit out of your LoRA~.

4. LoRA testing

I tested the LoRA with strength 0.8 ~ 1.0. The results are satisfactory. 🤗😉 Well, that's it for the Style LoRA making of Oil Painting. Thanks for reading. Please let me know if there are any mistakes here.

32
0