So you want to make a LORA? (Part 3)


Updated:

Okay you’ve read this far (thanks for sticking with me) and have decided that you’re going to give this LORA thing serious consideration. Great! But what now? Well it's time to…

Step 2. Curate quality source images for model training.

Once you have a solid idea, be it a character or style, you need a good reference dataset. That means images. To make a LORA you need thematically relevant reference images. These images are used to train the LORA.

The standard that seems to work best is to collect or create between 10 to 80 high quality images that fit with the theme of the LORA you want to create.

What? 80 fracking images…to create from scratch!?

That may sound like a lot, and it is, but the good news is you can source your sample images from anywhere. You don’t actually have to generate your own training images, its just often quicker to get the specific type of artworks to fit a theme this way. Per chance, if you have an account on another platform, there’s no reason you couldn’t generate images and use them. In fact it might be better to curate images from multiple sources as a quality control. Your trained LORA will not suffer if the reference source images aren’t all generated on the same platform, or even using the same base LORAs. What’s important is resolution quality, not quantity, and consistency.

Where to start?

The highest bestest most awesome source images won’t be worth a can of refried beans if you choose the wrong type of Base LORA. So far I have only tried Lightning FLUX and SD 3.5L, not because of cost but because those seem to be good with most Character LORA styles. AND those were the Base LORAs I’d used to make art with when I started. There are many other style options, like for Anime, but I picked a familiar LORA that generated a style of AI art similar to what I wanted to create.

Honestly I'd just stick with FLUX to start with. It seems decent, even when only using 10-12 images. Best of all the learning curve wasn’t that steep. However to generate a consistent look, style, or character pose you do need some repetition.

How much repetition?

I’ve found 2-4 similar images seem to work okay. You can do 30. But that’s just silly overkill. No, seriously. I had no clue and used 30 similar images of Bacchae dancing at a Dionysian revel or Mountain glade initiation. The result, even at a LORA weight of 1, is bleed through, mostly of grapes. Lots of grapes, even when not prompted for. No need to shame the LORA, not her fault. But, because of that mild snafu, I’d suggest repetition of images in larger LORAs should probably be kept to 10 or fewer similar images and then only for a full range of front (2), back (2), and side views (2 left, 2 right) of the subject and, for Character LORAs, at least 1 to 4 good quality head or head and shoulder shots.

To recap, whether generating AI images or searching for art/pictures consider the character poses/subject positioning. Of course how many images you decide to use will depend upon what tier you are at and what you want to create. Free tier community members can make LORAs, and relatively decent ones, even if we are limited to 100 images. So far I’ve not required that many for a dataset. So don’t worry about it. Start simple with a test LORA of 10-20 reference images. See how that goes.

Remember that what type of LORA you can create will depend on how many credits you have banked since the more source images you employ the higher the cost to train will be. An important consideration to keep in mind. The good news is, once in the online training menu, you can toggle through the various options. After all your images have uploaded and the tags have been applied you will be able to see how much each particular LORA would cost to create. So, whether free tier or Pro, it doesn’t hurt to toggle through each option you are considering and compare the cost.

For the best result your images should be good quality with decent resolution. If you have the image editing tools clean up the image if necessary. That means remove extraneous logos, screenbugs, and other unwanted text or image elements. Remember these are TRAINING images. That means if you use screencaps with a station logo or image with a screen bug (like a URL) that will become part of the training model.

In Practical Terms: The more images the more time it will take to train. AND the more it will cost. Cleaning up images with logos or other visible watermarks is a must. If you can upscale do so for lower quality images but if the result after upscaling isn’t crisp and clear trash it. (Pay close attention to the eyes when working with character images. Not all upscale tools work the same.) Use only clean, in focus, high-quality images when possible; unless you are doing a Style LORA that requires film grain or blurriness.

Thanks for reading this far. If my terse writing style hasn’t turned you off to creating a LORA then you are ready for…

Step 3: Dive in head first or check out one of the tensor.art specific step-by-step guides for LORA creation.

(PART 4: https://tensor.art/articles/877929254090760909)

0