A bit of my experience with making AI-generated images and LoRAs ( 4 )

https://tensor.art/articles/868883505357024765 ( 1 )

https://tensor.art/articles/868883998204559176 ( 2 )

https://tensor.art/articles/868884792773445944 ( 3 )

https://tensor.art/articles/868890182957418586 ( 5 )

When it comes to training LoRAs, trying to fix all the bugs at the source is seriously exhausting. Unless you're doing LoRA training full-time, who really has the time and energy to spend so much of their free time on just one LoRA? Even if you are full-time, chances are that you'd still prioritize efficiency over perfection. And even after going through all the trouble to eliminate those bugs, the result might only be improving the “purity” from 60% to 80%—just a guess. After all, AI is still a game of randomness. The final training parameters, repeats, epochs, learning rate, optimizer, and so on will all influence the outcome. You’ll never “purify” it to 100%. And really, even 60% can already be impressive enough. So—worth it? My personal take: absolutely. If a certain character—or your OC—is someone your favorite since childhood, someone who’s part of your emotional support, someone who represents a small dream in your life, then why not? They’ll always be worth it.

I’ve only made a handful of LoRAs so far, each with a bit of thought and some controlled variables. I’ve never repeated the same workflow, and each result more or less met the expectations I had at the beginning. Still, the sample size is way too small. I don’t think my experiences are close to being truly reliable yet. If you notice anything wrong, please don’t hesitate to point it out—thank you so much. And if you think there’s value in these thoughts, why not give it a try yourself?

Oh, right—another disclaimer: due to the limitations of my PC setup, I have no idea what effect larger parameter values would have. All of this is based on training character LoRAs using the Illustrious model.

Also, a very important note: this is not a LoRA training tutorial or a definitive guide. If you’ve never made a LoRA yourself but are interested in doing so, try searching around online and go ahead and make your first one. The quality doesn’t matter; just get familiar with the process and experience firsthand the mix of joy and frustration it brings. That said, I’ll still try to lay out the logic clearly and help you get a sense of the steps involved.

0. Prepare your training set. This usually comes from anime screenshots or other material of the character you love. A lot of tutorials treat this as the most crucial step, but I won’t go into it here—you’ll understand why after reading the rest.

1. Get the tools ready. You’ll need a computer, and you’ll need to download a local LoRA trainer or a tagging tool of some kind. Tools like Tensor can sometimes have unstable network connections, but they’re very convenient. If your internet is reliable, feel free to use Tensor; otherwise, I recommend doing everything on your PC.

2. If you’ve never written prompts using Danbooru-style tags before, go read the tag wiki on Danbooru. Get familiar with the categories, what each one means, and look at the images they link to. This is super important—you’ll need to use those tags accurately on your training images.

3. Do the auto-tagging. These tagging tools will detect the elements in your image and generate tags for them. On Tensor, just use the default model wd-v1-4-vit-tagger-v2—it’s fine, since Tensor doesn’t support many models anyway, and you can’t adjust the threshold. On PC, you can experiment with different tagger models. Try setting the threshold to 0.10 to make the tags as detailed as possible. You can adjust it based on your own needs.

4. Now comes the most critical step—the one that takes up 99% of the entire training workload.

After tagging is complete, fix your eyes on the first image in your dataset. Just how many different elements are in this image? Just like how the order of prompts affects output during image generation, prompts during training follow a similar rule. So don’t enable the “shuffle tokens” parameter. Put the most important tokens first—like the character’s name and “1boy.”

For the character’s traits, I suggest including only two. Eye color is one of them. Avoid using obscure color names; simple ones like “red” or “blue” are more than enough. You don’t need to describe the hairstyle or hair color in detail—delete all automatically generated hair-related tags. Of course, double-check the eye color too. Sometimes it tags multiple colors like “red” and “orange” together—make sure to delete the extra ones.

When it comes to hair, my experience is: if the color is complex, just write the hairstyle (e.g., “short hair”); if the hairstyle is complex, just write the color. Actually, if the training is done properly, you don’t even need to include those—just the character name is enough. But in case you use this LoRA with others that have potential for overfitting, it’s a safety measure to include them.

Any tags about things like teeth, tattoos, etc., should be completely removed. If they show up in the auto-tags, delete them. The same goes for tags describing age or body type, such as “muscular,” “toned,” “young,” “child male,” “dark-skinned male,” etc. And if there are nude images in your dataset, and you think the body type looks good and you want future generations to match that body type, do not include tags like “abs” or “pectorals.”

You may have realized by now—it’s precisely because those tags weren’t removed that they got explicitly flagged, and so the AI treats them as interchangeable. That’s why you might see the body shape, age, or proportions vary wildly in outputs. Sometimes the figure looks like a sheet of paper. That’s because you had “abs” and “pectorals” in your tags and didn’t realize those became part of the trigger prompts.

If you don’t take the initiative to remove or add certain tags, you won’t know which ones have high enough weight to act as triggers. They’ll all blend into the chaos. If you don’t call them, they won’t appear. But if you do—even unintentionally—they’ll show up, and it might just bring total chaos.

Once you’re done with all that, your character’s description should include only eye color and hair.

For the character name used as a trigger word, don’t format it like Danbooru or e621. That’s because Illustrious and Noobai models already recognize a lot of characters. If your base model already knows your character, a repeated or overly formal name will only confuse it. What nickname do you usually use when referring to the character? Just go with that.

See how tedious this process is, even just for tags setup? It’s far more complex than just automatically tagging everything, batch-adding names, and picking out high-frequency tags.

Remember the task at the start of this section? To identify all the elements in the first image. You’ve now covered the character features. Now let’s talk about the clothing.

Let’s say the boy in the image is wearing a white hoodie with blue sleeves, a tiger graphic on the front, and a chest pocket. Now you face a decision: do you want him to always wear this exact outfit, or do you want him to have a new outfit every day?

Auto-tagging tools don’t always fully tag the clothing. If you want him to wear different clothes all the time, then break down this outfit and tag each part accordingly using Danbooru-style tags. But if you want him to always wear the same thing, just use a single tag like “white hoodie,” or even give the outfit a custom name.

There’s more to say about clothing, but I’ll save that for the section about OCs. I already feel like this part is too long-winded, but it’s so tightly connected and info-heavy that I don’t know how to express it all clearly without rambling a bit.

Next, observe the character’s expression and pose. Use Danbooru-style tags to describe them clearly. I won’t repeat this later. Just remember—tags should align with Danbooru as closely as possible. Eye direction, facial expression, hand position, arm movement, leaning forward or backward, the angle of knees and legs—is the character running, fighting, lying down, sitting, etc.? Describe every detail you can.

Now, observe the background. Sky, interiors, buildings, trees—there’s a lot. Even a single wall, or objects on the wall, or the floor material indoors, or items on the floor—or what the character is holding. As mentioned earlier, if you don’t tag these things explicitly, they’re likely to show up alongside any chaotic high-weight tags you forgot to remove, suddenly appearing out of the ether.

Are there other characters in the scene? If so, explain them clearly using the same process. But I recommend avoiding images like this altogether. Many LoRA datasets include them—for example, a girl standing next to the boy, or a mecha, or a robot. You need to “disassemble” these extra elements. Otherwise, they’ll linger like ghosts, randomly interfering with your generations.

Also, when tagging anime screenshots, the tool often adds “white background” by default—so this becomes one of the most common carriers of chaos.

At this point, you might already be feeling frustrated. The good news is that there are plenty of tools now that support automatic background removal—like the latest versions of Photoshop, some ComfyUI workflows, and various online services. These can even isolate just the clothes or other specific objects.