How I LoRA: A beginners guide to LoRA training | Part 2: Training Basics


Updated:

A step-by-step guide on how to train a LoRA; part 2!

Warning: This guide is based on Kohya_SS

This guide REQUIRES that you read "How I LoRA: A beginners guide to LoRA training | Part 1: Dataset Prep."

This guide CAN be ported to Tensor.art's trainer; if you know what you are doing.

This guide is an (almost) 1:1 of the following guide: https://civitai.com/articles/3522/valstrixs-crash-course-guide-to-lora-and-lycoris-training

Edits were made to keep it short and only dive into the crucial details. It also removes a lot of recommendations I DO NOT follow.; for more advanced information, please support the original guide. If you want to do things MY way, keep reading.

THE SETTINGS USED ARE BASED ON SDXL, DO NOT FOLLOW IF YOU ARE TRAINING ON V-PRED OR 1.5

Training: Basics

Now that you have your dataset, you need to actually train it, which requires a training script. The most commonly used script, which I also use, are the Kohya Scripts. I personally use the Kohya-SS GUI, a fork of the SD-Scripts command line trainer.

Once you have it installed and open, make sure you navigate to the LoRA tab at the top (it defaults to dreambooth, an older method.)

There are a lot of things that can be tweaked and changed in Kohya, so we'll take it slow. Assume that anything I don't mention here can be left alone.

We'll go down vertically, tab by tab.

Accelerate Launch 

This tab is where your multi-gpu settings are, if you have them. Otherwise, skip this tab entirely, as the defaults are perfectly fine. Training precision is also here, and should match your Save precision in the following tab, but you won't touch it otherwise.

Model

This tab, as you've likely guessed, is where you set your model for training, select your dataset, etc.

  • Pretrained model name or path:

    • Input the full file path to the model you'll use to train.

  • Trained Model output name:

    • Will be the name of your output file. Name it however you like.

  • Image folder (containing training images subfolders):

    • Should be the full file path to your training folder, but not the one with the X_. You should set the path to the folder that folder is inside of. Ex: "C:/Training Folders/Concept Folder/".

  • Underneath that, there are 3 checkboxes:

    • v2: Check if you're using a SD 2.X model.

    • v_parameterization: Check if your model supports V-Prediction (VPred).

    • SDXL Model: Check if you're using some form of SDXL, obviously.

  • Save trained model as:

    • Can stay as "safetensors". "ckpt" is an older, less secure format. Unless you're purposefully using an ancient pre-safetensor version of something, ckpt should never be used.

  • Save precision:

    • "fp16" has higher precision data, but internally has smaller max values. "bf16" holds less precise data, but can use larger values, and seems faster to train on non-consumer cards (if you happen to have one). Choose based on your needs, but I stick with fp16 as the higher precision is generally better for more complex designs. "float" saves your LoRA in fp32 format, which gives it an overkill file size. Niche usage.

Metadata

A section for meta information. This is entirely optional, but could help people figure out how to use the LoRA (or who made it) if they find it off-site. I recommend putting your username in the author slot, at least.

Folders

As simple as it gets: Set your output/reg folders here, and logging directory if you want to.

  • Output folder:

    • Where your models will end up when they are saved during/after training. Set this to wherever you like.

  • Regularization directory:

    • Should be left empty unless you plan to use a Prior Preservation dataset from section 3.5, following a similar path to the image folder. Ex: "C:/Training Folders/Regularization/RegConceptA/".

Parameters

The bread-and-butter of training. Mostly everything we'll set is in this section: Don't bother with the presets, most of the time.

  • Lora Type: Standard

  • Train Batch Size:

    • How many images will be trained simultaneously. The larger this number the more VRAM you will use, don't go over 1 if you have low VRAM.

  • Max train steps:

    • RECOMMENDED. Forces training to stop at the exact step count provided, overriding epochs. Useful if you want to stop at a flat 2000 steps or similar. 3000 steps is my recommended cap.

  • Save every n epochs:

    • RECOMMENDED. Saves a LoRA before it finishes every X number of epochs you set. This can be useful to go back to and see where your sweet spot might be. I usually keep this at 1, saving every epoch.

    • Your final epoch will always be saved, so setting this to an odd number can prove useful, such as saving every 3 epochs with a 10 epoch training will give you epochs 3, 6, 9, & 10, giving you a fallback right at the end if it started to overbake.

  • Cache latents & Cache latents to disk:

    • These affect where your data is loaded during training. If you have a recent graphics card, "cache latents" is the better and faster choice which keeps your data loaded on the card while it trains. If you're lacking VRAM, the "to disk" version is slower but doesn't eat your VRAM to do so.

    • Caching to disk, however, prevents the need to re-cache the data if you run it multiple times, so long as there wasn't any changes to it. Useful for tweaking trainer settings. (Recommended).

The next settings are calibrated for low VRAM usage; read the original guide if you got VRAM to spare. Anything highlighted was changed for maximum VRAM optimization.

  • LR Scheduler: Cosine With Restarts

  • Optimizer: AdamW8bit.

  • Optimizer extra arguments: weight_decay=0.01 betas=0.9,0.99

  • "Learning Rate": 0.0001

As a general note, the specific "text encoder" and "unet" learning rate boxes lower down will override the main box, if values are set in them.

  • LR warmup (% of total steps): 20

  • "LR # cycles": 3

  • "Max resolution": 1024,1024

  • Enable buckets: True

  • Min/Max bucket resolution: 256; 2048

  • "Text Encoder & Unet learning rate": 0.001; 0.003

  • No half VAE: Should always be True, imo, just to save you the headache.

  • Network Rank & Network Alpha: 8 / 8

    • Your Alpha should be kept to the same number as your Rank, in most scenarios.

  • Network Dropout:

    • Recommended, but optional. A value of 0.1 is a good, universal value. Helps with overfitting in most scenarios.

Advanced (Subtab)

We won't touch much here, as most values have niche purposes.

  • Gradient accumulate steps: 1

  • Prior loss weight: 1

  • Keep n tokens:

    • For use with caption shuffling, to prevent the first X number of tags from being shuffled.

    • If using shuffling, this should always be 1 at minimum, which will prevent your instance token from being thrown around.

  • Clip skip:

    • Should be set to the clip skip value of your model. Most anime & SDXL models use 2, most others use 1. If you're unsure, most civit models note the used value on their page.

  • Full bf16 training: False

  • Gradient Checkpointing: True

  • Shuffle Caption: True

  • Persistent Data Loader: False

  • Memory Efficient Attention:

    • Use only if you're not on a Nvidia card, like AMD. This is to replace xformers CrossAttention.

  • CrossAttention:

    • xformers, always. (As long as you're on a Nvidia card, which you really should be.)

      • If for whatever reason you can't use xformers, SDPA is your next best option. It eats more ram and is a bit slower, but it's better than nothing.

  • Color augmentation:

    • Do not.

  • Flip Augmentation: False

  • Min SNR Gamma: 1

  • Debiased Estimation Loss: False

  • Bucket resolution steps: 64

  • Random crop instead of center crop: False

  • Noise offset type: Multires

  • Multires noise iterations: 6

  • Multires noise discount: 0.3

  • IP noise gamma: 0.1

And that's everything! Scroll to the top, open the "configuration" dropdown, and save your settings with whatever name you'd like. Once you've done that, hit "start training" at the bottom and wait! Depending on your card, settings, and image count, this can take quite some time.

Here are some visuals:

Once training begins:

THIS IS IT FOR PART 2: Training Basics!

6
0