ArturoWolff

ArturoWolff

The one and only linktr.ee/ArturoWolff
36
Followers
1
Following
87
Runs
1
Downloads
159
Likes
3
Stars

Articles

View All
How I LoRA: A beginners guide to LoRA training | Part 3: Testing your LoRA

How I LoRA: A beginners guide to LoRA training | Part 3: Testing your LoRA

A step-by-step guide on how to train a LoRA; part 3!Warning: This guide is based on Kohya_SSThis guide REQUIRES that you read "How I LoRA: A beginners guide to LoRA training | Part 1: Dataset Prep." and "How I LoRA: A beginners guide to LoRA training | Part 2: Training Basics"This guide CAN be ported to Tensor.art's trainer; if you know what you are doing.This guide is an (almost) 1:1 of the following guide: https://civitai.com/articles/3522/valstrixs-crash-course-guide-to-lora-and-lycoris-trainingEdits were made to keep it short and only dive into the crucial details. It also removes a lot of recommendations I DO NOT follow.; for more advanced information, please support the original guide. If you want to do things MY way, keep reading.THE SETTINGS USED ARE BASED ON SDXL, DO NOT FOLLOW IF YOU ARE TRAINING ON V-PRED OR 1.5Testing your LoRAThere are two ways to test a LoRA. During training and after training.During:While on your Kohya_ss, there is a section for a "test" prompt. Use it. If you followed the guide you should have set "save every N epoch" as 1. Meaning that every epoch it will save a model and by proxy, test it with the given prompt.Look at each image, and judge its quality.After (The right way):After training is done, move all your safetensors files to your lora folder on your WebUI instalation. I will asume you have A1111, A1111 Forge or A1111 Re-Forge (the best one).On your WebUI, set yourself up with all the settings you would normally use; checkpoint, scheduler, etc.Copy/paste one of your dataset prompts to the prompt area (This will test overfitting).Navigate to the LoRA subtab and add the first file; ie.: Shondo_Noob-000001.safetensor, this will add the LoRA to the prompt as: <lora:Shondo_Noob-000001:1>; change the :1 to :0.1Set a fixed seed; ie.: 1234567890Scroll down to the "script" area of your WebUI and select X/Y/ZSet your X, Y and Z as "Prompt S/R"On X; write all of your LoRA's filenames; ie.: Shondo_Noob-000001, Shondo_Noob-000002, Shondo_Noob-000003, Shondo_Noob-000004, etc. Depending on how many files you saved, their names, etc. ALWAYS SEPARATE WITH A COMMA.On Y; write all the strength variables from 0.1 to 1, ie.: 0.1, 0.2, 0.3, etc. ALWAYS SEPARATE WITH A COMMA.On Z; write an alternate tag to test flexibility, so, if your prompt is: "fallenshadow, standing, dress, smile", write something like: dress, nude, swimwear, underwear, etc. This will create a grid where instead of wearing a dress, she will be nude, wear a swimsuit, etc. ALWAYS SEPARATE WITH A COMMA.If you did a concept LoRA or a style lora:On your WebUI, set yourself up with all the settings you would normally use; checkpoint, scheduler, etc.Copy/paste one of your dataset prompts to the prompt area (This will test overfitting).Navigate to the LoRA subtab and add the first file; ie.: doggystyle-000001.safetensor, this will add the LoRA to the prompt as: <lora:doggystyle-000001:1>; change the :1 to :0.1Set a fixed seed; ie.: 1234567890Scroll down to the "script" area of your WebUI and select X/Y/ZSet your X and Y as "Prompt S/R"On X; write all of your LoRA's filenames; ie.: doggystyle-000001, doggystyle-000002, Shondo_Noob-000003, doggystyle-000004, etc. Depending on how many files you saved, their names, etc. ALWAYS SEPARATE WITH A COMMA.On Y; write all the strength variables from 0.1 to 1, ie.: 0.1, 0.2, 0.3, etc. ALWAYS SEPARATE WITH A COMMA.Selecting the right fileOnce the process finishes, you should have at least 2 grids, one XY with dress, and another with nude (for example). Or one if you didnt set up an Z grid. Up to you.Now look at the grid and look for the "best" result. Look at the art style bias, pose bias, look bias, etc. The more flexible the better. If on fallenshadow-000005 shondo's pose is always unique but after 000006 she's always standing the same way, ignore 000006+If at some point the art style gets ignored or changes and fixates on it; ignore it.If at some point ANYTHING starts repeating that you don't want; ignore it.The only thing that should repeat at all times is whatever corresponds to the trained concept. If you only trained a wolf with a hat but it should always be a different hat, avoid a file that gives him the same hat on the same pose with the same style.If the result image is identical to the training data; avoid it! You are not here to do the same images as your data, you are here to make new ones, remember?If colors are weird; bad.If shapes are mushy; bad.If angle is always the same; bad (unless you prompted for it).Anything that goes against the concept or the flexibility of it: BAD.Any file that has to be lower than 1 or 0.9: BAD. If your LoRA "works best" at 0.6 strenght, it's shit.THIS IS IT FOR PART 3. Now do some good cool loras.
133
12
How I LoRA: A beginners guide to LoRA training | Part 2: Training Basics

How I LoRA: A beginners guide to LoRA training | Part 2: Training Basics

A step-by-step guide on how to train a LoRA; part 2!Warning: This guide is based on Kohya_SSThis guide REQUIRES that you read "How I LoRA: A beginners guide to LoRA training | Part 1: Dataset Prep."This guide CAN be ported to Tensor.art's trainer; if you know what you are doing.This guide is an (almost) 1:1 of the following guide: https://civitai.com/articles/3522/valstrixs-crash-course-guide-to-lora-and-lycoris-trainingEdits were made to keep it short and only dive into the crucial details. It also removes a lot of recommendations I DO NOT follow.; for more advanced information, please support the original guide. If you want to do things MY way, keep reading.THE SETTINGS USED ARE BASED ON SDXL, DO NOT FOLLOW IF YOU ARE TRAINING ON V-PRED OR 1.5Training: BasicsNow that you have your dataset, you need to actually train it, which requires a training script. The most commonly used script, which I also use, are the Kohya Scripts. I personally use the Kohya-SS GUI, a fork of the SD-Scripts command line trainer.Once you have it installed and open, make sure you navigate to the LoRA tab at the top (it defaults to dreambooth, an older method.)There are a lot of things that can be tweaked and changed in Kohya, so we'll take it slow. Assume that anything I don't mention here can be left alone.We'll go down vertically, tab by tab.Accelerate Launch This tab is where your multi-gpu settings are, if you have them. Otherwise, skip this tab entirely, as the defaults are perfectly fine. Training precision is also here, and should match your Save precision in the following tab, but you won't touch it otherwise.ModelThis tab, as you've likely guessed, is where you set your model for training, select your dataset, etc.Pretrained model name or path:Input the full file path to the model you'll use to train.Trained Model output name:Will be the name of your output file. Name it however you like.Image folder (containing training images subfolders):Should be the full file path to your training folder, but not the one with the X_. You should set the path to the folder that folder is inside of. Ex: "C:/Training Folders/Concept Folder/".Underneath that, there are 3 checkboxes:v2: Check if you're using a SD 2.X model.v_parameterization: Check if your model supports V-Prediction (VPred).SDXL Model: Check if you're using some form of SDXL, obviously.Save trained model as:Can stay as "safetensors". "ckpt" is an older, less secure format. Unless you're purposefully using an ancient pre-safetensor version of something, ckpt should never be used.Save precision:"fp16" has higher precision data, but internally has smaller max values. "bf16" holds less precise data, but can use larger values, and seems faster to train on non-consumer cards (if you happen to have one). Choose based on your needs, but I stick with fp16 as the higher precision is generally better for more complex designs. "float" saves your LoRA in fp32 format, which gives it an overkill file size. Niche usage.MetadataA section for meta information. This is entirely optional, but could help people figure out how to use the LoRA (or who made it) if they find it off-site. I recommend putting your username in the author slot, at least.FoldersAs simple as it gets: Set your output/reg folders here, and logging directory if you want to.Output folder:Where your models will end up when they are saved during/after training. Set this to wherever you like.Regularization directory:Should be left empty unless you plan to use a Prior Preservation dataset from section 3.5, following a similar path to the image folder. Ex: "C:/Training Folders/Regularization/RegConceptA/".ParametersThe bread-and-butter of training. Mostly everything we'll set is in this section: Don't bother with the presets, most of the time.Lora Type: StandardTrain Batch Size:How many images will be trained simultaneously. The larger this number the more VRAM you will use, don't go over 1 if you have low VRAM.Max train steps:RECOMMENDED. Forces training to stop at the exact step count provided, overriding epochs. Useful if you want to stop at a flat 2000 steps or similar. 3000 steps is my recommended cap.Save every n epochs:RECOMMENDED. Saves a LoRA before it finishes every X number of epochs you set. This can be useful to go back to and see where your sweet spot might be. I usually keep this at 1, saving every epoch.Your final epoch will always be saved, so setting this to an odd number can prove useful, such as saving every 3 epochs with a 10 epoch training will give you epochs 3, 6, 9, & 10, giving you a fallback right at the end if it started to overbake.Cache latents & Cache latents to disk:These affect where your data is loaded during training. If you have a recent graphics card, "cache latents" is the better and faster choice which keeps your data loaded on the card while it trains. If you're lacking VRAM, the "to disk" version is slower but doesn't eat your VRAM to do so.Caching to disk, however, prevents the need to re-cache the data if you run it multiple times, so long as there wasn't any changes to it. Useful for tweaking trainer settings. (Recommended).The next settings are calibrated for low VRAM usage; read the original guide if you got VRAM to spare. Anything highlighted was changed for maximum VRAM optimization.LR Scheduler: Cosine With RestartsOptimizer: AdamW8bit.Optimizer extra arguments: weight_decay=0.01 betas=0.9,0.99"Learning Rate": 0.0001As a general note, the specific "text encoder" and "unet" learning rate boxes lower down will override the main box, if values are set in them.LR warmup (% of total steps): 20"LR # cycles": 3"Max resolution": 1024,1024Enable buckets: TrueMin/Max bucket resolution: 256; 2048"Text Encoder & Unet learning rate": 0.001; 0.003No half VAE: Should always be True, imo, just to save you the headache.Network Rank & Network Alpha: 8 / 8Your Alpha should be kept to the same number as your Rank, in most scenarios.Network Dropout:Recommended, but optional. A value of 0.1 is a good, universal value. Helps with overfitting in most scenarios.Advanced (Subtab)We won't touch much here, as most values have niche purposes.Gradient accumulate steps: 1Prior loss weight: 1Keep n tokens:For use with caption shuffling, to prevent the first X number of tags from being shuffled.If using shuffling, this should always be 1 at minimum, which will prevent your instance token from being thrown around.Clip skip:Should be set to the clip skip value of your model. Most anime & SDXL models use 2, most others use 1. If you're unsure, most civit models note the used value on their page.Full bf16 training: FalseGradient Checkpointing: TrueShuffle Caption: TruePersistent Data Loader: FalseMemory Efficient Attention:Use only if you're not on a Nvidia card, like AMD. This is to replace xformers CrossAttention.CrossAttention:xformers, always. (As long as you're on a Nvidia card, which you really should be.)If for whatever reason you can't use xformers, SDPA is your next best option. It eats more ram and is a bit slower, but it's better than nothing.Color augmentation:Do not.Flip Augmentation: FalseMin SNR Gamma: 1Debiased Estimation Loss: FalseBucket resolution steps: 64Random crop instead of center crop: FalseNoise offset type: MultiresMultires noise iterations: 6Multires noise discount: 0.3IP noise gamma: 0.1And that's everything! Scroll to the top, open the "configuration" dropdown, and save your settings with whatever name you'd like. Once you've done that, hit "start training" at the bottom and wait! Depending on your card, settings, and image count, this can take quite some time.Here are some visuals:Once training begins:THIS IS IT FOR PART 2: Training Basics!
6
How I LoRA: A beginners guide to LoRA training | Part 1: Dataset Prep.

How I LoRA: A beginners guide to LoRA training | Part 1: Dataset Prep.

How I LoRA: A beginners guide to LoRA trainingA step-by-step guide on how to train a LoRA.Warning: This guide is based on Kohya_SSThis guide REQUIRES a basic understanding of image generation, read my guide "How I art: A beginners guide" for basic understanding of image generation.This guide REQUIRES a basic understanding of image editing, tagging, and WebUI navigation. This guide CAN be ported to Tensor.art's trainer; if you know what you are doing.This guide is an (almost) 1:1 of the following guide: https://civitai.com/articles/3522/valstrixs-crash-course-guide-to-lora-and-lycoris-training Edits were made to keep it short and only dive into the crucial details. It also removes a lot of recommendations I DO NOT follow.; for more advanced information, please support the original guide. If you want to do things MY way, keep reading.Part 1 | Datasets: Gathering & BasicsYour dataset is THE MOST IMPORTANT aspect of your LoRA, hands down. A bad dataset will produce a bad LoRA every time, regardless of your settings. Garbage data in gives garbage data out!Image Count:Personally, I recommend a dataset size of anywhere from 50 to 100 images at an ideal. While you can absolutely use more or less.Image Quality:When assembling your images, ensure you go for quality over quantity. A well-curated set of 30 images can easily outperform a set of 100 poor and mediocre images.Additionally, I recommend you keep your dataset stylistically varied, unless you're training a style lora. If a style is too prominent in your data, the style itself may be learned alongside your intended concept. When you get to tagging your data, I highly recommend you tag such cases to minimize their effect.Image Sourcing:Personally, I gather my data from e621. Again, make sure you try and avoid pulling too much from the same artist and similar styles. Identifying Your Needs:Concepts:For concepts, you should primarily look for solo images of the subject in question. Duos/Trios also work, but you should only grab them if your primary subject is largely unobscured. Alternatively, extra individuals can be easily removed or cropped out.If you do include multi-character images, make sure they are properly and thoroughly tagged.Including duo/trio/group images can be very beneficial to using your LoRA in multi-character generations, but is not required by any means.Styles:For styles, gathering data is generally a lot less selective, so long as the images are styled consistently.Folder Structure:To keep yourself organized and formatted correctly for Kohya, structure your training folder as follows:- Root - LoRA Dataset Folders - Concept Folder (What you're training, and where you point Kohya.) - Raw (Not required, but this is where I put all my images before sorting.) - 1_Concept (This is what's actually trained. You can replace "Concept" with anything.)For example, here my data is on kohya_ss>sd-scripts>.dataset>1_ShondoPart 2 | Datasets: PreparationOnce you have your raw images from part 1, you can begin to preprocess them to get them ready for training. First Pass:Personally, I separate my images into two groups: Images that are ok on their own, and images that require some form of editing before use. Those that meet the below criteria are moved to another folder and then edited accordingly.Important info: Webp is incompatible with current trainers, and should be converted. You are best only using .jpeg/.png at all times.If you use photoshop, you can resize your entire set at once with image processor. Set the fixed resolution to 2048 on both sides and max quality (7). This will resize all images so the images won't go under 2048 (or 1024, 512, etc) on either side.Second Pass:On the second pass, I manually edit any images that needed extra work from the first: This is where I do any cropping or, sometimes, redrawing of image sections.Once you've done the following, place all of your images that you edited, and the images you didn't edit, in your training folder:- Root - LoRA Dataset Folders - Concept Folder - Raw - 1_Concept <- (This one!)Part 2.5 | Datasets: Curing PoisonImage poisoning techniques have been found to only work in such niche situations that practically any form is DOA. Generally speaking, the following set of conditions needs to align for poisoned data to have any tangible impact on your training:The poisoning technique was based on the exact same text encoder as your model.The poisoning technique also used the exact same or similar VAE as you're training with.The amount of poisoned data is proportionally higher than unpoisoned data by a sizable margin.It's still a good idea to clean or discard obviously poisoned images, but it's less to combat the poison and more to have a clean image without artifacts. The poison is actually snake oil (Nightshade and glaze are effectively useless).Part 3 | Datasets: TaggingAlmost done with the dataset! We're in the final step now, tagging. This will be what sets your instance token (the activator tag), and will determine how your LoRA is used. In my personal opinion you should always do this manually.Personally, I use the Booru Dataset Tag Manager and tag all of my images manually. You COULD tag without a program, but just... don't. Manually creating, naming, and filling out a .txt for every image is not what you want to do with your time.Thankfully, BDTM has a nice option to add a tag to every image in your dataset at once, which makes the beginning of the process much easier.Picking a model:Before you tag, you need to choose a model to train on! For the sake of compatibility, I suggest you train on a Base Model, which is anything like a finetune that is NOT a mix of other models. For example, ChromaMix is well, a mix BASED on Noob; which is BASED on Illustrous. SO, train on Noob or Illustrous if you are planning to use a Noob-mixed model like Chroma or Kiwi.Tagging Types:Now, for the tagging itself. Before you do anything, figure out what type of tags you'll be using, this comes from the model you will use; Chroma and by proxy, Noob are trained on e621/danbooru, therefore using tags from said websites will yield the best results:The Tagging Process:Once you know what model and tags you're using, you can start tagging. Depending on what you are training; how you tag. For example, for a style LoRA, I recommend to tag EVERYTHING, while for any other LoRA tag anything you wish to keep as a variable.Or in lamer terms: Anything NOT related to the training subject should be tagged. If your character always has red eyes, don't tag "red eyes". But if your character can be male or female, then tag those depending on the image.Tagging Tips:Don't overtag, keep it simple, the less tags the better.Don't use "implied" tags, these are tags that imply other tags just by their presence. By having an implied tag, you shouldn't use the tag(s) it implies alongside it; for example, "German shepherd" implies both "canid" and "dog".Part 3.5 | Datasets: Prior Preservation (Regularization)While completely optional, another method of combating style bias and improper tag attribution is via the use of a Prior Preservation dataset. This will act as a separate but generalized dataset for use alongside your training dataset, and can usually be used generally between multiple training sessions. I would recommend creating a new folder for them like so:- Root - LoRA Dataset Folders - Concept Folders - Regularization Folder - 1_Concept - 1_Concept (You can have more than one!)"But how exactly do I make and use these?"You can start by naming your folder after a token - your class token is often a good choice.Creating a dataset for these is actually incredibly easy - no tagging is required. Within the folder you created for the tag, you simply need to put in a number of random, unique, and varied images that fall within that tag's domain. Do not include images of anything you'll be training. From my own testing, I personally recommend a number roughly equal to the number of images in your main dataset for training.During training, the trainer will alternate between training on your primary and regularization dataset - this will require you to have longer training to achieve the same amount of learning, but will very potently reduce biasing.Part 3.6 | Tagging: ExamplesSince examples are usually quite helpful:fallenshadow, :p, skirt, bow, capelet, cat ears, cat tail, closed mouth, fang, heart-shaped pupils, holding knife, looking at viewer, smile, solo, tail bow, teddy bear, tongue out, white backgroundfallenshadow will be the token tag; while everything else will remain a variable. This means that unless I tag "cat ears" alongside fallenshadow when using the LoRA; the image should NOT come out with cat ears.THIS IS IT FOR PART 1: DATASET PREP. GOOD LUCK!
8
How I inpaint: A beginners guide to img2img

How I inpaint: A beginners guide to img2img

How I inpaint: A beginners guide to img2imgA step-by-step guide on how to use img2img.Warning: This guide is based on SDXL, results on other models will vary.This guide REQUIRES a basic understanding of image generation, read my guide "How I art: A beginners guide" for basic understanding of image generation (For further improve your results, reading "How I ControlNet" is also recommended but not mandatory).Step 1. Creating a starting point.Img2Img is a process in which the AI takes a provided image and re-imagines it with the help of the provided prompt. This can be used for many porpuses and each porpuse requires it's own workflow; therefore this will only be ONE of many ways to use img2img.For this guide I will be using this image:This may seem like an already good image, but not for me!If you are using tensor.art for your generations, you will use the txt2img tab (if you are generating images) and the Img2Img tab. All settings you should already understand if you read the "How I art: A beginners guide" guide. Anything NOT mentioned there I will explain here.In my case I will be using Re-forge; but don't worry, the logic is the same.For the sake of context; this is how my screen looks.Step 2. Editing toolsOnce you have your base image; it's time to edit in the changes you want. Maybe you generated something, or saw a cool YCH you want to recolor/edit. No matter the origin, you need to add in any missing details yourself.This is the easiest way and most reliable too.I use photoshop; so, here's my desired edits:Not much, just fixed my fursona's colours and removed the paints in the background, most of the other things will be managed by the AI anyways (Colour correction also goes a long way).Step 3. The prompt and youNow that your starting image is as close as possible to the way you want it to look it's time to prompt it. But, instead of prompting what we want, we prompt what we see.The way img2img works is that; when you show it an image, it will try to recreate it using your prompt as a guide. So if the image it's a dog with an apple and you prompt for a cat in a megazord; the image will come out looking like a mush slop. Just use txt2img in that case.Depending on what platform you are using; take the starting image and upload it to the img2img tab."But Arturo!" I hear you say. "Isn't this an inpainting guide? Shouldn't we use the inpaint tab?"And you are correct, to inpaint, you use the inpaint tab, but to know how to inpaint you must learn how to img2img first. Apply wax; remove wax my grasshopper.Don't forget the fitting prompt!Now that you have your promt and your image, we must learn the settings:Regardless of your platform of choice, you will have a Sampling method, a Schedule type, Sampling steps, etc. So go ahead and set them up accordingly to your model, image, and prompt.As for the img2img exclusive settings:"Just resize", "crop and resize", "resize and fill" and "just resize (latent upscale)".Be a darling and don't touch that for now. This is a beginners guide after all.Basically; these change how the result image will interact with the original image. If your starting image is portrait but you want landscape, resize will stretch the image to meet the "Resize to" resolution. Crop and resize will go to the center of the image, crop the aspect ratio and resize that area only; resize and fill will keep the aspect ratio and try to fill in the voids and Just resize (latent upscale) will do the same as the first option; but with latent upscale."Refiner"This triggers a checkpoint change during the img2img process, for example, if you use ChromaXL and Refine with Kiwimixv3; then activate it and select said checkpoint, set up the % and you are done. The % means that, if set at 80% for example, the checkpoint change will trigger after the 80% progress has been done.If you use 100 steps, the checkpoint will change at step 80; for example."Resize to" and "Resize by"Resize to means your result image size will adapt to whatever resolution you set there. So if the image is 800*800 and you set 80*1000; depending on the Resize mode, the resulting image will aim for that ratio.Resize by goes by ratios, so of your image is 1024*1024 and you set it to 2, the result image will be 2048*2048!Note: If your image is too small, opt for resize by and up the ratio as far as your platform or computer lets you. In my case the max res I can work with is 2048* so anything bigger and I run out of memory.In my case the image is 2048* so I don't need to change anything there."Batch count/size"Same as with any generation mode; count is for how many images you want to generate in total, and size for how many images you want to generate at the same time. For img2img I recommend a count of 4 or more depending on complexity, this means I will get 4 images total to pick the best from."CFG Scale"Same as with any generation mode; the guidance strenght. Change accordingly to your model's settings."Denoising strength"This one is your cream de la cream. The higher the number the more creative freedom the model will take. Also the hardest setting to get right in ANY given moment.For example; lighting models work fast as hell, so even a low denoise can dramatically alter the results, while average models will need higher denoise to do something. ChromaXL is a lighting model, so if I go over 0.3 or 0.4; I risk getting abnormal results.Unfortunately there is no "defenitive" guide for denoise strength, but as you test and play with this tab you will eventually learn the "feel" and no longer struggle. Trust me.FOR TENSOR.AI USERS:Tensor.AI offers less options in terms of settings, so for an abridged version; add your starting image; select the model you want, set the desired denoise stregth, set the desired resolution px*px and the rest you should already understand. If you have any questions; please leave a comment.Step 4. Go gambling!Keep your seed random and click generate! Test different denoise levels, change the prompt if something's odd. As you test and tweak you will notice the new image make more and more sense.And that's it!Congratulations, now you know the basics of inpainting and img2img. Next time you will learn how to use the sketch tab, inpaint tab and more!
5
How I ControlNet: A beginners guide

How I ControlNet: A beginners guide

How I ControlNet: A beginners guide.A step-by-step guide on how to use ControlNet, and why canny is the best model.Pun intended.Warning: This guide is based on SDXL, results on other models will vary.This guide REQUIRES a basic understanding of image generation, read my guide "How I art: A beginners guide" for basic understanding of image generation.Step 1. Setting up the workflow.For simplicity porpuses I will use the same settings, prompt and seed from the guide "How I art: A beginners guide", any changed done to these settings will be mentioned as part of the guide. Further rules will also be metioned as the guide continues.Now that we have our set up ready, let's dive into each ControlNet available.If you are using tensor.art for your generations, click on "Add ControlNet" under the "Add LoRA" and "Add Embedding" options. If you are using your own WebUI for local generation; make sure you have ControlNet installed and the ControlNet models downloaded.A1111 will have them installed by default, but if that's not the case the "extension" tab will have all you need. If you need a guide for how to install ControlNet on your WebUI, let me know in the comments.Step 2. Select a model.The most popular ones are "Openpose", "Canny", "Depth", "Scribble" and "Reference Only".For this guide I will only demonstrate "Openpose", "Canny" and "Reference". Since Depth and Scribble are practically image to image in my personal opinion, and I never use them.IF YOU WANT A GUIDE FOR ANY OF THESE MODELS LET ME KNOW IN THE COMMENTS AND I WILL DO AN INDEPENDENT GUIDE AROUND IT.Step 2.1. (For Openpose).On the Openpose window, you will either upload your reference image, or use one from the Reference list. For this guide I will use the 5th image, because it looks like a jojo pose. For this reason I also will add to my prompt:menacing \(meme\)Once your image is uploaded or the reference image is confirmed, you will be moved to the "control image" tab.This will show the pose that the openpose model detected. It goes without saying, but the more complex the pose is, the harder it is for Openpose to disect it –and as for your base model, the prompt work will decide how it gets interpreted post image control.Click on Confirm and then Generate.Much like the original pose I would say, but we aren't done yet.Step 2.1.2. SettingsOpenpose comes with 2 settings; Weigth and Control Steps.Much like a lora, the first one will dictate how much FORCE the Openpose will have on the image.Meanwhile; Control Steps will dictate for HOW LONG the Openpose will be on effect, where 1 is a 100% of the generation process. For reference: 0.5 means 50%, etc.These settings for example, Openpose will have HALF the impact, and will only take effect after 20% of the generation process has passed –to which it will stop at 70%.This means that if your image was 100 steps, Openpose starts at step 20 and ends at step 70. Every other step is controlled by the base model and prompt alone (Or LoRA if applicable).Let's see what happens:As you can see, the pose was completely ignored, so be careful what and how you change these settings. Only change them if you have to.My personal recomendation are:Never change the starting step.Only lower the strength of the model if the effect is too strong.Only lower the finishing step if you want some deviancy.Setting strenght at 0.5 and steps at 0 - 0.5 (meaning it stops at 50%) shows that the pose is still used (see the hand posture and head tilt), but the model ignores the "full body" representation.Step 2.2.1. (For Canny).Canny (Cunny) is what I like to call, a "recoloring" tool but also the best option for pose reference, angle, composition and all around what "Reference" should actually do. I will show multiple uses for this model as it is the most flexible in my opinion.Step 2.2.2. Canny to recolour.As seen above, we will first learn canny for recolouring porpuses. Remember that YCH you saw for 5k usd? I do. So, load the image and wait for the model to process it. For this step I will use this YCH I found on Twitter.The settings in Canny are the same as in Openpose and the effects are going to be the same as well. You will need to edit them as you go, for there is no such "one configuration to rule them all".I will also remove the "menacing" and "solo" tags and instead will add:solo focus, duo, flexing, looking away, musclesThe image is two characters but focuses on one, the character is not looking at the camera and is flexing his arm. When using canny, treat your prompt as an img2img prompt. Describe not only what you want, but what you see in the reference image.Without changing anything, let's generate now.Here the image took some aspects of Arturo but most are missing. This is when you start testing different settings. For example, let's lower the strenght to 80% or 0.8:As you can see, the colours are better here, but the character now has a weird shirt on. I will change my prompt to say the character is nude and try again.Almost perfect, right? Just continue playing with the settings until you have something close to what you are looking for, and if use have image editing software; recolour the missing details yourself, then finish the image through img2img.Here are the final settings:masterpiece, best quality, newest, absurdres, highres, solo focus, awff, male, wolf, anthro, bust portrait, smile, nude, nipples, looking away, flexing, muscles, duo, masterpiece, best quality, newest, absurdres, highresI apologize for any imperfections, I am not that invested on these guides to be honest. But I do hope they help.Step 2.2.3. Canny to add details.Canny can also be great to add details to an already generated image. Let's say you used NovelAI for its great poses, but want to add details that Novel simply can't do –canny is the way to go.For this process, switch all of your settings to Img2Img, that means prompt, model, loras, etc. And yes, Controlnet too; selecting Canny and loading the same image as reference.But before you get on your high horses, notice that there are some new wacky options!These are "Denoising strenght" and "Seed", now; Seed isn't exactly NEW, but it is different, because most of the times you won't have the original seed, or you are using a different base model, etc. This; again is all for test porpuses to get the idea across. Therefore, I won't re-use the seed.Denoising strenght tho, it's the important setting here. An Img2Img generation requires you to set how different the image will be from its starting point.Canny makes sure that the image itself doesn't change, but it will still change the colours, background and other things.Since we are just adding details; the denoise shouldn't go above 0.4 or 40% but each model is unique so you will have to learn how your base model works to find the sweet spot.Here's our control image:And here is the result with a 0.4 denoise and default settings on canny:As expected (for me) the image is almost the same, this is because I used the same model for the original image as for the img2img. Let's change the model to something else. I will use KiwiMix-XL - V3 for this, now we run it again, everything else untouched:As expected from Kiwi (for me) the colours changed dramatically to fit the model's art style. Kiwi has a more soft approach and overall "pastel" vibe. Therefore the image took the same route.This, paired with a basic understanding of Controlnet settings can allow you to pick and choose how the image gets interpreted by the AI.For those with local WebUI, you are in luck, since Controlnet offers ALOT more settings, such as pixel perfect, intensity of the canny control image, etc. But for now, this are the basics.Step 2.2.4. Canny for references.Here we show the "Reference" controlnet model who's daddy.For this I will use the following image:Controlnet allows you to use an image as reference, and canny does it best to what I can say. So let's return to Text2Img and learn how it works!Settings are as always, Denoise is no longer here and we load the reference image. To test it out, we will now generate as is:Already solid, but you can see the AI trying to keep the hair in place, while also struggling with the ears, it's a mess! So let's tone it down. First I will lower the strenght to 0.7 and try again:Now this one I like! But the image still feels a little... weird...If you have paid attention, canny has a little thing that makes all resulting images a little too... pixelated.. all over the edges. This is where the control steps come into play finally.Since the image is just a reference, I will test again with the following config; 0 - 0.5 and 0 - 0.7This is with controlnet stopping at 0.5:And 0.7:As you can see, each has some pros and cons. 0.5 has less artifacts, but 0.7 has a better looking face. At the end you will run multiple gens and pick your favorite; finishing with an img2img pass (WITHOUT CONTROLNET) at a low denoise such as 0.2 for example. Or depending on the model you are using.Always test!Step 2.3.1. (For Reference).My least favorite, reference. The idea is that your control image "suggests" the model what you want. So if you generate an image of a girl and your reference image is Ariana Grande, the result should be a girl that looks like Ariana Grande. And of course you sill need to describe it.Let's test it with this:So you know the drill.Prompt:masterpiece, best quality, newest, absurdres, highres, solo, awff, male, wolf, anthro, smile, clothed, safe, black clothing, raincoat, red gloves, holding mask, holding gun, full-length portrait, Persona 5, cosplay, joker \(persona 5\), masterpiece, best quality, newest, absurdres, highresAnd here's the result:Not bad uh? But I rather use cunny.And that's it!Congratulations, now you know the basics of Controlnet. Remember to always test different settings as you trial and error!
5
How I art: A beginners guide.

How I art: A beginners guide.

How I art: A beginners guideA step-by-step guide on how to not generate like the average CivitAI user.Prettly please; never ever do this.Warning: This guide is based on SDXL, if you use 1.5, FLUX, etc., this won't be a 1:1, results will vary.Step 1. Don't use Pony.For real tho, don't do it.Step 2. The settings.Each model comes with its "best" settings, usually attached to the description if the uploader has over 2 braincells or is not the anti-christ. ChromaXL for example, is nice and based. It tells you the average settings. If this is your first time; go to the middle:10 steps2 CFGEuler A BetaBut if you want to test, Down the line I will explain how to test these configurations.A rule of thumb is:The more steps, the less noise. Too many steps and the image changes completely.The more CFG the more it follows the prompt, less CFG the more creative. But also; higher CFG ups the saturation, while lower CFG washes up the colours.If you use the in-site generator, it will look something like this:As you can see, I loaded the model, my LoRA, set a resolution, set the recommended settings and added an upscaler.Always, and I mean ALWAYS remove the LoRAs you are not going to use, if you upload that image, it will appear on that LoRA's gallery and will negatively impact the betting process of other users when looking for working models. It also makes you look like a clown.NOTE: Only upscale images you are happy with, upscaling every image will result in a waste of time and credits. More information later. In this guide I will turn it off, then on when needed.Step 2: The resolution.The resolution here will be 1024* But depending on what you are generating, a different resolution is recommended.Portrait works great for full-lenght, Landscape works best for lying down, wide shots, etc. Square for icons and headshot/bust portraits. So on and so forth, your resolution will affect the resulting image. We will do a profile icon for the sake of visual context. If you generate locally on your own equipment like I do, you will be limited by the amount of VRAM you have available.Further information will be given on a different Article.Step 3: The positive prompt.Quality tags:These cange depending on the model, model providers usually give you examples of these tags in the description.99% of the time these are snake oil, but I use them anyways.For Noob based models such as Chroma, these tags are:masterpiece, best quality, newest, absurdres, highres I recommend using them like this:masterpiece, best quality, newest, absurdres, highres, {prompt}, masterpiece, best quality, newest, absurdres, highresAs you can imagine, {prompt} is then replaced by your prompt per se. An example would be:masterpiece, best quality, newest, absurdres, highres, solo, awff, male, wolf, anthro, bust portrait, smile, clothed, safe, masterpiece, best quality, newest, absurdres, highresThis is, by the way, your POSITIVE prompt. This is what you want on your image.Now let's break down this simple prompt:Quality tags: These make sure the model gives out the best looking results possible. Mostly useless but worth using for newbies.'awff': This is an activation tag, awff does not mean anything, but I used it as keyword for my fursona; meaning that if I load my fursona's LoRA, my character will only appear if I use this tag.'solo, male, wolf, anthro, bust portrait, smile, clothed, safe': Context tags; this will guide the model to our desired result. Depending on the model; which tags you must use. ChromaXL, KiwiMix, Noob, etc. are trained on Danbooru and e621 tags, meaning that even though their CLIP (won't explain it here) still knows natural language such as "a dog jumping around", it will understand "dog, solo, jumping, outside" a little better.For example; if your model is trained only on e621 tags, using things like "shiny skin" won't do anything but give the character some glitter around him. Instead use "glistening body".Shiny skin CAN work, but due to the model bias, results will be at random, while glistening body will land more often on your desired results.Things to avoid:1.Repetitive tags; if you tagged "excessive cum, vaginal penetration", you don't need "cum inside". Just by having the penis inside, the model knows the excessive cum goes there. Same with "cum, ejaculation, cum in pussy".2.Un-related tags; As mentioned earlier, the more you stay true to the training data, the better results you will get. "A very sexy female wolf adult with huge tits" wastes a lot of tokens; just say "solo, female, wolf, huge breasts".3.Token overflow; Tensor doesn't show this, but prompts work on a 75 token basis. Staying under this limit gives the best results. You can still overflow this and jump to 150, 225, 300, etc. But if one of your tags overflows it, that tag is effectively useless.Let's say the tag "huge breasts" perfectly divides as "huge" and "breasts", one ending the first 75 tokens and the latter starting the next 75. The model will then barely listen to "huge" and try to re-focus on "breasts". This is fixed with BREAK a syntax that will force the text encoder (not explaining it here) to refresh its attention and giving you a fresh set of tokens. This also is crucial for working with Regional Prompter for those using a WebUI.4.Syntax clutter; Soon you will learn more about syntax, for now just dont [[((((huge breasts:1.5))))]] or I will hunt you down.Step 4: The negative prompt.Don't you touch this. SDXL has one great feature where it almost never needs a negative prompt, even then, at most use the model's recommended negatives such as:worst quality, old, early, low quality, lowres, signature, username, logo, bad hands, mutated handsOnly add negatives if the image keeps doing something you don't like. For example, if you want "4 fingers" but it keeps doing 5, then add "5 fingers" to the negatives. If the hair is the wrong colour, or wrong lenght, etc., then negatives are valid, but only then. For this test I won't use negatives.Things to avoid:1.Using negatives; just don't.Step 4.1: Additional settings and options.This can be skipped if you already feel overstimulated, but I will touch on it pretty quick.Tensor.art and local generators such as A1111 let you add things to your workflow like "controlnet", "embeddings", etc. Only use these if you know how to use them and how they work. I will drop another guide later on for each tool.Step 5: Generate!Now we click generate:What a nice generation uh? Now let's turn on the upscaler and add our seed to the settings:Setting a seed (assuming you didn't change the prompt) forces the model to generate the same image over and over. This is great for breaking down prompts from other directors, but is also great for upscaling!Because ChromaXL is a lighting model, we want our "upscaling steps" to be low, usually half the normal step count, so for 10; I use 5. Furthermore, "Denoising" has to be low as well; If you only want details added go for 0.10 or 0.15; 0.2 to 0.3 will start to alter the image and +0.4 can potentially destroy it.Lastly, the Upscaler itself; I recommend R-ESRGAN 4x+ Anime6B, but if you use a local generator I then suggest getting a custom one such as 4xUltrasharp_4xUltrasharpV10Now that we have the seed set and the upscaler configurated, generate again!As you can see, the difference is minimal, but the added resolution and low denoise make sure the small details pop and the artifacts get solved. In this case it went from 1024* to 1200* so it also makes the image less pixelated if seen at higher-res monitors. It also makes for a more detailed image if compressed; aka you posted it on Twitter.And that's it!Congratulations, now you know the basics of AI generation on XL models. Remember to always do research on the model and LoRAs you are going to use, chances are the creator added some tasty instructions under the Description and your lazy ass didn't read them. HAVE FUN!
4

Posts