How I ControlNet: A beginners guide
How I ControlNet: A beginners guide.A step-by-step guide on how to use ControlNet, and why canny is the best model.Pun intended.Warning: This guide is based on SDXL, results on other models will vary.This guide REQUIRES a basic understanding of image generation, read my guide "How I art: A beginners guide" for basic understanding of image generation.Step 1. Setting up the workflow.For simplicity porpuses I will use the same settings, prompt and seed from the guide "How I art: A beginners guide", any changed done to these settings will be mentioned as part of the guide. Further rules will also be metioned as the guide continues.Now that we have our set up ready, let's dive into each ControlNet available.If you are using tensor.art for your generations, click on "Add ControlNet" under the "Add LoRA" and "Add Embedding" options. If you are using your own WebUI for local generation; make sure you have ControlNet installed and the ControlNet models downloaded.A1111 will have them installed by default, but if that's not the case the "extension" tab will have all you need. If you need a guide for how to install ControlNet on your WebUI, let me know in the comments.Step 2. Select a model.The most popular ones are "Openpose", "Canny", "Depth", "Scribble" and "Reference Only".For this guide I will only demonstrate "Openpose", "Canny" and "Reference". Since Depth and Scribble are practically image to image in my personal opinion, and I never use them.IF YOU WANT A GUIDE FOR ANY OF THESE MODELS LET ME KNOW IN THE COMMENTS AND I WILL DO AN INDEPENDENT GUIDE AROUND IT.Step 2.1. (For Openpose).On the Openpose window, you will either upload your reference image, or use one from the Reference list. For this guide I will use the 5th image, because it looks like a jojo pose. For this reason I also will add to my prompt:menacing \(meme\)Once your image is uploaded or the reference image is confirmed, you will be moved to the "control image" tab.This will show the pose that the openpose model detected. It goes without saying, but the more complex the pose is, the harder it is for Openpose to disect it –and as for your base model, the prompt work will decide how it gets interpreted post image control.Click on Confirm and then Generate.Much like the original pose I would say, but we aren't done yet.Step 2.1.2. SettingsOpenpose comes with 2 settings; Weigth and Control Steps.Much like a lora, the first one will dictate how much FORCE the Openpose will have on the image.Meanwhile; Control Steps will dictate for HOW LONG the Openpose will be on effect, where 1 is a 100% of the generation process. For reference: 0.5 means 50%, etc.These settings for example, Openpose will have HALF the impact, and will only take effect after 20% of the generation process has passed –to which it will stop at 70%.This means that if your image was 100 steps, Openpose starts at step 20 and ends at step 70. Every other step is controlled by the base model and prompt alone (Or LoRA if applicable).Let's see what happens:As you can see, the pose was completely ignored, so be careful what and how you change these settings. Only change them if you have to.My personal recomendation are:Never change the starting step.Only lower the strength of the model if the effect is too strong.Only lower the finishing step if you want some deviancy.Setting strenght at 0.5 and steps at 0 - 0.5 (meaning it stops at 50%) shows that the pose is still used (see the hand posture and head tilt), but the model ignores the "full body" representation.Step 2.2.1. (For Canny).Canny (Cunny) is what I like to call, a "recoloring" tool but also the best option for pose reference, angle, composition and all around what "Reference" should actually do. I will show multiple uses for this model as it is the most flexible in my opinion.Step 2.2.2. Canny to recolour.As seen above, we will first learn canny for recolouring porpuses. Remember that YCH you saw for 5k usd? I do. So, load the image and wait for the model to process it. For this step I will use this YCH I found on Twitter.The settings in Canny are the same as in Openpose and the effects are going to be the same as well. You will need to edit them as you go, for there is no such "one configuration to rule them all".I will also remove the "menacing" and "solo" tags and instead will add:solo focus, duo, flexing, looking away, musclesThe image is two characters but focuses on one, the character is not looking at the camera and is flexing his arm. When using canny, treat your prompt as an img2img prompt. Describe not only what you want, but what you see in the reference image.Without changing anything, let's generate now.Here the image took some aspects of Arturo but most are missing. This is when you start testing different settings. For example, let's lower the strenght to 80% or 0.8:As you can see, the colours are better here, but the character now has a weird shirt on. I will change my prompt to say the character is nude and try again.Almost perfect, right? Just continue playing with the settings until you have something close to what you are looking for, and if use have image editing software; recolour the missing details yourself, then finish the image through img2img.Here are the final settings:masterpiece, best quality, newest, absurdres, highres, solo focus, awff, male, wolf, anthro, bust portrait, smile, nude, nipples, looking away, flexing, muscles, duo, masterpiece, best quality, newest, absurdres, highresI apologize for any imperfections, I am not that invested on these guides to be honest. But I do hope they help.Step 2.2.3. Canny to add details.Canny can also be great to add details to an already generated image. Let's say you used NovelAI for its great poses, but want to add details that Novel simply can't do –canny is the way to go.For this process, switch all of your settings to Img2Img, that means prompt, model, loras, etc. And yes, Controlnet too; selecting Canny and loading the same image as reference.But before you get on your high horses, notice that there are some new wacky options!These are "Denoising strenght" and "Seed", now; Seed isn't exactly NEW, but it is different, because most of the times you won't have the original seed, or you are using a different base model, etc. This; again is all for test porpuses to get the idea across. Therefore, I won't re-use the seed.Denoising strenght tho, it's the important setting here. An Img2Img generation requires you to set how different the image will be from its starting point.Canny makes sure that the image itself doesn't change, but it will still change the colours, background and other things.Since we are just adding details; the denoise shouldn't go above 0.4 or 40% but each model is unique so you will have to learn how your base model works to find the sweet spot.Here's our control image:And here is the result with a 0.4 denoise and default settings on canny:As expected (for me) the image is almost the same, this is because I used the same model for the original image as for the img2img. Let's change the model to something else. I will use KiwiMix-XL - V3 for this, now we run it again, everything else untouched:As expected from Kiwi (for me) the colours changed dramatically to fit the model's art style. Kiwi has a more soft approach and overall "pastel" vibe. Therefore the image took the same route.This, paired with a basic understanding of Controlnet settings can allow you to pick and choose how the image gets interpreted by the AI.For those with local WebUI, you are in luck, since Controlnet offers ALOT more settings, such as pixel perfect, intensity of the canny control image, etc. But for now, this are the basics.Step 2.2.4. Canny for references.Here we show the "Reference" controlnet model who's daddy.For this I will use the following image:Controlnet allows you to use an image as reference, and canny does it best to what I can say. So let's return to Text2Img and learn how it works!Settings are as always, Denoise is no longer here and we load the reference image. To test it out, we will now generate as is:Already solid, but you can see the AI trying to keep the hair in place, while also struggling with the ears, it's a mess! So let's tone it down. First I will lower the strenght to 0.7 and try again:Now this one I like! But the image still feels a little... weird...If you have paid attention, canny has a little thing that makes all resulting images a little too... pixelated.. all over the edges. This is where the control steps come into play finally.Since the image is just a reference, I will test again with the following config; 0 - 0.5 and 0 - 0.7This is with controlnet stopping at 0.5:And 0.7:As you can see, each has some pros and cons. 0.5 has less artifacts, but 0.7 has a better looking face. At the end you will run multiple gens and pick your favorite; finishing with an img2img pass (WITHOUT CONTROLNET) at a low denoise such as 0.2 for example. Or depending on the model you are using.Always test!Step 2.3.1. (For Reference).My least favorite, reference. The idea is that your control image "suggests" the model what you want. So if you generate an image of a girl and your reference image is Ariana Grande, the result should be a girl that looks like Ariana Grande. And of course you sill need to describe it.Let's test it with this:So you know the drill.Prompt:masterpiece, best quality, newest, absurdres, highres, solo, awff, male, wolf, anthro, smile, clothed, safe, black clothing, raincoat, red gloves, holding mask, holding gun, full-length portrait, Persona 5, cosplay, joker \(persona 5\), masterpiece, best quality, newest, absurdres, highresAnd here's the result:Not bad uh? But I rather use cunny.And that's it!Congratulations, now you know the basics of Controlnet. Remember to always test different settings as you trial and error!