Multi-character Prompting & Inpainting Guide (Step by step) v2
Foxy Fluffs wants to hug Hatsune Miku! but how to gen two different characters together?Version 2 of my guide to snuggling Vocaloids Multi-character Prompting & Inpainting! Long overdue update to correct some prompting technique (eg. getting rid of BREAK which was confusing people), plus some new best practices for inpainting.We will follow the same basic steps:Engineer an SD model prompt for our charactersRun our prompt and refine until we get a near-matchInpaint our chosen image section by section to correct errorsUpscale the final product.1) Engineer an SD model prompt for our charactersWhen prompting in SD (with either TAMS 2.0 or A1111 parsing method), the best practice I've found for images of multiple characters is to use the following prompt structure:general scene prompts, (character1 description and lora keywords:1), (character2 descripton and lora keywords:1), quality tagsgeneral scene prompts refers to all the prompts that apply to the scene and to both characters (eg. indoors, 2girls, hugging). Setting this first helps to enforce to the model the overall scene, even while we are manipulating the character prompts for inpainting.(prompt:1) assigns a weight to your individual prompts and can include one or more prompts. We will manipulate this value when inpainting in order to emphasise the individual characters traits, and suppress others*character descriptions refers to the unique prompt sets that apply to each character in the scene (eg. "Foxy Fluffs, fox girl, green top," and "Hatsune Miku, human, aqua hair"). Expect bleedthrough no matter how carefully you structure this, however we will choose the image with the least mistakes and edit out any remaining errors with inpainting. Recent models can handle this better and might rarely give a "perfect" gen, but being able to inpaint a near-perfect one is still a valuable skill.quality tags (like "best quality", "score_9", or "Newest") differ depending on the base model or checkpoint used (SD1.5, Pony, Illustrious, etc). Many people swear by placing quality tags at the beginning of the prompt, but I favour placing them at the end to give more emphasis to the scene prompts at the top. YMMV.*A key trick we will employ here is (character1 description and lora keywords:0). This tells the model to ignore this set of prompts on this pass, which is incredibly useful for inpainting as you don't need to delete anything in order to emphasise other character's prompts.Here's the prompt I re-engineered for my image of Foxy and Miku with Illustrious model quality tags:Positive Prompt: 2girls, hugging, looking at another, indoors, concert. 1girl (Foxy Fluffs, anthro, furry, foxgirl, orange fur, long brown hair, brown eyes, slit pupils, black choker with a silver heart-shaped pendant, green top, black bottoms, blushing, nervous, FoxyFluffs:1). 1girl (hatsune miku, human, absurdly long hair, aqua hair, twintails, hair ornament, sidelocks, hair between eyes, parted bangs, aqua eyes, (happy), smiling, white shirt, collared shirt, bare shoulders, sleeveless shirt, aqua necktie, detached sleeves, black sleeves, shoulder tattoo, fringe, black thighhighs, miniskirt, pleated skirt, zettai ryouiki, thigh boots:1).masterpiece, best quality, amazing quality,very aesthetic,high resolution,ultra-detailed, absurdres, newest, scenery, depth of field, volumetric lighting,Negative Prompt: [blank]**I don't use negative prompts most of the time. Maybe it's a thing with images involving fox girls (because there's nothing negative about fox girls!), but I find I get better results without negatives, so i keep the negative blank except if obvious errors creep in that need to be prompted out. YMMV!2) Run our prompt and refine until we get a near-matchI ran the prompt using the following Models, portrait aspect,dpmpp_2m/sgm_uniform, 15 steps, CFG 5:WAI-SHUFFLE-NOOB - V-Pred-02 (Illustrious checkpoint)Foxy Fluffs OC Character Lora (0.6 weight)Hatsune Miku -Vocaloid (0.8 weight)After about 20 gens, I got the following set of images, with one near perfect pose that doesn't need inpainting, and one post that I like that will need some inpainting:Near Perfect Base Gen (circled in red):This is a jackpot generation, where almost everything in the image is as prompted. Even the hands have the right number of fingers/knuckles.This is one strategy, especially if you use DMD2 to keep your costs low, and rely on RNG to get a good image. Here we can skip the inpainting step entirely and move on to upscaling directly. However, if you want to nitpick, you can inpaint to fix things like foxy should be wearing shorts, fox doesn't wear leggings, foxy's hair should be long and not in pigtails (that's miku's look). Or we can handwave that and say she styled her hair and outfit that day to match Miku (laziness FTW!).Cutest pose that can use some inpainting (circled orange):Miku seems to have come out almost perfectly, while Foxy's outfit and hair got corrupted, and she inherited Miku's tattoo, which will need to go.So now that we have a usable image, we will move on to inpainting!3) Inpaint our chosen image section by section to correct errorsSince foxy's entire outfit and hair are wrong, we're gonna mask out her entire quadrant of the canvas. This gives the inpainting tool space to redraw things like her hair and tail into, since we won't know exactly where these will be moved to. Obviously if there are elements in the background you don't want to lose, don't mask over them.To reconfigure the prompt to focus on Foxy, all we need to do is set the weight of Miku's prompts to 0, like so:Using the default weight of 0.75, inpaint 1 or more times (just ensure seed is set to random):the result is several versions of Foxy in various states of dress. The second image is almost perfect, and just needs one more pass to get her hair right and remove the little black band on her arm.If the image shows too little or too much change, adjust the denoise setting up or down respectively.Result of inpaint No 1:Foxy's outfit is fixed, arms look good, however we have a weird black hand popped in on miku's belly. No worries. we'll leave that for now and fix it when we focus on Miku's prompts. We still want to fix foxy's hair and get rid of the armband.Notice that we are being a bit more selective now with the masked area. Only the band itself and the area her hair should be in are masked. We'll run the inpaint tool 2-3 times on this, but first we'll tweak the prompt. Since the hair didn't appear last time, that suggests our prompt didn't have enough emphasis, so we add in the prompts we want to see at the beginning since this is where the most weight is given.However this still doesn't give a strong enough effect, so now we will INCREASE the inpaint weight. shown here is inpaint weight at 0.95, 0.90. 0.85, and 0.75:0.85 was enough to redraw the hair, however i like how the background came out on the 0.95, so...Result of inpaint No 2:So now Foxy is perfect! It's time to move on to Miku, and get rid of that awful hallucinated hand over her tie. I do quite like the slightly uncomfortable expression on her face though, so I'll tweak her base prompt to match and reinforce this:I ran this a few times at a high denoise (0.85-1.00) until I got a result that overwrote the hand and drew in Miku's uniform in its place:It's important to note that when you inpaint or use img2img, the model is effectively drawing a new version of the image and merging it to a degree based on the denoise value.Result of inpaint No 3:Miku looks good, so we can move on to Upscaling.4) Upscale the final productUpscaling can be quite tricky, depending on the image, higher resolution and denoise can introduce a lot of hallucinations. To avoid this, I usually prefer to upscale by 1.5x and a low denoise (not more than 0.4) to preserve the underlying image.The following image was our "perfect" base gen, and is Upscaled with Nearest at 2x, 0.4 denoise, and 20 steps.A big advantage to going straight from base gen to upscale is that the upscaler seems to try to stick to the original generation as much as possible, so the less you inpaint or tweak the prompt, the better.Upscaling the inpainted image:So, here's an interesting problem:foxy's eyes went wrong! This is because her prompt is still set to (character prompts:0). So we actually need to go back a step and do one last little inpaint to set it back to 1. Also don't like how the background screen turned out, so...normally, i'd set mask size to 1 and dot in the corner, however that screen is all kinds of derp, so i'll inpaint it...And then upscale number 3:Hmm! Miku with slit pupils! Altering the denoise or upscaling steps won't fix this, so we'll have to remove it from the prompt, which means inpainting the previous step with a dot and reduce the weight of "slit pupils" in foxy's prompts (or delete it entirely).Then upscale again:And, voila! A very uncomfortable Vocaloid, getting snuggled by our favourite fox girl. Again. SECURITY!!!!I hope this updated version of the guide will be useful to people. Please do give me your feedback on anything that is unclear!This technique will work for any number of subjects as long as you're patient enough to inpaint each of them. Just bear in mind that after 5 or 6 successive inpaints, the whole image starts to lose resolution, so there does seem to be a hard limit to how many times an image can be inpainted.For example, the following image where I inpainted 5 unique characters in one image! (dont ask how many credits it cost).Please do leave your comments below if this guide was helpful, or if you have suggestions on how I can improve too!Lastly, a quick shoutout to Rexo on the TA Discord for constantly plugging my original article, and for inspiring me to get off my lazy butt and write an updated version. Additional shout out to Superpat50 who inspired me to write the original article in the first place too!