Leveraging img2img with LoRAs and prompting


Updated:

There are numerous guides on Tensor and the internet at large that detail proper prompting for the different models available for image generation. This guide will cover the Flux model with image to image source denoise, in combination with text to image prompting with added LoRAs. This combination allows the end user multiple access points for drastic/subtle variance in image output.

Selecting which source image will effect numerous aesthetic changes in your final image output. Medium in terms of realism (photo), painting, crayon, line drawing, etc. each will have a distinct core transference to the final output. The img2img AI model will detect color, shapes, depth, objects, people, animals, etc. and apply as much to the final output as the end user adjusts with settings such as denoise strength, prompt cfg scale, and LoRA strengths. Below are some examples of those adjustments.

The same prompt was used, LoRA strength was increased to get image variance using a cinematic horror LoRA.



Using a Flux model with Schnell allows higher LoRA strengths beyond 2, which can be used to over saturate your images with certain LoRA aesthetics. The first remixed images of Xena above shows heavy line work that the Jeanne LoRA was trained on. The second image is within the standard 2 scale for changes such as a more detailed face.

Utilizing all of these elements in tandem allows for precise adjustments of strength scale. Denoise strength closer to 0 will match the original image similar to a photocopy. The larger the scale towards 1 the more creativity the Flux model will add to the image from the training data of Flux. Adding one or more LoRAs will give you further aesthetic control, adding the type of medium and art style you want the final output. For example if making origami prompts and the art style isn't translating, adding an origami LoRA will add origami to every image. The trade off may be that the style of origami for this example may be off from the final type you are looking for. Modular or kirigami etc. The training data of a LoRA is much more honed and pronounced in the image output compared to the Flux training data. This should be factored into your LoRA strength scale use.

LoRA strength scale can be set between .7 - 1 for regular results. For less LoRA strength you can set as low as .01 - .1, and depending on the model (Schnell, Dev) can go as high as 2 - 10, this will add a ridiculous amount of LoRA strength typically and make incoherent noise images. Finding the right balance for all settings is key to being able to make images in the style you want consistently.

There are numerous options on Tensor in terms of LoRAs for you to choose which LoRA selection while creatively deciding which source image and prompt will all coalesce into the image you are intending to make. The above screengrab from Ferngully was remixed through The Iron Giant backgrounds LoRA to remix the original style into another distinct yet similar setting.

Changing denoise strength, adjusting CFG strength scale for the prompt, or the LoRA strength scale will all affect the image output.

This method can be used for daily challenge posts, events, etc. This allows you a foundation of visual information, along with your crafted prompt and LoRA selection to remix source images. Using low strength will yield similar results. Higher denoise strength will give you some elements matching the original images (such as colors and mood) while drastically changing all other output elements. Through time and practice you can use this method as a shortcut to get closer and faster to your final output image with fewer generations.

One way to use this method is on a lower cost model to get a better source image closer to what you want as a final output image, to then use the new source image at closer denoise strength on a higher cost model.

You can use one LoRA at a time and change from one LoRA to another when you generate a better image. You can also add multiple LoRAs at once with different strength scale settings for each LoRA while having all the separate LoRAs change the one image in one generation.

There is a lot of trial and error in image generation, and it may take some time and credits before finding exactly what works for you. Sticking to standard scale settings at first is recommended to avoid some of the unintended outputs from the farthest ends of the strength scale. Choosing early on if you want low denoising strength for similar images or high denoise strength for creative variance is an important setting to have decided.

You can add low strength LoRA to a low Denoise image (.5 - .7) to have subtle changes added to your image of a specific style.

0