
Tools I use for LoRa training
Hi everyone!Training LoRas is a process that can be very time-consuming, especially dataset preparation. After many training sessions, I've found some tools that save me hours of manual work. I have compiled the tools I use in this article.The titles of each tool contain the direct links to access them.Notice: I use these applications on a system with a Ryzen 5 9600X and an Rx 570 4GB.1. Grabber (Image Collection)This application is wonderful for mass image collection, especially if you work with boorus (anime-style image sites).Main function: It allows you to search by tags on multiple sites simultaneously and download the images in their highest quality quickly.Additional function: It can automatically create a .txt file with the tags from the site itself for each downloaded image.How to set up Grabber's auto-tagging:Go to Tools > Options.In the Options menu, expand the Save tab.Go to Separate Log Files and create a new one.You must configure the values shown in this image:%character:spaces,separator=^, %, %general:spaces,separator=^, %Then, in the main application window, go to the Destination panel (on the left).In the Name field, use the nomenclature shown in this image:%md5%.%ext%And that's it! Now, each downloaded image will come with its corresponding tags file.My advice: Personally, I don't trust these tags 100%, as they are sometimes incorrect or incomplete. However, they are an excellent base if you supplement them using the Append tags option in the automatic tagging tool.2. DupeGuru and Krokiet (Duplicate Cleaning)Having duplicate or very similar images in a dataset is fatal for training, and cleaning them by hand is a nightmare. These two tools make it much easier.Both do the same thing: they scan a folder and detect duplicate or visually similar images.Why use both? I've noticed that DupeGuru sometimes detects duplicates that Krokiet misses, and vice versa. Using both gives me almost total certainty that the dataset is clean.In Krokiet: Simply select the Similar Images section on the left, set your folder, and let it scan.Adjustment: If it doesn't detect duplicates well, you can click the gear icon (⚙️) and adjust the similarity threshold.3. Regional MultiCrop (Image Extraction)This is a simple but incredibly useful tool I created to speed up the extraction of multiple images from a single one.It's perfect for those images that contain multiple angles of a character, facial expressions, or for cropping individual panels from a manga. It saves a lot of manual cropping time.4. Upscayl (Image Scaling)Although some tools (like Dataset Processor) have scaling functions, they often depend on modern hardware. Upscayl is my preferred solution.It allows for batch upscaling.It works wonderfully even with old hardware or if you have VRAM limitations; it can also use the CPU.5. Dataset Processor DesktopThis tool is the Swiss Army knife for processing datasets. It has many functions, but to keep the workflow fast, I focus on the following:Gallery Page: Gives you a quick view of all images. You can click to select them and then delete them with a single button. It's ideal for detecting images that don't add value, are duplicates, or simply clash with the rest of the LoRa.Inpaint Images: Allows you to quickly erase text, logos, or unwanted elements. You just navigate between images, paint over what you want to remove, and move on to the next one.Resize Images: Although this isn't necessary, I usually rescale them to 1024px on their longest side.Tip: The main reason I do this is speed. Inpainting at high resolutions takes a very long time and delays the process unnecessarily.Generate Tags: Allows you to automatically tag the entire dataset with the tagger you choose.My Threshold settings:Few images: I use WDv3Large with a low threshold, 0.25.Many images (+100): I raise the threshold to 0.4 or 0.5 to capture only the most relevant tags and avoid noise.Process Tags: Once tagged, this section gives you a quick overview of the most common tags. It has great options for cleaning duplicates, removing redundancies, and checkboxes to add or remove tags in bulk.6. tagguiAlthough I generally prefer the integrated tagger in Dataset Processor for convenience, taggui is a fantastic and more specialized alternative.Greater variety: It has many more tagger models than Dataset Processor.Use cases: It's especially useful if you need something more specific or if you are training a LoRa with natural language tagging (like Qwen).Technical: It offers more technical options that might interest you if you want more granular control.7. chaiNNerchaiNNer is an advanced node-based image processor. Its capabilities are enormous and go far beyond this guide, but I use it for two specific tasks:Dataset Augmentation: You can create workflows (chains) to rotate, flip, or make small changes to your images to artificially increase the dataset size. (Use chaiNNer to increase dataset quickly | Civitai)Quick Batch Editing: If you notice that all your images need an adjustment in contrast, color, saturation, or brightness, you can apply that correction to the entire dataset at once.8. Booru Prompt GalleryOnce the LoRa is trained, you have to test it! I made this simple webpage to get quick and varied prompts directly from sites like danbooru.Web App: Booru Prompt Gallery V5.1 | CivitaiThat’s it!Stay hydrated and don’t forget to blink.
