REYA_012

REYA_012

Good time to stop, perhaps.
153
Followers
102
Following
38K
Runs
7
Downloads
4.8K
Likes
294
Stars

Articles

View All
[REYApping] Model Training Experience Part1: Illustrious

[REYApping] Model Training Experience Part1: Illustrious

Hello and welcome to the fourth edition of REYApping, a space where I write a bunch of nonsense. Without further ado, let's begin.Fourth edition? Dang, I didn't expect I'd make a fourth edition of me yapping, but here I am, trying to yap about my experience with model training, with the model this time being "Illustrious". Now I warn you that I never use this in my entire life before. Even if it's based on XL, when trying it, it just felt very different from Animagine or other anime XL in general. Also quick disclaimer: any settings that I used here is just an uneducated guess, so I encourage you to NOT STRAIGHT UP USE THESE SETTINGS. Unless it's coincidentally good. Without further ado, let's begin.Step 1, Getting IdeasI'm not really a creative person, so I admit that this part is one of, if not the hardest part. Currently I got a task telling me to train with Illustrious as base, put it in either visual, game, or space design, and get it into Tenstar Fund. The required channel reminds me of the past Hunyuan event with basically the same task except the Tenstar Fund thing, and I made: Fumo Dolls.I decided to revisit the image that I gathered and see some really good opportunity. Most of my dataset contains anime and/or manga characters with 90% of it being Tohou characters, and since we're dealing with Illustrious, everything just aligned well, so I can continue to the next step.Step 2, Dataset Gathering and CaptioningSince I choose to use my old images, I can skip the gathering process, but if you started from scratch, then I recommend you to either generate at least 1MP image, or if you look for images in Google, you need to use the advanced search and pick the "larger than 2MP" in size. If you have an interesting concept but don't know where to look and Google is just a no go, then I recommend to try generating with Dall E that you can access through Microsoft Edge's Bing. That will get you 1024x1024 images, which is good enough for training. For this Fumo model, I use 15 images of real Fumo dolls, with different backgrounds, angles, and subjects. After getting the images you want, it's time to do the most annoying part: captioning.Captioning images is basically telling the base model what the image is all about. If you train a style, then you need to caption pretty much everything in the image. For a character, you assign its key characteristics into a unique word (called trigger word) and just caption its clothes and what it is doing. There're other methods, but I'll be captioning everything in this one. You need to create a txt file that corresponds to the image name. To make it easier, I renamed my image to "fumo" and let Windows assign a number. I then create txt files and renamed it the same. The result pretty much look like this:Since the dataset mostly contains characters that is recognized by Illustrious, I'll be captioning it by using the trigger word first, character name, and then clothing (if different from original), pose, and then background. After it's done, I zipped all and it's ready to be uploaded.Step 3, Training parametersI'll just post a picture off my parameters here.Also, since the training image is real photo, I add "realistic" tag at the end of all images by doing this:Now it's time to train.Part 4, Test and PublishAfter the training is done, I published 2 models (Epoch 3 and 6) for tests since Tensor can't use an unpublished model for testing. The result is quite okay.It apparently also creates cute randomness like this:Final ThoughtsThis is quite interesting. Illustrious as a model seems very good, it can make nice images and understands a lot of characters so it makes it easier for people to create character images without LoRas. This one will be the first of many tests that I will do with this base model. Also, I'll make an update later on with some more base models (like SD3.5 perhaps?).Thank you for reading this part of REYApping. See you in the next one.
8
7
[REYApping] Simple and Brief Explanation of AI Tool

[REYApping] Simple and Brief Explanation of AI Tool

Hello and welcome to the third edition of REYApping, a space where I write a bunch of nonsense. Without further ado, let's begin.Never in my entire Tensor life would I actually try to explain something. But here we are, an article about AI tool. What is AI Tool? Why make an AI Tool? How is it different from the "create mode"? I'll try to explain them.What is AI Tool?Now, I might be wrong here (roast me in the comment) but here's my answer: AI tool is a simplified, more straightforward interface of a comfyui workflow. It saves you from seeing bunch of tangled spaghetti that can potentially break your eyes and mind. Instead of customizing directly on the workflow nodes, you get a similar interface as the "create mode". The downside is that it can have limited parameter since those are set by the tool's creator, and you won't know how the workflow works. Also it suck your credits and soul (Riiwa, 2024), but sadly doesn't suck your coc- *cough* Nevermind that last part.Here's an image of comfyui workflow:Here's when that workflow is made into an AI Tool:Why Make an AI Tool?Simplicity and straightforwardness in the palm(?) of your hand. That's it. Especially if your flow has a few variables that can be modified such as prompts, steps, etc.. If your flow has a lot of modifiable variables and/or you want more control over your workflow, then I suggest you do that directly in the comfyui.How is It Different from Creation Mode?Creation mode allows you to control basic functions such as samplers, which T5 would you use, and other thing like ADetailer, img2img, controlnet, etc.. AI Tool, while it can do that if set by the author, it's generally limited to basic things only such as prompts, steps, resolution, batch size, and maybe seeds. But you can't really use things like ADetailer or img2img and other fancy stuff by yourself and you really depends on what is provided by the tool itself. In short: Creation Mode allows broader range of functions but with only basic abilities while AI Tool mostly allows specific functions, but can have better result because of the dark magic trickery inside its comfy flow.Thank you for reading this part of REYApping. See you in the next one (if there's any).
22
7
[REYApping] How Going A Bit "GungHo" On the Network Ranks Could Help in Flux Training #Halloween2024

[REYApping] How Going A Bit "GungHo" On the Network Ranks Could Help in Flux Training #Halloween2024

Hello and welcome to the second edition of REYApping, a space where I write a bunch of nonsense. Without further ado, let's begin.It's been a while since I first wrote REYApping, and that one stays true to its "nonsense ahh writing". This one though, may be a bit different. I'm going to share a bit of experience with one of my Flux LoRa: Anitrait. Now, I'm not a good creator, in fact I'm actually just a mere user that hates Flux original anime style. That's why I decided to make Anitrait. With that in mind, you might think that how I train LoRa sucks and it's bad practice, but whatever, you can roast me later.BackstoryThe first time I tried to make it was kinda hellish since I got a lot of issues with deformity (especially bad anatomy and hands). I lost quite amount of credits because I'm the type to "do first, think later". The issue isn't just about that, there's also the issue of LoRas not triggering when set to 1 and even more.After getting broke with numerous attempt at training, I seek help and thankfully a great creator, Riiwa (you must've known this person and you might've even used the AI Tools), reached out to me and send a link about someone's experience in training Flux. The article is on CivitAI made by "mnemic" (it's very easy to find on Google if you tried to search it). That person did something that I've never done: having the network alpha higher than the dim. Very interesting. When I asked about its effect to another great creator: NukeAI about its effect, dude said "It make your image fries a.k.a overfits quickly." or something along the line. But me being a person I am decides to just go with it, and the result is actually interesting.TestingI want to compare 3 version of Anitrait. All of that is made by training 50 portrait images generated from AnimagineXL 3.1, cropped to 1024x1024 manually via photoshop, and using the simple category word captioning (also found in the article by mnemic, check that part out, it's interesting). The version I'll be comparing is Beta 2_E5, Beta 3_E6, and the real Beta 3 (I'll just call it B3). Here's more settings:Beta 2_E5:Network Dim: 64Network Alpha: 32LR Scheduler: CosineOptimizer: AdamWBeta 3_E6:Network Dim: 64Network Alpha: 64LR Scheduler: CosineOptimizer: AdamWB3:Network Dim: 64Network Alpha: 128LR Scheduler: CosineOptimizer: AdamW8bit (Isn't trained on Tensor, but I try to match every settings except optimizer since regular AdamW isn't supported).All of the model above will be tested using a herbalist prompt that I find by clicking the dice button in Tensor's prompt box, all with Euler-normal sampler, 25 steps, seed: 1000001, LoRa weight 0.6, model Flux.1 Dev fp8, T5 fp8, no negative prompt, guidance 3.5, and clip skip 1 at 768 x 1152 resolution.The prompt:Within the whimsical realm of an anime style, fantasy themed herbalist shop, a serene herbalist stands amidst shelves stacked with ancient tomes and peculiar botanicals. The camera captures a stunning close-up of her gentle features: piercing light-colored eyes, a caring smile, and long hair framing her arcanist attire adorned with herbalist details. Her hands cradle a delicate glass vial filled with shimmering essence as she gazes directly at the viewer, inviting them into her mystical world. The soft, warm lighting emphasizes the intricate textures of her clothes and the lush greenery surrounding her. In the background, an immersive scenery unfolds, replete with symmetrical details and sharpened clarity, transporting the viewer to a realm of wonder and discovery.ResultBeta 2_E5Beta 3_E6B3DiscussionNot gonna lie, all three version seems good at interpreting the long af prompt. From the herbalist character, to the object she's holding, to her surrounding, etc. but when we're talking about details, there's quite the difference.Beta 2_E5 has problems generating good finger (for FLux standard). The face is also a little bit off, but the overall detail is richer than the other 2 version: from the more intricate clothing pattern to her more natural plant looking hair accessories.Beta 3_E7 still has problem generating good finger, but the overall face detail is better. Another problem to point out is that the glass bottle she's holding is deformed. But overall detail is good, and I actually like the eye color here being yellow than green.B3 has the best result, not only because it generates a very good finger and hands, it's also more vibrant and not a deformity in sight. It's sharp, face looks good, proportion is good, and her chest somehow got a teeny bit bigger. The only problem is that the yellow thing on her shoulder is only generated on one side, but meh, I'll take it.ConclusionGoing with network alpha higher than the network dim may help in improving image quality and it may help in fixing certain anatomy. But this result may need more testing from other people since it's only 1 case that sees better result.Thank you for reading this edition of REYApping. Any feedback is appreciated. See you in the next edition!
4
[REYApping] Memes, The DNA of Soul

[REYApping] Memes, The DNA of Soul

Hello and welcome to the first edition of REYApping, a space where I write a bunch of nonsense. Without further ado, let's begin.Memes. You probably know this term as a manifestation of comedy (or in some cases, dark comedy) through a visual media. It has existed for a long time, with some saying that it originates roughly a decade ago. They often reflect current events, societal trends, and shared experiences. For example when the anime "Spy x Family" is trending, you'll see a bunch of memes like Anya's 'heh' face or quite recently the Rizz guy which by the way, you can find their respective Loras in Tensor and here's an example of each of those Lora generated with a model called "REYAMix" which somehow sounds very similar to my username... sus... .Image 1. Anya 'heh' face by FallenIncursio and Waos Chad Face by AlvaroOoNow, when you saw a meme, you'll probably just laugh it off and sometimes got the urge to share it to your friends, colleagues, or even your parents and grandparents and think of it as just a shape of expressing humour. That is until this guy exists:Image 2. "A guy who enjoys memes"This guy is a character from a game called "Metal Gear Rising: Revengeance" and is called Monsoon. He's famous for the quote "Memes, the DNA of Soul". Now you might be thinking: Meh it's just a guy who enjoys memes nothing serious. Well you're also right, but then he also says this: "They shape our will. They are the culture, they are everything we pass on. Expose someone to anger long enough, they will learn to hate. They become a carrier. Envy, greed, despair... All memes. All passed along." ―Monsoon And that's when things get quite interesting. This guy isn't talking about the meme that we usually knows, but instead it's about beliefs, "doctrines", that shape human's will and culture. Memes are the basic units of human culture and consciousness that are passed on from person to person, shaping their beliefs, emotions, and behaviors. They also carry cultural information that influences human thought and action, just like how DNA carries genetic information that determines the traits and characteristics of living organisms (hence the DNA of Soul thing). And this also explains the next thing: if you're exposed to an environment filled with malice, anger, and hate for a long time, you will most likely develop something like an anger issue and filled with hatred. If you're exposed to an AI tool long enough, you'll also learn how to make imaginary waifus exist in your screen, and make a POV where you (not)accidentally launch a mega pint of milk on them and make them feel good(?), enough to make the "government" applies a filter that's somehow not internally tested yet, and this results in even non milky waifus to "not feel very well" and disappear in a snap of a finger. And that is why ladies and gentlemen, keep your image generation "clean" and safe for now so you can enjoy your waifus long term. Well enough yapping. In the end it doesn't even matter, memes are therefore seen as the building blocks of human culture and identity, and as such, they are essential to understanding the nature of human evolution and social dynamics, and understanding memes can help us comprehend the nature of human evolution and culture (such as... comedy, yes, good old comedy).  Thank you for reading this part of REYApping. Oh and please try REYAMix if you have credits to spare, and roast me in Discord if it's crap (spoiler: it is). Okay this is the end, for real. Thank you and see you next time(?).
5

Posts