creator: Minthybasis
You beloved tail~ Ready for a full NAI3 experience? (Actually even better)
Full scale finetune of Pony Diffusion 6 with dataset of 1.8M anime pictures::
Unmatched (in opensouse) knowledge missing in original pony and other models
8k+ artists styles (wildcard), few general styles
Thousands of characters simply by prompt
Full color palette, full brightness range (example 1, example 2), great base aesthetics
No annoying watermarks like everywhere else
Unique angles, foreshortenings, fullbody-wideshots or extreeme closeups without any issues, pretty backgrounds as an added bonus
From cutest and lovely things to deepest and darkest fantasies
Best performance with tails concepts for your fox/cat/dog/dragon/... waifus/husbendos
Well, this finetune has amount of training that is enough to make a base anime model. Despite it, existing knowledge (for anime) has not gone but only becomes better. Accurate approach especially for TE training and a lot of high quality natural text captions (about 600k, mainly made with Claude3 Opus/Claude3.5 Sonet) significantly improves prompt control and understanding. "Feels like a new base, not pony (c)".
And yes, unlike the majority of PD-derriatives which is just a reskin or lobotomized merge, not a single lora was harmed merged. You can add your tweakers if needed, merge difference of other favourite checkpoint or whatever, it works just as a good pony-compatible base.
v0.5.0 Changelog
A new training from PD-base with a large dataset using some new approaches with pretraining, main train, refining
Lot of new data
After some black magic in training, now you can get complete black or complete white pictures without breaking compatibility with existing tools, loras, etc. Actually very interesting experience example
Better and more stable base styles, less "burning" for artists
Fixes, improvements, ...
(Dataset cut-off - beginning of July, requests after it is pending and not forgotten)
v0.4.5 changelog:
* Large work on improving stability and anatomy. Works better, less flaws in complex poses, better fingers
* New data, characters, artists, concepts
* Pyramid Noise has been applyed at final pass of training. Now brightness can be controlled by promt, also it fixes closeups/wide shots and maintains compatibility with existing LoRas unlike standard noise offset.
* Abandoning the original quality classification tags and switching to own
* General fixes and improvements
(Dataset cutoff: 20th May, all requests after this date will be in next version)
Features and prompting:
Well, first of all - TE knows a lot. It will try to make whatever you prompt without ignorance like you may use to. No guide-rails, no safeguards, no lobotomy. Shit it - shit out.
Scizo-prompts from mixes where you have to boost tag weights and add extra ones to get at least some response (something like (sunny day, rainbow, ethereal hair, transparent skin, huge breasts:1.9)) will not work. You will get something insane, creepy or unexpected.
At the same time, if you just copy tags from booru picture without manipulations mentioned above, or describe it normaly with combination of tags and natural text - most likely it will be great in very wide range. Stick to original booru tags to get best results. Deepest and darkest fantasies may require some rolling, popular things are very stable.
Basic of v0.5.0:
Same as for all SDXL, ~1 megapixel for txt2img, any AR with resolution multiple of 64 (1024x1024, 1152x, 1216x832,...). Euler_a and CFG 4..9 (6-7 is best). Highresfix: anyGAN/DAT, x1.5-1.6, denoise 0.5, upscale works best with single tile resolution no more then 3mpx. Highres fix and further upscale will significantly improve quality, details, eyes, hands, feet, etc.
Set Emphasis: No norm in settings of your generation tool if you getting strange blobs or distortion.
Clip Skip 1 unless you are using loras that have problems with it.
Only 4 quality tags:
masterpiece, best quality,
for positive
low quality, worst quality
for negative.
Avoid using score_x, source_x, ... etc like in original pony.
In most cases they just make things worse, add noise and mess, brake bodies, fingers, change styles and bring back urine yellow-green filter.
They just make things worse, add noise and mess, brake bodies, fingers, change styles and bring back urine yellow-green filter.
Originally that was definitely not the best implementation of quality tagging including some training flaws and requiring tons of tokens. It became clear that it's better to introduce new tags instead of fixing original. At this point they only bring old triggers without serious improvements.
Negative prompt:
(worst quality, low quality:1.1), error, bad hands, watermark, distorted
correct according to your preferences.
Do not put tags like greyscale, monochrome, yellow background in negative. You will just get burned images, no need to fix washed colors or "yellow filter" here like you may use to. 3d in negatives is also a bad choose in most cases.
To improve backgrounds, add to negative
simple background, blurry background, abstract background
but do not forget to remove it if you are prompting something with simple.
Artist styles:
Used with "by ", multiple gives very interesting results, can be controlled with prompt weights.
by ARTISTNAME1, [by ARTISTNAME2, (by ARTISTNAME3:0.8),...]
or/and
[by ARTISTNAME1|by ARTISTNAME2|by ARTISTNAME3|...]
Works best in the very beginning of prompt. Can be used as a wildcard (beware, there is a flaw in sd-dynamic-prompts extension that sometimes wrecks up results when used with batch size more then 1). For majority highresfix/upscale improves quality a lot.
General styles:
2.5d, bold line, smooth shading, flat colors, minimalistic, cgi, digital painting, ink style, oil style, pastel style
can be used in combinations (with artists too), with weights, both in positive and negative prompts.
Characters:
Use full name tag same like on boorus and proper formatting, like "karin_(blue_archive)" -> "karin \(blue_archive\)", use skin tags for better reproducing, like "karin \(bunny \(blue_archive\)". This extension might be very usefull.
Most characters are known by the name, but it will be better if you prompt their main features, like:
karin \(blue_archive\), karin \(bunny \(blue_archive\), dark-skinned female, purple halo, ponytail, yellow eyes, playboy bunny, fishnet pantyhose, gloves
Natural text:
Use it in combination with booru tags, works great. Use only natural text after typing styles and quality tags. Use just booru tags and forget about it, it's all up to you.
And yes, it's still based on pony, so it will be worse in IRL concepts, references or some complex expressions comparing to other checkpoints based on vanila SDXL. Check out Tofu, my new model that can manage such things.
Lots of Tail/Ears-related concepts:
tail censor, holding own tail, hugging own tail, holding another's tail, tail grab, tail raised, tail down, ears down, hand on own ear, tail around own leg, tail around penis, tail through clothes, tail under clothes, lifted by tail, tail biting,...
(booru meaning, not e621) and many others with natural text. Some reproduces perfectly, some requires rolling. Unfortunately In 0.5.0 some may work worse, but other looks better. Also now it have better performance with all kind of tails, not only fluffy kemonomimis.
Brightness/contrast:
You can just prompt with tags or natural text what you want in it should work, like dark night, dusk, bright sun, etc. Black/white background works, but often it gives not 0,0,0 or 255,255,255 like should. Part of this is related to prompts - just check what pictures are tagged with it. And using phrases like (cute girl in front of completely black background) fixes it. Anyway you shouldn't meet any issues with general use, it works just like NAI3, often even better.
Known issues
Well, unfortunatelly there are:
Some artist styles don't work as it should.
(The reason for this is not entirely clear, because in another model with the same dataset they work fine. Probably it is something related to conflicts with PD 1-token hashes or problems with original TE. It can be fixed in future anyway, please report if you find artists that doesn't have decent effect.)
Some concepts require more training (few tail-related, some rare like "dogeza" or memes)
Watermarks sometimes can be found. Mostly it is related to pony-base, but some may be from dataset
To be discovered, still WIP
Requests for artists/characters in future models are open. If you find artist/character/concept that perform weak, inaccurate or has strong watermark - please report, will add them explicitly. Follow for a new versions.
License:
Pony viral, check the original. Fell free to use in your merges, finetunes, ets. just please leave a link.
Thanks:
Artists wish to remain anonymous for sharing private works; Soviet Cat - GPU sponsoring; Sv1. - llm access, captioning, code; K. - training code; Bakariso - datasets, testing, advices, insides; NeuroSenko - donations, testing, code; T.,[] - datasets, testing, advices; dga, Fi., ello - donations; other fellow brothers that helped. Love you so much ❤️.
And off course everyone who made feedback and requests, it's really valuable.
Donations
AI is my hobby, I'm wasting money on it and not begging for donations. If you want to support - share my models, leave feedback, make a cute picture with kemonomimi-girl. And of course, support original artists.
Hovewer your money will accelerate further training and researches
(Just keep in mind that I can waste it on alcohol or cosplay girls)
BTC: bc1qwv83ggq8rvv07uk6dv4njs0j3yygj3aax4wg6c
ETH/USDT(e): 0x04C8a749F49aE8a56CB84cF0C99CD9E92eDB17db
if you can offer gpu-time (a100+) - PM.