Photanima is an experimental finetune of Anima Base v1.0 to see whether it is a viable architecture for photography. Spoiler alert: it totally is.
Turbo LoRA baked in.
Trained on ~1500 images for 27,500 steps. This is my dataset with around 100 new images and some caption cleanup. Training took approximately 24 hours on a Geforce 3090.
Pros:
Extremely fast.
Extremely good prompt adherence.
Anatomy is pretty stable. If it screws something up, changing your steps by +1/-1 usually fixes it.
Supports up to nearly 2MP with little-to-no distortions.
At first, I noticed that Photanima's style was inconsistent - it had a tendency to regress toward a cartoony/CGI look as my prompts became more complex. I was able to mostly overcome this by splitting Photanima into constituent content and style blocks, then boosting the style strength to around ~4.2 in ComfyUI.
Style is pretty consistent now, but there are some notable drawbacks.
Cons:
There are significant biases from my limited dataset. For example, you have to push your prompts pretty hard to steer the model away from its default facial features/racial biases. Yes, I have a type. I suspect this won't be a big issue for LoRA training.
It struggles with certain artistic terms like silhouette.
Microdetail quality is somewhere between SDXL and ZIT. Honestly, it's really good for a 2B model. Two-step upscaling with Anima doesn't help much, but I'm sure the results would be amazing if you sent a Photanima image to a different model for refinement. Or if that's too much work: just add a little film grain. It does wonders and requires no extra VRAM.
Model is a little too horny for its own good.
🛠️ Recommended Settings
8-10 steps with v1.1 Turbo, or ~12 steps with v1.0 Turbo.
Euler or er_sde sampler. Euler is a safe pick, but er_sde might produce better details.
Simple or Beta scheduler.
CFG 1.
Preferred resolution: 832x1216 or 1040x1520.
For maximum realism, begin your prompt with real life photo of...
Base model settings:
30-50 steps.
Euler sampler.
Simpler scheduler.
CFG 4-6.
Use a bunch of fluff tags like masterpiece, score_9, absurdres, best quality, highres, photo \(medium\), real life. Note: do not do this with Turbo
