Photanima [by liftweights] - v1.1 Turbo

Photanima [by liftweights]

CHECKPOINT
Reprint


Updated:

Photanima [by liftweights] by MacrossManiac on Tensor.Art
Photanima [by liftweights] by MacrossManiac on Tensor.Art
Photanima [by liftweights] by MacrossManiac on Tensor.Art
Photanima [by liftweights] by MacrossManiac on Tensor.Art
Photanima [by liftweights] by MacrossManiac on Tensor.Art
Photanima [by liftweights] by MacrossManiac on Tensor.Art
Photanima [by liftweights] by MacrossManiac on Tensor.Art
Photanima [by liftweights] by MacrossManiac on Tensor.Art
Photanima [by liftweights] by MacrossManiac on Tensor.Art
Photanima [by liftweights] by MacrossManiac on Tensor.Art
Photanima [by liftweights] by MacrossManiac on Tensor.Art
Photanima [by liftweights] by MacrossManiac on Tensor.Art
Photanima [by liftweights] by MacrossManiac on Tensor.Art
Photanima [by liftweights] by MacrossManiac on Tensor.Art
Photanima [by liftweights] by MacrossManiac on Tensor.Art
Photanima [by liftweights] by MacrossManiac on Tensor.Art
Photanima [by liftweights] by MacrossManiac on Tensor.Art
Photanima [by liftweights] by MacrossManiac on Tensor.Art

Photanima is an experimental finetune of Anima Base v1.0 to see whether it is a viable architecture for photography. Spoiler alert: it totally is.

Turbo LoRA baked in. If you're on a 30-series GPU, I recommend using this with my INT8 Toolkit + INT8 Lazy Torch Compile node for wicked fast gen times. All demo images generated with that combo.

❤️ If you enjoy Photanima, you can help offset the cost of training:

Buy liftweights a Coffee

🤓 Technical details

Trained on ~1500 images for 27,500 steps. This is my dataset with around 100 new images and some caption cleanup. Training took approximately 24 hours on a Geforce 3090.

Pros:

  • Extremely fast.

  • Extremely good prompt adherence.

  • Anatomy is pretty stable. If it screws something up, changing your steps by +1/-1 usually fixes it.

  • Supports up to nearly 2MP with little-to-no distortions.

At first, I noticed that Photanima's style was inconsistent - it had a tendency to regress toward a cartoony/CGI look as my prompts became more complex. I was able to mostly overcome this by splitting Photanima into constituent content and style blocks, then boosting the style strength to around ~4.2 in ComfyUI.

Style is pretty consistent now, but there are some notable drawbacks.

Cons:

  • There are significant biases from my limited dataset. For example, you have to push your prompts pretty hard to steer the model away from its default facial features/racial biases. Yes, I have a type. I suspect this won't be a big issue for LoRA training.

  • It struggles with certain artistic terms like silhouette.

  • Microdetail quality is somewhere between SDXL and ZIT. Honestly, it's really good for a 2B model. Two-step upscaling with Anima doesn't help much, but I'm sure the results would be amazing if you sent a Photanima image to a different model for refinement. Or if that's too much work: just add a little film grain. It does wonders and requires no extra VRAM.

🛠️ Recommended Settings

  • 8-10 steps with v1.1 Turbo, or ~12 steps with v1.0 Turbo.

  • Euler sampler.

  • Simple scheduler.

  • CFG 1.

  • Preferred resolution: 832x1216 or 1040x1520.

  • For maximum realism, begin your prompt with real life photo of...

Base model settings:

  • 30-50 steps.

  • Euler sampler.

  • Simpler scheduler.

  • CFG 4-6.

  • Use a bunch of fluff tags like masterpiece, score_9, absurdres, best quality, highres, photo \(medium\), real life. Note: do not do this with Turbo.

🗺️ Roadmap

I'm pretty excited about the potential of Anima, but let's be clear: I'm not claiming that this checkpoint is a "ZIT killer." The correct model to compare this against is SDXL/IL - and I'm confident that Anima can dethrone it with enough community attention.

Directions I'd like to explore next:

  • There are a handful of Anima "detailer" LoRAs on Civitai. These are not intended for photography, but with enough block pruning, you never know. The right mix could go a long way.

  • I suspect doubling my dataset to ~3k images would make a big difference, especially if I can collect a wider range of faces, body types, and textures.

  • I'm eagerly awaiting the release of Anima Turbo 1.0. The current Turbo solution is based on Preview3 and I think it's holding back this model's potential a little.

  • I'm also looking forward to Anima support in OneTrainer. It will make trying experimental configs a lot less of a hassle compared to kohya-ss. For this v1 run, I stuck with safe values (prodigy, 1.0 LR, no fancy flags.)

Thank you. As always, I look forward to your feedback. Please share the model and upload some images to help it gain traction.

Version Detail

Anima
27500
8-10 steps with v1.1 Turbo. Euler sampler. Simple scheduler. CFG 1. Preferred resolution: 832x1216 or 1040x1520. For maximum realism, begin your prompt with "Real life photo of..." Trained on ~1500 images for 27,500 steps. This is my Snakebite 2.3 dataset with around 100 new images and some caption cleanup. Training took approximately 24 hours on a Geforce 3090.

Project Permissions

Model reprinted from : https://civitai.com/models/2645333/photanima

Reprinted models are for communication and learning purposes only, not for commercial use. Original authors can contact us to transfer the models through our Discord channel --- #claim-models.

Related Posts