Kirazuri (Anima) - v2.0

Name: Kirazuri (Anima) - v2.0
Availability: InStock
Author: motimalu

Kirazuri (Anima)

CHECKPOINT

Original

motimalu

Updated: Jul 10, 2026 12:09 AM

Kirazuri (Anima)

Version 4.0 (Latest)

For in-depth details of training and tooling, see:

Training Details Summary

Trainer: diffusion-pipe

Training device: NVIDIA RTX PRO 6000 Blackwell Max-Q Workstation Edition

Total training time: ~10 days

Total samples seen(unbatched steps): ~3,000,000

Training resolutions:

512^2
768^2
1024^2
1536^2

Stage 1

Samples seen(unbatched steps): ~2,000,000
Training time: ~125 hrs
Learning Rate: 6e-6
Learning Rate Scheduler: Cosine
LLM Adaptor Learning Rate: 8e-7
Precision: Mixed BF16
Optimizer: AdamW8bit with Kahan Summation
Weight Decay: 0.01
Timestep Sampling Strategy: Logit-Normal
Training Resolutions: 512^2, 768^2, 1024^2

Stage 2

Samples seen(unbatched steps): ~1,000,000
Training time: ~84 hrs
Learning Rate: 2e-6
Learning Rate Scheduler: Cosine
LLM Adaptor Learning Rate: 2e-7
Precision: Mixed BF16
Optimizer: AdamW8bit with Kahan Summation
Weight Decay: 0.01
Timestep Sampling Strategy: Logit-Normal
Training Resolutions: 512^2, 1024^2, 1536^2

Additional Features

Masked Training
Tag Dropout: 30% with protected first 8 tags
Tag Shuffle: Applied to last unprotected tags
Natural Language: Short and Long Caption variants

Changes from Kirazuri (Anima) v3.0

Dataset includes recently curated 2,450 images increasing total size from 42,608 to 45,058 images
Dataset cutoff now of 29/06/2026
Introduced Masked Training for images with simple backgrounds
Updated tags+caption variants structure

Recognitions

Thanks to Circlestone Labs for the Anima Preview base model.
Thanks to tdrussell of Circlestone Labs for the diffusion-pipe trainer.
Thanks to bluvoll for support using their fork of diffusion-pipe.
Thanks to narugo1992 and the deepghs team for open-sourcing various training sets, image processing tools, and models.

License

This model is released under the same license as the base model.

See the base model for details of the CircleStone Labs Non-Commercial License.

Version Detail

Uploaded

May 28, 2026 4:33 AM

Base Model

Anima

Description

Version 2 A full finetune of the Anima preview3-base predominantly trained on high-resolution 1536x1536 AR buckets. Expanded the dataset with more recent data and included the full dataset used for my previous model Kirazuri Lazuli (Noobai V-Pred). Total training dataset of 35,537 non-synthetic images manually curated including quality and aesthetic ratings with a dataset cutoff now of 2026/04/15. Training Details Main training with diffusion-pipe commit: d5b78a2c49a07db8f7d9a4c795e4cfe7ba1c3dfe Final stage for high-res used fix in commit: b0aa4f1e03169f3280c8518d37570a448420f8be Samples seen(unbatched steps): ~680,000 Training time: ~220 hrs Learning Rate: 4e-6 (General Training) and 2e-6 (Aesthetic) LLM Adaptor Learning Rate: 8e-7 (General Training) and 2e-7 (Aesthetic) Per-resolution Effective Batch size: 128 (512p), 96 (1024p), and 48 (1536p) Precision: Mixed BF16 Optimizer: AdamW8bit with Kahan Summation Weight Decay: 0.01 Timestep Sampling Strategy: Logit-Normal (General Training) Tag Dropout: 30% with protected first 8 tags Additional Features used: Structured dataset by resolutions and manual ratings for staged training multiscale_loss_weight=0.5 and flux_shift=true for high-resolution training Mixed Natural Language captions with diffusion-pipe captions.json format: "image_1.jpg": [ "{tags}", "{first_n_tags}.\n{nl_caption}", "{dropout_tags1}.\n{nl_caption}", "{nl_caption}\n{dropout_tags2}" ]

Project Permissions

Use Permissions

Use in TENSOR Online
As a online training base model on TENSOR
Use without crediting me
Share merges of this model
Use different permissions on merges

Commercial Use

Sell generated contents
Use on generation services
Sell this model or merges