ZootVision - Eta

CHECKPOINT
Original


Updated:

840

What is this?

I would describe it like so: an abnormally versatile SD 1.5 model with extensive custom training done exclusively at 1024px and higher (thanks to "bucketing"). Built up in a clean, additive, iterative fashion on an ongoing basis thanks to CivitAI's handy online Lora trainer. Can do everything from pretty landscapes to ******** booru-tag based NFSW in pretty much any style. Not specifically an anime, realistic, or semirealistic checkpoint, rather moreso whichever of those you want it to be at any given time. All showcase images are direct generations made without any use of detailing or upscaling, and include full metadata.

How do I use it?

You can use either natural language or booru tags (with spaces, not underscores). I tend to use both simultaneously, as in mostly coherent sentences but with many of the words and phrases being specific tags that actually exist. See the showcase gallery for a variety of examples. In terms of resolution, it is at the very least completely pointless in my opinion to ever go any lower than 768x768 with this model (as 100% of my training is done at 1024px without downscaling or cropping anything).

Personally, I do not ever generate lower than 1024x768 or 768x1024 with this, and more often actually do 1216x832 and 832x1216 when it comes to non-square-format images. For square format I personally stick to 1024x1024. Again, you can download my showcase images at their original resolution with full metadata to get a better idea of what this thing can do, as it is also trained on some less common "exotic" aspect ratios / resolutions too.

Also note that if you're prompting for 2D-style images, this model DOES recognize a large selection of "by whoever" artist tags (some stronger than others), so if there's one you have in mind just try it.

Do masterpiece, best quality, high quality, worst quality, and so on exist in this model?

Yes, but their impact on the image is much smaller if your overall prompt is for realism or semirealism, they have the most noticeable impact specifically on 2D-style images. detailed background and simple background specifically however DO both have the impact you'd expect on all types of images, generally speaking.

V6.5 Zeta Plus Details:

It's not quite what Zootvision V7 Eta is intended to be, yet. But it makes some nice, perhaps subtle, improements. I tried to stress the actual depth of the model in the showcase gallery images this time, a bit more. VAE is baked in as always.

V5.0 Epsilon Details:

Trained for an additional 10,000 steps on a variety of subjects (all of photorealism, NSFW, and anime have been at least somewhat refined) against v4.0 Delta. This version also introduces an Ideogram style dataset, which can be triggered by using 'by ideogram' in any prompt. See the showcase gallery for some examples. I think this is a pretty solid improvement over Delta, hope you enjoy it! VAE is baked in as always.

V4.0 Delta Details:

Two additional datasets merged in (one for further enhancement of photographic images of people and places, one for some experimental "tricky prompt" rich captioning stuff), both trained on V3.0 Gamma for a combined total of 9040 steps. VAE is baked in as always. All data in the new photographic dataset was tagged with photo \(medium\) in order to build on top of the model's existing understanding of that tag. This is definitely the best version yet, hope you enjoy it!

V3.0 Gamma Details:

1000-image "aesthetic" dataset (trained for 10,000 steps on V2.0 Beta) merged in. This dataset can be optionally strengthened by using the phrase very aesthetic anywhere in your prompt. This version has a VAE already baked in, as always.

V2.0 Beta Details:

Merged with 1000-image "NSFW Enhancer" dataset (trained for 10,000 steps on V1.0 Alpha). All images were at least 1024px on at least one side, up to a maximum of 1216 (for XL-style 832x1216 portrait / 1216x832 landscape images, of which there were a fair number).

V1.0 Alpha details:

My (incomplete) attempt at a truly general-purpose high-resolution-focused SD 1.5 model, in the sense of anything from pretty landscapes to ******** booru-tag based NSFW ****.

Uploading to CivitAI in the current state basically for the **** purpose of using their Lora trainer for a few more 1000-image datasets I need to get trained and merged into this thing. Feel free to try it out regardless if you like (it know many characters, see e.g. Jinx in the showcase), however expect relatively different results from later / the final version.

General (always relevant) details:

DO NOT blindly assume that Clip Skip 2 is always "correct" with this model, it is not really traditionally NAI-derived at all. Really I'd moreso recommend just trying either Clip Skip 1 or 2 if you've found a particular seed that you mostly like but isn't quite "there" for a given prompt, as in my testing both give good results under different circumstances.

Version Detail

SD 1.5
59040
400
Two additional datasets merged in (one for further enhancement of photographic images of people and places, one for some experimental "tricky prompt" rich captioning stuff), both trained on V3.0 Gamma for a combined total of 9040 steps. VAE is baked in as always. All data in the new photographic dataset was tagged with photo \(medium\) in order to build on top of the model's existing understanding of that tag. This is definitely the best version yet, hope you enjoy it!

Project Permissions

    Use Permissions

  • Use in TENSOR Online

  • As a online training base model on TENSOR

  • Use without crediting me

  • Share merges of this model

  • Use different permissions on merges

    Commercial Use

  • Sell generated contents

  • Use on generation services

  • Sell this model or merges

Comments

Related Posts

No posts yet
Describe the image you want to generate, then press Enter to send.