Clarity XL

CHECKPOINT
Original


Updated:

39

Read Description

Note: Clarity XL is currently in BETA.

Fine-Tuning is still in progress.

v2 is planned to release mid June once the current tuning process wraps up!


Questions/Feedback/Updates

Visit my thread on the Unstable Diffusion Discord

Buy me a coffee ❤

https://ko-fi.com/ndimensional

All donations will be used to fund the creation of new Stable Diffusion fine-tunes and open-source AI tools.


About

Continuing from the original Clarity model for SD1.5, Clarity XL is an attempt to recreate and expand the original models capabilities within the more complex architecture of SDXL.

Differences between Clarity SD1.5 and Clarity XL

Currently, Clarity XL focuses purely on photorealism. This is intentional to build the foundation which will be expanded upon in future releases. That's not to say Clarity XL will ever be a general purpose model. It will always have a bias towards photorealism. Future releases will add more complex photorealistic/cinematic scene capabilities.

Improvements

  • Emphasis on authentic (non touched-up) photorealism.

  • Higher Image Fidelity.

  • Prompt Adherence: How well the model follows your prompt.

    • Excluding concepts that the model was not trained on

  • Improved skin textures

  • Overall improvements to aesthetics.

  • Video Game / Movie character recognition.

    • Including worlds spaces, landscapes, settings, ect..

  • Prompt how you want: Accepts natural language prompts, comma delimited-lists, a hybrid of the two. In addition, prompts can be as short or long as you'd like.

Limitations

  • Complex scenes, such as firing lighting bolts from hand, erupting in a cacophony of bright blue sparkling arcs.

  • Multi-medium generation: The model is currently grounded in photorealism and cinematography.

Model Details

  • Base Model: Stable Diffusion XL v1.0

    • Since Clarity XL v1 is a mid-training epoch. I merged the Epoch with an unreleased fine-tune update for LomoXL. I used a modified version of the DARE merging method to preserve the original weight matrix of the base epoch. This will not be needed in later releases.

  • Data: Quality was a priority when creating the dataset. All image-caption pairs were cleansed through multiple iterations to ensure only high quality data was used for tuning.

    • Captions: Captions were written by my MLLM captioning system, verified via GroundingDINO + a Reasoning Engine + NLP

      • Captions were written in a natural language format. Though, SDXL's text-encoders make it possible to write prompts in multiple prompting styles.

  • VAE: sdxl-vae-fp16-fix

  • Aspect Ratio: From training data, any of the typical aspect ratios for SDXL will work.

    • 1344x768 (16:9) — Cinematic Film Stills

    • 1536x640 (21:9) — Ultrawide Cinematic Film Stills

    • 1152x896 (4:3) — Fullscreen

    • 1216x832 (3:2) — Mobile landscape

    • 1024x1024 (1:1) — Square

    • 1024x704 (11:16)

    • 768x1344 (9:16) — Tall (Instagram stories / snapchat)

    • 896x1152 (3:4)

    • 832x1216 (2:3) — Mobile Portrait

    • 704x1024 (16:11)

🤗Huggingface Repo


Changelog

5/23/24 Clarity XL v1.0:

  • Initial Release


Checkout my other models

SDXL

LoRA SDXL

SD1.5

LoRA SD1.5

https://tensor.art/u/738915824668190781

Version Detail

SDXL 1.0
<p><strong>Initial Release</strong></p>

Project Permissions

    Use Permissions

  • Use in TENSOR Online

  • As a online training base model on TENSOR

  • Use without crediting me

  • Share merges of this model

  • Use different permissions on merges

    Commercial Use

  • Sell generated contents

  • Use on generation services

  • Sell this model or merges

Comments

Related Posts

No posts yet
Describe the image you want to generate, then press Enter to send.