Clarity XL

Name: Clarity XL - v1.0
Rating: 0 (0 reviews)
Author: nDimensional

CHECKPOINT

Original

nDimensional

Updated: Sep 8, 2024 11:41 AM

Run39

Read Description

Note: Clarity XL is currently in BETA.

Fine-Tuning is still in progress.

v2 is planned to release mid June once the current tuning process wraps up!

Questions/Feedback/Updates

Visit my thread on the Unstable Diffusion Discord

Buy me a coffee ❤

https://ko-fi.com/ndimensional

All donations will be used to fund the creation of new Stable Diffusion fine-tunes and open-source AI tools.

About

Continuing from the original Clarity model for SD1.5, Clarity XL is an attempt to recreate and expand the original models capabilities within the more complex architecture of SDXL.

Differences between Clarity SD1.5 and Clarity XL

Currently, Clarity XL focuses purely on photorealism. This is intentional to build the foundation which will be expanded upon in future releases. That's not to say Clarity XL will ever be a general purpose model. It will always have a bias towards photorealism. Future releases will add more complex photorealistic/cinematic scene capabilities.

Improvements

Emphasis on authentic (non touched-up) photorealism.
Higher Image Fidelity.
Prompt Adherence: How well the model follows your prompt.
- Excluding concepts that the model was not trained on
Improved skin textures
Overall improvements to aesthetics.
Video Game / Movie character recognition.
- Including worlds spaces, landscapes, settings, ect..
Prompt how you want: Accepts natural language prompts, comma delimited-lists, a hybrid of the two. In addition, prompts can be as short or long as you'd like.

Limitations

Complex scenes, such as firing lighting bolts from hand, erupting in a cacophony of bright blue sparkling arcs.
Multi-medium generation: The model is currently grounded in photorealism and cinematography.

Model Details

Base Model: Stable Diffusion XL v1.0
- Since Clarity XL v1 is a mid-training epoch. I merged the Epoch with an unreleased fine-tune update for LomoXL. I used a modified version of the DARE merging method to preserve the original weight matrix of the base epoch. This will not be needed in later releases.
Data: Quality was a priority when creating the dataset. All image-caption pairs were cleansed through multiple iterations to ensure only high quality data was used for tuning.
- Captions: Captions were written by my MLLM captioning system, verified via GroundingDINO + a Reasoning Engine + NLP
  - Captions were written in a natural language format. Though, SDXL's text-encoders make it possible to write prompts in multiple prompting styles.
VAE: sdxl-vae-fp16-fix
Aspect Ratio: From training data, any of the typical aspect ratios for SDXL will work.
- 1344x768 (16:9) — Cinematic Film Stills
- 1536x640 (21:9) — Ultrawide Cinematic Film Stills
- 1152x896 (4:3) — Fullscreen
- 1216x832 (3:2) — Mobile landscape
- 1024x1024 (1:1) — Square
- 1024x704 (11:16)
- 768x1344 (9:16) — Tall (Instagram stories / snapchat)
- 896x1152 (3:4)
- 832x1216 (2:3) — Mobile Portrait
- 704x1024 (16:11)