Are score_tags neccessary in PDXL/SDXL Pony Models? | Halloween2024


Updated:

Consensus is that the latest generation of Pony SDXL models no linger require "score_9 score_8 score_7" written in the prompt to "look good".

//----//

It is possible to visualize our actual input to the SD model for CLIP_L ( a 1x768 tensor) as a 16x16 grid , each with RGB values since 16 x 16 x 3 = 768

I'll assume CLIP_G in the SDXL model can be ignored. Its assumed CLIP_G is functionally the same but for 1024 dimension instead of 768.

So the here we have the prompt : "score_9 score_8_up score_8_up"

Then I can do the same but for the prompt : "score_9 score_8_up score_8_up" + X

Where X is some random extremely sus prompt I fetch from my gallery. Assume it to fill up to the full 77 tokens (I set truncate=True on the tokenizer so it just caps off past the 77 token limit)

Examples:

etc. etc.

Granted , first three tokens in the prompt for the 768 encoding greatly influnces the "theme" of the output.

But from above images one can see that the "appearance" of the text encoding can vary a lot.

Thus , the "best" way to write a prompt is rarely universal.

Here I'm running some random text I write myself to check similarity to our "score prompt" (top result should be 100% , so I might have some rounding error) :

score_6 score_7_up score_8_up : 98.03%
score 8578 : 85.42%

highscore : 82.87%

beautiful : 77.09%

score boobs score : 73.16%

SCORE : 80.1%

score score score : 83.87%

score 1 score  2  score 3 : 87.64%

score : 80.1%

score up score : 88.45%

score  123 score down : 84.62%

So even though the model is trained for "score_6 score_7_up score_8_up"

we can be kinda loose in how we want to phrase it , if we want to phrase it.

Same principle applies for all LoRA and their activation keywords.

Negatives are special. The text we write in the negatives are split by whitespace , and the chunks are encoded individually.

Link to Notebook if you want to run your own tests:

https://huggingface.co/datasets/codeShare/fusion-t2i-generator-data/blob/main/Google%20Colab%20Jupyter%20Notebooks/fusion_t2i_CLIP_interrogator.ipynb

I use this thing to search up prompt words using the CLIP_L model

//---//

These are the most similiar items to the Pony model "score prompt" within my text corpus

Items of zero similarity (perpendicular) negative similarity (vector at opposite direction) to encoding are omitted from these results.

Note that this are encodings similiar to the "score prompt" trigger encoding , not analysis of what the Pony Model considers good quality.

Prompt phrases among my text corpus most similiar to "score_9 score_8_up score_8_up" according to CLIP (the peak of the graph above):

    Community: sfa_polyfic  - 68.3 %
    holding blood ephemeral dream  - 68.3 %
    Excell  - 68.3 %
    supacrikeydave  - 68.3 %
    Score | Matthew Caruso  - 67.8 %
    freckles on face and body HeadpatPOV  - 67.8 %
    Kazuno Sarah/Kunikida Hanamaru  - 67.8 %
    iers-kraken lun   - 67.8 %
    blob whichever blanchett   - 67.6 %
    Gideon Royal  - 67.6 %
    Antok/Lotor/Regris (Voltron)  - 67.6 %
    Pauldron  - 66.7 %
    nsfw blush Raven   - 66.7 %
    Episode: s08e09 Enemies Domestic  - 66.7 %
    John Steinbeck/Tanizaki Junichirou (Bungou Stray Dogs)  - 66.7 %
    populism probiotics airspace shifter   - 65.4 %
    Sole Survivor & X6-88  - 65.4 %
    Corgi BB-8 (Star Wars)  - 65.4 %
    Quatre Raberba Winner/Undisclosed  - 65.2 %
    resembling a miniature fireworks display with a green haze.  Precision Shoot  - 65.2 %
    bracelet grey skin  - 65.2 %
    Reborn/Doctor Shamal (Katekyou Hitman Reborn!)/Original Male Character(s)  - 65.2 %
    James/Madison Li  - 65.1 %
    Feral Mumintrollet | Moomintroll  - 65.1 %
    wafc ccu linkin   - 65.1 %
    Christopher Mills  - 65.0 %
    at Overcast  - 65.0 %
    Kairi & Naminé (Kingdom Hearts)  - 65.0 %
    with magical symbols glowing in the air around her. The atmosphere is charged with magic Ghost white short kimono  - 65.0 %
    The ice age is coming  - 65.0 %
    Jonathan Reid & Bigby Wolf  - 65.0 %
    blue doe eyes cortical column  - 65.0 %
    Leshawna/Harold Norbert Cheever Doris McGrady V  - 65.0 %
    foxtv matchups panna   - 65.0 %
    Din Djarin & Migs Mayfeld & Grogu | Baby Yoda  - 65.0 %
    Epilogue jumps ahead  - 65.0 %
    nico sensopi  - 64.8 %
    秦风 - Character  - 64.8 %
    Caradoc Dearborn  - 64.8 %
    caribbean island processing highly detailed by wlop  - 64.8 %
    Tim Drake's Parents  - 64.7 %
    probiotics hardworkpaysoff onstorm allez   - 64.7 %
    Corpul | Coirpre  - 64.7 %
    Cantar de Flor y Espinas (Web Series)  - 64.7 %
    populist dialog biographical   - 64.7 %
    uf!papyrus/reader  - 64.7 %
    Imrah of Legann & Roald II of Conte  - 64.6 %
    d brown legwear  - 64.6 %
    Urey Rockbell  - 64.6 %
    bass_clef   - 64.6 %
    Royal Links AU  - 64.6 %
    sunlight glinting off metal ghost town  - 64.6 %
    Cross Marian/Undisclosed  - 64.6 %
    ccu monoxide thcentury   - 64.5 %
    Dimitri Alexandre Blaiddyd & Summoner | Eclat | Kiran  - 64.5 %
24
0

Comments