PictureT

PictureT

🇬🇧 🇨🇵 🇪🇸 Mostly in 𝑷𝒊𝒙𝒊𝒗 🌐 https://picture-t.com/
729
Followers
175
Following
128.1K
Runs
144
Downloads
13.2K
Likes
3.2K
Stars

Models

View All
880654075832609508
LORA IllustriousUpdated
EXCLUSIVE

[Cosmic Horrors] Eris Etolia-Illustrious v2.0 🇦

21K 202
LORA IllustriousUpdated
EXCLUSIVE

[Cosmic Horrors] Nightmare-Illustrious v2.0

2.1K 71
878189597828319769
LYCORIS SDXL 1.0Updated

Nightmare - LyCORIS-10

98 6
875808559227480849
LYCORIS FLUX.1

sblycor-v1.0

65 4
LORA Illustrious
EXCLUSIVE

Stellar Blade - 스텔라 블레이드 [ILLUSTRIOUS v2.0]-Project: EVE

888 16
LORA FLUX.1
EXCLUSIVE

Stellar Blade - 스텔라 블레이드 [FLUX.1 DEV]-Project: Eve

169 9
LORA Illustrious
EXCLUSIVE

SEENA - 𝗖𝗢𝗟𝗢𝗥𝗙𝗨𝗟 iGame-Illustrious v2.0

1.3K 28
VIDEO WAN_2_1_14B

[T2V] Glitch-futurist Nostalgia-v1.0

67 45
863526087706062843

VIDEO HunyuanVideo

rrr-2025-05-14 18:43:05

9 7
LORA Illustrious
EXCLUSIVE

[Gumroad] Helena Douglas x Illustrious v2.0-852132855877214476

2.6K 12
843529833456773245
LYCORIS FLUX.1
EXCLUSIVE

LyCORIS 🛠️ Hotfix for [FLUX.1 DEV]-☣️ ALT alpha v1.1

2.4K 44
844945699490674794
LORA FLUX.1
EXCLUSIVE

HED☽NICA™ — Money Shot-𝗥𝗔𝗡𝗞𝟰 v0.70

526 21
833908080216549157
LORA FLUX.1
EXCLUSIVE

HED☽NICA™ — Expressive-v1.20

428 20
838130342306130276
LORA SDXL 1.0
EXCLUSIVE

DoA • デッド オア アライブ [SDXL]-Helena 𝗥𝗔𝗪

2.4K 10
831104755062334350
LORA FLUX.1
EXCLUSIVE

DoA • デッド オア アライブ [FLUX.1 DEV]-Helena Douglas

4.5K 20
830107236023088858
LORA Illustrious
EXCLUSIVE

Final Fantasy VII Remake [Illustrious]-Tifa 𝗥𝗔𝗪

2K 51
828782187072493896
LORA FLUX.1
EXCLUSIVE

Final Fantasy VII Remake [FLUX.1 DEV]-Tifa Lockhart

539 22
808053558210506634
LORA Illustrious
EXCLUSIVE

Western Fine Art [Illustrious v0.1]-v10

110 6
807500529635739355
LORA FLUX.1
EXCLUSIVE

Fine Art Photography-Space Design

252 23
807156949427910276
LORA FLUX.1

Western Fine Art [Flux.1 D]-🎅 Merry Christmas

935 27

Articles

View All
 What is the 75 prompt limit in Stable Diffusion?

What is the 75 prompt limit in Stable Diffusion?

This is a re-posted article to avoid being lost via waybackmachine. The next information apply for CLIP L and CLIP G tooIn the vibrant landscape of artificial intelligence, the Stable Diffusion model stands as a beacon of ingenuity, translating text into images with remarkable finesse.Yet, within its creative realm lies the defining constraint of the 75-token prompt limit. While seemingly restrictive, this limit serves as a safeguard to ensure the model’s responsiveness and manageability. As users navigate the nuances of crafting effective prompts through stable diffusion’s prompt engineering tips and more – precision becomes paramount. By embracing the principles of clarity, relevance, and meticulous keyword selection, creators can harness the potential of Stable Diffusion to conjure breathtaking scenes and characters from the canvas of imagination. As the AI journey continues, understanding and working within these limitations only elevate the tapestry of creativity that technology weaves..What is the 75 prompt limit in Stable Diffusion?The realm of artificial intelligence continues to expand its horizons, showcasing remarkable advancements that never cease to amaze us.  Among these innovations, the Stable Diffusion model has emerged as a notable breakthrough, captivating attention for its extraordinary capability to transform textual inputs into lifelike images. Thus, AI prompt engineering courses are a must to learn to give successful prompts for high quality images.However, the world of AI is not without its unique constraints, and in the case of the Stable Diffusion model, it introduces us to the concept of the “Stable Diffusion prompt token limit of 75.” Today we will dive into the intricacies of this limitation, shedding light on what it entails and how it shapes the creative potential of the model.What is the Stable Diffusion limit 75Let’s look into learning the easiest way on how stable diffusion works. Artificial intelligence continues to astonish us with its capabilities, and one such breakthrough is the Stable Diffusion model. This advanced AI model has gained substantial attention for its remarkable ability to generate lifelike images from text inputs. However, it comes with a peculiar constraint known as the “Stable Diffusion prompt token limit of 75.”The Stable Diffusion model permits users to provide prompts containing up to 75 tokens. A token can be thought of as a discrete unit of text that the model comprehends. This is analogous to how words form the building blocks of human language. For instance, in the sentence “The cat sat on the mat,” each word is a token. The limit of 75 tokens implies that a prompt can be at most 75 words long.What are tokens?To grasp the concept better, it’s essential to understand what tokens are. Tokens are the fundamental elements of the text that a machine-learning model comprehends. While these tokens are typically words, they can also encompass punctuation marks, numbers, or other symbols. For instance, in the sentence “Hello, 123,” the tokens are “Hello,” “,” and “123.”The Stable Diffusion model employs a process called tokenization to break down input text into these tokens. This intricate process allows the model to extract meaningful insights from the provided text, guiding its image generation process.How to check the number of tokens in a promptDetermining the number of tokens in a prompt is crucial to ensure it adheres to the 75-token limit. Various methods can help you achieve this. One approach is using an online tokenizer tool designed to count tokens in a given text. Alternatively, certain text editors or platforms display the token count directly, simplifying the process for users.What is the limitation of Stable Diffusion prompt?Prompt length limitationThe 75-token limit is a defining constraint of the Stable Diffusion model. This restriction effectively means that your prompt cannot exceed 75 tokens in length. Should you attempt to input a prompt surpassing this limit, the model will truncate it down to precisely 75 tokens. This serves as a necessary measure to ensure that the model’s responses remain within manageable parameters.Other limitations of Stable Diffusion promptsBeyond the token limit, other limitations accompany Stable Diffusion prompts. It’s vital to recognize that the model’s comprehension needs to be more omniscient. In other words, it doesn’t understand every single word or phrase that’s thrown at it. Consequently, if a prompt contains words or terms that the model struggles to comprehend, the resulting image may deviate from the intended outcome. This underscores the significance of crafting precise and comprehensible prompts.What is a quality prompt for Stable Diffusion?Tips for writing a quality Stable Diffusion promptCrafting effective prompts for Stable Diffusion is both an art and a science. While the model’s capabilities are awe-inspiring, the quality of the prompt significantly influences the generated images. You can also learn to make use of negative prompts as well as the stable diffusion weights to get better results.Here are some tips to ensure your prompts yield optimal results:Clarity is Key: Express your ideas clearly and concisely. Ambiguous or convoluted prompts might lead to unintended outcomes.Relevance Matters: Ensure that your prompt is pertinent to the desired image. Irrelevant prompts can cause confusion and misalignment between the generated image and your intent.Avoid Esoteric Terminology: Steer clear of using highly specialized jargon or niche terminology that the model might not comprehend accurately.Precise Keywords: Choose keywords that accurately represent your desired image. Keywords play a pivotal role in guiding the model’s creative process.If all of this seems overwhelming, a good and simpler place to start is the prompt engineering cheat sheet.Examples of good Stable Diffusion promptsEffective prompts encompass a range of characteristics that align with the model’s capabilities. Let’s explore a few examples:Scene Description: “Generate a serene sunset over a tranquil lake, with vibrant hues of orange and pink reflecting on the water’s surface.”Fantasy Landscape: “Create a mystical forest where towering trees house luminescent fireflies and iridescent mushrooms.”Character Portrayal: “Illustrate a courageous knight adorned in shimmering armor, gazing confidently at a daunting dragon in a moonlit cave.”How many words is a Stable Diffusion prompt?Ideal prompt lengthRegarding the ideal length for a Stable Diffusion prompt, adhering to the 75-token limit is paramount. As tokens can encompass words, punctuation marks, and symbols, the length in terms of words can vary. However, a general rule of thumb is that the prompt should ideally be concise yet sufficiently detailed to convey your vision effectively.How to adjust the prompt lengthIf you are worried about how long are stable diffusion prompts and whether you should grapple with the 75-token constraint, there are strategies to navigate this limitation. One approach involves condensing your prompt by removing unnecessary words while retaining its essence. You can also explore splitting your prompt into smaller segments, each within the token limit. This can be particularly effective when crafting prompts that require multiple facets or details.
Animatensor - Prompting Guide

Animatensor - Prompting Guide

AnimaTensor, is the ultimate anime-themed finetuned SDXL model. The model was trained from Animagine XL 4.0-Zero to converting the model to support V-prediction and Zero-terminal SNR. Trained on anime-style images from danbooru with the knowledge cut-off of January 7th 2025. Similar to the base model, this model was trained using tag ordering method for the identity and style training.Animatensor PRO: https://tensor.art/models/875968258996450530/AnimaTensor-Pro-ProAnimatensor REGULAR: https://tensor.art/models/875952002545262935/AnimaTensor-RegularTo generate on this new model, you can check this Ai Tool: https://tensor.art/template/878775865159195620User GuideFirst things first, the order of tags is crucial in this model. At this point, we strongly recommend following our suggested prompt guidelines by placing tags in the correct order.[Gender], [Character], [From What Series], [Rating Tag], [Artist Tags], [General Tag], [Quality Tags]Example1girl, mita \(miside\), cool mita \(miside\), miside, animal ear headwear, blue gloves, blue hat, blue skirt, cabbie hat, choker, collarbone, gloves, hat, looking at viewer, low ponytail, ponytail, purple eyes, purple hair, red choker, red sweater, skirt, smile, sweater, teardrop facial mark, v, sensitive, field, sensitive, masterpiece, high score, great score, absurdresFrom What SeriesIt is mandatory to put series tag inside your prompt list. If you want to generate your own original character? You can skip this step and just go ahead with the general tags right after gender tag.Rating TagsBy putting rating tags (safe, sensitive, nsfw, or explicit) inside your prompt list, you have bigger chance to make your generated images better suited to the rating you’re aiming for. Putting safe rating tag will make the model more likely to generate safe for work image, sensitive for more revealing clothings. NSFW tag, like the name suggest (”not safe for work”), can yield even more revealing clothes/nudity, while explicit tag can yield sexual act. Take a look at these examples below (and of course, for an obvious reason, we cannot provide examples for NSFW tag and Explicit tag.)Artist TagsWe recommend placing [Artist Tags] before [Geneal Tags]. Similarly, using artist tags for a character without including their series tags is not advised. Using artist tags without character’s corresponding series tags significantly weakens their effect.Quality Tags: masterpiece, high score, great score, absurdresTo put quality tags at the end of your prompts instead of front. the artist tags can be triggered.Negative Prompts: lowres, bad anatomy, bad hands, text, error, missing finger, extra digits, cropped, worst quality, low quality, bad score, signature, watermark, blurryReferenceshttps://cagliostrolab.net/https://github.com/cagliostrolab
14
My 'Model Training' Parammeters for the 'Christmas Model' task of 'Christmas Walkthrough' event

My 'Model Training' Parammeters for the 'Christmas Model' task of 'Christmas Walkthrough' event

Hello!, I've made a LoRA in TensorArt, and i want to share the processhttps://tensor.art/models/806550646066228285/Christmas-Walkthrough-2024-Merry-ChristmasTutorial for FluxHave an Idea (Christmas)Get a Dataset, [min 1024x1024]Dataset: https://mega.nz/file/DAdHiLwA#-swbYhXvbH2-zMav4JflUDZuERoY-lt_y4RMq7HpzM8Upload your DatasetSelect Batch Cutting: config for 'vertical, horizontal or squared' depending on your dataset, mine is vertical.Select Auto Labeling: config for 'florence2, natural lenguage' or enter your captions manually; I'll do both, first auto labeling and then selecting Add Batch Labels to caption my Keywords. Its Important match the 'Keen n Tokens' with the number of Keywords desired.Keywords: traditional media, christmas parody, fine art, 1940 \(style\)Config the rest of parammeters (see below)You're done, start training!
7
Christmas Walkthrough | Add Radio Buttons to an old Ai Tool.

Christmas Walkthrough | Add Radio Buttons to an old Ai Tool.

What are Radio Buttons?They allow you to use name syntax in your prompt to get a lines of prompt from a file. in TensorArt we will use it as susbtitition for personalized wildcards. So Radio Buttons are pseudo-wildcards. Check this article to know how to manipulate and personalize them. Radio Buttons requires a <Clip Text Encoder> node to be storo within.What do we need?Any working Ai ToolIn my current exploration only certain <CLIP Text Encoder> nodes allows you to use them as Radio Button containers. For this example I'll use my ai tool: 📸 Shutterbug | SD3.5L Turbo.Duplicate/Download your Ai Tool workflow (To have a Backup).Add a <CLIP Text Encode> node.Add a <Conditioning Combine> node,Ensamble the nodes as the illustration shows; be careful with the combine method, use concat if you're not experienced at combining clips, this will instruct your prompting to ADD the Radio Button calling prompt.💾 Save your Ai Tool workflow.Go to Edit mode in your Ai Tool.Export your current User-configurable Settings (JSON).↺ Update your Ai Tool.Import your old User-configurable Settings (JSON).Look for the new <CLIP TextEncode> node, and load it.Hover over the <CLIP TextEncode> new tab, and select Edit.Config your Radio Buttons.Publish your Ai Tool.Done! Enjoy the Radio Button feature in your Ai Tools, so in my case my new Ai Tool looks like this:📹 Shutterbug | SVD & SD3.5L Turbo.Note: I also included SVD video to meet the requirements of the Christmas Walkthrough event.
15
1
🎃 Halloween2024 | Optimizing Sampling Schedules in Diffusion Models

🎃 Halloween2024 | Optimizing Sampling Schedules in Diffusion Models

You migh have seen this kind of images in the past if you've girly tastes when navigate on pinterest, well guess what? I'll teach you about some parammeters to enhance your Pony SDXL future generations. It's been a while since my last post, today I'll teach you about a cool feature launched by NVIDIA on July 22, 2024. For this task I'll provide an alternative workflow (Diffusion Workflow) for SDXL. Now lets go with the content.ModelsFor my research (AI Tool) I decided to use the next models:Checklpoint model: https://tensor.art/models/757869889005411012/Anime-Confetti-Comrade-Mix-v30.60 LoRA: https://tensor.art/models/7025156632998356040.80 LoRA: https://tensor.art/models/757240925404735859/Sailor-Moon-Vixon's-Anime-Style-Freckledvixon-1.00.75 LoRA: https://tensor.art/models/685518158427095353NodesThe Diffusion Workflow has many nodes I've merged in single nodes I'll explain them below, remember you can group nodes and edit their values to enhance your experience.👑 Super Prompt Styler // Advanced Manager (CLIP G) text_positive_g: positive prompt, subject of the scene (all the elements the scene is meant for, LoRA Keyword activators).(CLIP L) text_positive_l: positive prompt, all the scene itself is meant (composition, lighting, style, scores, ratings).text:negative: negative prompt.◀Style▶: artistic styler, select the direction for your prompt, select 'misc Gothic' for halloween direction.◀Negative Prompt▶: prepares the negative prompt splitting it in two (CLIP G and CLIP L) for the encoder.◀Log Prompt▶: add information to metadata, produces error 1406 when enabled, so turn it off.◀Resolution▶: select the resolution of your generation.👑 Super KSampler // NVIDIA Aligned Stepsbase_seed: similar to esnd (know more here).similarity: this parameter influences base_seed noise to be similar to noise_seed value.noise_seed: the exact same noise seed you know.control after generate: dictates the behavior of noise_seed.cfg: guidance for the prompt, read about <DynamicThresholdingFull> to know the correct value. I recomend 12sampler_name: sampling method.model_type: NVIDIA sampler for SDXL and SD models.steps: the exact same steps you know, dictates how much the sampling denoises the noise injected.denoise: the exact same denoise you know, dictates the strong the sampling denoises the noise injected.latent_offset: select between {-1.00 Darker to 1.00 Brighter} to modify the input latent, any value different than 0 adds information to enhance final result.factor_positive: upscale factor for the conditioning.factor_negative: upscale factor for the conditioning.vae_name: the exact same vae you know, dictates how the noise injected is denoised by the sampler.👑 Super Iterative Upscale // Latent/on Pixel Spacemodel_type: NVIDIA sampler for SDXL and SD models.steps: number of steps the UPSCALER (Pixel KSampler) will use to correct the latent on pixel space while upscaling it.denoise: dictates the strenght of the correction on the latent on pixel space.cfg: guidance for the prompt, read about <DynamicThresholdingFull> to know the correct value. I recomend 12upscale_factor: number of times the upscaler will upscale the latent (must match factor_positive and factor_positive) upscale_steps: dictates the number of steps the UPSCALER (Pixel KSampler) will use to upscale the latent.MiscellaneousDynamicThresholdingFullmimic_scale: 4.5 (Important value. go to learn more)threshold_percentile: 0.98mimic_mode: half cosine downmimic_scale_min: 3.00cfg_mode: half cosine downcfg_scale_min: 0.00sched_val: 3.00separate_feature_channels: enablescaling_starpoint: meanvariability_measure: ADinterpolate_phi: 0.85Learn more: https://www.youtube.com/watch?v=_l0WHqKEKk8Latent OffsetLearn more: https://github.com/spacepxl/ComfyUI-Image-Filters?tab=readme-ov-file#offset-latent-imageAlign Your StepsLearn more: https://research.nvidia.com/labs/toronto-ai/AlignYourSteps/LayerColor: Levelsset black_point = 0 (base level of black)set white_point = 255 (base level of white)Set output_black_point = 20 (makes blacks less blacks)Set output_white_point = 220 (makes whites less whites)Learn more: https://docs.getsalt.ai/md/ComfyUI_LayerStyle/Nodes/LayerColor%3A%20Levels/LayerFilter:Filmcenter_x: 0.50center_y: 0.50saturation: 1.75vignete_intensity: 0.20grain_power: 0.50grain_scale: 1.00grain_sat: 0.00grain_shadows: 0.05grain_highs: 0.00blur_strenght: 0.00blur_focus_spread: 0.1 focal_depth: 1.00Learn more: https://docs.getsalt.ai/md/ComfyUI_LayerStyle/Nodes/LayerFilter%3A%20Film/?h=filmResultAi Tool: https://tensor.art/template/785834262153721417DownloadsPony Diffusion Workflow: https://tensor.art/workflows/785821634949973948
33
11
Hunyuan-DiT: Recommendations

Hunyuan-DiT: Recommendations

ReviewHello everyone; I want to share some of my impressions about the Chinese model, Hunyuan-DiT from tencent. First of all let’s start with some mandatory data to know so we (westerns) can figure out what is meant for:Hunyuan-DiT works well as multi-modal dialogue with users (mainly Chinese and English language), the better explained your prompt the better your generation will be, is not necessary to introduce only keywords, despite it understands them quite well. In terms of rating HYDiT 1.2 is located between SDXL and SD3; is not as powerful than SD3, defeats SDXL almost in everything; for me is how SDXL should’ve be in first place; one of the best parts is that Hunyuan-DiT is compatible with almost all SDXL node suit.Hunyuan-DiT-v1.2, was trained with 1.5B parameters.mT5, was trained with 1.6B parameters.Recommeded VAE: sdxl-vae-fp16-fixRecommended Sampler: ddpm, ddim, or dpmmsPrompt as you’d like to do in SD1.5, don’t be shy and go further in term of length; HunyuanDiT combines two text encoders, a bilingual CLIP and a multilingual T5 encoder to improve language understanding and increase the context length; they divide your prompt on meaningful IDs and then process your entire prompt, their limit is 100 IDs or to 256 tokens. T5 works well on a variety of tasks out-of-the-box by prepending a different prefix to the input corresponding to each task.To improve your prompt, place your resumed prompt in the CLIP:TextEncoder node box (if you disabled t5), or place your extended prompt in the T5:TextEncoder node box (if you enabled t5).You can use the "simple" text encode node to only use one prompt, or you can use the regular one to pass different text to CLIP/T5.The worst is the model only benefits from moderated (high for TensorArt) step values: 40 steps are the basis in most cases.Comfyui (Comfyflow) (Example)TensorArt added all the elements to build a good flow for us; you should try it too.AdditionalWhat can we do in the Open-Source plan? (link)Official info for LoRA training (link)ReferencesAnalysis of HunYuan-DiT | https://arxiv.org/html/2405.08748v1Learn more of T5 | https://huggingface.co/docs/transformers/en/model_doc/t5How CLIP and T5 work together | https://arxiv.org/pdf/2205.11487
58
13
🆘 ERROR | Exception

🆘 ERROR | Exception

Exception (routeId: 7544339967855538950230)Suspect nodes:<string function>. <LayeStyle>, <LayerUtility>, <FaceDetailer>, many <TextBox>, <Bumpmap>After some reseach (on my own) I've found<FaceDetailer> node is completely broken<TextBox> and <MultiLine:Textbox> node will cause this error if you introduce more than 250+ characters, I'm not very sure about this number, but you won't be able to introduce a decent amount of text anymore.More than 40 nodes, despite its function will couse this error.How do i know this? Well I made a functional comfyflow following those rules:https://tensor.art/template/754955251181895419The next functional comfyflow suddelny stopped from generating, it's almost the same flow than the previous, but with <FaceDetailer> and large text strings to polish the prompt. It works again yay!https://tensor.art/template/752678510492967987 proof it really worked (here)I feel bad for you if this error suddenly disrupt your day; feel bad for me cuz I bought the yearly membership of this broken product I can't refound. I'll be happy to delete this bad review if you fix this error.News081124 | <String Function> has been taken down. Comfyflow works slowly (but works)081024 | eveything is broken again lmao, we cant generate outside TAMS.080624 | <reroute> output node could trigger this error when linked to many inputs.072824 | <FaceDetailer> node seems to work again.
17
15
Understanding the Impact of Negative Prompts: When and How Do They Take Effect?

Understanding the Impact of Negative Prompts: When and How Do They Take Effect?

📝 - SynthicalThe Dynamics of Negative Prompts in AI: A Comprehensive Study by: Yuanhao Ban UCLA, Ruochen Wang UCLA, Tianyi Zhou UMD, Minhao Cheng PSU, Boqing Gong, Cho-Jui Hsieh UCLAEThis study addresses the gap in understanding the impact of negative prompts in AI diffusion models. By focusing on the dynamics of diffusion steps, the research aims to answer the question: "When and how do negative prompts take effect?". The investigation categorizes the mechanism of negative prompts into two primary tasks: noun-based removal and adjective-based alteration.The role of prompts in AI diffusion models is crucial for guiding the generation process. Negative prompts, which instruct the model to avoid generating certain features, have been less studied compared to their positive counterparts. This study provides a detailed analysis of negative prompts, identifying the critical steps at which they begin to influence the image generation process.FindingsCritical Steps for Negative PromptsNoun-Based Removal: The influence of noun-based negative prompts peaks at the 5th diffusion step. At this critical step, negative prompts initially generate a target object at a specific location within the image. This neutralizes the positive noise through a subtractive process, effectively erasing the object. However, introducing a negative prompt in the early stages paradoxically results in the generation of the specified object. Therefore, the optimal timing for introducing these prompts is after the critical step.Adjective-Based Alteration: The influence of adjective-based negative prompts peaks around the 10th diffusion step. During the initial stages, the absence of the object leads to a subdued response. Between the 5th and 10th steps, as the object becomes clearer, the negative prompt accurately focuses on the intended area and maintains its influence.Cross-Attention DynamicsAt the peak around the 5th step for noun-based prompts, the negative prompt attempts to generate objects in the middle of the image, regardless of the positive prompt's context. As this process approaches its peak, the negative prompt begins to assimilate layout cues from its positive counterpart, trying to remove the object. This represents the zenith of its influence.For adjective-based prompts, during the peak around the 10th step, the negative prompt maintains its influence on the intended area, accurately targeting the object as it becomes clear.The study highlights the paradoxical effect of introducing negative prompts in the early stages of diffusion, leading to the unintended generation of the specified object. This finding suggests that the timing of negative prompt introduction is crucial for achieving the desired outcome.Reverse Activation PhenomenonA significant phenomenon observed in the study is Reverse Activation. This occurs when a negative prompt, introduced early in the diffusion process, unexpectedly leads to the generation of the specified object within the context of that negative prompt. To explain this, researchers borrowed the concept of the energy function from Energy-Based Models to represent data distribution.Real-world distributions often feature elements like clear blue skies or uniform backgrounds, alongside distinct objects such as the Eiffel Tower. These elements typically possess low energy scores, making the model inclined to generate them. The energy function is designed to assign lower energy levels to more 'likely' or 'natural' images according to the model’s training data, and higher energy levels to less likely ones.A positive difference indicates that the presence of the negative prompt effectively induces the inclusion of this component in the positive noise. The presence of a negative prompt promotes the formation of the object within the positive noise. Without the negative prompt, implicit guidance is insufficient to generate the intended object. The application of a negative prompt intensifies the distribution guidance towards the object, preventing it from materializing.As a result, negative prompts typically do not attend to the correct place until step 5, well after the application of positive prompts. The use of negative prompts in the initial steps can significantly skew the diffusion process, potentially altering the background.ConclusionsDo not step less than 10th times, going beyond 25th times does not make the difference for negative prompting.Negative prompts could enhance your positive prompts, depending on how well the model and LoRA have learn their keywords, so they could be understood as an extension of their counterparts.Weighting-up negative keywords may cause reverse activation, breaking up your image, try keeping the ratio influence of all your LoRAs and models equals.Referencehttps://synthical.com/article/Understanding-the-Impact-of-Negative-Prompts%3A-When-and-How-Do-They-Take-Effect%3F-171ebba1-5ca7-410e-8cf9-c8b8c98d37b6?
59
10
Stable Diffusion [Floating Point, Performance in the Cloud]

Stable Diffusion [Floating Point, Performance in the Cloud]

Overview of Data Formats used in AIfp32 is the default data format used for training, along with mixed-precision training that uses both fp32 and fp16. fp32 has more than adequate scale and definition to effectively train the most complex neural networks. It also results in large models both in terms of parameter size and complexity.fp16 data format both in hardware and software with good performance. In running AI inference workloads, the adoption of fp16 instead of the mainstream fp32 offers tremendous advantages in terms of speed-up while reducing power consumption and memory footprint. This advantage comes with virtually no accuracy loss. The switch to fp16 is completely seamless and does not require any major code changes or fine-tuning. CPUs will improve their AI inference workload performance instantly. Overview of Data Formats used in AI fp32 is the default data format used for training, along with mixed-precision training that uses both fp32 and fp16. fp32 has more than adequate scale and definition to effectively train the most complex neural networks. fp32 can represent numbers between 10⁻⁴⁵ and 10³⁸. In most cases, such a wide range is wasteful and does not bring additional precision. The use of fp16 reduces this range to 10⁻⁸ and 65,504 and cuts in half the memory requirements while also accelerating the training and inference speeds. Make sure to avoid under and overflow situations.Once the training is completed, one of the most popular ways to improve performance is to quantize the network. A popular data format used in this process, mainly in edge applications is int8 and results in at most a 4x reduction in size with a notable performance improvement. However, quantization into int8 frequently leads to some accuracy loss. Sometimes, the loss is limited to a fraction of a percent but often results in a few percent of degradation, and in many applications, this degradation becomes unacceptable.There are ways to limit accuracy loss by doing quantization-aware training. This consists of introducing the int8 data format selectively and/or progressively during training. It is also possible to apply quantization to the weights while keeping activation functions at fp32 resolution. Though these methods will help limit the accuracy loss, they will not eliminate it altogether. fp16 is a data format that can be the right solution for preventing accuracy loss while requiring minimal or no conversion effort. Indeed, it has been observed in many benchmarks that the transition from fp32 to fp16 results in no noticeable accuracy without any re-training.ConclusionFor NVIDIA GPUs and AI, deploy in fp16 to double inference speeds while reducing the memory, footprint and power consumption. Note: If the original model was not trained using fp16, its conversion to fp16 is extremely easy and does not require re-training or code changes. It is also shown that the switch to fp16 led to no visible accuracy loss in most cases.Source: https://amperecomputing.com/
Blender to Stabble Diffusion, animation workflow.

Blender to Stabble Diffusion, animation workflow.

Source: https://www.youtube.com/watch?v=8afb3luBvD8Mickmumpitz guides us on how to use Stable Diffusion, a neural network-based interface, to generate masks and prompts for rendering 3D animations. The process involves setting up passes in Blender, creating a file output node, and then using Stable Diffusion's node-based interface for image workflow. Overall, the video demonstrates how to use these AI tools to enhance the rendering process of 3D animations.The process involves setting up render passes, such as depth and normal passes, in Blender to extract information from the 3D scene for AI image generation. Users can create mask passes to communicate which prompts to use for individual objects in the scene. Stable Diffusion, a neural network-based interface, is used to generate masks and prompts for rendering. Mickmumpitz tell us the differences between using Stable Diffusion and SDXL for image generation and video rendering, highlighting the advantages and disadvantages of each, demonstrating how to use Stable Diffusion 1.5 in Blender to generate specific styles and control the level of detail in the AI-generated scenes.Mickmumpitz shows an updated workflow for rendering 3D animations using AI with Blender and Stable Diffusion. He created simplistic scenes, including a futuristic cityscape and a rope balancing scene, to test the updated version. The workflow uses render passes, such as depth and normal passes, to extract information from the 3D scene for AI image generation. The speaker also explains how to create mask passes to communicate which prompts to use for individual objects in the scene. The workflow aims to make rendering more efficient and versatile.
2
Stable Diffusion [ADetailer]

Stable Diffusion [ADetailer]

After Detailer (ADetailer)After Detailer (ADetailer) is a game-changing extension designed to simplify the process of image enhancement, particularly inpainting. This tool saves you time and proves invaluable in fixing common issues, such as distorted faces in your generated images.Historically we would send the image to an inpainting tool and manually draw a mask around the problematic face area. After Detailer streamlines this process by automating it with the help of a face recognition model. It detects faces and automatically generates the inpaint mask, then proceeds with inpainting by itself.Exploring ADetailer ParametersNow that you've grasped the basics, let's delve into additional parameters that allow fine-tuning of ADetailer's functionality.Detection Model:ADetailer offers various detection models, such as face_xxxx, hand_xxxx, and person_xxxx, catering to specific needs.Notably, face_yolo and person_yolo models, based on YOLO (You Only Look Once), excel at detecting faces and objects, yielding excellent inpainting results.Model Selection:The "8n" and "8s" models vary in speed and power, with "8n" being faster and smaller.Choose the model that suits your detection needs, switching to "8s" if detection proves challenging.ADetailer PromptingInput your prompts and negatives in the ADetailer section to achieve desired results.Detection Model Confidence Threshold:This threshold determines the minimum confidence score needed for model detections. Lower values (e.g., 0.3) are advisable for detecting faces. Adjust as necessary to improve or reduce detections.Mask Min/Max Area Ratio:These parameters control the allowed size range for detected masks. Modifying the minimum area ratio can help filter out undesired small objects.The most crucial setting in the Inpainting section is the "Inpaint denoising strength," which determines the level of denoising applied during automatic inpainting. Adjust it to achieve your desired degree of change.In most cases, selecting "Inpaint only masked" is recommended when inpainting faces.ReferenceThinkDiffusion
23
2
TagGUI - captioning tool for model creators

TagGUI - captioning tool for model creators

📥 Download | https://github.com/jhc13/tagguiCross-platform desktop application for quickly adding and editing image tags and captions, aimed towards creators of image datasets for generative AI models like Stable Diffusion.FeaturesKeyboard-friendly interface for fast taggingTag autocomplete based on your own most-used tagsIntegrated Stable Diffusion token counterAutomatic caption and tag generation with models including CogVLM, LLaVA, WD Tagger, and many moreBatch tag operations for renaming, deleting, and sorting tagsAdvanced image list filteringCaptioning parametersPrompt: Instructions given to the captioning model. Prompt formats are handled automatically based on the selected model. You can use the following template variables to dynamically insert information about each image into the prompt:{tags}: The tags of the image, separated by commas.{name}: The file name of the image without the extension.{directory} or {folder}: The name of the directory containing the image.An example prompt using a template variable could be Describe the image using the following tags as context: {tags}. With this prompt, {tags} would be replaced with the existing tags of each image before the prompt is sent to the model.Start caption with: Generated captions will start with this text.Remove tag separators in caption: If checked, tag separators (commas by default) will be removed from the generated captions.Discourage from caption: Words or phrases that should not be present in the generated captions. You can separate multiple words or phrases with commas (,). For example, you can put appears,seems,possibly to prevent the model from using an uncertain tone in the captions. The words may still be generated due to limitations related to tokenization.Include in caption: Words or phrases that should be present somewhere in the generated captions. You can separate multiple words or phrases with commas (,). You can also allow the captioning model to choose from a group of words or phrases by separating them with |. For example, if you put cat,orange|white|black, the model will attempt to generate captions that contain the word cat and either orange, white, or black. It is not guaranteed that all of your specifications will be met.Tags to exclude (WD Tagger models): Tags that should not be generated, separated by commas.Many of the other generation parameters are described in the Hugging Face documentation.
1
1
Stable Diffusion [Parameters]

Stable Diffusion [Parameters]

Stable DIfusion Intro.Stable Diffusion is an open-source text-to-image AI model that can generate amazing images from given text in seconds. The model was trained on images in the LAION-5B dataset (Large-scale Artificial Intelligence Open Network). It was developed by CompVis, Stable AI and RunwayML. All research artifacts from Stability AI are intended to be open sourced.Promp Engineering.Prompt Engineering is the process of structuring words that can be interpreted and understood by a text-to-image model. Is the language you need to speak in order to tell an AI model what to draw.A well-written prompt consisting of keywords and good sentence structure.Ask yourself a list of questions once you have in mind something.Do you want a photo or a painting, digital art?What’s the subject: a person, an animal the painting itself?What details are part of your idea?Special lighting: soft, ambient, etc.Environment: indoor, outdoor, etc.Colo scheme: vibrant, muted, etc.Shot: front, from behind, etc.Background: solid color, forest, etc.What style: illustration, 3D render, movie poster?The order of words is important.The order and presentation of our desired output is almost as an important aspect as the vocabulary itself. It is recommended to list your concepts explicitly and separately than trying to cramp it into one simple sentence.Keywords and Sub-Keywords.Keywords are words that can change the style, format, or perspective of the image. There are certain magic words or phrases that are proven to boost the quality of the image. sub-keywords are those who belong to the semantic group of keywords; hierarchy is important for prompting as well for LoRAS or Models design.Classifier Free Guidance (CFG default is 7)You can understand this parameter as “Ai Creativity vs {{user}} prompt”. Lower numbers give Ai more freedom to be creative, while higher numbers force it to stick to the prompt.CFG {2, 6}: if you’re discovering, testing or researching for heavy Ai influence.CFG {7, 10}: if you have a solid prompt but you still want some creativity.CFG {10, 15}: if your prompt is solid enough and you do not want Ai disturbs your idea.CFG {16, 20}: Not recommended, uncoherency.Step CountStable Diffusion creates an image by starting with a canvas full of noise and denoise it gradually to reach the final output, this parameter controls the number of these denoising steps. Usually, higher is better but to a certain degree, for beginners it’s recommended to stick with the default.SeedSeed is a number that controls the initial noise. The seed is the reason that you get a different image each time you generate when all the parameters are fixed. By default, on most implementations of Stable Diffusion, the seed automatically changes every time you generate an image. You can get the same result back if you keep the prompt, the seed and all other parameters the same.⚠️ Seeding is important for your creations, so try to save a good seed and slightly tweak the prompt to get what you’re looking for while keeping the same composition.SamplerDiffusion samplers are the method used to denoise the image during generation, they take different durations and different number of steps to reach a usable image. This parameter affects the step count significantly; a refined one could reduce or increase the step count giving more or less subjective detail.CLIP SkipFirst of all we need to know what CLIP is. CLIP, which stands for Contrastive Language Image Pretraining is a multi-modal model trained on 400 million (image, text) pairs. During the training process, a text and image encoder are jointly trained to predict which caption goes with which image as shown in the diagram below.Just think on this like the size like a funnel which uses SD to comb obtained information from its dataset; big numbers result in many information to process, so the final image is not presize. Lower numbers narrow down the captions on the dataset, so you'd get more accurated results.Clip Skip {1}: Strong concidences and less liberty.Clip Skip {2}: Nicer concidences and few liberty.Clip Skip {3-5}: Many concidences and high liberty.Clip Skip {6}: Unexpeted results.ENSD (Eta Noise Seed Delta)Its like a slider for the seed parameter; you can get different image results for a fixed seed number. So... what is the optimal number? There is not. Just use your lucky number, you're ponting the seeding to this number. If you are using a random seed every time, ENSD is irrelevant.So why people use 31337 commonly? Known as eleet or leetspeak, is a system of modified spellings used primarily on the Internet. Its a cabalistic number, its safe using any other number.ReferencesAutomatic1111OpenArt Prompt BookLAIONLAION-5B Paper1337
24
Stable Diffusion [Weight Syntax]

Stable Diffusion [Weight Syntax]

Weight (Individual CFG for keywords): Colon stablish weight slider on keywords changing its default value(1.00 = default = x).( ) Round brackets, for modifying keyword’s value, example (red) means red:1.10(keyword) means (x+0.1x), if x=1 ⇒ (1+1(0.1)) = 1.10((keyword)) means (x+0.1x)², if x=1 ⇒ (1+0.1))² = 1.21(((keyword))) means (x+0.1x)³, if x=1 ⇒ (1+0.1))³ = 1.33((((keyword)))) means (x+0.1x)⁴, if x=1 ⇒ (1+0.1))⁴ = 1.46+ Plus, for modifying keyword’s value, example red+ means red:1.10keyword+ means (x+0.1x), if x=1 ⇒ (1+1(0.1)) = 1.10keyword++ means (x+0.1x)², if x=1 ⇒ (1+0.1))² = 1.21keyword+++ means (x+0.1x)³, if x=1 ⇒ (1+0.1))³ = 1.33keyword++++ means (x+0.1x)⁴, if x=1 ⇒ (1+0.1))⁴ = 1.46… etc[ ] Square Bracket, for modifying keyword’s value, example [red] means red:0.90[keyword] means (x+0.1x), if x=1 ⇒ (1-1(0.1)) = 0.90[[keyword]] means (x+0.1x)², if x=1 ⇒ (1-0.1))² = 0.81[[[keyword]]] means (x+0.1x)³, if x=1 ⇒ (1-0.1))³ = 0.72[[[[keyword]]]] means (x+0.1x)⁴, if x=1 ⇒ (1+0.1))⁴ = 0.65… etc- Minus, for modifying keyword’s value, example red+ means red:0.90keyword- means (x+0.1x), if x=1 ⇒ (1-1(0.1)) = 0.90keyword-- means (x+0.1x)², if x=1 ⇒ (1-0.1))² = 0.81keyword--- means (x+0.1x)³, if x=1 ⇒ (1-0.1))³ = 0.72keyword---- means (x+0.1x)⁴, if x=1 ⇒ (1+0.1))⁴ = 0.65… etcIn theory you can combine, or even bypass the limit values (0.00 - 2.00) with the correct script or modification in your dashboard.

Posts