Bokeh 3.5 Medium is based on Stable Diffusion 3.5 Medium as its foundation model, using a 5M high-resolution open-source dataset that underwent rigorous quality and aesthetic screening for post-training, ensuring excellent image quality, high fidelity of natural images, preservation of fine details, and enhanced controllability.
Trained with mixed short/long natural language captions.
Short Captions: Focus on the core subject content of the image.
Long Captions: Provide broader descriptions of the scene environment and atmosphere.
Recommended Resolutions:
1920x1024, 1728x1152, 1152x1728, 1280x1664, 1440x1440
Powerful customized fine-tuning performance that can be widely used for downstream production tasks.
Advantages
🖼️ High-Quality Image Generation
State-of-the-art visual fidelity with improved detail extraction and aesthetic consistency.
Enhanced resolution support up to 200W pixels, ensuring highly detailed image outputs.
Carefully curated dataset ensures better composition, lighting, and overall artistic appeal.
🎯 Powerful Custom Fine-Tuning
Exceptional LoRA training support, making it highly effective for:
Photography
3D Rendering
Illustration
Concept Art
⚡ Efficient Inference & Training
Low hardware requirements for inference:
Medium model: 9GB VRAM (without T5)
Full weights inference: 16GB VRAM (suitable for local deployment)
LoRA fine-tuning VRAM requirement: 12GB - 32GB
Known Issues
Potential human anatomy inconsistencies.
Limited ability to generate photorealistic images.
Some concepts may suffer from aesthetic quality issues.