在开篇前想先说明一下,SD3.5-Large的图像质量和美学表现不如Flux1.0-dev,无论是从审美角度、构图合理性还是色彩运用上都显得一般。但,这次SD3.5-Large的亮点在于它是完全开源的,几乎可以无条件用于商业化(尤其是最近许可证改了之后)。最关键的是,它没有经过蒸馏处理,这意味着开源社区有机会进一步提升它的上限,而且像Controlnet这种模型也能很好地支持它,就像之前的开源SDXL一样,一句话总结:可扩展性非常强。
2024年10月22日,StabilityAI发布了最新的Stable Diffusion 3.5系列模型(简称SD3.5),其中包括SD3.5-Large、SD3.5-Large Turbo两款型号,而SD3.5 Medium预计在10月29日发布。虽然在上次发布SD3时,许可协议的变动引发了开源社区的强烈抵制,最终他们还是将许可协议改回了原来的形式,但目前还不清楚这样的举措是否能重新赢得开源社区的支持。
Stable Diffusion 3.5 – Large
这次开源的SD3.5整体架构继承以往的SD3架构,没有任何变动。
模型尺寸:8B(80亿参数)
模型类型:MMDiT 文本到图像生成模型
模型描述:一个可以根据文本提示生成图像的模型。它是一个多模态扩散变换器(MMDiT) 架构的模型,使用三个固定的、预训练的文本编码器(OpenCLIP-ViT/G、CLIP-ViT/L和T5-xxl)
应用支持:ComfyUI
显存支持:24G+
第三方支持:shakker.ai、civitai
目前能在国内的魔搭社区下载:
- 单独下载sd3.5-Large或Trubo模型;
- 再下载编码器,显存大的小伙伴可以把这4个编码器都下载了。
Tips:小于16G显存的小伙伴用t5xxl_fp8+SD3.5large-turbo就可以顺利跑起来,速度飞快。
ComfyUI启动
前置准备:
- N卡16G显存套餐:SD3.5-Large-Turbo模型+t5xxl_fp8编码器
- N卡24G显存套餐:SD3.5-Large模型+t5xxl_fp8编码器模型
- SD3.5模型下载完成之后请放到ComfyUI/models/checkpoints/ 目录中;
- CLIP和T5XXL编码器模型下载完成之后请放到ComfyUI/models/CLIP/目录中。
- ComfyUI更新到最新版本。
SD3.5-Large工作流示例
第一步,下载这张图;
第二步,加载图片,ComfyUI自动载入工作流。
第三步,选择SD3.5-Large模型,选择CLIP编码器,输入提示词,调整K采样;
😼K采样器推荐设置:迭代步数:26-30步,CFG:3.6-4.0,采样器:DPM++2M,调度器:SGM
TIps:提示词对出图质量的影响挺大,得益于强大的CLIP编码器,可以随意输入各种自然语言的句子,描述得越详细,生成的图像质量就会越高,同样还支持各种语法和权重比例。
SD3.5-Large-Turbo工作流示例
下载这张图,具体不演示了,和large差不多。
😺K采样器推荐设置:迭代步数:4-8步,CFG:1-1.2,采样器:DPM++2M或DDiM,调度器:SGM
TIps:不需要再输入负面提示词了。
SD3.5-Large示例图:
It is a digital painting depicting a stately, winged female figure standing in a peaceful and mysterious forest. The woman had pale complexion, long flowing hair
and a blonde headband. She wore a revealing outfit that included a gold strapless top with intricate designs, a red skirt that draped over her legs, and thigh-high boots. Her dress was decorated with gold accessories, including bracelets and a belt. She holds a large, feathered scepter in her left hand and a bird in her right, suggesting her connection to nature and its elements. The background is a bright sky, soft pastel hues of blue, white and pink, and a golden glow cast by a brilliant sun. In the foreground, there are ripples and reflections of a small pond surrounded by tall, leafy plants and trees. The water surface is calm and some leaves float on it, adding to the atmosphere of peace. Women's wings are large and detailed, with a mixture of white, brown, and black plumage, giving an intimidating and ethereal appearance. The overall style is highly detailed and realistic, with themes of fantasy and mythology, combining elements of nature, magic and mythology.
SD3.5-Large-Trubo示例图:
~*~aesthetic~*~ #vaporwave neon 3D render, a fancy car in a clubhouse garage, neon sign on the wall reads "Drive On".