Hi there, I’ve seen a few videos on yt showing it off and it looks incredibly powerful in finetuning the outputs of SD. It also looks dauntingly complicated to learn how to use it effectively.
For those of you, who played around with it - do you think it gives better results than A1111? Is it indeed better in finetuning? How steep was the learning curve for you?
I’m trying to figure out if I’d want to put in hours to learn how to use it. If it improves my ability to get out exactly the images I want, I’ll go for it. If it does what A1111 does, just dressed up differently I’ll sit it out :)
I am no expert but have been experimenting with ComfyUI: https://lemmy.zip/post/510712
ComfyUI seems incredibly powerful and efficient, much faster than Automatic 1111. But I have yet to figure out how to get good results using ControlNet, I can make it work, but quality seems to get lost with ComfyUI and I am yet to figure out why, but expect it to be ‘operator error’.
If the node based interface of ComfyUI is intimidating, then it is easy to install ComfyBox, which also lets you easily toggle between a GUI and node interface for ComfyUI: https://github.com/space-nuko/ComfyBox/blob/master/static/screenshot.png
Once I figure out the kinks I expect to transition to ComfyUI as my Stable Diffusion daily driver interface, as it is so much faster, resource efficient and configurable.
I’m no stranger to node based workflows, but I have struggled to see how nodes are beneficial for stable diffusion. It just seems like a ton of extra steps to have to lay down like 10 nodes just to make a simple image, where other interfaces let me do the same thing a lot easier.
When you say it’s faster, are you referring to the workflow, or the actual generation? Do you see any other benefits from comfy?
The usefulness of ComfyUI is not just making one simple image. It is the ability to completely customize how that image is created.
For example, I have a workflow that generates a half-resolution preview image, then upscales the latent and puts it through two more sampling nodes. All three of the nodes have a different prompt input, with the focus slowly shifting to style instead of content.
I have also created a custom upscaling workflow, where the image is upscaled with normal upscaling, then re-encoded and put through just a few sampling steps, the re-encoded with a tiled VAE decoder (to save my VRAM). It creates much better results (more detail and control) than a direct ERSGAN upscale, and can even be put through ERSGAN afterward to get a super large image.
Do happen to have a tutorial for ComfyUI at hand, that you can link and that goes into some details? These custom workflows sound intriguing, but I’m not really sure where to start.
The repository links to a list of examples, but the best way to learn is just to mess around with it. It is fairly intuitive to work with (especially if you have used another node-based UI before, like Blender).
The UI also has the ability to import/export your current setup (what I call a workflow) as a json file. If I get some time, I might share some of mine with a pastebin link or something.
I just figured out that I could drag any of my images, made with A1111, into the UI and it would set up the corresponding workflow automatically. I was under the impression that this would only work for images already created with ComfyUI first. However, this gives great starting points to work with. I will play around with it tonight and see if I can extract upscaling and control-net workflows with it as a starting point from existing images.
I didn’t even know that, that’s pretty cool. I’ll have to try it later
By faster I mean in rendering and less demanding on hardware, so I mean ComfyUI seems way more efficient.
In terms of the node based approach what it does allow is fine grained and even custom node control, but as I said, if that is not your thing then ComfyBox provides a ‘best of both worlds’ option.
My advice is to do what I am currently doing, try ComfyUI, see how it compares for your particular use case or workflow, then decide from there. Personally I am keeping both Automatic 1111 and ComfyUI, sharing the key resources between both, while I make up my mind: https://weirdwonderfulai.art/resources/sd-automatic1111-webui-customisation-tips/
You can use a json file or ComfyUI gerated image and just drag it on to the interface to setup a partcular node structure, so in that respect ComfyUI is even quicker than Automatic 1111, but that said both pretty much have that functionality.
I’ve been fiddling around with it for a few days. At first it seems like A1111 gives better results, but after more tries they both look pretty similar. One big plus from ComfyUI is that it’s much faster than A1111 (3 times faster on my PC) and it’s also lighter. I can do 1024x1024 upscale without getting OOM error.
Here is decent tutorial: https://www.youtube.com/watch?v=KTPLOqAMR0s&ab_channel=SebastianKamph
I first tried it a few days ago, I’m still a bit lost. Inpainting, which is the major part of my workflow, feels not as swift as in automatic1111 and I’m still searching for the only-masked-area inpainting in ComfyUI.
But I can confirm it is much faster and uses less VRAM. And I somehow love the ability to save the entire workflow into a json. I’m missing my prompt-autocomplete plugin the most.