Tiefling sorceress
Elf monk
Half-elf druid/rogue
Awesome work! What is your method?? Please fill me in on your process, I’m still pretty new to using AI tools for this
I have a local installation of Vladmandic’s fork of the Automatic1111 web UI, so these steps are specific to Stable Diffusion. They will also work fine if you have a good colab.
1. Find a good base image
- Pick the right model. I use A-Zovya RPG Artist Tools mostly, but there are many other models that are great for more specific styles.
- Start with a simple prompt that includes general details about your subject. Don’t try to go down the Midjourney-style rabbit hole of crafting the perfect prompt; quantity is far more important than quality with current AI models.
- Use txt2img to generate an image that will serve as a good base. You’re looking for something that has the right colors, silhouette, and style. Don’t worry about the fine details like fingers and faces: you’ll clean those up later.
- Generate just a few images with your initial prompt to see what sort of results you get: batches of 5-10 should be enough to tell you if you’re using the right tokens. If you see something frequently popping up that you don’t want, add it to the negative prompt. Change your prompts around until your results start consistently including the details you’re looking for.
- If you find an image that you like the style of but not the details, you can use that image as an input for ControlNet’s Reference model in txt2img.
Here’s the image I chose as my base for the tiefling. Generation parameters:
Female tiefling, sorcerer, librarian, goat horns, watercolor, masterpiece, best quality
Negative prompt: bad_prompt_version2, nude, nsfw, explicit, penis, nipples, sex, suggestive, lowres, bad anatomy, bad hands, text, error, missing fingers, extra digit, fewer digits, cropped, worst quality, low quality, normal quality, jpeg artifacts, signature, watermark, username, blurry
Steps: 10, Sampler: UniPC, CFG scale: 7, Seed: 3178845724, Size: 512x512, Model hash: da5224a242, Model: aZovyaRPGArtistTools_v2, Version: 1dffd11
2. Inpaint
- Inpaint any details you don’t like. I use the openOutpaint extension since it’s much easier to pick out details.
- Make sure you use an inpainting model and don’t forget to load a VAE! Some checkpoints have a VAE baked in to their base model but not their inpainting model. This will make your inpainting attempts look grey and generally awful. What VAE you choose honestly doesn’t matter much: I use Grapefruit’s since it makes nice vibrant colors, but if I’m getting faces that look too much like anime then I’ll switch to the base 1.5 VAE.
- Keep inpainting until you have all the features you want (right number of fingers, right clothes, etc.). Don’t worry about the really fine details like eyes and fingernails yet. This will be the longest step; take your time!
- OpenOutpaint also lets you outpaint, so do that now if you want.
- Once you’ve got everything right, upscale it 2-4x. Try out all the different upscaling models to see which one works best for this specific image.
- Start inpainting again. Your image is now much larger, and you should only inpaint in 512x512 sections, so you’ll also need to alter your prompt specifically for what you are inpainting at that time.
- Just keep inpainting and upscaling as needed. There’s no one way to do things from here; just tinker until you’re happy with the results.
Here’s most of the images that I generated for the tiefling. This took me about 12 hours and almost 1000 images.
Wonderful, you should make this a whole dedicated post at some point just to share your process with others. I’m following the guide on the A-Zovya RPG page right now and getting great results! Thank you for sharing!
These look phenomenal. I tried something similar just using the free image generation through Bing. I think it’s Dall-E powered but it’s limited, for some reason it rejected the word “bard”. What service/program did you use to make these?
Thank you! I used a local installation of Automatic1111, a popular web interface for Stable Diffusion image generation models. You can technically run it on older hardware, but I personally wouldn’t recommend it unless you have a GPU with at least 8GB of VRAM. If you don’t have one then there are lots of other options to use someone else’s hardware.
- StableHorde
- There are a bunch of Google Colab instances running Stable Diffusion. It should be pretty easy to find a good one of those.
- Rent a cloud server.
DALL-E was neat last year, but it’s incredibly outdated now. AI image generation has absolutely exploded in the past few months and is about to have another burst of advances with the imminent release of SDXL.
I’ll have to look into it, I’ve got a 1070 in one PC and a 1080ti in the other so I’m probably on the border of being able to do it on both. I just used Dall-E because it popped up through the Bing app on my phone really and I was just playing around with it, but after seeing what you’ve made I’ll have to look into doing it all properly because they turned out great.
What model/checkpoint are you using for these? I’ve been struggling with getting a good render of a Tiefling Warlock with horns that curl over the head
These were all made with A-Zovya RPG Artist Tools v3. Tieflings are really tricky; I’ve found the most success by specifying what I wanted the horns to look like. For the tiefling in this post I used tiefling and goat horns in the prompt. I’ve also used the inpaint sketch tool to help define the rough shape, but it usually wasn’t too effective.