AI-generated images are allowing a lot of people to experience the thrill of being creative that otherwise might never have had the opportunity to do so. While the commercial/professional aspects of these images are open to debate in terms of legality, copyright, and numerous other issues, the hobbyist, your average casual, non-commercial user, can dig into whatever aspect of AI generation that suits them without being a part of that debate.
One of the most discussed and anticipated aspects will be features and tools to give more control over the final output, to remove some of the randomness. At the moment it can be somewhat difficult to get Midjourney to adhere to the Rule of Thirds or Golden Ratio, but AI generation is still in the early stages when there are far more important aspects to spend research and development time on.
The use of reference images combined with a good prompt can conjure up some great results. These reference images can be anything from fine art to photographs. With tools like Midjourney it can even be scribbles. That’s right. Some call them doddles and others aren't sure what they are but, combined with the right prompts and styling, scribbles that would be considered artwork for a 5-year-old can now be used to start a generational chain of renders until you get a result far removed from the scribble.
In this first example, I used a drawing tablet to take all of thirty seconds or so to scribble out a “snail and rider” for the first-generation reference images. I admit to using the tablet for 3D sculpting and layout, not so much for drawing as you can obviously tell. This is the image I upload to use as a reference for the first generational render from Midjourney. From there I chose the best second-generation render to use as a reference for the final third generation.
I admit to it not looking much like a snail but I was intrigued by the shape and used it to generate the final four images. For just two generational iterations the final four were far removed from the original scribbles. Just imagine what a few more iterations would look like.
Next up was a toadstool/mushroom example. Like the first attempts, I wasn’t looking for realism but something offbeat and catchy. Something that would draw the eye to it if posted. Again the entire process only required two generational renders from the scribble used as a reference.
In terms of composition, the generated images differ from the reference image but that is one of the things that will improve as the AI improves. The reference image just keeps the whole generational process from going off the rails. Some tools like those based on Stable Diffusion seem to get a little closer to the actual layout of the reference image so it might be possible to use a Stable Diffusion program to control the early renders if you need to match the sketched layout a bit better then switch to Midjourney for later renders if you prefer their process over Stable Diffusion.
Last, is the flying unicorn (a simple trace on the drawing tablet) since I couldn’t make up my mind whether I wanted a unicorn or a flying horse so why not both?
As you can see, I ended up choosing a more stylized render than the previous attempts. I could’ve just as easily invoked anything from a porcelain version to a vividly multi-colored 1960s blacklight poster look. The text prompt guides that aspect.
In recap the process is simple to get to this point:
Draw (or scribble depending on your artistic level) a simple reference image. Pick the best of the first renders created from that reference image. Use that render as the basis for your next generational iteration. By now you should be getting some nice and more artistic renders.
With a few practice runs you can gain a lot of first-hand information that will guide you in shaping your own “style” of AI-generated images.