Ever since I saw the capabilities of Dall-E 2, I had this great experiment in mind – use a comic book script and compare it with the real comic book. Great idea, right?! So, the first thing that I did is went to the OpenAI Lab website ready to try it out. There I filled out the form and waited to get access to Dall-E 2. And waited, and waited. What seems like eons passed. During that period, Dall-E generated a cover for Cosmopolitan, Dall-E Mini was published on Hugging Face and I felt like I will never get to do my thing.
When I finally got access, all the projects that I worked on blew up and I couldn’t find the time to play around with Dall-E 2. Finally, after what seemed like forever, I found some time and used it to try to recreate the comic book Killing Joke using the script written by Alan Moore. And yes, in case you were wondering the cover image for this article is created with Dall-E 2 and the prompt “Alan Moore being angry at his code while programming”.
This bundle of e-books is specially crafted for beginners.
Everything from Python basics to the deployment of Machine Learning algorithms to production in one place.
Become a Machine Learning Superhero TODAY!
Why do I want to do this experiment, you ask? Well, I grew up reading a bunch of comic books and even tried to write some of my own. Drawing comic books based on some scripts is a hard and long process. So, I wanted to see what kind of impact recent AI developments will have on this industry.
Picking a Comic Book
Now, I consider myself somewhat of a graphic novel/comic book fan. I am not a super fan, and I will not get wrapped up in the lengthy discussion about what is the difference between a comic book and a graphic novel (so I will use them interchangeably in this article), but let’s say that I have a collection of comic books that are very dear to me.
What a lot of people don’t know is that comic books are usually created by two creators – the first one writing a script and the second one drawing based on that script. Nowadays, more people are involved in the process, and there are artists which are just doing the coloring, the lettering, and so on, but these two are the main roles. That is why when people are stating their favorite comic book authors they are usually referring to the writer – the guy who wrote the comic book script.
For example, the Sandman Overture that I hold in my hands in the image above is written by Neil Gaiman, but was illustrated by J. H. Williams III. So yes, Neil Gaiman is one of my favorite authors, but also Grant Morission (dare I put his name in the Alan Moore article?), Warren Ellis (even though I guess he is canceled now, Transmetropolitan is still some of my favorites) and self-proclaimed wizard Alan Moore. So, why did I pick Alan Moore? There are a couple of reasons.
Writing Style
Well, when writing a comic book script, authors usually have two styles. The first one is more “free”, meaning the authors generally explain what is happening on each page, or panel without…well, over-obsessing about details. This style is too ambiguous for Dall-E 2 to make a good output. So, I picked another style – excruciatingly deitaled explanation of what is happening in each panel. Alan Moore is having this style. For example, here is description for one of the panels for the Youngblood:
As you can, it is a pretty detailed description.
The other reason, of course, is that I love so many of Alan Moore’s novels and you are probably familiar with some of them too. For example, he is the author of V for Vendetta (he hates this movie adaptation), Watchmen (he hates this movie adaptation even more) and From Hell (I will not repeat myself). Finally, I know that there is a real danger of messing with Mr. Moore’s work, so hopefully, he will not summon Astaroth (yes, this is Jimmy’s End reference) or something to mess with me.
For this experiment, I have chosen Killing Joke. It is a cult Batman episode drawn by Brian Bolland, with a script that can be easily found online. Because recreating the whole comic book would be a hell of a work, I decided to recreate only the first page. Here is the original page, which we want to recreate with Dall-E 2. Ok, let’s dive into it!
Generating Panel 1. with Dall-E
Alan Moore writes long descriptions. While this is really good if you are an illustrator, this is a problem for my experiment, because Dall-E 2 prompt length is not that big. Even the first panel which is a description of a puddle is lengthy. Here it is:
Here is how Mr. Bolland illustrated this first panel:
Yes, it is a perfect representation of what mighty Alan described. However, as I predicted, the text is too long for Dall-E’s prompt. So, I had to cut it in order to get something that can be consumed by Dall-E. This is the best result that I’ve got:
Even with this simple test, I immediately got the feeling that this will not go as smoothly as I thought it would. Apart from the problem with the length of the text, I realized that it is very hard to specify details in the image. For example, I failed to fulfill the requirement of “large droplets of rain that fall through the foreground in diagonal slashes”. Also, I failed to get the comic book quality of the image. So, yeah, I was sure I was in for a bumpy ride.
Generating Panel 2. with Dall-E
And here is what I managed to get with Dall-E:
The leaf, O MY GOD, the leaf. It caused me so much frustration. In general, getting Dall-E to focus on some detail, add detail, or position that detail and its correlation with the main elements of the image is hell. It is nice to do it for experiments and play around with it, but with the concrete description is so hard. Ok, let’s go to the next panel.
Here was the moment where I realized, that if these tools take over the industry, scripts are gonna get much much simpler. In general, I stripped away a lot of descriptions that Alan Moore put in and focused on the basics. Basically, everything that the writer does to pull the illustrator into the world he is creating would be gone. I have mixed feelings about that, to be honest.
Generating Panel 3. with Dall-E
By the third panel, I got my rhythm. I was aware of the problems and got ideas about the main pitfalls. Also, by this panel, I had to buy credits. Just a minor note to plan your budget if you want to make comic books this way. Anyhow, here is the description for the third panel:
This panel was illustrated like this:
However, I realized there are more problems than I expected. The first one is that style is not easily transferred between panels. What I managed in the first two panels was to get that art-school feel sort of unintentionally. I added the words comic book at the end of each prompt, but now it got kind of appeared useless.
I think here the problem arises from the words that are used and the way that the model is trained. If you turn up Alan Moore and use words like “mournfully” and “sinister”, you will end up with images that look like from the 1800s. On the other hand, if you even mention “Arkham Asylum”, you will end up with a more video game feel. For example, here is the result when you turn up the Moore:
What is the text underneath? I have no idea. Where is the batmobile? I couldn’t get it in. However, after playing around I got something that resembles illustration in the comic book:
Taking a break (part 1)
I got so happy that I got something substantial, so I decided to do something fun. Here is the prompt:
Alan Moore summoning Shaquille O’neal for a game of basketball
And here is the result:
Ah yes, that felt good. Back to work!
Generating Panel 4. with Dall-E
Ok, lets go to the panel 4:
At this point, let me just say that script is fantastic. All my frustrations aside, I can imagine that if I would illustrate this I would be super into it. The atmosphere is built in such a great way. Here is how it looks in the comic book:
This is the moment when I felt accomplished (well at least happier with what I got than in previous images). The feel was a bit like a comic book, and closer to the illustration. Even though it deviated in style from previous ones and had a huge bat in the sky. Hell, the bat in the sky is the plus from where I am standing. Here it is:
Generating Panel 5. with Dall-E
Here we go, panel number five:
And the illustration:
This panel came straight from hell (pun intended)! I got so many failed attempts. I mean, what are these:
Here is what I ended up with:
Not great, not terrible! Considering other options, this was gold. I am not happy with its “realistic” feel but attempts to make it more comic book-y failed miserably.
Generating Panel 6. with Dall-E
Ok, lets go to the panel 6:
And the illustration is here:
This one was not too hard to get, but I was not very happy about it. It goes well with the previous image, but I was still not able to reproduce the success from panel 4. when it comes to style.
Taking a break (part 2)
Alright, I finished two-thirds of the page, so I decided to spend some of the credits on pure fun. So here is the image of Alan Moore anchoring news with Glycon:
Generating Panels 7, 8 and 9. with Dall-E
Let’s bring this home! I am putting panels 7 , 8 and 9 together because they were more of the same. Picking up the text from the script, removing long descriptions so they can fit the prompt and that Dall-E can “focus” properly. I wanted to leave 9 out of this chapter because…well because it…Let’s put it like this if you want to see some really weird photos of Commissioner Gordon giving coffee to something that looks like the weirdest Batman ever, shoot me a mail. Here are the descriptions of panels 7, 8 and, respectively:
And here are their illustrations:
Aaaaaand my results:
What I couldn’t do no matter how hard I tried is to get Batman or his cape in the final two panels.
Final Result and Conclusion
So, here is the comparison.
You can see that the one I really liked (panel 4) really messes up the style here. So, I will probably work some more on it.
In general, I really liked the experiment. There are two major thoughts I have here. The first one is – there will definitely be a niche of AI-generated comic books, both scripts and illustrations. My feeling is that their style will be similar, at least in the beginning. Transferring a specific style from one panel to another is going to be tricky, but that might be solved sooner than later. Maybe there will be a way to “lock” the style, something like with StyleGAN. For now, I can see this as a tool that might assist authors to visualize their scripts better, and for illustrators to shorten their work by generating some details for each panel.
The second one is – I don’t want to be gloomy about this, but all the things we love about scripts might be put aside for short descriptions that can be generated with AI. I can see a future where comic book author uses tools to generate script based on a few keywords (take any GPT-3, T5 or whatnot) and then use another tool to generate illustrations, like Dall-E for example. Looking even more into the future, the author itself might be made obsolete and an infinite amount of content might be created in this automatic loop.
Thanks for reading!
This bundle of e-books is specially crafted for beginners.
Everything from Python basics to the deployment of Machine Learning algorithms to production in one place.
Become a Machine Learning Superhero TODAY!
The description property of this page is a Killing Joke in itself. Congrats! 😆
Ooooops, let’s leave it there as an easter egg 😀
You could try using Midjourney too. I have never used any of them, but take a look. I think it would work even better for batman comics https://petapixel.com/2022/08/22/ai-image-generators-compared-side-by-side-reveals-stark-differences/
Hey, thanks for the feedback!
I am actually working on it, I hope it will be out in the next couple of days.