AI: Your Own, Personal Art Department
If you pay any attention to what the media has been saying, generative AI is poised and ready to replace every single creative profession. Any minute now, the rapid advance of this technology means that every illustrator working today — from those creating children’s books, to those cranking out conceptual, clever editorial images for The Economist — will be replaced by someone typing a few words into a text field. The designers, copywriters and musicians are next.
I’ve been creating with digital media, personally and professionally, for nearly three decades: creating, designing, animating and coding. I also hire and manage digital creatives, and I’m fascinated with creative technology, so I’m emotionally invested in where this is all headed.
I’ve also been keenly watching, and speaking on, the development of AI-driven creativity for almost 10 years now. Over the past year, I’ve created artwork in Snowpixel, Wombo Dream, Disco Diffusion and other AI platforms, and I’ve now spent a couple of weeks immersed in the beta release of DALL-E 2, a state of the art platform from Open AI.
In this article, I’ll share my impression on the state of things, the creative process I use, and how I see AI as empowering artists and designers.
So, is this the Artpocalypse?
Not yet. It’s clear that this technology is fast approaching the point of “if you can dream it, you can probably generate a convincing image of it” — whatever it is, and for better or worse (the implications of “worse” here are a whole other topic).
And, you can have it rendered as any kind of photograph you like. Or any kind of drawing, or painting, or as a 3D render; or as a vintage product catalogue illustration; or as scrimshaw carved into whale bone; even written in the stars...
But that doesn’t mean you’re out of a job. After spending a week with DALL-E 2, I’ve come to a more nuanced understanding of where we’re at. And for now, producing quality imagery using AI is not as simple as typing a few words into its input field. Well, not yet. (Or at least not every time.)
As it happens, there is a process here, and the steps require the kind of skills that make good creatives, well, good.
In the case of AI, I’ve found that success comes down to our separate but related abilities to (1) envision the work, (2) communicate it, and (3) curate the output.
Step 1: Envision
Before you can generate an image using AI, you need a subject. But this “simple” question can turn out to be quite complex. Because first, you need to determine whether your idea is literal, or conceptual.
Literal subjects are relatively easy for DALL-E 2, even when they’re absurd
Literal subjects, like “a man riding a bike”, tend to provide rather predictable results. This is true even when the scenario is more absurd, like “a fish riding a bike”.
But if you ask for an image that communicates a concept, like “Less is more”, you may find it impossible to predict or control the results.
For your image to convey a metaphoric meaning, you will likely have to do the hard work of developing a good concept in your mind first, and then describe the results you want to see to the AI as a literal image.
The concept of “wealth inequality”, for example, may conjure up images for you personally, but as a text prompt typed into an AI’s input field, those two words alone are unlikely to generate many successful results.
Instead, to communicate our concept, we might imagine two different families, in two different states of economic security: one wealthy and one destitute. The prompt, “A wealthy family next to a poor family”, will generate a literal image that, hopefully, conveys the concept you developed. (You’ll need to be far more descriptive than this to get good results, however, and we’ll look at that next.)
As always, developing visuals that are compelling, that deliver a specific message, and that make an emotional impact still takes a special kind of thinking that our machines aren’t very good at yet, so creative humans have a big role to play.
Step 2: Articulate
Okay, so you have an idea for your literal image: “A wealthy family next to a poor family.”
It’s a start, but it’s short and open-ended, which can lead to weak results. They say a picture is worth a thousand words, so maybe eight isn’t quite enough.
In this stage, the creative challenge is in determining how to describe image in just the “right” way, and that often means getting very specific about certain details.
For example, if we’re choosing to render our two families in “photograph”, then what is the setting for the scene? A restaurant? A mall? Maybe isolated on black in a photo studio? Or perhaps a “candid street scene”.
If we’re on the street, then where is the camera positioned? Are we looking down on a seated homeless family, or up at the wealthy walkers from down on the sidewalk? In short, who are we meant to identify with?
Going further, are you embedding additional meaning into the image, intentionally or otherwise? Might your final image appear to demonize the wealthy family? The homeless family?
Interrogating further, how old is each person? What races are they? What genders? Do your choices reinforce unfair biases, or counter them?
(Time will tell, but maybe grappling with precisely what and how to render, and the implications of those choices, is where the real “art” in AI art will be found.)
Keep in mind, too, that AIs have been trained on millions of images from our own popular culture which often fails to represent the real diversity of our society. You might need to push against these biases in your prompts. (Open AI seems to be going to great lengths in their quest to make an ethical AI, but it’s got issues too.)
Of course, you’ll also need to keep your own biases in check, too. You’re not photographing an actual family on the street, but, perhaps worse, you’re simulating one — or depicting “reality” — and that comes with responsibilities. Or it should, if you’re trying to be ethical about it.
Once you’re past all those pesky content and composition questions, there’s still the desired media to consider.
Is this piece best suited to illustration? Photography? A screen-printed, two-colour, revolutionary propaganda poster? Or maybe a glossy 3D-rendered scene from a Pixar movie? How about a painting made by both Vincent Van Gough and Georgia O’Keefe at the same time?
With the entire history of Western media (kind of) at our disposal, we can render almost anything we can think of, “in the style” of nearly any artist, art movement, and media.
As an artist working in traditional media, or even digital media, you may have learned to limit the scope of your creative vision to things that you’re fairly confident you can actually execute. Previously, I might have not dreamed up something I could never create myself. But now I do.
With so many variables to (potentially) consider and address in your prompt writing, editing a text prompt again and again to arrive at just the “right” image can be one of the biggest challenges. But it’s also like solving a fun puzzle with a special reward at the end. For best results, be open to what the AI suggests too, even when it’s not what you had intended. Good ideas can be spotted anywhere.
The prompt for the above image was:
Looking down the center of a narrow street. Broken fences and old-fashioned houses, run down and boarded up, line both sides of the road. Crooked telephone poles reach into the sky, their many wires cross-crossing the sky. Trash litters the sidewalk. Black and white photograph. Film noir.
(Too bad I forgot to add “at night”, which was what I had envisioned.)
The DALL-E 2 Prompt Book by Dallery Gallery is an excellent starting point and desk reference for refining prompts with art styles, camera angles, descriptive adjectives, and more.
Step 3: Curate
Even with a clear idea in mind, and very well-crafted prompts, you can still wrestle to arrive at just the right image. Curating the resulting images you get (and refining, or editing them) is the third step here, and it may be the most important in some ways.
Knowing what makes an image “successful” means having a command of art, design and communication skills; a messy combination of taste, creative strategy, aesthetic prowess and more.
For any image, you may need to ask: Is it appealing? Does it match the vision (or the “brief”)? Does it communicate the idea clearly? Is it holistically consistent with the look of the project (or “brand”) overall? What would make it even better?
Rather than removing me from the equation, my recent experiences creating AI art have made it clear that my role may be shifting to a more heavily conceptual and imaginative one. We seem to be continuing along a course that will see many creative professionals become a new kind of Art Director: one that steers an AI creative partner.
Our ability to envision, communicate and judge good content will become increasingly important skills — whether we’re talking about AI-generated images, videos, music, or copy. In doing so, we’ll be bringing our professional experience, taste, and humanity to bear.
New Creative Powers
Ironically, while many of us are growing worried about being kicked out of our own industry by The Robots, I’m finding the power of AI to be a liberating creative force for me, personally.
Because while AI may be able to replace the entire art department, the flip side of this means that as an individual creator, I now have my own, personal, entire art department at the ready 24/7.
With DALL-E 2, I am able to crank out literally hundreds of custom, ready-to-use production assets for a short animated film — in just a few short hours. And I didn’t spend one lousy second searching for stock photography.
Most of the developers I know would quite happily welcome an AI that takes on the more tedious parts of their job (like hand-coding CSS.).
Like them, creative and design professionals may benefit from learning to look at AI as a creative partner; one that can augment and/or automate select portions of our process.
Generating patterns, whipping up colour palettes, providing rough character ideas, pumping out inspiration for set designs… the list of production tasks that may benefit from an AI collaborator is practically endless.
Smartphones didn’t entirely replace the professional photography industry, but they did give everyone access to a camera, at all times — and that’s come to have a massive impact on our culture and media. (Plus, everyone’s becoming a really decent photographer, now, thanks to the AI in our phones.)
Right now, it may be too soon to clearly see how AI-driven creativity will be integrated into society, culture and commerce. But like many technologies before it, generative AI seems poised to “democratize” parts of the creative process, opening it up to anyone and everyone. And that’s not necessarily a Bad Thing, in the long run.
At this stage, maybe the question of whether AI has doomed all creative professionals to a career change isn’t necessarily the most interesting one to ask.
Instead, I’m deeply curious how AI can augment, alter, and empower the processes we engage in right now, today.
Today, digital creatives are at a unique point in time where we can learn to leverage these systems for our own benefit.
Rather than looking at AI as your foe, see it as your partner, and collaborator. Harness what it’s good at, starting today, so that you can start to gain fluency with AI as a tool in your personal kit.
I’m pretty sure we humans will continue to corner the market regarding hard-to-code-for qualities like human empathy, a holistic perspective (and that elusive thing called “taste”), at least for a while longer.
So for now, you keep using that big, mushy, human-centric brain to make the work Actually Good™, and put the robots to work for you.