An Eye on AI

by | Jul 24, 2024

Black, white, and red face

“Vampire (Corpus: Monsters of Capitalism) Adversarially Evolved Hallucination” (2017), Trevor Paglen, Dye Sublimation on Aluminum Print, 60 × 48 in. Courtesy of the Artist, Altman Siegel, San Francisco and Pace Gallery.

What makes AI images a strange invention is their innate inscrutability. What we see in these pictures cannot be said to have existed. Unlike photographs, AI images do not result from light falling on material surfaces; they are functions of algorithms and databases, all at a digital level. AI generators use images whose reference to reality has already been translated into a series of zeros and ones, which compose the fast-paced, pixelated world of computers. By and large, these models are fed millions of pictures to generate new ones through a process of synthesis. The term “AI art” is often thrown around to encompass all the images that result from these complex processes. The term is misleading, however, because it aims solely at how the images appear, based on their mimicking of various art styles. It also obfuscates core differences between how these generators operate, which can offer a key to better assessing whether or not such images can be considered art. Suppose one understands art as a human capacity to bring new things into being (or old things into new existence) to afford us a kind of freedom—of thought, of experience. In that case, given their propensity to minimize the role of human capacity in their generation, AI images should be subject to greater scrutiny when gauging their potential for art, lest we find ourselves in a state of arrest under another of artificial intelligence’s false promises.

For some time now, art institutions have sold and exhibited these images, legitimizing their status within the art world. In the early 2020s, the public release of text-to-image generators, such as OpenAI’s DALL-E, MidJourney, and Google’s Imagen, opened the door for both egregious and provocative examples. Take Théâtre D’opéra Spatial, a game designer’s MidJourney-generated piece, which was awarded an art prize at the Colorado State Fair. It is a flashy, derivative image that is indistinguishable from the kind of concept art one can easily find on websites like DeviantArt; it is the perfect example of the potential of these kinds of images to become generic. In contrast, Boris Eldagsen’s The Electrician, a black-and-white photorealistic image produced through DALL-E 2, presents an eerie vision. It depicts an older, impassive-looking woman grasping the shoulders of a younger one, who stares off into the distance. The enigmatic title adds an air of mystery to the image as it bears no connection to what we see. Eldagsen’s AI-generated image won the 2023 Sony World Photography Award. Whereas the creator of Théâtre D’opéra Spatial accepted the art award, Eldagsen rejected his prize, claiming that his work was not photography but “promptography,” that is, images created by writing prompts (playing on photography’s literal definition as “writing with light”). These “promptographs” are the product of text-to-image generators that combine natural language processing and computer vision to illustrate words. However, how and why these AI-generated images can (or should) be considered works of art in the first place has remained underdiscussed.

An uncritical use of these models, then, surrenders a vital part of the creative process, since it is not a matter of the intrinsic limitations of a medium but an adoption of a preset way of seeing the world in pictures, always already labeled and cataloged by programmers.

AI images generated through stable diffusion models can more appropriately be called “synthography.” Media artist and scholar Elke Reinhuber coined the term to better encapsulate the nature of images produced by the synthetic methodologies of AI algorithms. Here, the prefix syn– underscores the synthesis that stable diffusion programs execute on command, while graphy denotes visible remnants of photography and “drawing.” Synthographic examples, such as the AI images discussed above, conceal the processes of their mimicry under the masquerade of “style.” What enables unrestricted models like DALL-E 3 to carry out “style transfer”—the capability of a user to produce, for example, an “impressionistic painting” with the proper prompt—is the indiscriminate harvesting of large online image datasets with prefigured nested categories, such as ImageNet and LAION-5B. These vast pictorial collections, which carry with them preset categories and labels, are used to train AI models for tasks such as “machine vision”; they pair text and image so that the model can discriminate between a pear and an apple, or a Manet and a Seurat. An uncritical use of these models, then, surrenders a vital part of the creative process, since it is not a matter of the intrinsic limitations of a medium but an adoption of a preset way of seeing the world in pictures, always already labeled and cataloged by programmers. Ultimately, even Eldagsen’s promptographic works, surrealistic as they may seem, can be viewed as a “failure” in this sense, inasmuch as they fail to liberate themselves from the opaque tenets of the machine’s eye.

Ironically, an alternative artificial intelligence technology that could stake out a more honest claim to art may already be obsolete. In an op-ed last year, AI researcher and image creator Ahmed Elgammal proclaimed the death of AI art and attributed its downfall to the widespread use of text-to-image generators. He is precise about the short lifespan of when AI art was possible: 2017 to 2020. This was the era of generative adversarial networks (GANs), AI systems whereby two neural networks are pitted against each other to discover underlying structures in the dataset, producing new data. Unlike modern text-to-image generators, GANs are trained with discrete, controlled, and sometimes unlabeled sets. This model allows users—such as artists—to create an “uncanny aesthetics” of error and serendipity. What results does not simply follow the command of text but also can diverge from it. Conversely, prompt-based models are bereft of this uncanny quality, Elgammal claims, because their visual output overly relies on linguistic confinement. Errors do remain in text-to-image AI models—their generation of odd-looking hands is emblematic of this—but error optimization turns the uncanny into a kind of predictive surrealism. Or, as Elgammal eloquently puts it, in prompt-based models, AI becomes “more like us—no longer able to see the world with an eye that complements or challenges us.”

How and under what conditions, then, can AI produce art?

While Elgammal’s diagnosis seems correct, and his concept of AI as a “challenging eye” is promising, his argument deserves further commentary. His critique recalls the perennial question of novelty and creativity. In the common imagination, creativity is still equated with an inscrutable, innate talent or inspiration, closer to a Romantic ideal force—or spirit—waiting to descend on a select few. It is associated with madness and inspiration rather than acumen and restraint. It is, therefore, expected that today such a “spirit” would look for its sheltered domain in AI, as a ghost in the machine. This is a mystification—or a “bewitchment,” as Ludwig Wittgenstein would put it—of the language we use to speak about generative AI nowadays, with their “neural networks,” “hallucinations,” “machine learning techniques,” and supposed collaboration in creativity. And it is partly a scientific reprisal, not of computer scientists in particular, but of a more ostensibly scientific era, whose attitude is perhaps a response to artists who have always asserted a privileged seat on creativity over science. Elgammal is right to speak of novelty and death concerning this issue. The Czech poet and immunologist Miroslav Holub, who would rather “be caught dead” than admit to feeling anything like creativity, noted, “The first artifact of its kind can be the result of creativity, but not the second.” It follows from this elegant premise that neither art nor science has an exclusive claim to the creative and, to our point, that Elgammal’s irreversible loss of innovation in GANs stands true. But this comes not from the relative eeriness and unintentional deformity of its images; rather, it arises from the “first artifact” that afforded artists the capability to create in their own right, without yielding to the presets and culturally embedded categories of text-to-image models that merely deal in predictable imitation—that blindest of blind alleys. Art cannot be found in that estrangement.

How and under what conditions, then, can AI produce art?

A paradigmatic case of conceptually rigorous AI-generated art is Trevor Paglen’s series collectively titled Adversarially Evolved Hallucinations (2017–ongoing). For this work, Paglen and his studio combined GAN-based image recognition and image generator networks to create a series of smaller sets, which used subjectively constructed taxonomies and concept clusters taken from psychoanalysis (Interpretation of Dreams), fiction (Omens and Portents), economics (Monsters of Capitalism), and others. A sort of free association is at play here, with categories encompassing a range of possible subjects. For example, in the training set for Corpus: American Predators, Paglen included images of carnivorous plants and animals, drones, stealth bombers, and Mark Zuckerberg. The resulting series is a strange collection of warped visions representing how these networks “see” through the lens of subjectively arranged sets. For these machines, seeing becomes a “hallucination”—a controversial term that designates inaccurate and erroneous output that strays from the user’s expectations. In Paglen’s series, hallucination is foregrounded as the default activity of these models, highlighting the inherent arbitrariness of such programs.

With its “uncanny aesthetics,” the image aligns with a longstanding theme of Paglen’s work: visualizing invisibility.

One particularly sinister image in this work is Vampire (Corpus: Monsters of Capitalism). For this series, Paglen curated the training set to include “images of monsters that have historically been allegories for capitalism.” The central allegory here is Karl Marx’s analogy between vampirism and capital, which he characterized as “dead labour” and “vampire-like,” whose life was sustained “only by sucking living labour.” Yet, the set was not limited to pictures of Bela Lugosi. It also included images of octopi, squids, zombies, and others, expanding the visual possibilities of the resulting images. That the analogy would need unpacking, that the monsters chosen by Paglen may or may not be traditionally accepted allegories of capitalism, is precisely the point the series makes: to expose the vexed decisions commanding AI from within. Furthermore, the series seems to ask whether the GAN can really recognize (and “hallucinate”) an allegorical monster instead of simply depicting one as with text-to-image generators.

Vampire depicts an incapacity to represent, at least cogently. On the screen, its sole “eye” looks curiously real. What appears as brushstrokes of vivid orange hues and shades of black and blue nestle the “eye” under an eyelid. In the upper-right quadrant, where one would expect a second eye to create a face, there is a cut-out, an irregular circle, like one a child would cut out of a white blanket to dress up as a ghost. It looks like a papier-mâché mask with three holes: two eye sockets and a nasal aperture (not unlike a skull’s).  From that point, the face inexplicably flattens and seamlessly gives way to a lurid patch of red, tissued by veiny threads of pink. A tongue? Its deviation from the anatomical standard would dissuade us from calling it that. One particular detail renders this image even more uncanny. The eye’s pupil is strikingly dilated. Of this, ophthalmologists would tell us that mydriasis is often caused by being and seeing in the dark (like a vampire), when not the result of hallucinogens or trauma.

Vampire is undoubtedly an odd output of the AI’s “hallucination.” It does not show what we would recognize as a vampire per se but depicts the eye of a nocturnal being. With its “uncanny aesthetics,” the image aligns with a longstanding theme of Paglen’s work: visualizing invisibility. This particular piece not only deconstructs the process that goes unseen and unknown in most AI image generators, but it also dramatizes the symbolic potential of AI models. A “challenging eye” is present in the eye of Vampire, in the imaginative convergence of serendipity and creative control that GANs allow. In that realm of possibilities, AI should not be understood as a mechanical device and nothing more. With the camera in mind, the French poet Paul Valéry reminded us that technology exerts an effect on the destiny of art “by creating unheard-of new methods of employing the sensibility.” AI technology can present these methods and new solutions to artists exploring art’s raison d’être, namely, to try to answer the perennial question of what art is. Yet, artists can only offer us true deliverance by liberating their tools from the underhanded tyranny of programmed ideologies. Only then can the challenging eye of the machine truly enter into a conversation with art.

Paulo Andreas Lorca

Paulo Andreas Lorca

Publab Fellow 2024

Paulo Andreas Lorca (https://aikphrasis.com/) is a translator, editor, academic, and writer. He earned his PhD in Romance Studies from Cornell University and has made significant contributions to the field of Latin American art and culture through his scholarly publications. As a co-director of the Iberoamerican section of Revista Otra Parte, he frequently writes on contemporary art and literature. His current project as an editor is The Presence of This Breath, a collection of poetry by the writer and artist Clive Barker, spanning his entire career.

Trevor Paglen

Artist

Trevor Paglen (b. 1974, Maryland) (https://paglen.studio/) is an artist whose work spans image-making, sculpture, investigative journalism, writing, engineering, and numerous other disciplines. Paglen is the author of several books and numerous articles on subjects including experimental geography, artificial intelligence, state secrecy, military symbology, photography, and visuality. Paglen’s work has been profiled in The New York Times, The New Yorker, The Wall Street Journal, Wired, The Financial Times, Art Forum, and Aperture. In 2014, he received the Electronic Frontier Foundation’s Pioneer Award and in 2016, he won the Deutsche Börse Photography Prize. Paglen was named a MacArthur Fellow in 2017.