Book a demo

AI image creation: redefining original?

Stable Diffusion, the latest of many new AI image generation applications, is causing a stir through its ease of use and the size and sources of the dataset it uses. Is there utility for publishers in such tech?

by Rob Corbidge

Published: 14:43, 08 September 2022

Liz Truss as generated by Stable Diffusion

AI image creation, mostly obviously in the form of OpenAI's DALL-E and DALLE-2 and its spin-off Craiyon, and more recently Midjourney, has opened a new area of creative exploration in recent years, exploration which has gone well beyond the technical.

Such systems are here to stay, and the creative industry will have to make some interesting decisions about their use and formulate ways to make them work without depriving those who produce original content of a living - or indeed stifling the very creation of original content.

At least the publishing industry has been down similar roads already around other content creation - even if that journey is not complete, and AI enabled image creation adds another complexity.

The public still seems to be at the "wow" stage with such technology, although it's interesting to note how quickly people move from "wow" to "yawn" over such technological marvels. Such tools are already making their way down to consumer phone app level, albeit in simplified form.

Stable Diffusion is the latest such application to generate digital images from natural language prompts. It differs from the others in some substantive ways, yet the primary one is that whole thing is currently accessible to all and sundry. Try it.

The image accompanying this article was itself generated by Stable Diffusion. It's new UK PM, Liz Truss, as a Grand Theft Auto V cover art star. A little digging, and I was able to discover the type of prompts I should be feeding the system in order for it to match a particular art style, in this case the distinctive (and derivative) theme used by Rockstar Games. An entire prompt search engine lives here .

As it is obligatory for any piece about AI image creation to show something personal by the writer, I don't wish to disappoint. A few simple inputs - film noir, orson welles, fritz lang, shadow - attached to an unprepossessing image of this author - produced the image below. It's a fascinating process and both as startlingly simple and as complex as you wish.

AI image created by Stable Diffusion

Trained on the LAION-5B dataset - a dataset that contains 5.85 billion image-text pairs - Stable Diffusion is an impressive tool. However, an analysis of a sample of the LAION-5B dataset reveals some interesting things, such as that around 47% were sourced from only 100 domains, and Pinterest being the largest single source. A bit like Google image search was for an unfortunate while. If you're an artist, you can protect your work on the net with watermarks etc. If someone posts your work on Pinterest though, it's open season for scraping.

Stable Diffusion also has no restraints on generating images of people named in the dataset. That makes it more fun, with greater utility of course. There are 11,000 images tagged with Donald Trump for example. Yet such a lack of restraints could lead to creation of some seriously offensive images.

The uses and limitations of AI generated images are not fully appreciated yet. That much is a certainty. Yet it's not difficult to imagine uses for the publishing industry, some of which are already in use - albeit experimentally - by those on the cutting edge. Stock imagery is a cost drain on most publishers, and many pieces of content require such imagery; alternatively, it is the preserve of art editors and artists commissioned to provide suitable work for the purpose.

AL image generation gives the ability to produce the kind of abstract images that can fit this purpose to order. A mastery of natural language prompts is required to get the best from such applications, yet it is already simply a matter of trial and error and knowledge accumulation. The ethical and legal boundaries around doing so are untested.

Such image creation leads to all kinds of places. Recently this particular horror show was threaded on Twitter, with all the madness of negatively weighted prompts leading to some very strange places or this combination piece (be warned - creepy). An art competition has been won recently with an AI-created piece, creating some fascinating discussions around originality, inspiration and skill.

It's clear we're only in the foothills of AI image generation. It's also likely that we'll see more video and audio equivalents released too. The world is getting mashed up.

Rob Corbidge • Head of Content Intelligence

Rob Corbidge is Head of Content Intelligence at Glide Publishing Platform, applying the latest knowledge about advances and ideas in the publishing industry to our own product and helping clients get the most from their content.