Are we in danger of eating our own words?

By: Rob Corbidge, 23 March 2023

As publishers on the one hand look to generative AI content as a route to reducing production costs, on the other hand they are finding their own content being utilised by such systems without any deal in place.

In 1983, the price of safety matches in West Germany fell by around one third. In a deal that an impoverished Germany struck in 1930, its entire safety match market was handed as a monopoly to the Swedish "Match King" Ivar Kreuger - known as the Zündwaren monopoly - and he in return arranged a loan of $125 million to Berlin. He didn't actually have the money at the time he cut the deal, but that's the way you become Match King.

The deal lasted over 50 years, until the last bond repayment was made, and in 1983 West Germans could once again light matches with joyous abandon, knowing the household budget was safe. The East Germans wisely didn't honour the deal. I suspect their matches still weren't as good.

This historical nugget is offered by way of illustrating bad deals. As deals go, it's a total and unmitigated stinker and a cast iron example of how the consequences of a bad deal can last a long time.

Publishers are currently on the nursery slopes of yet another challenge in the shape of generative AI. It is supremely important not be distracted by bells, whistles, and buzzers and remember that publishers produce the raw material without which such systems cannot operate. To state it more simply, publishers control supply. 

One recent crystal clear example shows us how this actually works. The reputable Tom's Hardware tech site revealed this week how Google's Bard used data that could only have come from testing done by the Tom's Hardware team to answer a question about the comparative performance of computer chips. It even tried to take credit for conducting the actual back-to-back testing process.

Bard did at least apologise when questioned on its sources. That's one thing about these AIs... if you interrogate them further, they often spill the beans in a way that leaves their own actions looking a bit 2nd rate.

But to bang an old drum, that's a company with a market cap of Jupiter stealing the in-depth research and testing work that a site such as Tom's Hardware (owned by Future) needs to do to keep its place in a hotly-contested part of the publishing market. To rub salt in the wounds, Bard didn't just pinch it, it tried to say it was its own research. Imagine if they try it with, say, a Taylor Swift song, or Coke's mysterious 7X, or a rival search engine's algorithm.

The tech wizards in line for the billions say it's called "training" and we shouldn't be upset about it. In that case, anyone want to let me train moving money out of their wallet?

We know that a number of larger publishers and content producers are now addressing the use of their content for such systems and training.

"We have valuable content that’s being used constantly to generate revenue for others off the backs of investments that we make, that requires real human work, and that has to be compensated,” Danielle Coffey, executive vice president and general counsel of the News Media Alliance in the US, told the Wall Street Journal this week.

The same report indicated that Reddit has discussed the use of its content in AI training with Microsoft. Lucky Reddit.

At the same time as this is happening, we're all aware of various publishers looking at generative AI to assist them in reducing costs. That is entirely understandable, and it is supremely likely that such systems will find a place in the production of some content very quickly.

It does however, offer the prospect of Content Cannibalism does it not? Eating our own words. Or images. Or video. Or in-depth CPU test results. Even in the very broadest sense of content going into a giant training vat for the AI to sieve out and give back to us in the form of enhanced predictive text, the finer the content grade is, the more obvious the derivation will be.

Even as deals for content use are thrashed out with tech giants by various national governments anxious to keep some kind of free and functional media, publishers need to see the value of their content to this newly accessible technology and push for a good deal with the generative AI big dogs before they have to fight the same battle again.

Let's not be left holding a box of soggy matches when it was us who helped light the touchpaper.