What is the value of tech without content?

By: Rob Corbidge, 14 December 2023

a haptic gloved hand resting on a book, aztec art

Thinking about the language used in the content landscape makes it clear that tech has the upper hand over creators. But without content, what is the tech for?

Is content more valuable than tech? We are currently in an epoch where the power to transmit has gained more importance than the transmission itself. It's not that content itself is entirely devalued, yet technology itself, in making content creation accessible to all and consumable by all, enjoys an upper hand that is largely supported by perception rather than reality.

There is no such thing as a TikTok video, or a Facebook one, and so on. They are just videos posted on the respective platform. They could be posted elsewhere. 

The art of tech is in keeping users on a particular platform, hence the recommendation alchemy of TikTok being a source of much intrigue. Yet the tech is nothing without the platform. You could have filmed your super-viral funny on an 8mm cine film camera, and it would still exist on that film even if it was never distributed.

Outside of the Machiavellian T&Cs associated with all content hosting platforms, and goodness knows they'd make Machiavelli himself blush, the actual content itself remains the property of the creator. Yet, as publishers know all too well, once you're neck deep with the platforms, they decide which way your head revolves.

Such thoughts about the value of content, the record of the activities of the planet's most exotic fauna - humans - not being valued properly were provoked by a post on X. (Note, X is just another platform, its value almost entirely in the user base.)

Twitter LLMs

Twitter LLMs

If we follow the logic of the misunderstanding around AI now so prevalent and accept the current generation of AI as a kind of sentience, then by reverse our mother's first words to us are just training data, right? Shouldn't all such sets of first words heard be coalesced into a training data set, and then all humans could be trained on a global standard dataset of first-heard words?

An absurd example of course, yet every piece of content of whatever format used in such training has the first moment of creation in it. And that first moment involved a human.

Training data is itself a misnomer, although once again the power of tech is in evidence because it is the technological process choosing the words to describe things with all the merciless brevity that a good engineer must use in the process of building a system.

Yet it is a diminution useful for those crawling all our content to their profit, as it reduces the collective endeavour of thousands, or more, into two words and the merciless brevity of engineering is then used as blink-of-an-eye cover for wholescale looting. 

This then is evidence of how tech strengthens its hand, even unwittingly, by dictating the language used to describe fundamental elements of any discussion. It's not necessarily wrong, or right, it's just how it is and that consideration must form a part of thinking about AI systems.

Even the modest modification to "training media" suggested above brings that "training data" a step closer to being something someone actually spent part of their limited lifespan creating.

LLM systems are amazing, and they have utility, yet ultimately they are the sum of their data, and still limited by the way they can use that data. In effect, the "training data" they consume has already been filtered by one of most complex systems in existence, one that assigns value at capacity in a way the Transformer at the heart of an LLM is not even in remote orbit with.

Happily, this pre-filtered content is also arranged in a hierarchy, a hierarchy somewhat in flux and constantly curated by humans, with a great deal of related content. It's called culture.