arrow Products
Glide CMS image Glide CMS image
Glide CMS arrow
The powerful intuitive headless CMS for busy content and editorial teams, bursting with features and sector insight. MACH architecture gives you business freedom.
Glide Go image Glide Go image
Glide Go arrow
Enterprise power at start-up speed. Glide Go is a pre-configured deployment of Glide CMS with hosting and front-end problems solved.
Glide Nexa image Glide Nexa image
Glide Nexa arrow
Audience authentication, entitlements, and preference management in one system designed for publishers and content businesses.
For your sector arrow arrow
Media & Entertainment
arrow arrow
Built for any content to thrive, whomever it's for. Get content out faster and do more with it.
Sports & Gaming
arrow arrow
Bring fans closer to their passions and deliver unrivalled audience experiences wherever they are.
Publishing
arrow arrow
Tailored to the unique needs of publishing so you can fully focus on audiences and content success.
For your role arrow arrow
Technology
arrow arrow
Unlock resources and budget with low-code & no-code solutions to do so much more.
Editorial & Content
arrow arrow
Make content of higher quality quicker, and target it with pinpoint accuracy at the right audiences.
Developers
arrow arrow
MACH architecture lets you kickstart development, leveraging vast native functionality and top-tier support.
Commercial & Marketing
arrow arrow
Speedrun ideas into products, accelerate ROI, convert interest, and own the conversation.
Technology Partners arrow arrow
Explore Glide's world-class technology partners and integrations.
Solution Partners arrow arrow
For workflow guidance, SEO, digital transformation, data & analytics, and design, tap into Glide's solution partners and sector experts.
Industry Insights arrow arrow
News
arrow arrow
News from inside our world, about Glide Publishing Platform, our customers, and other cool things.
Comment
arrow arrow
Insight and comment about the things which make content and publishing better - or sometimes worse.
Expert Guides
arrow arrow
Essential insights and helpful resources from industry veterans, and your gateway to CMS and Glide mastery.
Newsletter
arrow arrow
The Content Aware weekly newsletter, with news and comment every Thursday.
Knowledge arrow arrow
Customer Support
arrow arrow
Learn more about the unrivalled customer support from the team at Glide.
Documentation
arrow arrow
User Guides and Technical Documentation for Glide Publishing Platform headless CMS, Glide Go, and Glide Nexa.
Developer Experience
arrow arrow
Learn more about using Glide headless CMS, Glide Go, and Glide Nexa identity management.

The publisher's AI dilemma: my enemy's enemy is my enemy

While battling AIs are stealing content wherever they find it, would suing some into oblivion just make the survivors more powerful?

by Rob Corbidge
Published: 12:40, 21 June 2024

Last updated: 12:52, 21 June 2024
A group of angry people fighting each other

We currently exist in a tech epoch in which it's quite possible over a single morning coffee to absorb the news that the Pope has given his qualified blessing to AI, a US bank was revealed to have an AI system which calmed stressed workers by showing pictures of their family - I am sure Pavlov showed us where this might end up going - and McDonald's paused an AI drive-thru project after some "issues". If that morning coffee was from McDonalds, perhaps take a sieve.

Developments in the AI arena are so frequent and varied, and reported on with such simulated insight, that it's tricky to separate the wheat from the chaff. Wheat there is for sure, but likely outweighed at stupendous ratios by hype-powered chaff.

Back in publishing's buffeted world, we find "AI answers" company Perplexity surfacing barely concealed content from Forbes as its own work, even citing aggregated versions of the same original story as its sources to show it had done its homework. It seems to be saying it is using AI to summarise other AIs, so that's OK then.

Having been involved in my formative years in what I can only describe as an informal homework copying cartel, it seems there's an obvious flaw in presenting copies of copies to prove you didn't copy something. You can't parse cunningness into training data, yet.

We're happy to see that Forbes aren't messing around and are demanding Perplexity "remove the misleading source articles, reimburse Forbes for all advertising revenues Perplexity earned via the infringement, and provide 'satisfactory evidence and written assurances' that it has removed the infringing articles".

Randall Lane, Chief Content Officer at Forbes Media, asserted to the AP that the dispute was an “inflection point” in the conversation about AI.

“It’s a case study in where we’re heading,” Lane told the AP. “If the people who are leading the [AI] charge don’t have a fundamental respect for the hard work of doing proprietary reporting, and keeping people informed with value-added content, we’ve got a big problem.”

Perplexity's boss Aravind Srinivas has been on the defensive, saying they're "trying to build positive relationships with news publishers... we can definitely coexist and help each other." 

He's right to be agile in his dealings: despite Perplexity already earning the valuation of $1bn, a suit this early in its life could prove highly damaging to its trajectory.

The scenario presents me with a bit of a conundrum.

As a tech optimist I generally welcome promising entrants to markets dominated by super-funded incumbents. Conversely, I'm not shy about airing my views on AI piranhas hoovering up content and the consequences they should face. 

But, while Perplexity may be small fry in a world of sharks such as OpenAI and Google, their one shared goal is to be both the start and end point of any given query - at the expense of the very publishers which have helped create their AIs.

To complicate things yet more, I fear that cases of this type risk entrenching the most egregious content ripping culprits as market leaders, making the whole problem that much worse. 

Is it possible we ultimately see the big players get away with content harvesting murder, while less powerful start-ups are hobbled by a vigorous publisher rightfully defending their IP? 

Backing from Microsoft or Google is like an unlimited supply of steroids for legal teams, the AI lawsuit equivalent of the joke about two tourists running from a lion: "I don't need to outrun the lion, I just need to outrun you." 

Are publishers doing the work of the AI giants for them?

Before we are all moved to tears at plucky Perplexity being picked off from the AI herd so more dangerous pack members can feast more voraciously, an excellent analysis by Wired gloriously headlined "Perplexity Is a Bullshit Machine" reveals some unpleasant insight into Perplexity's practices.

It would seem that Perplexity's site crawler ignores the fundamental protocol of adhering to instructions in website robots.txt files, and helps itself to website content by scraping what it has been told not to.

This is a known risk of course because robots.txt files are about as legally binding as Christmas wishes to Santa, but that's still some underhand stuff. Another aspect of the internet that relied on trust is nudged towards regulation, I'd wager.

But there lies the rub: some of those with their eye on AI hegemony think it's better to steal first and defend later, than move slow and be honest. 

Early players were able to harvest petabytes of content before those that produced it had even woken up to the threat. To gain some competitive traction, basic fair use rules were trampled without a moment of worry about the consequences: witness Google's lack of action over the harvesting of YouTube scripts by OpenAI, because they were doing it themselves, on the quiet, with anything else they could lay their hands on. 

This is what publishers are up against, like shepherds tending a flock surrounded by wolves. 

It's an AI gold rush where content is the gold, and constant vigilance is the name of the game.