Model Collapse: Why AI Is Probably Getting Worse, And It May Be Fatal

If you have been starting to wonder if AI language model ChatGPT’s answers have been becoming progressively worse, you are not alone.

Several researchers, who have been following the capabilities of AI products that rely on large amounts of data scraped from the internet, are sounding the alarm over an issue that has been dubbed ‘model collapse’.

Data Gravity

Firmbee/Unsplash

A technique called machine learning is used to process and analyze the data used to train large language models such as OpenAI’s ChatGPT and Google’s Gemini.

One of the major downsides to this technology is that it requires large amounts of high-quality input to function as intended.

Data Sources

Fujiphilm/Unsplash

This large amount of data has to be found somewhere, so the corporations creating these models came up with a data-based solution.

To find the necessary data, they scrape the internet for terabytes of content from public sources such as news sites and social media.

AI Web

Markus Spiske/Unsplash

However, Google, Meta, and others are running into a problem that had been predicted but was sidelined as the hype for AI grew.

Large amounts of textual and visual information that is uploaded to the internet is either fully, or in part, generated by AI.

Blind Leading the Blind

Victoria/Pixabay

All this is leading to the fact that newer AI models are now being trained, more and more, on content either generated by itself or other AI models.

A researcher that works with Monash University, Jathan Sadowski, recently described a hypothetical self-trained AI as an “an inbred mutant, likely with exaggerated, grotesque features.”

Evidence Mounts

Jilbert Ebrahimi/Unsplash

More recently, a study was published in Nature that tested the capabilities of AI trained on itself, and the results were concerning.

By the 5th cycle, the degradation had become stark, and by the 9th, it was almost completely nonsensical.

Rapid Decline

Solen Feyissa/Unsplash

Dr. Ilia Shumailov, of the University of Oxford, who researched the phenomenon with her team, was shocked at the rate of decline.

She says: “It is surprising how fast model collapse kicks in and how elusive it can be.” She says it initially “affects [badly represented] minority data,” before moving on to more diverse outputs.

Model Collapse

Laura Rivera/Unsplash

‘Model collapse’ is the term given to this phenomenon, where AI models trained on themselves degrade and become meaningless over iterations.

AI has the potential to cause its own downfall. Shumailov warns that “model collapse can have serious consequences”.

Scale of Issue

Glenn Carstens/Unsplash

One lesser known aspect of the issue is the sheer scale of AI-generated content that can be found on the web today, as revealed by separate researchers at Amazon Web Services.

They found that as much as 57% of text on the web has been at least partly generated by AI.

Real Content

Andras Vas/Unsplash

All of this means that it will not only become harder to find real content on the internet, but that the content we do find may start to become markedly worse.

So, are you an AI-evangelist or a tech skeptic? What do you think of the increasing proportion of AI-generated content found on the internet, even aside from its potential to scupper AI-driven language models?

Categories

Model Collapse: Why AI Is Probably Getting Worse, And It May Be Fatal

Disney Walk Back Wrongful Death Disney+ Position, But One Major Caveat Remains

Hugh Jackman Has Also Played One of DC’s Most Iconic Heroes

‘Inside Out 2’ is Now the Highest-Grossing Animated Film at International Box Office

Deadpool and Wolverine Hits Major Global Box Office Milestone, Bags Unique Achievement

Trending

Five Nights at Freddy’s is the Blockbuster Film for Gen Z, Here’s What Sets it Apart

New ‘Thunderbolts’ Trailer Hints at Yet Another Superhero Getting Their First On-Screen Feature

First John Wick Spin-Off Starring Ana de Armas to Hit Theaters, But With Year Delay