This seems inevitable
But which fraction of AI-generated content is the threshold for collapse? And how is that fraction measured? How much new input is necessary to make sure the model does not overlook it?
Best term I’ve heard for this is “Hapsburg AI”
The ultimate irony if it goes this way. The singularity is inherently impossible, instead of an exponential explosion of machine intelligence, there is an exponential implosion.
I’ve posted a few GPT generated poems on Reddit. If some next generation LLM uses those as their learning data, it’s not going to get much better than whatever LLM I used back then.
Actually, it might just improve a little bit, but not much. I was the editor of those poems, which means I didn’t accept just any random garbage GPT gave me. It involved a few iterations until the poem was good enough for me. Still probably nowhere near what an actual poet would have written.
My poems were mainly intended to discuss very concrete topics in an entertaining manner, whereas real poems made by real poets tend to use complex symbolism and discuss topics that reflect on the nature of the very core of the human soul.
It would be ironic if companies had to pay people to generate clean content to train the AIs.