The Curse of Recursion: Training on Generated Data Makes Models Forget

Deliverator@kbin.social · 1 year ago

The Curse of Recursion: Training on Generated Data Makes Models Forget

nsa@kbin.social · 1 year ago

If the effect is strong enough, then it could have a very negative effect on LLM training in the near future, considering more and more of the internet contains ChatGPT & GPT-4 content in it and automatic detectors are currently quite poor.

Deliverator@kbin.social · 1 year ago

Yeah it does not portend well for the future, especially combined with the current explosion of low quality, profit driven content. I fear if left unchecked we could approach some kind of Kessler Syndrome-style scenario where desire for rapid growth and profit will poison the well in the long term. “Garbage in, garbage out”