• SaucySnake@lemmy.world
    link
    fedilink
    arrow-up
    49
    ·
    5 months ago

    https://arxiv.org/abs/2306.07899 here’s a paper that found that one of the biggest sources for LLM training data is corrupted by people using AI to complete the tasks. There are plenty of papers out there that show the effects of this, which they call “model collapse”.