• Rentlar
    link
    fedilink
    arrow-up
    2
    ·
    3 months ago

    …studies have found that this process can amplify biases in the data and is more likely to erase data pertaining to minorities.

    Later on…

    new research suggests that when humans curate synthetic data (for example, by ranking A.I. answers and choosing the best one), it can alleviate some of the problems of collapse.

    It definitely won’t solve the biases part, unless we select against it.

    • Septimaeus@infosec.pub
      link
      fedilink
      arrow-up
      4
      ·
      3 months ago

      Yeah I read that as a caveat to the larger point, i.e. just acknowledging that there are limited cases where the use of synthetic training data has been shown to be useful.