incompetent half-assing is rarely this morally righteous of an act too, since your one act of barely-competent-enough incompetence is transmuted into endless incompetence by becoming training data/qc feedback

    • folkrav
      link
      fedilink
      arrow-up
      4
      arrow-down
      1
      ·
      9 hours ago

      That kind of data sanitization is just standard practice. You need some level of confidence on your data’s accuracy, and for anything normally distributed, throwing out obvious outliers is a safe assumption.

      • supersquirrel@sopuli.xyzOP
        link
        fedilink
        arrow-up
        4
        arrow-down
        1
        ·
        edit-2
        8 hours ago

        If you cut the outliers out of a dataset of whom 30% are bullshitters who are skilled and motivated to bullshit, that doesn’t magically make the system more accurate it only makes it more precise since bullshitters have been training their whole lives to bullshit in a convincing way (some went to school for it starting at a very young age) and can often present much more authentic than non-bullshitters and honestly it makes me happy that I know big tech thinks the same way you do on this. It is glorious how poorly positioned it makes these much more dangerous bullshitters to respond or anticipate how these systems will naturally decay.

        At a certain point, and it doesn’t surprise me in the least that people who think rigidly along the lines of statistics and automation don’t get this at all, when misinformation is rampant in a system it is often the outliers that are the critical voices of truth.

        If you discard outliers because they are outliers and keep doing it you will get a more refined system precisely because it has gotten better at bullshitting and now everybody always jumps on the bandwagon and meaning collapses into byzantine conformism.

        I take my schadenfreude where I can get it : )