• RejZoR@lemmy.ml
    link
    fedilink
    English
    arrow-up
    61
    ·
    3 days ago

    This is Ai poisoning. Blocking it you just make it not learn. Feeding it bullshit poisons its knowledge making it hallucinate.

    I also wonder how Ai crawlers know what wasn’t already generated by Ai, potentially “inbreeding” knowledge as I call it with Ai hallucinations of the past.

    When whole Ai craze began, everything online was human made basically. Not anymore now. It’ll just get worse if you ask me.

    • CheeseNoodle@lemmy.world
      link
      fedilink
      English
      arrow-up
      29
      ·
      3 days ago

      The scary part is even humans don’t really have a proper escape mechanism for this kind of misinformation. Sure we can spot AI a lot of the time but there are also situations where we can’t and it kind of leaves us only trusting people we already knew before AI, and being more and more distrustful of information in general.

      • theangryseal@lemmy.world
        link
        fedilink
        English
        arrow-up
        11
        ·
        2 days ago

        Holy shit, this.

        I’m constantly worried that what I’m seeing/hearing is fake. It’s going to get harder and harder to find older information on the internet too.

        Shit, it’s crept outside of the internet actually. Family buys my kids books for Christmas and birthdays and I’m checking to make sure they aren’t AI garbage before I ever let them look at it because someone bought them an AI book already without realizing it.

        I don’t really understand what we hope to get from all of this. I mean, not really. Maybe if it gets to a point where it can truly be trusted, I just don’t see how.

        • Flagstaff@programming.dev
          link
          fedilink
          English
          arrow-up
          2
          ·
          2 days ago

          I don’t really understand what we hope to get from all of this.

          Well, even among the most moral devs, the garbage output wasn’t intended, and no one could have predicted the pace at which it’s been developing. So all this is driving a real need for in-person communities and regular contact—which is at least one great result, I think.

    • JustARegularNerd@lemmy.dbzer0.com
      link
      fedilink
      English
      arrow-up
      18
      ·
      3 days ago

      Kind of. They’re actually trying to avoid this according to the article:

      “The company says the content served to bots is deliberately irrelevant to the website being crawled, but it is carefully sourced or generated using real scientific facts—such as neutral information about biology, physics, or mathematics—to avoid spreading misinformation (whether this approach effectively prevents misinformation, however, remains unproven).”

      • Muad'dib@sopuli.xyz
        link
        fedilink
        English
        arrow-up
        5
        ·
        2 days ago

        That sucks! What’s the point of putting an AI in a maze if you’re not going to poison it?

    • floofloof@lemmy.caOP
      link
      fedilink
      English
      arrow-up
      2
      ·
      2 days ago

      Some of these LLMs introduce very subtle statistical patterns into their output so it can be recognized as such. So it is possible in principle (not sure how computationally feasible when crawling) to avoid ingesting whatever has these patterns. But there will also be plenty of AI content that is not deliberately marked in this way, which would be harder to filter out.

    • Flic@mstdn.social
      link
      fedilink
      arrow-up
      2
      ·
      3 days ago

      @RejZoR @floofloof yeah AI will get worse and worse the more it trains on its own output. I can only see “walled-garden” AIs trained on specific datasets for specific industries being useful in future. These enormous “we can do everything (we can’t do anything)” LLMs will die a death.