This is Ai poisoning. Blocking it you just make it not learn. Feeding it bullshit poisons its knowledge making it hallucinate.
I also wonder how Ai crawlers know what wasn’t already generated by Ai, potentially “inbreeding” knowledge as I call it with Ai hallucinations of the past.
When whole Ai craze began, everything online was human made basically. Not anymore now. It’ll just get worse if you ask me.
The scary part is even humans don’t really have a proper escape mechanism for this kind of misinformation. Sure we can spot AI a lot of the time but there are also situations where we can’t and it kind of leaves us only trusting people we already knew before AI, and being more and more distrustful of information in general.
I’m constantly worried that what I’m seeing/hearing is fake. It’s going to get harder and harder to find older information on the internet too.
Shit, it’s crept outside of the internet actually. Family buys my kids books for Christmas and birthdays and I’m checking to make sure they aren’t AI garbage before I ever let them look at it because someone bought them an AI book already without realizing it.
I don’t really understand what we hope to get from all of this. I mean, not really. Maybe if it gets to a point where it can truly be trusted, I just don’t see how.
I don’t really understand what we hope to get from all of this.
Well, even among the most moral devs, the garbage output wasn’t intended, and no one could have predicted the pace at which it’s been developing. So all this is driving a real need for in-person communities and regular contact—which is at least one great result, I think.
Kind of. They’re actually trying to avoid this according to the article:
“The company says the content served to bots is deliberately irrelevant to the website being crawled, but it is carefully sourced or generated using real scientific facts—such as neutral information about biology, physics, or mathematics—to avoid spreading misinformation (whether this approach effectively prevents misinformation, however, remains unproven).”
Some of these LLMs introduce very subtle statistical patterns into their output so it can be recognized as such. So it is possible in principle (not sure how computationally feasible when crawling) to avoid ingesting whatever has these patterns. But there will also be plenty of AI content that is not deliberately marked in this way, which would be harder to filter out.
@RejZoR@floofloof yeah AI will get worse and worse the more it trains on its own output. I can only see “walled-garden” AIs trained on specific datasets for specific industries being useful in future. These enormous “we can do everything (we can’t do anything)” LLMs will die a death.
This is Ai poisoning. Blocking it you just make it not learn. Feeding it bullshit poisons its knowledge making it hallucinate.
I also wonder how Ai crawlers know what wasn’t already generated by Ai, potentially “inbreeding” knowledge as I call it with Ai hallucinations of the past.
When whole Ai craze began, everything online was human made basically. Not anymore now. It’ll just get worse if you ask me.
The scary part is even humans don’t really have a proper escape mechanism for this kind of misinformation. Sure we can spot AI a lot of the time but there are also situations where we can’t and it kind of leaves us only trusting people we already knew before AI, and being more and more distrustful of information in general.
Holy shit, this.
I’m constantly worried that what I’m seeing/hearing is fake. It’s going to get harder and harder to find older information on the internet too.
Shit, it’s crept outside of the internet actually. Family buys my kids books for Christmas and birthdays and I’m checking to make sure they aren’t AI garbage before I ever let them look at it because someone bought them an AI book already without realizing it.
I don’t really understand what we hope to get from all of this. I mean, not really. Maybe if it gets to a point where it can truly be trusted, I just don’t see how.
Well, even among the most moral devs, the garbage output wasn’t intended, and no one could have predicted the pace at which it’s been developing. So all this is driving a real need for in-person communities and regular contact—which is at least one great result, I think.
Kind of. They’re actually trying to avoid this according to the article:
“The company says the content served to bots is deliberately irrelevant to the website being crawled, but it is carefully sourced or generated using real scientific facts—such as neutral information about biology, physics, or mathematics—to avoid spreading misinformation (whether this approach effectively prevents misinformation, however, remains unproven).”
That sucks! What’s the point of putting an AI in a maze if you’re not going to poison it?
Whoa I never considered AI inbreeding as a death for AI 🤔
Some of these LLMs introduce very subtle statistical patterns into their output so it can be recognized as such. So it is possible in principle (not sure how computationally feasible when crawling) to avoid ingesting whatever has these patterns. But there will also be plenty of AI content that is not deliberately marked in this way, which would be harder to filter out.
@RejZoR @floofloof yeah AI will get worse and worse the more it trains on its own output. I can only see “walled-garden” AIs trained on specific datasets for specific industries being useful in future. These enormous “we can do everything (we can’t do anything)” LLMs will die a death.