Some of these LLMs introduce very subtle statistical patterns into their output so it can be recognized as such. So it is possible in principle (not sure how computationally feasible when crawling) to avoid ingesting whatever has these patterns. But there will also be plenty of AI content that is not deliberately marked in this way, which would be harder to filter out.
Some of these LLMs introduce very subtle statistical patterns into their output so it can be recognized as such. So it is possible in principle (not sure how computationally feasible when crawling) to avoid ingesting whatever has these patterns. But there will also be plenty of AI content that is not deliberately marked in this way, which would be harder to filter out.