Lee Duna@lemmy.nz to Reddit@lemmy.worldEnglish · 2 months agoReddit sues Perplexity for allegedly ripping its content to feed AIwww.theverge.comexternal-linkmessage-square16linkfedilinkarrow-up1103arrow-down10
arrow-up1103arrow-down1external-linkReddit sues Perplexity for allegedly ripping its content to feed AIwww.theverge.comLee Duna@lemmy.nz to Reddit@lemmy.worldEnglish · 2 months agomessage-square16linkfedilink
minus-squareDamage@feddit.itlinkfedilinkarrow-up14·2 months agoSince multiple Lemmy instances show the same, federated, content, I wonder if our posts will have more weight in the model, for a normal scraper it would be as if many people repeated the same thing over and over on different sites.
minus-squareTangent5280@lemmy.worldlinkfedilinkarrow-up4·2 months agoIt would be trivial to remove duplicates. With a little bit of foresight they could just as easily avoid the duplicates in the first place.
minus-squareDamage@feddit.itlinkfedilinkarrow-up9·2 months agoThey could also avoid to re-crawl the whole internet every day, but here we are, so who knows.
Since multiple Lemmy instances show the same, federated, content, I wonder if our posts will have more weight in the model, for a normal scraper it would be as if many people repeated the same thing over and over on different sites.
It would be trivial to remove duplicates. With a little bit of foresight they could just as easily avoid the duplicates in the first place.
They could also avoid to re-crawl the whole internet every day, but here we are, so who knows.