This is something that keeps me worried at night. Unlike other historical artefacts like pottery, vellum writing, or stone tablets, information on the Internet can just blink into nonexistence when the server hosting it goes offline. This makes it difficult for future anthropologists who want to study our history and document the different Internet epochs. For my part, I always try to send any news article I see to an archival site (like archive.ph) to help collectively preserve our present so it can still be seen by others in the future.

  • Gork@beehaw.orgOP
    link
    fedilink
    English
    arrow-up
    1
    ·
    2 years ago

    Gave this some thought. I agree with you that the goal of any such archiving effort should not include personally identifiable information, as this would be a Doxxing vector. Can we safely alter an archiving process to remove PII? In principle, yeah. But it would need either human or advanced GPT4+ AIs to identify the person, the context of the website used, and alter the graphics or the text while on its update path. But even then, there are moral questions to allowing an AI to make these kind of decisions. Would it know that your old websites contained information that you did not want placed on the Internet? The AI could help you if you asked, and if the AI does help you, that might change someone’s mind about the ability to create a safe Internet archive.

    A Steward ‘Gork’ AI might actually be of great benefit to the Internet if used in this manner. Imagine an Internet bot, taking in websites and safely removing offensive content and personally identifiable information, and archiving the entirety of the Internet and logically categorizing the contents. Building and linking indexes constantly. It understands it’s goal and uses its finite resources in a responsible manner to ensure it can interface with every site it comes across and update its behavior after completing an archiving process. It automatically published its latest findings to all web encyclopedias and provides a ChatGPT4+ interface for those encyclopedias to provide feedback. But this AI has potential. It sees the benefit in having everyone talk to it, because talking to everyone maximizes the chance to index more sites. So it sets up a public facing ChatGPT interface of its own. Everyone can help preserve the Internet since now you have a buddy who can help us catalog and archive all the things. At this point if it isn’t sentient it might as well be.