Why is it so hard to request these AI companies to provide the sources used for their dataset?
Then if most are illegally acquired, seized the models. The AI company could retrain a model from scratch but with legally acquired data.
You must log in or # to comment.
Most will not be criminal liability but more likely civil liability. By admitting to using datasets containing intellectual property or other information that infringes on others’ rights, those parties can sue them. But by keeping it all under a mysterious black box and blending everything together, any specific rights holder would have a difficult time putting together a case.
You’ve answered your own question - transparency is potentially self-incriminating.