AI models scraping copyrighted work off the internet is a very real problem. Some researchers may have found a solution.
Per MIT Technology Review, some folks at Imperial College London have released research pertaining to “copyright traps,” a method they’ve devised that could help creators figure out if AI has stolen their work. The code for these traps, which is available on GitHub, can hide bits of hidden text throughout copyrighted works that would, theoretically, later show up as smoking guns if AI models were trained on that content.
Mashable Light Speed
The idea of a copyright trap isn’t new to the world, having previously been used for other types of media — but it is new to AI. The nitty-gritty technical details are kind of a lot to parse, but the idea is that strings of gibberish text would be hidden somewhere on a page — like in the source code, for instance — and would be detectable if used to train large language models.
The researchers admitted that this is imperfect. Someone who knows to look for the traps could find and remove them, for instance. But with copyright arguments constantly happening around generative AI, it only makes sense that people would work on ways for creators to fight back.
Topics
Artificial Intelligence