At the tail end of 2022, ChatGPT took the world by storm. It has since inspired many new innovations like simplifying medical notes. But it also inspired many to use it as a shortcut to avoid putting effort into writing. AI Plagiarism, as it came to be known, gradually became a big problem, especially in education. But it wasn’t unpredicted. In fact, OpenAI researchers talked about ChatGPT watermarking for a long time. But why isn’t it out yet?
OpenAI won’t watermark ChatGPT text because its users could get caught
Some ChatGPT users said they’d use ChatGPT less if it included watermarks.https://t.co/G4IuR7DizQ— Acorn Protocol (@AcornProtocol) August 13, 2024
Stylistic vs. cryptographic approach
For the past few years, users have been challenged to solve the problem independently with AI detectors. However this was never 100% precise; not only do they sometimes have false positives, but users could tailor their prompts to make the AI avoid typical stylistic patterns. Just asking the AI to write the output in a human style can often be enough to fool them.
The ChatGPT watermarking described by OpenAI researcher Scott Aaronson works on a cryptographic level. By not having a fully random seed but a pseudo-random one, the end user won’t be able to notice any difference. Meanwhile, OpenAI could, at any time, tell if it was generated. Moving around a few words or sentences doesn’t break the key — one would have to paraphrase the whole thing. Considering this was announced during a lecture before the public launch of ChatGPT, it was promising.
How to fool AI Content Detectors!https://t.co/GYLmym1qD5 pic.twitter.com/sePmIzOhaz
— Gonçalo Ferraz (@goncaloferrax) June 2, 2023
Mission impossible
Finally, on August 4, 2024, there was an update on the ChatGPT watermark. In a blog post update, OpenAI announced that it is still up for consideration. They mention that, while a solution works, it is still trivial for bad actors to circumvent it by tricking the model. One way is by telling it to put a special character between every word and then removing it. They are now considering how metadata could be used for the same purpose, but that’s still in its early stages.
One question that arises is, would this change anything? Even if OpenAI found the trick, users could just move to the next AI. Again, Aaronson spoke on this too. He explained that anyone who wanted to be a responsible player would include such measures, mentioning others like Google and Facebook are already on pace. What he didn’t expect was that many smaller companies entered this game too, and many of them won’t be responsible. At this point, it’s unlikely AI Plagiarism will totally be rooted out. However, I believe we’ll continue finding ways to combat it, just like traditional plagiarism.
YouTube: Scott Aaronson Talks AI Safety
By clicking play, you agree to YouTube's Terms of Service and Privacy Policy. Data may be shared with YouTube/Google.
Photo credit: The feature image is symbolic and has been taken by Solen Feyissa.
Source: Scott Aaronson
