IBM patenting watermark technology to protect ownership of AI models

Digital properties such as photos and videos get stolen frequently, which is why many creators of such forms of content employ watermarking methodologies so that ownership is easier to claim in case of theft. Another less-known digital property that can get stolen are artificial intelligence (AI) models, that have been developed by researchers after months, and sometimes years of effort.

IBM is now developing a technique to allow AI researchers to "watermark" these models. The technology is currently "patent-pending".

IBM says that it showcased its research regarding watermarking models developed by deep neural networks (DNNs) at the AsiaCCS ’18 conference, where it was proven to be highly robust. As a result, it is now patenting the concept, which details a remote verification mechanism to determine the ownership of DNN models using simple API calls.

The company explains that it has developed three watermark generation algorithms, described below:

Embedding meaningful content together with the original training data as watermarks into the protected DNNs,

Embedding irrelevant data samples as watermarks into the protected DNNs, and

Embedding noise as watermarks into the protected DNNs.

IBM says that in its internal testing using several datasets such as MNIST, a watermarked DNN model triggers an "unexpected but controlled response".

The company clarifies that this isn't the first time that this idea has been floated around, but previous methodologies were limited in potential due to their need to access the parameters of the model to determine ownership.

On the other hand, IBM's patent-pending technology is immune to watermark removal techniques such as parameter tuning and pruning. That said, its methodology does have some limitations including failure to detect ownership if the model is deployed internally instead of online. It goes on to say that:

In addition, our current watermarking framework cannot protect the DNN models from being stolen through prediction APIs, whereby attackers can exploit the tension between query access and confidentiality in the results to learn the parameters of machine learning models. However, such attacks have only been demonstrated to work well in practice for conventional machine learning algorithms with fewer model parameters such as decision trees and logistic regressions.

IBM has stated that it is currently considering using the watermarking framework internally, and will "explore" how to make it available to clients as well. You can check out the relevant research paper here.