成果转移转化部

基于模拟对抗的鲁棒模型水印
中山大学计算机学院 信息技术教育部重点实验室 广东省信息安全技术重点实验室 河南省科学院应用物理研究所 河南省物联网感知技术与系统重点实验室 数学工程与先进计算国家重点实验室 北京国瑞数智技术有限公司
2024-01-01
模型水印 水印移除攻击 鲁棒性 后门水印
物联网(Internet of Things,IoT)的快速发展极大地提高了人们的生活与生产效率。在这个过程中,深度神经网络模型在物联网的数据处理及智能化方面起着至关重要的作用。为了防止模型在未经授权的情况下被使用,模型水印技术已成为一种有效的版权保护手段。模型所有者可以在模型发布前在模型中嵌入特定的水印行为,通过检测是否存在水印行为来鉴别潜在的盗版模型。然而,模型窃取者可以采用低成本方法,在几乎不影响模型性能的情况下移除水印,从而逃避版权验证。为了解决这一问题,一种创新的基于模拟对抗的鲁棒模型水印方法被提出。该方法的核心在于优化一组水印样本,确保模型即使在遭受水印移除攻击后,水印样本仍能触发水印行为。具体而言,通过分析水印移除攻击的共同特性,构建了模拟这些攻击的水印移除仿真器和模拟无水印状态下模型表现的干净模型仿真器,再利用这些仿真器共同指导水印样本的优化。在CIFAR-10和CIFAR-100数据集上的实验结果显示,所提出的鲁棒模型水印在面对多种水印移除攻击方法时均表现出良好的抵抗能力,证明了该方法的有效性和实用性。%The development of the Internet of Things(IoT)has significantly improved people's lives and productivity.In this pro-cess,deep neural network models play a crucial role in data processing and intelligence within IoT.To prevent unauthorized use of mod-els,model watermarking has been emerged as an effective means of copyright protection.Model owners can embed specific watermark behaviors into the models before release and detect the watermark behaviors to identify potential pirated models.However,adversaries can use low-cost methods to remove watermarks with minimal impact on model performance,thus evading copyright verification.To ad-dress this problem,an innovative robust model watermarking method based on adversarial simulation was proposed.The method opti-mized a set of watermark samples to ensure that the watermark samples could trigger the watermark behaviors even after undergoing wa-termark removal attacks.Specifically,by analyzing the common characteristics of watermark removal attacks,a watermark removal sim-ulator was constructed to mimic these attacks and a clean model simulator was constructed to emulate the model's performance without watermarks.These simulators were used together to guide the optimization of the watermark samples.Experiments were conducted on CIFAR-10 and CIFAR-100 datasets.The results show that the proposed robust model watermarking method exhibits strong resistance to various watermark removal attacks,demonstrating its effectiveness and practicality.