面向数据质量的隐私保护多分类LR方案

曹来成; 吴文涛; 冯涛; 郭显

doi:10.19665/j.issn1001-2400.20230601

您当前的位置：

首页 >

文章列表页 >

面向数据质量的隐私保护多分类LR方案

网络空间安全 | 更新时间：2023-11-30

- 面向数据质量的隐私保护多分类LR方案
- 暂无标题
- 西安电子科技大学学报 2023年50卷第5期页码：188-198
- 作者机构：
  
  兰州理工大学计算机与通信学院,甘肃兰州 730050
- 作者简介：
  
  [ "曹来成(1965—),男,教授,E-mail:[email protected];" ]
  [ "吴文涛(1996—),男,兰州理工大学硕士研究生,E-mail:[email protected];" ]
  [ "冯涛(1970—),男,教授,E-mail:[email protected];" ]
  [ "郭显(1971—),男,教授,E-mail:[email protected]" ]
- 基金信息：
  
  国家自然科学基金(61562059);国家自然科学基金(61461027);甘肃省自然科学基金(20JR5RA467)
- DOI：10.19665/j.issn1001-2400.20230601
  中图分类号：
扫描看全文
曹来成, 吴文涛, 冯涛, 等. 面向数据质量的隐私保护多分类LR方案[J]. 西安电子科技大学学报, 2023,50(5):188-198.
曹来成, 吴文涛, 冯涛, 等. 面向数据质量的隐私保护多分类LR方案[J]. 西安电子科技大学学报, 2023,50(5):188-198. DOI： 10.19665/j.issn1001-2400.20230601.

DOI：

摘要

为了保护机器学习中多分类逻辑回归模型的隐私,保证训练数据质量并减少计算和通信开销,提出了一种面向数据质量的隐私保护多分类逻辑回归方案。首先,基于近似数算术同态加密技术,利用批处理技术和单指令多数据机制将多条消息打包成一个密文,安全地将加密的向量移位成明文向量对应的密文。其次,采用“一对其余”的拆解策略,通过训练多个分类器,将二分类逻辑回归模型推广到多分类。最后,将训练数据集划分为多个固定大小的矩阵,这些矩阵仍然保留完整的样本信息数据结构;用固定的海森方法优化模型参数,使其适用于任何情况并保证参数隐私。在模型训练期间,该方案能够减轻数据的稀疏性,并保证数据质量。安全性分析显示,整个过程中能够保证训练模型和用户数据信息都不被泄漏,同时实验表明,该方案的训练准确率比现有方案有了较大提升,与未加密数据训练得到的准确率几乎相同,且该方案具有更低的计算开销。

Abstract

In order to protect the privacy of the multi-classification logistic regression model in machine learning,ensure the quality of training data,and reduce the computing and communication costs,a privacy preserving multi-classification logistic regressions cheme for data quality is proposed.First,based on the homomorphic encryption for arithmetic of approximate numbers technology,the batch processing technology and single-instruction multi-data mechanism are used to package multiple messages into one ciphertext,and the encrypted vector is safely shifted into the ciphertext corresponding to the plaintext vector.Second,the binary logistic regression model is extended to multiple classifications by training multiple classifiers using the "One vs Rest" disassembly strategy.Finally,the training data set is divided into several matrices of a fixed size,which still retain the complete data structure of the sample information.The fixed Hessian method is used to optimize the model parameters so that they can be used in any case and keep the parameters private.during model training.The scheme can reduce data sparsity and ensure data quality.The security analysis shows that the training model and user data information cannot be leaked in the whole process.Meanwhile,the experiment shows that the training accuracy of this scheme is greatly improved compared with the existing scheme and almost the same as that obtained by training unencrypted data,and that the scheme has a lower computing cost.

关键词

同态加密云计算逻辑回归隐私保护数据质量

Keywords

homomorphic encryptioncloud computinglogical regressionprivacy-preservingdata quality

references

XU W, WANG B, LIU J, et al. Toward Practical Privacy-Preserving Linear Regression[J]. Information Sciences, 2022, 596:119-136. DOI:10.1016/j.ins.2022.03.023http://doi.org/10.1016/j.ins.2022.03.023https://linkinghub.elsevier.com/retrieve/pii/S0020025522002225https://linkinghub.elsevier.com/retrieve/pii/S0020025522002225

CHEN Y, HUANG R, YANG B. Efficient Batch Fully Homomorphic Encryption with a Shorter Key from Ring-LWE[J]. Applied Sciences, 2022, 12(17):8420. DOI:10.3390/app12178420http://doi.org/10.3390/app12178420https://www.mdpi.com/2076-3417/12/17/8420https://www.mdpi.com/2076-3417/12/17/8420

AHARONI E, DRUCKER N, EZOV G, et al. Complex Encoded Tile Tensors:Accelerating Encrypted Analytics[J]. IEEE Security & Privacy, 2022, 20(5):35-43.

DENG W, PENG Y, YANG F, et al. Feature Optimization and Hybrid Classification for Malicious Web Page Detection[J]. Concurrency and Computation:Practice and Experience, 2022, 34(16):e5859. DOI:10.1002/cpe.v34.16http://doi.org/10.1002/cpe.v34.16https://onlinelibrary.wiley.com/toc/15320634/34/16https://onlinelibrary.wiley.com/toc/15320634/34/16

SINHA S, SAHA S, ALAM M, et al. Exploring Bitslicing Architectures for Enabling FHE-Assisted Machine Learning[J]. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 2022, 41(11):4004-4015. DOI:10.1109/TCAD.2022.3204909http://doi.org/10.1109/TCAD.2022.3204909https://ieeexplore.ieee.org/document/9920696/https://ieeexplore.ieee.org/document/9920696/

JANG J, LEE Y, KIM A, et al. Privacy-Preserving Deep Sequential Model with Matrix Homomorphic Encryption[C]//Proceedings of the 2022 ACM on Asia Conference on Computer and Communications Security. New York: ACM, 2022:377-391.

YAN J, CAO J. Privacy Preservation of Optimization Algorithm over Unbalanced Directed Graph[J]. IEEE Transactions on Network Science and Engineering, 2022, 9(4):2164-2173. DOI:10.1109/TNSE.2022.3155481http://doi.org/10.1109/TNSE.2022.3155481https://ieeexplore.ieee.org/document/9723541/https://ieeexplore.ieee.org/document/9723541/

JIA H, ALDEEN M S, ZHAO C, et al. Flexible Privacy-Preserving Machine Learning:When Searchable Encryption Meets Homomorphic Encryption[J]. International Journal of Intelligent Systems, 2022, 37(11):9173-9191. DOI:10.1002/int.v37.11http://doi.org/10.1002/int.v37.11https://onlinelibrary.wiley.com/toc/1098111x/37/11https://onlinelibrary.wiley.com/toc/1098111x/37/11

FU F, LIU S, CHENG Y. Vertical Federated Logistic Regression via Homomorphic Encryption and Secret Sharing[J]. Information and Communications Technology and Policy, 2022, 48(5):34-44.

ZHAO J, ZHU H, WANG F, et al. ACCEL:An Efficient and Privacy-Preserving Federated Logistic Regression Scheme over Vertically Partitioned Data[J]. Science China Information Sciences, 2022, 65(7):1-2.

EDEMACU K, KIM J W. Multi-Party Privacy-Preserving Logistic Regression with Poor Quality Data Filtering for IoT Contributors[J]. Electronics, 2021, 10(17):2049. DOI:10.3390/electronics10172049http://doi.org/10.3390/electronics10172049https://www.mdpi.com/2079-9292/10/17/2049https://www.mdpi.com/2079-9292/10/17/2049

YANG S, HUANG X. Universal Product Learning with Errors:A New Variant of LWE for Lattice-based Cryptography[J]. Theoretical Computer Science, 2022, 915:90-100. DOI:10.1016/j.tcs.2022.02.032http://doi.org/10.1016/j.tcs.2022.02.032https://linkinghub.elsevier.com/retrieve/pii/S0304397522001268https://linkinghub.elsevier.com/retrieve/pii/S0304397522001268

SONG D, VOLD A, MADAN K, et al. Multi-Label Legal Document Classification:A Deep Learning-Based Approach with Label-Attention and Domain-Specific Pre-Training[J]. Information Systems, 2022, 106:101718. DOI:10.1016/j.is.2021.101718http://doi.org/10.1016/j.is.2021.101718https://linkinghub.elsevier.com/retrieve/pii/S0306437921000016https://linkinghub.elsevier.com/retrieve/pii/S0306437921000016

NGUYEN T, KARUNANAYAKE N, WANG S, et al. Privacy-Preserving Spam Filtering Using Homomorphic and Functional Encryption[J]. Computer Communications, 2023, 197:230-241. DOI:10.1016/j.comcom.2022.11.002http://doi.org/10.1016/j.comcom.2022.11.002https://linkinghub.elsevier.com/retrieve/pii/S0140366422004261https://linkinghub.elsevier.com/retrieve/pii/S0140366422004261

WIESE M, BOCHE H. Mosaics of Combinatorial Designs for Information-Theoretic Security[J]. Designs,Codes and Cryptography, 2022, 90(3):593-632. DOI:10.1007/s10623-021-00994-1http://doi.org/10.1007/s10623-021-00994-1

浏览量

下载量

CSCD

文章被引用时，请邮件提醒。

提交

工具集

关联资源

云计算环境下加密图像检索

一种高效的联邦学习隐私保护方案

面向医疗数据的隐私保护联邦学习架构

面向ASPE的抗合谋攻击图像检索隐私保护方案

反迁移学习的隐私保护联邦学习