1. 河北大学 网络空间安全与计算机学院,河北 保定 071002
2. 河北省高可信信息系统重点实验室,河北 保定 071002
[ "何欣枫(1976—),男,副教授,E-mail:[email protected]" ]
杨琴琴(1995—),女,河北大学硕士研究生,E-mail:[email protected]
纸质出版日期:2024-1-20,
网络出版日期:2023-8-30,
收稿日期:2022-10-25,
扫 描 看 全 文
何欣枫, 杨琴琴. 面向云存储的数据流行度去重方案[J]. 西安电子科技大学学报, 2024,51(1):187-200.
Xinfeng HE, Qinqin YANG. Deduplication scheme with data popularity for cloud storage[J]. Journal of Xidian University, 2024,51(1):187-200.
何欣枫, 杨琴琴. 面向云存储的数据流行度去重方案[J]. 西安电子科技大学学报, 2024,51(1):187-200. DOI: 10.19665/j.issn1001-2400.20230205.
Xinfeng HE, Qinqin YANG. Deduplication scheme with data popularity for cloud storage[J]. Journal of Xidian University, 2024,51(1):187-200. DOI: 10.19665/j.issn1001-2400.20230205.
随着云计算的发展
企业和个人倾向于把数据外包给云存储服务器来缓解本地存储压力
导致云端存储压力成为一个日益突出的问题。为了提高云存储效率
降低通信成本
数据去重技术得到了广泛应用。现有的数据去重技术主要包括基于哈希表的相同数据去重和基于布隆过滤器的相似数据去重
但都很少考虑数据流行度的影响。实际应用中
用户外包给云服务器的数据分布是不均匀的
根据访问频率可以划分为流行数据和非流行数据。流行数据访问频繁
在云服务器中会存在大量的副本和相似数据
需要执行高精度的数据去重;而非流行数据访问频率低
云存储服务器中的副本数量和相似数据较少
低精度的去重即可满足要求。针对上述问题
将数据流行度和布隆过滤器相结合
提出一种基于数据流行度的动态布隆过滤器;同时
提出一种基于数据流行度的动态布隆过滤器的数据去重方案
可以根据数据流行度动态调整去重精度。仿真结果表明
该方案在时间消耗、空间消耗和误判率之间取得了良好的平衡。
With the development of cloud computing
more enterprises and individuals tend to outsource their data to cloud storage providers to relieve the local storage pressure
and the cloud storage pressure is becoming an increasingly prominent issue.To improve the storage efficiency and reduce the communication cost
data deduplication technology has been widely used.There are identical data deduplication based on the hash table and similar data deduplication based on the bloom filter
but both of them rarely consider the impact of data popularity.In fact
the data outsourced to the cloud storage can be divided into popular and unpopular data according to their popularity.Popular data refer to the data which are frequently accessed
and there are numerous duplicate copies and similar data in the cloud
so high-accuracy deduplication is required.Unpopular data
which are rarely accessed
have fewer duplicate copies and similar data in the cloud
and low-accuracy deduplication can meet the demand.In order to address this problem
a novel bloom filter variant named PDBF(popularity dynamic bloom filter) is proposed
which incorporates data popularity into the bloom filter.Moreover
a PDBF-based deduplication scheme is constructed to perform different degrees of deduplication depending on how popular a datum is.Experiments demonstrate that the scheme makes an excellent tradeoff among the computational time
the memory consumption
and the deduplication efficiency.
云计算云存储数据去重数据流行度布隆过滤器
cloud computingcloud storagedata deduplicationdata popularitybloom filter
IDC CORPORATE.世界的数字化-从边缘到核心(2018)[R/OL].[2022-01-01].https://max.book118.com/html/2019/0325/6020240050002020.shtm. https://max.book118.com/html/2019/0325/6020240050002020.shtmhttps://max.book118.com/html/2019/0325/6020240050002020.shtm
DOUCEUR J R, ADYA A, BOLOSKY W J, et al. Reclaiming Space from Duplicate Files in a Serverless Distributed File System[C]//Proceedings 22nd International Conference on Distributed Computing Systems. Piscataway:IEEE, 2002:617-624.
BELLARE M, KEELVEEDHI S, RISTENPART T. Message-Locked Encryption and Secure Deduplication[C]//Proceedings of the Annual International Conference on the Theory and Applications of Cryptographic Techniques. Berlin:Springer, 2013:296-312.
CHEN R M, MU Y, YANG G M, et al. BL-MLE:Block-Level Message-Locked Encryption for Secure Large File Deduplication[J]. IEEE Transactions on Information Forensics & Security, 2015, 10(12):2643-2652.
LI J, LI Y K, CHEN X F, et al. A Hybrid Cloud Approach for Secure Authorized Deduplication[J]. IEEE Transactions on Parallel and Distributed Systems, 2015, 26(5):1206-1216.
TANG X, ZHOU L N, HUANG Y F, et al, Efficient Cross-User Deduplication of Encrypted Data Through Re-Encryption[C]//2018 17th IEEE International Conference on Trust,Security and Privacy in Computing and Communications/ 12th IEEE International Conference on Big Data Science and Engineering(TrustCom/BigDataSE). Piscataway:IEEE, 2018:897-904.
YUAN H R, CHEN X F, LI J, et al. Secure Cloud Data Deduplication with Efficient Re-Encryption[J]. IEEE Transactions on Services Computing, 2022, 15(1):442-456.
李雪莲, 张夏川, 高军涛, 等. 支持属性和代理重加密的区块链数据共享方案[J]. 西安电子科技大学学报, 2022, 49(1):1-16.
LI Xuelian, ZHANG Xiachuan, GAO Juntao, et al. Blockchain Data Sharing Scheme Supporting Attribute and Proxy Re-Encryption[J]. Journal of Xidian University, 2022, 49(1):1-16.
STANEK J, SORNIOTTI A, ANDROULAKI E, et al. A Secure Data Deduplication Scheme for Cloud Storage[R]. Berlin:Springer, 2014.
STANEK J, KENCL L. Enhanced Secure Thresholded Data Deduplication Scheme for Cloud Storage[J]. IEEE Transactions on Dependable and Secure Computing, 2018, 15(4):694-707.
PUZIO P, MOLVA R, ÖNEN M, et al. PerfectDedup:Secure Data Deduplication[C]//Data Privacy Management and Autonomous Spontaneous Security. Berlin:Springer, 2015:150-166.
高文静, 咸鹤群, 田呈亮, 等. 基于双层加密的云存储数据去重方法[J]. 密码学报, 2020, 7(5):698-712.
GAO Wenjing, XIAN Hequn, TIAN Chengliang, et al. A Cloud Storage Deduplication Method Based on Double-Layered Encryption[J]. Journal of Cryptologic Research, 2020, 7(5):698-712.
高文静, 咸鹤群, 程润辉. 基于双层加密和密钥共享的云数据去重方法[J]. 计算机学报, 2021, 44(11):2203-2215.
GAO Wenjing, XIANHequn, CHENG Runhui. A Cloud Data Deduplication Method Based on Double-Layered Encryption and Key Sharing[J]. Chinese Journal of Computers, 2021, 44(11):2203-2215.
HA G X, CHEN H, JIA C F, et al. A Secure Deduplication Scheme Based on Data Popularity with Fully Random Tags[C]// 2021 IEEE 20th International Conference on Trust,Security and Privacy in Computing and Communications(TrustCom). Piscataway:IEEE, 2021:207-214.
哈冠雄, 贾巧雯, 陈杭, 等. 无第三方服务器的基于数据流行度的加密去重方案[J]. 通信学报, 2022, 43(8):17-29.
HA Guanxiong, JIA Qiaowen, Chen Hang, et al. Data Popularity-Based Encrypted Deduplication Scheme without Third-Party Servers[J]. Journal on Communications, 2022, 43(8):17-29.
HE Y L, XIAN H Q, WANG L M, et al. Secure Encrypted Data Deduplication Based on Data Popularity[J]. Mobile Networks and Applications, 2021, 26(4):1686-1695.
BLOOM B H. Space/Time Trade-offs in Hash Coding with Allowable Errors[J]. Communications of the ACM, 1970, 13(7):422-426.
FAN L, CAO P, ALMEIDA J, et al. Summary Cache:A Scalable Wide-Area Web Cache Sharing Protocol[J]. IEEE/ACM Transactions on Networking, 2000, 8(3):281-293.
GUO D K, WU J, CHEN HH, et al. The Dynamic Bloom Filters[J]. IEEE Transactions on Knowledge and Data Engineering, 2010, 22(1):120-133.
BRUCK J, GAO J, JIANG A X. Weighted Bloom Filter[C]//2006 IEEE International Symposium on Information Theory. Piscataway:IEEE, 2006:2304-2308.
YAN X A, SHI W Q, TIAN H. Cloud Storage Security Deduplication Scheme Based on Dynamic Bloom Filter[J]. Journal of Information Processing Systems, 2019, 15(6):1265-1276.
PATGIRI R, NAYAK S, BORGOHAIN S K. PassDB:A Password Database with Strict Privacy Protocol Using 3D Bloom Filter[J]. Information Sciences, 2020, 539:157-176.
CHHABRA N, BALA M. An Optimized Data Duplication Strategy for Cloud Computing:Dedup with ABE and Bloom Filters[J]. International Journal of Future Generation Communication and Networking, 2020, 13(1):824-834.
咸鹤群, 高原, 穆雪莲, 等. 基于阈值动态调整的重复数据删除方案[J]. 软件学报, 2021, 32(11):3563-3575.
XIAN Hequn, GAO Yuan, MU Xuelian, et al. Deduplication Scheme Based on Threshold Dynamic Adjustment[J]. Journal of Software, 2021, 32(11):3563-3575.
TARASOV V, MUDRANKIT A, BUIK W, et al. Generating Realistic Datasets for Deduplication Analysis[C]//2012 USENIX Annual Technical Conference(USENIX ATC 12). Berkeley:USENIX, 2012:261-272.
0
浏览量
6
下载量
0
CSCD
关联资源
相关文章
相关作者
相关机构