Coupling Secret Sharing with Decentralized Server-Aided Encryption in Encrypted Deduplication
Abstract
:1. Introduction
- We are the first to leverage secret sharing to realize decentralized server-aided encryption in the encrypted deduplication storage systems, where we first use the coding matrix instead of the key to couple the encryption process with the encoding process, so as to achieve the secret sharing of both keys and data chunks in one encoding process (Section 2).
- We propose ECDedup, which realizes efficient decentralized encrypted deduplication via sharing the coding matrix to multiple key servers, and improves the matrix generation performance via two optimization methods (Section 3).
2. Background and Motivation
2.1. Encrypted Deduplication
- Basics: Deduplication is a data reduction technique that eliminates duplicate data in storage systems. In this paper, we mainly focus on chunk-based deduplication, which generally splits data files into variable-size (an average of 8 KB [9,10,11] for deduplication efficiency) chunks (called chunking) based on the boundary points detected by the Rabin algorithm (a rolling hash algorithm [12]). Each chunk is uniquely identified by a hash fingerprint (called fingerprinting) computed using a cryptographic hash function (e.g., SHA-1, SHA-256). Then, the fingerprints are indexed and only one physical copy of the duplicate data chunks is stored (called indexing), so as to save storage space.
- Server-aided MLE: However, CE is vulnerable to offline brute-force attacks since its MLE key can be publicly derived [14], and an adversary can infer the input plaintext chunk from a target ciphertext chunk without knowing the MLE key. Specifically, the adversary can enumerate all possible plaintext chunks and check if any resulting ciphertext chunk of those plaintext chunks matches the target ciphertext chunk. Server-aided MLE [1,2,4,5], which deploys a dedicated key server for MLE key generation, enhances the security of encrypted deduplication against offline brute-force attacks. When the client encrypts a plaintext chunk, a fingerprint of the plaintext chunk is first sent to the key server. The key server then generates and returns the corresponding MLE key based on this fingerprint and a global key maintained by the key server. If this global key is secure, the adversary cannot feasibly launch offline brute-force attacks. Otherwise, the security will degrade to ordinary MLE.
2.2. Secret Sharing
2.3. Limitations
2.4. Motivation and Challenges
2.5. Threat Model
- Adversarial capabilities: We consider the cloud servers and the key servers as honest-but-curious, which means that they will honestly follow the storage protocol and execute the assigned tasks in the system. However, they would like to learn as much information as possible about the encrypted outsourced data via inferring the plaintext data from the ciphertext data stored in the cloud. Moreover, some clients are also curious and would try to access unauthorized data out of the scope of their privileges. Specifically, there are two types of adversaries considered in the system:
- An inside adversary refers to a cloud storage service provider (owns the cloud servers) or any of the key servers. The inside adversary is assumed to be honest-but-curious and its goal is to infer the plaintext data from the outsourced ciphertext data it possesses to obtain useful information.
- An outside adversary refers to any external entity that attempts to gain unauthorized access to the data stored in the cloud. The outside adversary plays the role of a client, interacting with cloud servers and key servers.
- Assumptions: Our threat model makes the following assumptions: (i) All communications among the clients, the key server, and the cloud server are protected via SSL/TLS against tampering. (ii) The key manager rate-limits the client’s key generation requests to defend against online brute-force attacks [1] (see Section 2.1). (iii) To ensure data availability, we deploy remote auditing [19] for data integrity and deduplication-aware secret sharing [14] across multiple storage sites for fault tolerance.
3. Design
- Enabling decentralized server-aided encryption: ECDedup first realizes coding-based secret sharing, which replaces the encryption key with a coding matrix to achieve the decentralized server-aided encryption, where the data chunks can be encoded into the coded chunks to achieve data confidentiality, while each key server only obtains a vector of the coding matrix so that it cannot restore the full matrix (Section 3.2).
- Preserving deduplication efficiency: ECDedup designs a content-based coding matrix to generate an identical coding matrix for the identical data chunk, so as to generate an identical coded chunk and preserve the deduplication capability (Section 3.3).
- Efficient and scalable matrix generation: We design an element-based matrix generation pipeline to reduce the computation and bandwidth overhead of the matrix generation to improve generation efficiency. We further design a rolling-based matrix generation scheme to enhance the scalability of the matrix (Section 3.4).
3.1. Architecture
3.2. Coding-Based Secret Sharing
- Design idea: Our idea is that the coding matrix of the non-systematic codes can be leveraged to encrypt the plaintext chunks, such that we can achieve both encryption and encoding in a single encoding process (see Section 2.4). We also find that the coding matrix can naturally be decomposed into multiple vectors, allowing the encryption key (in this case the key is the coding matrix) to be shared, such that the server-aided encryption can be divided into multiple small encryption tasks which can be assigned to multiple key servers for parallel computation. This ensures that each key server only accesses the part of the matrix necessary for its computation.
- Sharing the coding matrix among key servers Based on the idea illustrated above, we design our coding-based secret-sharing scheme. We first decompose the coding matrix into multiple vectors and then transfer each vector to each key server, respectively. The number of vectors generated from the coding matrix is based on the number of key servers.
3.3. Content-Based Coding Matrix
- Insight: We observe that the Cauchy matrix, which is used in RS codes to encode data chunks, can be generated in a customized manner. Thus, we can generate the coding matrix based on the Cauchy matrix to obtain the identical coding matrix for the identical data chunk.
- Design idea: Based on the above observation, we propose a content-based coding matrix generation scheme to generate the identical coding matrix for the identical chunk, so as to preserve deduplication capability while encoding original data chunk in ECDedup. Our idea is that the Cauchy matrix leveraged in the encoding can be deterministically generated based on the chunk’s hash (which is derived based on the chunk’s content). For each chunk, ECDedup first generates a hash h using an (optionally salted) hash function H (e.g., SHA-256); then, we generate the elements and of the Cauchy matrix based on the hash h.
- Fixed-size sampling: To efficiently generate the coding matrix based on the hash value, we directly sample every w bits from the begining to the end of the hash value h to generate each element in two sets, X and Y, of the Cauchy matrix.
- Server-aided encryption: We then share the original content-based coding matrix with the key servers via the coding-based secret-sharing scheme (see Section 3.2). For each key server, it receives a vector from the client and calculates the corresponding vector based on its secret:
- Invertibility of generated matrices: We need to ensure that the generated Cauchy matrix is invertible, which is a necessary condition for the RS code to be able to decode the coded chunks back to the original data chunks. We can ensure the invertibility of the generated matrix by checking the determinant of the matrix. If the determinant is not zero, then the matrix is invertible. If the current matrix is not invertible, we perform an elementary transformation and then add 1 to the elements in the pivot positions (diagonal positions) that are 0 to make the matrix invertible. Algorithm 1 details the steps of handling the matrix to be full ranked. Note that only the determinants of the square matrices can be calculated, which means that this generation scheme can only generate the square matrices. This enables the sharing of the matrix to achieve decentralized server-aided encryption, yet it can not generate additional parity chunks to achieve fault tolerance. We will discuss how to optimize it and generate n parity chunks in Section 3.4.
Algorithm 1 Handle Full Rank Input: original matrix MOutput: full rank matrix M’- 1:
- procedure HandleFullRank(M)
- 2:
- CheckFullRank(M)
- 3:
- while do
- 4:
- for to do
- 5:
- 6:
- end for
- 7:
- CheckFullRank(M)
- 8:
- end while
- 9:
- return M’
- 10:
- end procedure
- 11:
- function CheckFullRank(M)
- 12:
- the rank of M
- 13:
- if M is not full rank then
- 14:
- return
- 15:
- else
- 16:
- return
- 17:
- end if
- 18:
- end function
3.4. Optimizing Matrix Generation
- Method 1: Element-based matrix generation: As stated in Section 3.3, the elements and of the coding matrix can be generated from the content-based hash of the data chunk. However, the matrix after encryption cannot be guaranteed to be invertible, such that it requires additional calculations to ensure its invertibility (see Section 3.3). Thus, we propose an element-based matrix generation to couple the server-aided encryption and the generation of elements, and we further accelerate the matrix generation process via pipelining the generation of the elements.
- Method 2: Rolling-based matrix generation: To support large values, which are often deployed in storage systems to save storage costs [22], we design the rolling-based matrix generation scheme. We leverage a fixed-size sliding window to generate the elements and based on the hash value of the data chunk. The sliding window moves 1 bits each time, and we record the w bits as an element (as in Section 3.3).
3.5. Discussion on Security
- Security of encryption key: Similar to the centralized encrypted deduplication schemes, ECDedup calculates the cryptographic hash of the chunk’s content and sends it to the key server to derive the encryption key. With the help of the key server, ECDedup can defend against offline brute-force attacks. The difference between ECDedup and the centralized schemes lies on the client side, where ECDedup generates the corresponding coding matrix based on the cryptographic hash and shares the coding matrix among multiple key servers. In this way, only part of the coding information (i.e., vectors or elements) of the coding matrix is sent to each key server, such that a single key server cannot infer the complete encryption key. Meanwhile, the information sent to each key server is actually the short hash, which is more likely to incur many hash collision (i.e., multiple chunks are mapped to the same short hashes). Thus, the key server cannot readily guess a chunk from the short hashes [1,4].
- Data confidentiality: Li [23] presented the confidentiality of the erasure-coded-based (e.g., RS codes) secret-sharing algorithm, which meets the following condition: if any sub-square matrix of the coding matrix is nonsingular, then it has confidentiality. For example, if a erasure-coded-based secret-sharing algorithm satisfies the condition, the adversary cannot reconstruct any part of the plaintext data (which is divided into k chunks) with less than k ciphertext chunks.
4. Implementation
- Chunking and fingerprinting: We first implement content-defined chunking based on Rabin fingerprinting [12] on the client, which takes the minimum, average, and maximum chunk sizes as input (by default as [2,3,4,5,6,14], we set them as 4 KB, 8 KB, and 16 KB, respectively) and computes rolling hashes over chunks via a fixed-size sliding window to identify the boundaries of each chunk. To implement coding-based encryption, the client further generates a content-based key for each data chunk via SHA-256.
- Encoding and decoding: We implement the encoding and decoding operation based on Klauspost Reed–Solomon [27], an open-source Reed–Solomon library for Golang which supports SIMD instruction sets [21] (including AVX2, SSE2, and NEON) that can accelerate the encoding and decoding speed. We use the Cauchy matrix as the coding matrix, such that the matrix can be generated based on the chunk’s content.
- Server-aided matrix generation: Recall that a client generates a content-based coding matrix for a original data chunk by generating the elements of the Cauchy matrix with the chunk’s content-based key, as in Section 3.3. For each key server, we implement a matrix generation module, which receives the vectors from clients and generates the server-aided encrypted vectors to form the content-based coding matrix.
- Deduplication: Each server implements a fingerprint index to map the fingerprint of each ciphertext chunk to the physical address where the ciphertext chunk is stored, using a key-value store based on LevelDB [28]. We also use the unique ciphertext chunks (average in KBs) as a fixed-size container (e.g., 4MB [9]) based on the locality within those chunks, such that the chunks with locality can be prefetched to mitigate the disk I/O overhead.
- Blockchain integration: We integrate blockchain into ECDedup to improve the security of the system. We manage the decentralized key servers using a private blockchain and employ zero-knowledge proofs [29] to ensure that the secret key in the key servers cannot be tampered with. We also implement hierarchical classification access control via the smart contracts of blockchain to manage the access control of the data chunks.
5. Evaluation
- ECDedup achieves the decentralized server-aided encryption with a better encryption (including the key/matrix generation and encoding) performance than the traditional decentralized server-aided encryption schemes.
- ECDedup can directly index the ciphertext chunk during deduplication on the cloud server, which is not achievable for the traditional decentralized server-aided encryption schemes, so that it can reduce the indexing time.
- ECDedup improves the upload and download performance and can speed up the whole encrypted deduplication storage system.
5.1. Setup
- Testbed: We conduct all experiments on Alibaba Cloud with a ecs.g7.xlarge instance as the client and four ecs.g7.xlarge instances as the servers. Each instance is equipped with four vCPUs and 16 GB RAM. All instances are connected via a 10 Gbps network.
- Datasets: We use four well-known deduplication datasets for evaluation, as shown in Table 1. These datasets represent various typical workloads, including website snapshots, tarred source-code files, and database snapshots.
- Configurations: In our evaluation, we compare four encryption schemes: (i) centralized server-aided encryption based on MetaDedup [3], which has only a centralized key server to handle the key generation requests from clients; (ii) traditional decentralized server-aided encryption based on GDH-Dedup [8], which has multiple key servers, and each key server handles a portion of the key generation requests from clients; and (iii) our ECDedup with two different matrix generation schemes in Section 3. Since our ECDedup only changes the encryption scheme on the client side, we set the same number of 4 servers and 4 key servers in the experiments. For the parameter of erasure coding, we set , as in the state-of-the-art centralized server-aided encryption schemes [2,3,4,14].
5.2. Experiments
- Experiment #1 (MLE key/matrix generation time): We first evaluate the key/matrix generation time of different encryption schemes on four fixed-size (e.g., 1 GB, 2 GB, 4 GB, 8 GB and 16 GB) datasets sampled from GCC dataset in Table 1. Figure 7a compares the key/matrix generation time for ECDedup, centralized server-aided encryption, and GDH-Dedup. ECDedup (pipeline) demonstrates the best performance in key/matrix generation time. Compared to centralized server-aided encryption, ECDedup (pipeline) reduces the key/matrix generation time by 14.4%, 26.9%, 33.7%, 32.1% and 32.8% for the 1 GB, 2 GB, 4 GB, 8 GB and 16 GB dataset, respectively, with a maximum reduction of 33.7%.
- Experiment #2 (Encryption and encoding time): We then evaluate the encryption and encoding time of different encryption schemes under the same setup as in Experiment #1. Figure 7b shows the encryption and encoding time of ECDedup (vanilla) and ECDedup (pipeline). We can see that the encoding time of ECDedup (pipeline) is faster than all other encryption schemes. Compared to centralized server-aided encryption, ECDedup (pipeline) reduces the encryption and encoding time by up to 69.27% for the 2 GB dataset. ECDedup (pipeline) can also reduce the encoding time by up to 51.19% compared with ECDedup (vanilla). This is because the pipeline scheme can leverage the built-in acceleration strategy of the encoding library by reusing element sequences and during encoding.
- Experiment #3 (Impact of for ECDedup): We evaluate the impact of on both the matrix generation time and encoding time of ECDedup (vanilla) and ECDedup (pipeline). We set to the commonly used parameters of RS codes (e.g., , , , and ). Figure 8a shows the matrix generation time of ECDedup (vanilla) and ECDedup (pipeline) with different settings. We can see that ECDedup (pipeline) is always faster than ECDedup (vanilla) in all cases. Both the matrix generation time and the encoding time of ECDedup (vanilla) and ECDedup (pipeline) increase with the increase in k, and the time of ECDedup (vanilla) increases much faster than ECDedup (pipeline). For example, when increases from to , the matrix generation time of ECDedup (pipeline) increases from 8.06 s to 14.45 s, while the matrix generation time of ECDedup (vanilla) increases from 17.96 s to 205.42 s, which shows that ECDedup (pipeline) is 14 times faster than ECDedup (vanilla) when .
- Experiment #4 (Client throughput): We also evaluate the client throughput of different encryption schemes under the same setup as in Experiment #1. Figure 9 shows the client throughput of different encryption schemes. We can see that ECDedup (pipeline) achieves the highest client throughput among all encryption schemes. The throughput of all encryption schemes increases with the size of the dataset and eventually stabilizes. This is because the system resources can be fully utilized by the pipeline of ECDedup (pipeline) to continuously process data when processing large amounts of data (e.g., 16 GB). Thus, ECDedup (pipeline) can effectively improve the client throughput by 43.3% to 51.9% compared to the centralized scheme.
- Experiment #5 (Average indexing time per chunk): We evaluate the average indexing time per chunk (the average chunk size is set to 8 KB, as in Section 2.1) of GDH-Dedup and our ECDedup. Figure 10 shows the average indexing time per chunk of GDH-Dedup and ECDedup. We can see that the indexing time of ECDedup is much lower than the time of GDH-Dedup. This is because different key servers in GDH-Dedup encrypt the same plaintext chunk into different ciphertext chunks, such that it requires the complex cryptographic computation for indexing a ciphertext chunk.
- Experiment #6 (Uploads and downloads): We evaluate the upload and download speed of the centralized scheme and our ECDedup on the four real-world datasets in Table 1. Figure 11 shows the upload and download performance of different encryption schemes. We can see that ECDedup (pipeline) outperforms the centralized scheme in all cases. Compared to the centralized scheme, ECDedup (pipeline) can improve the upload and download speed by up to 15.2% and 34.6%, respectively.
- Experiment #7 (Blockchain integration): We evaluate the integration of blockchain into ECDedup. Recall that we only use the blockchain to verify the secret key possessed by the key servers (see Section 4). In this experiment, we only focus on the performance overhead incurred from the blockchain integration. Table 3 shows the upload and download performance of ECDedup with and without blockchain. We can see that the blockchain integration only introduces a small overhead to the system, which is less than 0.1% of the total system overhead.
6. Related Work
- Decentralized server-aided encryption in encrypted deduplication: MLE-based encrypted deduplication are widely deployed in outsourced storage. Traditional encrypted deduplication storage systems realize the MLE construction via convergent encryption (CE), which is vulnerable to offline brute-force attacks. DupLESS proposes server-aided MLE to perform MLE key generation in a dedicated key server to defend against offline brute-force attacks. Follow-up studies enhance server-aided MLE with decentralized key generation. For example, Duan [33] leverages RSA-based threshold signatures and generate convergence keys among multiple users to address single node failure and key leakage issues via eliminating centralized key servers. Miao et al. [7] allow each key server to independently generate keys and calculate shared the keys to all other key servers via blind signature. GDH-Dedup is based on the Gap–Diffie–Hellman (GDH) group. It reduces the computational overhead by solving the decisional Diffie–Hellman problem, which is considered to be equivalent to the computational Diffie–Hellman problem, considered difficult in cryptography.
- Secret sharing: Secret sharing is first formalized by Shamir [34] to provide security and reliability for shared data. Shamir’s secret sharing scheme (SSSS) can be considered as Reed–Solomon (RS) codes, which encode k shares (including one data share and random shares) into n shares to ensure the data confidentiality of the shared data. Ramp secret-sharing scheme (RSSS) [35] improves the storage efficiency atop SSSS via dividing a secret into pieces and generate r additional random pieces for fault tolerance and security. Dekey [36] uses RSSS to disperse and store the convergent keys of data blocks in multiple key storage servers, which ensures the reliability of the keys while reducing the storage overhead of the keys. AONT-RS [15] combines Rivest’s all-or-nothing transform (AONT) [16] for confidentiality and RS codes for fault tolerance to further improve the storage efficiency while maintaining the highest confidentiality as SSSS. CAONT-RS [14] uses a deterministic to replace the random key used in AONT-RS to enable deduplication on shared data.
7. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Bellare, M.; Keelveedhi, S.; Ristenpart, T. DupLESS: Server-Aided Encryption for Deduplicated Storage. In Proceedings of the USENIX Security, Washington, DC, USA, 14–16 August 2013. [Google Scholar]
- Qin, C.; Li, J.; Lee, P.P. The design and implementation of a rekeying-aware encrypted deduplication storage system. ACM Trans. Storage 2017, 13, 1–30. [Google Scholar] [CrossRef]
- Li, J.; Lee, P.P.; Ren, Y.; Zhang, X. Metadedup: Deduplicating metadata in encrypted deduplication via indirection. In Proceedings of the IEEE MSST, Santa Clara, CA, USA, 20–24 May 2019. [Google Scholar]
- Li, J.; Yang, Z.; Ren, Y.; Lee, P.P.; Zhang, X. Balancing storage efficiency and data confidentiality with tunable encrypted deduplication. In Proceedings of the EuroSys, Heraklion Greece, 27–30 April 2020. [Google Scholar]
- Ren, Y.; Li, J.; Yang, Z.; Lee, P.P.; Zhang, X. Accelerating Encrypted Deduplication via SGX. In Proceedings of the USENIX ATC, Online, 14–16 July 2021. [Google Scholar]
- Yang, Z.; Li, J.; Lee, P.P. Secure and lightweight deduplicated storage via shielded {deduplication-before-encryption}. In Proceedings of the USENIX ATC, Carlsbad, CA, USA, 11–13 July 2022. [Google Scholar]
- Miao, M.; Wang, J.; Li, H.; Chen, X. Secure multi-server-aided data deduplication in cloud computing. Pervasive Mob. Comput. 2015, 24, 129–137. [Google Scholar] [CrossRef]
- Shin, Y.; Koo, D.; Yun, J.; Hur, J. Decentralized server-aided encryption for secure deduplication in cloud storage. IEEE Trans. Serv. Comput. 2020, 13, 1021–1033. [Google Scholar] [CrossRef]
- Xia, W.; Jiang, H.; Feng, D.; Douglis, F.; Shilane, P.; Hua, Y.; Fu, M.; Zhang, Y.; Zhou, Y. A comprehensive study of the past, present, and future of data deduplication. Proc. IEEE 2016, 104, 1681–1710. [Google Scholar] [CrossRef]
- Zhu, B.; Li, K.; Patterson, R.H. Avoiding the disk bottleneck in the data domain deduplication file system. In Proceedings of the USENIX FAST, San Jose, CA, USA, 26–29 February 2008. [Google Scholar]
- Wallace, G.; Douglis, F.; Qian, H.; Shilane, P.; Smaldone, S.; Chamness, M.; Hsu, W. Characteristics of backup workloads in production systems. In Proceedings of the USENIX FAST, San Jose, CA, USA, 14–17 February 2012. [Google Scholar]
- Rabin, M.O. Fingerprinting by Random Polynomials; Technical Report; Scientific Research Publishing: Wuhan, China, 1981. [Google Scholar]
- Douceur, J.R.; Adya, A.; Bolosky, W.J.; Simon, P.; Theimer, M. Reclaiming space from duplicate files in a serverless distributed file system. In Proceedings of the IEEE ICDCS, Vienna, Austria, 2–5 July 2002. [Google Scholar]
- Li, M.; Qin, C.; Lee, P.P. CDStore: Toward Reliable, Secure, and Cost-Efficient Cloud Storage via Convergent Dispersal. In Proceedings of the USENIX ATC, San Jose, CA, USA, 8–10 July 2015. [Google Scholar]
- Resch, J.K.; Plank, J.S. AONT-RS: Blending Security and Performance in Dispersed Storage Systems. In Proceedings of the USENIX FAST, San Jose, CA, USA, 15–17 February 2011. [Google Scholar]
- Rivest, R.L. All-or-nothing encryption and the package transform. In Proceedings of the Springer FSE, Haifa, Israel, 20–22 January 1997. [Google Scholar]
- Reed, I.; Solomon, G. Polynomial Codes over Certain Finite Fields. J. Soc. Ind. Appl. Math. 1960, 8, 300–304. [Google Scholar] [CrossRef]
- Dimakis, A.G.; Ramchandran, K.; Wu, Y.; Suh, C. A Survey on Network Codes for Distributed Storage. Proc. IEEE 2011, 99, 476–489. [Google Scholar] [CrossRef]
- Ateniese, G.; Burns, R.; Curtmola, R.; Herring, J.; Kissner, L.; Peterson, Z.; Song, D. Provable data possession at untrusted stores. In Proceedings of the ACM CCS, Alexandria, VA, USA, 28–31 October 2007. [Google Scholar]
- Plank, J.S.; Luo, J.; Schuman, C.D.; Xu, L.; OHearn, Z. A Performance Evaluation and Examination of Open-Source Erasure Coding Libraries for Storage. In Proceedings of the USENIX FAST, San Jose, CA, USA, 24–27 February 2009. [Google Scholar]
- Plank, J.S.; Greenan, K.M.; Miller, E.L. Screaming fast Galois field arithmetic using intel SIMD instructions. In Proceedings of the USENIX FAST, San Jose, CA, USA, 12–15 February 2013. [Google Scholar]
- Hu, Y.; Cheng, L.; Yao, Q.; Lee, P.P.C.; Wang, W.; Chen, W. Exploiting combined locality for Wide-Stripe erasure coding in distributed storage. In Proceedings of the USENIX FAST, San Jose, CA, USA, 23–25 February 2021. [Google Scholar]
- Li, M. On the confidentiality of information dispersal algorithms and their erasure codes. arXiv 2012, arXiv:1206.4123. [Google Scholar]
- Roth, R.M.; Lempel, A. On MDS codes via Cauchy matrices. IEEE Trans. Inf. Theory 1989, 35, 1314–1319. [Google Scholar] [CrossRef]
- Blomer, J. An XOR-Based Erasure-Resilient Coding Scheme; Technical Report at ICSI; International Computer Science Institute (ICSI): Berkeley, CA, USA, 1995. [Google Scholar]
- OpenSSL. Cryptography and SSL/TLS Toolkit. Available online: https://www.openssl.org/ (accessed on 1 November 2024).
- Reed-Solomon Erasure Coding in Go. Available online: https://github.com/klauspost/reedsolomon (accessed on 1 November 2024).
- LevelDB. Available online: https://github.com/google/leveldb (accessed on 1 November 2024).
- Fiege, U.; Fiat, A.; Shamir, A. Zero knowledge proofs of identity. In Proceedings of the STOC, New York, NY, USA, 25–27 May 1987. [Google Scholar]
- Zhang, Y.; Xia, W.; Feng, D.; Jiang, H.; Hua, Y.; Wang, Q. Finesse:Fine-Grained Feature Locality based Fast Resemblance Detection for Post-Deduplication Delta Compression. In Proceedings of the USENIX FAST, Boston, MA, USA, 25–28 February 2019. [Google Scholar]
- Linux Archives. Available online: https://www.kernel.org (accessed on 1 November 2024).
- Redis. Available online: http://redis.io/ (accessed on 1 November 2024).
- Duan, Y. Distributed key generation for encrypted deduplication: Achieving the strongest privacy. In Proceedings of the ACM CCSW, Scottsdale, AZ, USA, 3–7 November 2014; pp. 57–68. [Google Scholar]
- Shamir, A. How to share a secret. Commun. ACM 1979, 22, 612–613. [Google Scholar] [CrossRef]
- Blakley, G.R.; Meadows, C. Security of ramp schemes. In Proceedings of the CRYPTO, Santa Barbara, CA, USA, 18–22 August 1985. [Google Scholar]
- Li, J.; Chen, X.; Li, M.; Li, J.; Lee, P.P.; Lou, W. Secure deduplication with efficient and reliable convergent key management. IEEE Trans. Parallel Distrib. Syst. 2014, 25, 1615–1625. [Google Scholar] [CrossRef]
Name | Size | Workload Descriptions |
---|---|---|
Webs | 347 GB | 129 days’ snapshots of the website: news.sina.com.cn [30] |
Linux | 109 GB | 253 versions of Linux kernel source code [31]. Each version is packaged as a tar file. |
RDB | 11 GB | 10 backups of the Redis key-value store database [32]. |
GCC | 28 GB | 89 backups of the GNU compiler collection source code files [32]. |
Average chunk size | 8 KB | 16 KB | 32 KB | 64 KB |
---|---|---|---|---|
Indexing time (s) | 1726.10 | 813.92 | 312.96 | 137.15 |
Upload Time (s) | Download Time (s) | Verification Time (ms) | |
---|---|---|---|
ECDedup | 79.05 | 67.86 | 9 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Gan, C.; Wang, W.; Hu, Y.; Zhao, X.; Dun, S.; Xiao, Q.; Wang, W.; Huang, H. Coupling Secret Sharing with Decentralized Server-Aided Encryption in Encrypted Deduplication. Appl. Sci. 2025, 15, 1245. https://doi.org/10.3390/app15031245
Gan C, Wang W, Hu Y, Zhao X, Dun S, Xiao Q, Wang W, Huang H. Coupling Secret Sharing with Decentralized Server-Aided Encryption in Encrypted Deduplication. Applied Sciences. 2025; 15(3):1245. https://doi.org/10.3390/app15031245
Chicago/Turabian StyleGan, Chuang, Weichun Wang, Yuchong Hu, Xin Zhao, Shi Dun, Qixiang Xiao, Wei Wang, and Huadong Huang. 2025. "Coupling Secret Sharing with Decentralized Server-Aided Encryption in Encrypted Deduplication" Applied Sciences 15, no. 3: 1245. https://doi.org/10.3390/app15031245
APA StyleGan, C., Wang, W., Hu, Y., Zhao, X., Dun, S., Xiao, Q., Wang, W., & Huang, H. (2025). Coupling Secret Sharing with Decentralized Server-Aided Encryption in Encrypted Deduplication. Applied Sciences, 15(3), 1245. https://doi.org/10.3390/app15031245