Efficient Implementation of NIST LWC ESTATE Algorithm Using OpenCL and Web Assembly for Secure Communication in Edge Computing Environment
Abstract
:1. Introduction
Contribution
- 1
- Web-based application edge computing method using Web AssemblyAs the number of users using cloud computing services increases, so does the amount of data that must be processed. So, there is a system load in the process of providing the service. So, the edge computing approach was created. The edge computing method transmits and processes data to hardware, such as ARM, RISC-V, and AVR, to reduce system load. However, this method has the disadvantage of having to implement the service differently using each hardware environment and programming language. So, we propose a web-based application edge computing method using Web Assembly. Web Assembly was created to show similar performance to a low-level language. The web-based application edge computing method has the advantage that it can be used in common without additional modification in PCs, smartphones, and IoT devices that can use web-based applications. In addition, the edge computing method communicates data between server and web-base application, web-base application and user, and web-base application. So, the ESTATE algorithm that can generate the encryption process and tag for authentication at once is implemented using Web Assembly to provide edge computing services. Check how far Web Assembly has caught up with the low-level language in terms of performance. Web Assembly was run on Chrome, FireFox, and Microsoft Edge. At Chrome, FireFox, and Microsoft Edge, Web Assembly is approximately 11%, 10%, 9% slower than Reference C code, TweAES-128-6 is about 5%, 2%, 6% slower, and TweGIFT-128 is about 22%, 54%, and 17% slower than Reference C code.
- 2
- LWC ESTATE parallel processing using OpenCLESTATE (Energy efficient and Single-state Tweakable block cipher based MAC-Then-Encrypt) algorithm is designed to be used in a limited environment, but the data are finally stored on the server. Therefore, ESTATE algorithm optimization is also required in the server. ESTATE algorithm divides AD (Associated Data) and messages into 128-bit blocks, encrypts them one block at a time, and affects the next process, so it cannot process a large amount of data through parallel processing at once. Servers have to send data to multiple platforms, so if they are processed sequentially, the communication speed becomes slow. So, we propose a method of simultaneously generating multiple ciphertexts and tags to be sent to multiple web-based applications for edge computing using OpenCL parallel processing. As a result, the implemented TweAES-128, TweAES-128-6, and TweGIFT-128 implemented in OpenCL showed performance improvement of 6.69×, 7.31×, and 1.47×, respectively, compared to the reference C code.
- 3
- Optimization method for safe and efficient operation of ESTATE algorithmESTATE algorithm uses TweAES-128, TweAES-128-6, and TweGIFT-128. We propose several methods for safe and efficient operation, and apply the previously existing studied methods. In the operation process of TweAES-128, TweAES-128-6, and TweGIFT-128, there is a process of XOR operation by expanding the 4-bit tweak value. TweAES-128 and TweAES-128-6 expand to 8-bit, and TweGIFT-128 expand to 32-bit. However, only 0∼7, 15 are used as 4-bit tweak values. Therefore, we propose a way to store and use 8-bit, 32-bit extended tweak values for 94-bit tweak values through pre-computation. In the OpenCL implementations of TweAES-128, TweAES-128-6, and TweGIFT-128, to remove the performance load, we use a loop unwind method to remove the load and implement it using local memory with a relatively fast working speed. The operation process of TweAES-128 and TweAES-128-6 is the same as AES algorithm. Therefore, the T-table method, which was previously existing studied, was applied. In addition, AES algorithm is vulnerable to cache-timing attack, and TweAES-128 and TweAES-128-6 with the same structure will be vulnerable. Therefore, TweAES-128 and TweAES-128-6 are safely operated by applying the T-table shuffling method, which is the method previously existing studied. TweAES-128 and TweAES-128-6, which applied the table shuffling method previously existing studied, show about 7% and 51% performance overhead, respectively. It simply shuffles the T-table, so it shows little performance overhead.
2. Background
2.1. Edge Computing
2.2. Web Environment
2.3. Web Assembly
2.4. OpenCL
Memory | Characteristics |
---|---|
Global Memory | (1) Read and write from all work items (2) Placed in device’s main memory |
Constant Memory | (1) Read only from all work items (2) Placed in device’s main memory |
Local Memory | (1) Can be shared and used among work items in a work group (2) In many cases, a shared memory disposed on each operation unit is used. |
Private Memory | (1) Dedicated memory area for work items (2) Often times you use registers used by processing elements. |
2.5. Lightweight Cryptography (Lwc) Estate
Algorithm 1 ESTATE Encryption, Tag Creation, Authentication, and Decryption Algorithm [16]. | |
1: function ESTATE.ENC[]() | 18: function ESTATE.DEC[]() |
2: | 19: |
3: | 20: |
4: retrun | 21: return ? M : ⊥ |
5: function MAC[]() | 22: function FCBC |
6: if and then | 23: |
7: return | 24: for to do |
8: | 25: |
9: if then | 26: |
10: | 27: return T |
11: ; ? 2 : 3 : 6 : 7 | |
12: FCBC | 28: function OFB[]() |
13: if then | 29: |
14: | 30: for to do |
15: ? 4 : 5 | 31: |
16: FCBC[]() | 32: chop() ⨁ |
17: return T | 33: return () |
Algorithm 2 sESTATE Encryption, Tag Creation, Authentication, and Decryption Algorithm [16]. | |
1: function ESTATE.ENC[]() | 18: function ESTATE.DEC[]() |
2: | 19: |
3: | 20: |
4: retrun | 21: return ? M : ⊥ |
5: function MAC[]() | 22: function FCBC |
6: if and then | 23: |
7: return | 24: for to do |
8: | 25: |
9: if then | 26: |
10: | 27: return T |
11: ; ? 2 : 3 : 6 : 7 | |
12: FCBC | 28: function OFB[]() |
13: if then | 29: |
14: | 30: for to do |
15: ? 4 : 5 | 31: |
16: FCBC[]() | 32: chop() ⨁ |
17: return T | 33: return () |
2.5.1. TweAES-128, TweAES-128-6
Algorithm 3 TweAES-128 Algorithm [16]. | |
1: function TweAES() | 15: function TweAES-6() |
2: KeyGen(K) | 16: quad KeyGen() |
3: | 17: |
4: for to 9 do | 18: for to 6 do |
5: SubBytes(X) | 19: SubBytes(X) |
6: SubRows(X) | 20: ShiftRows(X) |
7: MixColumns(X) | 21: MixColumns(X) |
8: | 22: |
9: if then | 23: if and then |
10: AddTweak() | 24: AddTweak() |
11: SubBytes(X) | 25: return X |
12: ShiftRows(X) | |
13: | 26: function AddTweak() |
14: return X | 27: |
28: | |
29: | |
30: for to 3 do | |
31: | |
32: for to 7 do | |
33: | |
34: return X |
2.5.2. TweGIFT-128
Algorithm 4 TweGIFT-128 Algorithm [16]. | |
1: function TweGIFT() | 11: function AddTweak() |
2: | 12: |
3: for to 39 do | 13: |
4: SubCells(X) | 14: |
5: PermBits(X) | 15: for to 3 do |
6: AddRoundKey() | 16: |
7: AddRoundConstant() | 17: |
8: if () and then | 18: |
9: AddTweak() | 19: |
10: return X | 20: for to 31 do |
21: | |
22: return X |
3. Related Work
3.1. Existing Crypto Implementation Using OpenCL
Paper | GPU | Language | Mode | Throughput (Gbps) |
---|---|---|---|---|
Yuan et al. [23] | ATI HD 7670M | OpenCL | CTR | 5.04 Gbps |
Wang et al. [24] | NVIDIA GTX 285 | OpenCL | XTS | 8.59 Gbps |
Wang et al. [24] | NVIDIA GTX 285 | CUDA | XTS | 9.74 Gbps |
Conti et al. [25] | NVIDIA GT 555M | OpenCL | CTR | 10.00 Gbps |
Biagio et al. [26] | NVIDIA GT 8800 | CUDA | CTR | 12.50 Gbps |
Sanida et al. [22] | NVIDIA GTX 1060 | OpenCL | XTS | 12.53 Gbps |
Sanida et al. [22] | NVIDIA GTX 1060 | OpenCL | CTR | 14.71 Gbps |
Cryptographic Algorithm | Key Size | Constant Space | Compilation Time |
---|---|---|---|
AES [20] | 128-bit | 844 KB | 2.7 ms |
DES [28] | 192-bit | 1294 KB | 5.3 ms |
BlowFish [29] | 256-bit | 252 B | 3.5 ms |
RSA [30] | 128-bit | 6 KB | 1031 ms |
3.2. Web Assembly
Chrome | FireFox | Microsoft Edge | ||||
---|---|---|---|---|---|---|
Web Assembly | JavaScript | Web Assembly | JavaScript | Web Assembly | JavaScript | |
revised CHAM-64/128 [33] | 120 cpb (2.1 times) | 260 cpb | 120 cpb (2.1 times) | 260 cpb | 120 cpb (2 times) | 240 cpb |
revised CHAM-128/128 [33] | 60 cpb (3 times) | 180 cpb | 60 cpb (1.6 times) | 100 cpb | 70 cpb (1.8 times) | 130 cpb |
revised CHAM-128/256 [33] | 70 cpb (3 times) | 210 cpb | 70 cpb (2.1 times) | 150 cpb | 70 cpb (2.8 times) | 200 cpb |
wNAF | 27 cpb (11 times) | 300 cpb | 30 cpb (12 times) | 365 cpb | 27 cpb (11 times) | 322 cpb |
wNAF [34] (Atomic block [35]) | 42 cpb (10 times) | 447 cpb | 37 cpb (10 times) | 405 cpb | 37 cpb (14 times) | 522 cpb |
wNAF (Improved Atomic block [6]) | 32 cpb (11 times) | 365 cpb | 32 cpb (12 times) | 387 cpb | 30 cpb (14 times) | 437 cpb |
SHA-256 [36] | 27 cpb (7.5 times) | 203 cpb | 20 cpb (10.8 times) | 216 cpb | 20 cpb (11 times) | 221 cpb |
HMAC [37] | 92 cpb (7.5 times) | 697 cpb | 93 cpb (24.8 times) | 2315 cpb | 97 cpb (7.1 times) | 693 cpb |
3.3. Cache Timing Attack
4. Proposed Implementation for Secure Communication in Edge Computing Services
4.1. Overall Architecture of Proposed Software
4.2. Edge Computing and Estate Implementation Using Web Assembly
4.3. Parallel Implementation of Estate Using OpenCL
Algorithm 5 TweAES-128, TweAES-128-6, TweGIFT-128 proposed by applying loop unrolling method. | |
1: function loop unrolling TweAES-128() | 41: function loop unrolling TweGIFT-128() |
2: KeyGen() | 42: |
3: | 43: for i = 0 to 7 do |
4: for i = 1 to 4 do | 44: for j = 0 to 3 do |
5: SubBytes(X) | 45: SubCells(X) |
6: ShiftRows(X) | 46: PermBits(X) |
7: MixColumns(X) | 47: AddRoundKey() |
8: | 48: AddRoundConstant() |
9: SubBytes(X) | 49: SubCells(X) |
10: ShiftRows(X) | 50: PermBits(X) |
11: MixColumns(X) | 51: AddRoundKey() |
12: | 52: AddRoundConstant() |
13: AddTweak(X, T) | 53: AddTweak(X, T) |
14: SubBytes(X) | 54: for i = 35 to 39 do |
15: ShiftRows(X) | 55: SubCells(X) |
16: MixColumns(X) | 56: PermBits(X) |
17: | 57: AddRoundKey() |
18: SubBytes(X) | 58: AddRoundConstant() |
19: ShiftRows(X) | |
20: | |
21: function loop unrolling TweAES-6() | |
22: KeyGen() | |
23: | |
24: for i = 1 to 2 do | |
25: SubBytes(X) | |
26: ShiftRows(X) | |
27: MixColumns(X) | |
28: | |
29: SubBytes(X) | |
30: ShiftRows(X) | |
31: MixColumns(X) | |
32: | |
33: AddTweak(X, T) | |
34: SubBytes(X) | |
35: ShiftRows(X) | |
36: MixColumns(X) | |
37: | |
38: SubBytes(X) | |
39: ShiftRows(X) | |
40: |
4.4. Safe and Efficient Implementation of TweAES-128, TweAES-128-6, TweGIFT-128 of Estate Algorithm
Algorithm 6 ESTATE TweAES-128, TweAES-128-6 Proposal Method Applying T-table Shuffling |
1: Te0-sf : Te0[shuffle-array] |
2: Te1-sf : Te1[shuffle-array] |
3: Te2-sf : Te2[shuffle-array] |
4: Te3-sf : Te3[shuffle-array] |
5: function 1-round(S0∼S3, RK) |
6: S0 = Te0-sf[S0 ≫ 24] ⊕ Te1-sf[S1 ≫ 16 & 0xff] ⊕ Te2-sf[S2 ≫ 8 & 0xff] ⊕ Te3-sf[S3 & 0xff] ⊕ |
7: S1 = Te0-sf[S1 ≫ 24] ⊕ Te1-sf[S2 ≫ 16 & 0xff] ⊕ Te2-sf[S3 ≫ 8 & 0xff] ⊕ Te3-sf[S0 & 0xff] ⊕ |
8: S0 = Te0-sf[S2 ≫ 24] ⊕ Te1-sf[S3 ≫ 16 & 0xff] ⊕ Te2-sf[S0 ≫ 8 & 0xff] ⊕ Te3-sf[S1 & 0xff] ⊕ |
9: S0 = Te0-sf[S3 ≫ 24] ⊕ Te1-sf[S0 ≫ 16 & 0xff] ⊕ Te2-sf[S1 ≫ 8 & 0xff] ⊕ Te3-sf[S2 & 0xff] ⊕ |
10: function 1-round with AddTweak(S0∼S3, RK, tweak) |
11: S0 = Te0-sf[S0 ≫ 24] ⊕ Te1-sf[S1 ≫ 16 & 0xff] ⊕ Te2-sf[S2 ≫ 8 & 0xff] ⊕ Te3-sf[S3 & 0xff] ⊕ |
12: S1 = Te0-sf[S1 ≫ 24] ⊕ Te1-sf[S2 ≫ 16 & 0xff] ⊕ Te2-sf[S3 ≫ 8 & 0xff] ⊕ Te3-sf[S0 & 0xff] ⊕ |
13: S0 = Te0-sf[S2 ≫ 24] ⊕ Te1-sf[S3 ≫ 16 & 0xff] ⊕ Te2-sf[S0 ≫ 8 & 0xff] ⊕ Te3-sf[S1 & 0xff] ⊕ |
14: S0 = Te0-sf[S3 ≫ 24] ⊕ Te1-sf[S0 ≫ 16 & 0xff] ⊕ Te2-sf[S1 ≫ 8 & 0xff] ⊕ Te3-sf[S2 & 0xff] ⊕ |
15: AddTweak(S0∼S3, tweak) |
5. Results
6. Conclusions
- 1
- Implementation of ESTATE algorithm using Web AssemblyWeb Assembly was created to show similar performance to low-level language in a web environment. Cryptographic algorithms using web-based applications can use web-based applications, and can be used without additional modification in PCs, smart phones, and IoT devices used as edge devices. Therefore, even if the platforms used are different, it is also cost-effective because it can be used generally without additional modification in terms of implementation. In addition, web-based application edge computing communicates with various platforms, so, to send data securely, we implement and use the ESTATE algorithm, which has both encryption and authentication processes, in Web Assembly. We can see how Web Assembly has caught up with the performance of low-level languages. ESTATE Web Assembly implementation compares performance with reference C/C++ code. Web Assembly implementation is measured in web browsers Chrome, FireFox, and Microsoft Edge. As a result, TweAES-128, TweAES-128-6, and TweGIFT-128 implemented as Web Assembly have 11%, 5%, 22% performance overhead in Chrome, 10%, 2%, 54 in FireFox. It shows % performance overhead, and 9%, 6%, and 17% performance overhead in Microsoft Edge. As a result, it is slower than C/C++, which is a low-level language, but it can be used efficiently because it can be used without special modifications on devices that can use web-based applications.
- 2
- ESTATE algorithm using OpenCL parallel processingData processed by the web-based application edge computing method are eventually stored on the main server. Therefore, in order to use the ESTATE algorithm efficiently, it is necessary to implement it according to the server environment. So, we propose a method of simultaneously processing ciphertext and tag generation to be sent to multiple platforms using OpenCL parallel processing. Through OpenCL parallel processing, each byte value is processed simultaneously instead of sequentially for the 16-byte input value used for one encryption process. OpenCL has a load when using conditional statements. In the ESTATE algorithm, a conditional statement is used to XOR the extended tweak value every specific round. Therefore, the loop unrolling method was used to remove the performance load by removing the process of using conditional statements. In addition, data is stored in a local memory with a fast operation speed and encrypted to perform efficient operation. For performance comparison, we compare the OpenCL parallel processing implementation and the reference C/C++ sequential processing implementation. As a result, the OpenCL implementation shows about 6.69 times, 7.31 times, and 1.47 times performance improvement in ESTATE TweAES-128, TweAES-128-6, and TweGIFT-128 than the reference C/C++ implementation.
- 3
- Method for efficient and safe operation of ESTATE algorithmAdditional methods are applied to safely and efficiently operate the ESTATE algorithm itself. The ESTATE algorithm uses conditional statements to check the type of input value to be encrypted, check whether it is the last block, check the tweak value, and calculate the extended tweak value for each specific round. The 8-bit and 32-bit extended tweak values used in TweAES-128, TweAES-128-6, and TweGIFT-128 are stored and used in advance through pre-calculation. This method reduces the performance load by removing unnecessary conditional statements. In addition, TweAES-128 and TweAES-128-6 have the same operation process as the AES algorithm, so they may be vulnerable to cache-timing attacks. So, we apply the T-table shuffling method, which is a previously studied method, to operate safely. We reduced the performance load by applying the proposed methods to minimize the performance load even when the T-table shuffling method is applied. As a result of applying the T-table shuffling method, TweAES-128 and TweAES-128-6 show about 7% and 51% performance overhead, respectively, than those without applying the T-table shuffling method.
- 4
- Future WorkWeb-based application using Web Assembly can be used in various devices without additional modification, so it can reduce the system load of the server and is effective in responding to failures. Web Assembly is currently continuously developing, and, since various devices, such as PCs, smart phones, and smart devices, are developing more and more, web technology is also developing accordingly. Currently, technologies using high-end hardware, such as Web Assembly’s SIMD technology and WebGPU, are being developed. In addition, it is being developed so that Web Assembly and WebGPU can be used together. When these technologies become stable in the future, many web developers will develop web services using various technologies, such as SIMD and WebGPU. Therefore, it can be used in various ways in terms of crypto security, and various studies will be conducted using web technologies developed in the field of crypto security. Therefore, the web-based application edge computing method can also be developed, and performance will be improved. Currently, there are various NIST LWC (National Institute of Standards and Technology LightWeight Cryptography) Round 2 candidate algorithms. However, the OpenCL parallel processing method we used is a method applicable to other candidate algorithms. Even if the LWC algorithm other than ESTATE is used to send data to multiple devices, the service can be provided more efficiently by using the method of simultaneously processing multiple ciphertexts and tags through the OpenCL parallel processing method.
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Hayes, B. Cloud Computing; Communications of the ACM: New York, NY, USA, July 2008. [Google Scholar]
- Yu, W.; Liang, F.; He, X.; Hatcher, W.G.; Lu, C.; Lin, J.; Yang, X. A Survey on the Edge Computing for the Internet of Things. IEEE Access 2018, 6, 6900–6919. [Google Scholar] [CrossRef]
- Ai, Y.; Peng, M.; Zhang, K. Edge computing technologies for Internet of Things: A primer. Digit. Commun. Netw. 2018, 4, 77–86. [Google Scholar] [CrossRef]
- Wang, X.; Han, Y.; Leung, V.C.M.; Niyato, D.; Yan, X.; Chen, X. Convergence of Edge Computing and Deep Learning: A Comprehensive Survey. IEEE Commun. Surv. Tutor. 2020, 22, 869–904. [Google Scholar] [CrossRef] [Green Version]
- Tilkov, S.; Vinoski, S. Node.js: Using JavaScript to build high-performance network programs. IEEE Internet Comput. 2010, 14, 80–83. [Google Scholar] [CrossRef]
- Park, B.; Song, J.; Seo, S.C. Efficient Implementation of a Crypto Library Using Web Assembly. Electronics 2020, 9, 1839. [Google Scholar] [CrossRef]
- Rossberg, A.; Titzer, B.L.; Haas, A.; Schuff, D.L.; Gohman, D.; Wagner, L.; Zakai, A.; Bastien, J.F.; Holman, M. Bringing the web up to speed with WebAssembly. Commun. ACM 2018, 61, 107–115. [Google Scholar] [CrossRef]
- Rossberg, A. WebAssembly Specification Release 1.1. 2020. Available online: https://webassembly.github.io/spec/core/ (accessed on 25 February 2021).
- Ritchie, D.M. The development of the C language. ACM Sigplan Not. 1993, 28, 201–208. [Google Scholar] [CrossRef]
- Smith, E. The C++ Language. In Introduction to the Tools of Scientific Computing; Springer: Berlin, Germany, 2020; pp. 133–148. [Google Scholar]
- Doglio, F. An Introduction to TypeScript. In Introducing Deno; Springer: Berlin, Germany, 2020; pp. 27–62. [Google Scholar]
- Bhattacharjee, J. Basics of Rust. In Practical Machine Learning with Rust; Springer: Berlin, Germany, 2020; pp. 1–30. [Google Scholar]
- Sjölander, Erik. Krypteringsalgoritmer i OpenCL: AES-256 och ECC ElGamal. 2012. Available online: https://www.diva-portal.org/smash/get/diva2:555565/FULLTEXT01.pdf (accessed on 25 February 2021).
- Munshi, A.; Gaster, B.; Mattson, T.G.; Ginsburg, D. OpenCL Programming Guide; Pearson Education: London, UK, 2011. [Google Scholar]
- Munshi, A. The opencl specification. In Proceedings of the 2009 IEEE Hot Chips 21 Symposium (HCS), Stanford, CA, USA, 26 May 2009; pp. 1–314. [Google Scholar]
- Chakraborti, A.; Datta, N.; Jha, A.; Mancillas-López, C.; Nandi, M.; Sasaki, Y. ESTATE: A Lightweight and Low Energy Authenticated Encryption Mode. IACR Trans. Symmetric Cryptol. 2020, 2020, 350–389. [Google Scholar] [CrossRef]
- Black, J.; Rogaway, P. CBC MACs for Arbitrary-Length Messages: The Three-Key Constructions; Springer: Berlin, Germany, 2000; pp. 197–215. [Google Scholar]
- Dworkin, M. Recommendation for Block Cipher Modes of Operation. Methods and Techniques; Technical Report; National Inst of Standards and Technology: Gaithersburg, MD, USA, 2001. [Google Scholar]
- Yao, J.; Zimmer, V. Cryptography. In Building Secure Firmware; Springer: Berlin, Germany, 2020; pp. 767–823. [Google Scholar]
- Heron, S. Advanced encryption standard (AES). Netw. Secur. 2009, 2009, 8–12. [Google Scholar] [CrossRef]
- Banik, S.; Pandey, S.K.; Peyrin, T.; Sasaki, Y.; Sim, S.M.; Todo, Y. GIFT: A Small Present—Towards Reaching the Limit of Lightweight Encryption. In Proceedings of the Cryptographic Hardware and Embedded Systems—CHES 2017—19th International Conference, Taipei, Taiwan, 25–28 September 2017; pp. 321–345. [Google Scholar] [CrossRef]
- Sanida, T.; Sideris, A.; Dasygenis, M. Accelerating the AES Algorithm using OpenCL. In Proceedings of the 2020 9th International Conference on Modern Circuits and Systems Technologies (MOCAST), Bremen, Germany, 18 September 2020; pp. 1–4. [Google Scholar]
- Yuan, Y.; He, Z.; Gong, Z.; Qiu, W. Acceleration of AES encryption with OpenCL. In Proceedings of the 2014 Ninth Asia Joint Conference on Information Security, Wuhan, China, 29 January 2015; pp. 64–70. [Google Scholar]
- Wang, X.; Li, X.; Zou, M.; Zhou, J. AES finalists implementation for GPU and multi-core CPU based on OpenCL. In Proceedings of the 2011 IEEE International Conference on Anti-Counterfeiting, Security and Identification, Xiamen, China, 29 July 2011; pp. 38–42. [Google Scholar]
- Conti, V.; Vitabile, S. Design exploration of aes accelerators on fpgas and gpus. J. Telecommun. Inf. Technol. 2017, 1, 28–38. [Google Scholar]
- Di Biagio, A.; Barenghi, A.; Agosta, G.; Pelosi, G. Design of a parallel AES for graphics hardware using the CUDA framework. In Proceedings of the 2009 IEEE International Symposium on Parallel & Distributed Processing, Rome, Italy, 10 July 2009; pp. 1–8. [Google Scholar]
- D Amato, J.P.; Vénere, M.J. Encrypting video and image streams using OpenCL code on-demand. CLEI Electron. J. 2014, 17, 6. [Google Scholar]
- Matsui, M. Linear cryptanalysis method for DES cipher. In Workshop on the Theory and Application of of Cryptographic Techniques; Springer: Berlin, Germany, 1993; pp. 386–397. [Google Scholar]
- Schneier, B. Description of a New Variable-Length Key, 64-Bit Block Cipher (Blowfish); Springer: Berlin, Germany, 1993; pp. 191–204. [Google Scholar]
- Rivest, R.L.; Shamir, A.; Adleman, L. A method for obtaining digital signatures and public-key cryptosystems. Commun. ACM 1978, 21, 120–126. [Google Scholar] [CrossRef]
- Inampudi, G.R.; Shyamala, K.; Ramachandram, S. Parallel implementation of cryptographic algorithm: AES using OpenCL on GPUs. In Proceedings of the 2018 2nd International Conference on Inventive Systems and Control (ICISC), Coimbatore, India, 28 June 2018; pp. 984–988. [Google Scholar]
- Standaert, F.X. Introduction to side-channel attacks. In Secure Integrated Circuits and Systems; Springer: Berlin, Germany, 2010; pp. 27–42. [Google Scholar]
- Roh, D.; Koo, B.; Jung, Y.; Jeong, I.; Lee, D.; Kwon, D.; Kim, W. Revised Version of Block Cipher CHAM. In Proceedings of the Information Security and Cryptology—ICISC 2019—22nd International Conference, Seoul, Korea, 4–6 December 2019; pp. 1–19. [Google Scholar] [CrossRef]
- King, B. wNAF*, an Efficient Left-To-Right Signed Digit Recoding Algorithm; Springer: Berlin, Germany, 2008; pp. 429–445. [Google Scholar]
- Chevallier-Mames, B.; Ciet, M.; Joye, M. Low-cost solutions for preventing simple side-channel analysis: Side-channel atomicity. IEEE Trans. Comput. 2004, 53, 760–768. [Google Scholar] [CrossRef]
- Fips Pub. Secure Hash Standard (SHS); Fips Pub: Edmonds, WA, USA, March 2012; Volume 180. [Google Scholar]
- Turner, J.M. The keyed-hash message authentication code (hmac). Fed. Inf. Process. Stand. Publ. 2008, 198, 1. [Google Scholar]
- Rösch, J. Efficient Implementation of Picnic. 23 September 2020. Available online: https://is.muni.cz/th/pbn05/Efficient_implementation_of_Picnic.pdf (accessed on 25 February 2021).
- Chase, M.; Derler, D.; Goldfeder, S.; Orlandi, C.; Ramacher, S.; Rechberger, C.; Slamanig, D.; Zaverucha, G. The Picnic Signature Scheme Design Document (Version 1.0); NIST Post-Quantum Cryptogr. Stand. Round; NIST: Gaithersburg, MD, USA, 2017; Volume 3. [Google Scholar]
- Daehyeon Bae, J.H.; Ha, J. Implementation of AES Resistant to Cache Side-Channel Attack Using T-Table Shuffling Method. Conf. Inf. Secur. Cryptogr. Winter 2020, 30, 579–583. [Google Scholar]
- Young, E.A.; Hudson, T.J.; Engelschall, R.S. OpenSSL. 9 November 2001. Available online: http://www.openssl.org/ (accessed on 25 February 2021).
- Bernstein, D.J. Cache-Timing Attacks on AES. 14 April 2005. Available online: https://cr.yp.to/antiforgery/cachetiming-20050414.pdf (accessed on 25 February 2021).
- Fisher, R.A.; Yates, F. Statistical Tables: For Biological, Agricultural and Medical Research; Oliver and Boyd: Edinburgh, UK, 1938. [Google Scholar]
Notation | Denote | Notation | Denote |
---|---|---|---|
length(bit) of A | K | ||
n-bit block parsing of X | T | ||
i | M | ||
TweAES-128 or TweGIFT-128 | |||
TweAES-128-6 | |||
the bitwise XOR of A and B | i-bit left | ||
the concatenation of A and B | i-bit right | ||
n | 128-bit block size | k | 128-bit key size |
t | 128-bit tag size | 4-bit tweak size |
Device | AES | DES | BlowFish | RSA |
---|---|---|---|---|
AMD FX 6100 3.0 GHz (CPU 6 Cores) | 240 Mbps | 144 Mbps | 736 Mbps | 4 Mbps |
NVIDIA GTX 550 (GPU) | 1920 Mbps | 368 Mbps | 8192 Mbps | 20 Mbps |
Ratio of Performance Improvement | 8 times | 2.5 times | 11.13 times | 5 times |
Operationg System | CPU | RAM | SW | Languages and API | Used Input Value | ESTATE Operation Count |
---|---|---|---|---|---|---|
Window 10 Education | Intel i5-8250U 1.6 GHz | 8 GB | (1) Chrome (2) FireFox (3) Microsoft Edge | (1) C/C++ (2) Web Assembly (3) OpenCL | Nonce: 25,600-byte AD: 51,200-byte Message: 512,000-byte | 6400 |
Algorithm | OpenCL | Reference C/C++ | Performance Improvement |
---|---|---|---|
ESTATE TweAES-128 | 19,088,500 ns | 127,842,877 ns | 6.69 times |
ESTATE TweAES-128-6 | 15,966,333 ns | 116,813,270 ns | 7.31 times |
ESTATE TweGIFT-128 | 1,958,343,000 ms | 2,897,251,400 ns | 1.47 times |
Algorithm | Applied T-Table Shuffling Method | Normal Method | Performance Overhead |
---|---|---|---|
ESTATE TweAES-128 | 20,589,394 ns | 19,088,500 ns | 7% |
ESTATE TweAES-128-6 | 24,192,899 ns | 15,966,333 ns | 51% |
Algorithm | Reference C/C++ Code | Web Assembly | ||
---|---|---|---|---|
Chrome (Performance Overhead) | FireFox (Performance Overhead) | Microsoft Edge (Performance Overhead) | ||
ESTATE TweAES-128 | 127,842,877 ns | 142,775,000 ns (11%) | 141,000,000 ns (10%) | 140,374,999 ns (9%) |
ESTATE TweAES-128-6 | 116,813,270 ns | 123,155,001 ns (5%) | 120,000,000 ns (2%) | 124,045,001 ns (6%) |
ESTATE TweGIFT-128 | 2,897,251,400 ns | 3,560,440,001 ns (22%) | 4,490,000,000 ns (54%) | 3,401,205,000 ns (17%) |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Park, B.; Seo, S.C. Efficient Implementation of NIST LWC ESTATE Algorithm Using OpenCL and Web Assembly for Secure Communication in Edge Computing Environment. Sensors 2021, 21, 1987. https://doi.org/10.3390/s21061987
Park B, Seo SC. Efficient Implementation of NIST LWC ESTATE Algorithm Using OpenCL and Web Assembly for Secure Communication in Edge Computing Environment. Sensors. 2021; 21(6):1987. https://doi.org/10.3390/s21061987
Chicago/Turabian StylePark, BoSun, and Seog Chung Seo. 2021. "Efficient Implementation of NIST LWC ESTATE Algorithm Using OpenCL and Web Assembly for Secure Communication in Edge Computing Environment" Sensors 21, no. 6: 1987. https://doi.org/10.3390/s21061987
APA StylePark, B., & Seo, S. C. (2021). Efficient Implementation of NIST LWC ESTATE Algorithm Using OpenCL and Web Assembly for Secure Communication in Edge Computing Environment. Sensors, 21(6), 1987. https://doi.org/10.3390/s21061987