A Low-Cost High-Performance Montgomery Modular Multiplier Based on Pipeline Interleaving for IoT Devices
Abstract
:1. Introduction
1.1. Research Background
1.2. Previous Work
1.3. Paper Contributions
2. Preliminaries
2.1. Radix-2 Montgomery Modular Multiplication Algorithm
Algorithm 1 Classic radix-2 Montgomery multiplication [15]. |
|
2.2. Multiple-Word Radix- Montgomery Modular Multiplication Algorithm
Algorithm 2 Classic multiple-word radix- Montgomery multiplication [16] |
|
2.3. Pipeline of the Classic Algorithm
3. Proposed Interleaved Pipeline Design
3.1. Proposed IP-MWRMM Algorithm
Algorithm 3 Proposed interleaved pipeline . |
|
3.2. Parallel Computation of the IP-MWRMM Algorithm
3.3. Proposed Hardware Architecture
3.3.1. Processing Elements
3.3.2. Overall Architecture
4. Implementation Results and Comparison
4.1. Performance and Area Analysis
4.1.1. Timing Analysis
4.1.2. Area Analysis
4.2. Results Comparison and Discussion
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
Abbreviations
IoT | Internet of Things |
PCS | public-key cryptography system |
TLS | transport layer security |
RSA | Rivest–-Shamir–-Adleman |
ECC | elliptic curve cryptography |
MMM | Montgomery modular multiplication |
LUT | lookup table |
FFT | fast Fourier transform |
RNS | residue number system |
NLP | non-least positive |
- | interleaved pipeline multiple-word radix- Montgomery multiplication |
PE | processing element |
CPD | critical path delay |
radix-2 Montgomery multiplication | |
multiple-word radix- Montgomery multiplication | |
MSB | most significant bit |
multiple-word radix-2 Montgomery multiplication | |
CHRP | classic high-radix pipeline |
PS | pipeline stage |
CC | clock cycle |
FSM | finite state machine |
FA | full adder |
CSA | carry–save adder |
IP | interleaved pipeline |
SEC | SLICE equivalent cost |
ATP | area–time–product |
References
- Madakam, S.; Lake, V.; Lake, V.; Lake, V. Internet of Things (IoT): A literature review. J. Comput. Commun. 2015, 3, 56616. [Google Scholar] [CrossRef] [Green Version]
- Lee, I.; Lee, K. The Internet of Things (IoT): Applications, investments, and challenges for enterprises. Bus. Horizons 2015, 58, 431–440. [Google Scholar] [CrossRef]
- Hassan, W.H. Current research on Internet of Things (IoT) security: A survey. Comput. Netw. 2019, 148, 283–294. [Google Scholar]
- Koblitz, N.; Menezes, A.J. A survey of public-key cryptosystems. SIAM Rev. 2004, 46, 599–634. [Google Scholar] [CrossRef] [Green Version]
- Rescorla, E. SSL and TLS: Designing and Building Secure Systems; Addison-Wesley Reading: Boston, MA, USA, 2001; Volume 1. [Google Scholar]
- Huh, S.; Cho, S.; Kim, S. Managing IoT devices using blockchain platform. In Proceedings of the 2017 19th International Conference on Advanced Communication Technology (ICACT), Pyeongchang, Republic of Korea, 19–22 February 2017; pp. 464–467. [Google Scholar] [CrossRef]
- Imteaj, A.; Thakker, U.; Wang, S.; Li, J.; Amini, M.H. A Survey on Federated Learning for Resource-Constrained IoT Devices. IEEE Internet Things J. 2022, 9, 1–24. [Google Scholar] [CrossRef]
- Thakor, V.A.; Razzaque, M.A.; Khandaker, M.R.A. Lightweight Cryptography Algorithms for Resource-Constrained IoT Devices: A Review, Comparison and Research Opportunities. IEEE Access 2021, 9, 28177–28193. [Google Scholar] [CrossRef]
- Wu, R.; Xu, M.; Yang, Y.; Tian, G.; Yu, P.; Zhao, Y.; Lian, B.; Ma, L. Efficient High-Radix GF (p) Montgomery Modular Multiplication Via Deep Use Of Multipliers. IEEE Trans. Circuits Syst. II Express Briefs 2022, 69, 5099–5103. [Google Scholar] [CrossRef]
- Pan, J.S.; Song, P.; Yang, C.S. Efficient digit-serial modular multiplication algorithm on FPGA. IET Circuits Devices Syst. 2018, 12, 662–668. [Google Scholar] [CrossRef]
- Ding, J.; Li, S. A low-latency and low-cost Montgomery modular multiplier based on NLP multiplication. IEEE Trans. Circuits Syst. II Express Briefs 2019, 67, 1319–1323. [Google Scholar] [CrossRef]
- Rivest, R.L.; Shamir, A.; Adleman, L. A method for obtaining digital signatures and public-key cryptosystems. Commun. ACM 1978, 21, 120–126. [Google Scholar] [CrossRef] [Green Version]
- Miller, V.S. Use of Elliptic Curves in Cryptography; Springer: Berlin/Heidelberg, Germany, 1986. [Google Scholar]
- Koblitz, N. Elliptic curve cryptosystems. Math. Comput. 1987, 48, 203–209. [Google Scholar] [CrossRef]
- Montgomery, P.L. Modular multiplication without trial division. Math. Comput. 1985, 44, 519–521. [Google Scholar] [CrossRef]
- Tenca, A.F.; Todorov, G.; Koç, C.K. High-radix design of a scalable modular multiplier. In Proceedings of the Cryptographic Hardware and Embedded Systems—CHES 2001: Third International Workshop, Paris, France, 14–16 May 2001; Proceedings 3. pp. 185–201. [Google Scholar]
- Zhang, B.; Cheng, Z.; Pedram, M. High-radix design of a scalable montgomery modular multiplier with low latency. IEEE Trans. Comput. 2021, 71, 436–449. [Google Scholar] [CrossRef]
- Erdem, S.S.; Yanık, T.; Çelebi, A. A general digit-serial architecture for Montgomery modular multiplication. IEEE Trans. Very Large Scale Integr. (VLSI) Syst. 2017, 25, 1658–1668. [Google Scholar] [CrossRef]
- Zhang, B.; Cheng, Z.; Pedram, M. An Iterative Montgomery Modular Multiplication Algorithm With Low Area-Time Product. IEEE Trans. Comput. 2022, 72, 236–249. [Google Scholar] [CrossRef]
- Fatemi, S.; Zare, M.; Khavari, A.F.; Maymandi-Nejad, M. Efficient implementation of digit-serial Montgomery modular multiplier architecture. IET Circuits Devices Syst. 2019, 13, 942–949. [Google Scholar] [CrossRef]
- Zhang, Z.; Zhang, P. A Scalable Montgomery Modular Multiplication Architecture with Low Area-Time Product Based on Redundant Binary Representation. Electronics 2022, 11, 3712. [Google Scholar] [CrossRef]
- Kolagatla, V.R.; Desalphine, V.; Selvakumar, D. Area-time scalable high radix Montgomery modular multiplier for large modulus. In Proceedings of the 2021 25th International Symposium on VLSI Design and Test (VDAT), Surat, India, 16–18 September 2021; pp. 1–4. [Google Scholar]
- Abd-Elkader, A.A.; Rashdan, M.; Hasaneen, E.S.A.; Hamed, H.F. Efficient implementation of Montgomery modular multiplier on FPGA. Comput. Electr. Eng. 2022, 97, 107585. [Google Scholar] [CrossRef]
- Gu, Z.; Li, S. A division-free Toom–Cook multiplication-based Montgomery modular multiplication. IEEE Trans. Circuits Syst. II Express Briefs 2018, 66, 1401–1405. [Google Scholar] [CrossRef]
- Dai, W.; Chen, D.D.; Cheung, R.C.; Koc, C.K. Area-time efficient architecture of FFT-based montgomery multiplication. IEEE Trans. Comput. 2016, 66, 375–388. [Google Scholar] [CrossRef]
- Mo, Y.; Li, S. Design of an 8192-bit RNS montgomery multiplier. In Proceedings of the 2017 International Conference on Electron Devices and Solid-State Circuits (EDSSC), Hsinchu, Taiwan, 18–20 October 2017; pp. 1–2. [Google Scholar]
- Ahsan, J.; Esmaeildoust, M.; Kaabi, A.; Zarei, V. Efficient FPGA implementation of RNS Montgomery multiplication using balanced RNS bases. Integration 2022, 84, 72–83. [Google Scholar] [CrossRef]
- Tenca, A.F.; Koç, C.K. A scalable architecture for Montgomery multiplication. In Proceedings of the CHES, Worcester, MA, USA, 12–13 August 1999; Volume 99, pp. 94–108. [Google Scholar]
- Ibrahim, A.; Gebali, F.; Elsimary, H. New and improved word-based unified and scalable architecture for radix 2 Montgomery modular multiplication algorithm. In Proceedings of the 2013 IEEE Pacific Rim Conference on Communications, Computers and Signal Processing (PACRIM), Victoria, BC, Canada, 27–29 August 2013; pp. 153–158. [Google Scholar]
- Shieh, M.D.; Lin, W.C. Word-based Montgomery modular multiplication algorithm for low-latency scalable architectures. IEEE Trans. Comput. 2010, 59, 1145–1151. [Google Scholar] [CrossRef]
- Huang, M.; Gaj, K.; El-Ghazawi, T. New hardware architectures for Montgomery modular multiplication algorithm. IEEE Trans. Comput. 2010, 60, 923–936. [Google Scholar] [CrossRef] [Green Version]
- Orup, H. Simplifying quotient determination in high-radix modular multiplication. In Proceedings of the 12th Symposium on Computer Arithmetic, Bath, UK, 19–21 July 1995; pp. 193–199. [Google Scholar]
- Farzam, M.H.; Bayat-Sarmadi, S.; Mosanaei-Boorani, H.; Alivand, A. Hardware architecture for supersingular isogeny Diffie-Hellman and key encapsulation using a fast montgomery multiplier. IEEE Trans. Circuits Syst. I Regul. Pap. 2021, 68, 2042–2050. [Google Scholar] [CrossRef]
Works | Platform | Time | Area | ATP (Area × ms) | |||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Cycles | Period (ns) | Latency (s) | LUT | FF | DSP | BRAM36 | SLICE | Total ** | |||
256-bit modulus | |||||||||||
[27] | Virtex-7 | - | - | 0.120 | 9210 | - | 248 | 0 | 2631 * | 27,431 | 3.291 |
[9] | Virtex-7 | - | 3.448 | 0.214 | 5500 | - | 0 | 0 | 1571 * | 1571 | 0.336 |
ours | Virtex-7 | 69 | 3.97 | 0.274 | 812 | 909 | 8 | 0 | 347 | 1147 | 0.314 |
512-bit modulus | |||||||||||
[9] | Virtex-7 | - | 3.448 | 0.448 | 9500 | - | 0 | 0 | 2714 * | 2714 | 1.215 |
ours | Virtex-7 | 133 | 4.07 | 0.541 | 1367 | 1951 | 13 | 0 | 576 | 1876 | 1.014 |
1024-bit modulus | |||||||||||
[31] | Virtex-2 | 1088 | 9.39 | 10.22 | 5356 | - | 0 | 0 | 1530* | 1530 | 15.636 |
[30] | Virtex-6 | 1287 | 3.92 | 5.05 | - | - | 0 | 0 | 5158 | 5158 | 26.047 |
[25] | Virtex-6 | 1052 | 3.80 | 4.00 | 6047 | - | 9 | 16.5 | 1757 | 5891 | 23.564 |
[18] | Virtex-7 | 530 | 2.23 | 1.18 | 9304 | 7492 | 0 | 0 | 2504 | 2504 | 2.954 |
[17] | Virtex-7 | 290 | 3.90 | 1.13 | 19,124 | 4638 | 0 | 0 | 5464 * | 5464 | 6.174 |
[19] | Virtex-7 | 264 | 3.00 | 0.79 | 17,661 | 3120 | 0 | 0 | 5046 * | 5046 | 3.986 |
[10] | Virtex-7 | 262 | 3.96 | 1.04 | 16,832 | 5165 | 0 | 0 | 4809 * | 4809 | 5.001 |
[22] | Virtex-7 | 257 | 17.54 | 4.50 | 16,531 | 3098 | 0 | 0 | 4723* | 4723 | 21.253 |
ours | Virtex-7 | 261 | 3.95 | 1.03 | 2845 | 3165 | 24 | 0 | 1100 | 3500 | 3.605 |
2048-bit modulus | |||||||||||
[31] | Virtex-2 | 2176 | 9.90 | 21.553 | 10,698 | - | 0 | 0 | 3056 * | 3056 | 65.865 |
[25] | Virtex-6 | 2036 | 3.88 | 7.90 | 7337 | - | 9 | 17.5 | 2083 | 6413 | 50.662 |
[18] | Virtex-7 | 512 | 4.57 | 2.39 | 36,238 | 15,066 | 0 | 0 | 10,104 | 10,104 | 24.148 |
[17] | Virtex-7 | 562 | 3.90 | 2.19 | 39,744 | 8942 | 0 | 0 | 11,355 * | 11,355 | 24.867 |
[19] | Virtex-7 | 520 | 3.40 | 1.77 | 32,170 | 6177 | 0 | 0 | 9191 * | 9191 | 16.268 |
[10] | Virtex-7 | 518 | 4.81 | 2.49 | 33,734 | 10,315 | 0 | 0 | 9638 * | 9638 | 23.999 |
ours | Virtex-7 | 517 | 4.13 | 2.13 | 5328 | 7551 | 45 | 0 | 2286 | 6786 | 14.454 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Li, H.; Ren, S.; Wang, W.; Zhang, J.; Wang, X. A Low-Cost High-Performance Montgomery Modular Multiplier Based on Pipeline Interleaving for IoT Devices. Electronics 2023, 12, 3241. https://doi.org/10.3390/electronics12153241
Li H, Ren S, Wang W, Zhang J, Wang X. A Low-Cost High-Performance Montgomery Modular Multiplier Based on Pipeline Interleaving for IoT Devices. Electronics. 2023; 12(15):3241. https://doi.org/10.3390/electronics12153241
Chicago/Turabian StyleLi, Hongshuo, Shiwei Ren, Weijiang Wang, Jingqi Zhang, and Xiaohua Wang. 2023. "A Low-Cost High-Performance Montgomery Modular Multiplier Based on Pipeline Interleaving for IoT Devices" Electronics 12, no. 15: 3241. https://doi.org/10.3390/electronics12153241
APA StyleLi, H., Ren, S., Wang, W., Zhang, J., & Wang, X. (2023). A Low-Cost High-Performance Montgomery Modular Multiplier Based on Pipeline Interleaving for IoT Devices. Electronics, 12(15), 3241. https://doi.org/10.3390/electronics12153241