An Area-Efficient Unified VLSI Architecture for Type IV DCT/DST Having an Efficient Hardware Security with Low Overheads
Abstract
:1. Introduction
Related Works
- A new VLSI algorithm for type IV DST using only six special computation structures that allow an efficient VLSI implementation, called quasi-cycle convolutions, as compared with that in [28], where eight such structures are used.
- The new VLSI algorithm for type IV DST can be used to obtain a significantly reduced hardware complexity as compared to existing ones by using multiplications where one operand is a constant instead of the usual case where the hardware complexity of the multiplier is significant higher.
- A new unified VLSI algorithm for type IV DCT and DST has been obtained that leads to an efficient unified VLSI architecture, in which most of the chip area is used in common by the two transforms.
- The obtained unified VLSI architecture allows the inclusion of hardware security with very low overheads.
2. Methods
2.1. A New VLSI Algorithm for DST IV
2.2. A Unified VLSI Algorithm for DCT/DST IV
3. The Proposed Unified VLSI Architecture for DCT/DST IV
3.1. Designing the VLSI Architecture
3.2. The Obfuscation Technique Used in the Proposed Design
4. Results
- the input sequences are replaced from where as in [29], to , where , as in Equation (9);
5. Discussion
5.1. Discussion about the Main Features of the Proposed Solution
5.2. Comparison with Similar Solutions
6. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
Appendix A
Proof of the Equation (3)
References
- Jain, A.K. A Sinusoidal Family of Unitary Transforms. IEEE Trans. Pattern Anal. Mach. Intell. 1979, PAMI-1, 356–365. [Google Scholar] [CrossRef] [PubMed]
- Malvar, H.S. Lapped Transforms for Efficient Transform/Subband Coding. IEEE Trans. Acoust. Speech Signal Process. 1990, 38, 969–978. [Google Scholar] [CrossRef]
- Malvar, H. A Modulated Complex Lapped Transform and Its Applications to Audio Processing. In Proceedings of the 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing, Proceedings. ICASSP99 (Cat. No.99CH36258), Phoenix, AZ, USA, 15–19 March 1999; Volume 3, pp. 1421–1424. [Google Scholar]
- Jing, C.; Tai, H.-M. Fast Algorithm for Computing Modulated Lapped Transform. Electron. Lett. 2001, 37, 796–797. [Google Scholar] [CrossRef]
- Davidson, G.A.; Isnardi, M.A.; Fielder, L.D.; Goldman, M.S.; Todd, C.C. ATSC Video and Audio Coding. Proc. IEEE 2006, 94, 60–76. [Google Scholar] [CrossRef]
- Chan, Y.-H.; Siu, W.-C. On the Realization of Discrete Cosine Transform Using the Distributed Arithmetic. IEEE Trans. Circuits Syst. Fundam. Theory Appl. 1992, 39, 705–712. [Google Scholar] [CrossRef]
- Guo, J.-I.; Liu, C.-M.; Jen, C.-W. A New Array Architecture for Prime-Length Discrete Cosine Transform. IEEE Trans. Signal Process. 1993, 41, 436. [Google Scholar] [CrossRef]
- Cheng, C.; Parhi, K.K. A Novel Systolic Array Structure for DCT. IEEE Trans. Circuits Syst. II Express Briefs 2005, 52, 366–369. [Google Scholar] [CrossRef]
- Meher, P.K. Systolic Designs for DCT Using a Low-Complexity Concurrent Convolutional Formulation. IEEE Trans. Circuits Syst. Video Technol. 2006, 16, 1041–1050. [Google Scholar] [CrossRef]
- Meher, P.K.; Swamy, M.N.S. New Systolic Algorithm and Array Architecture for Prime-Length Discrete Sine Transform. IEEE Trans. Circuits Syst. II Express Briefs 2007, 54, 262–266. [Google Scholar] [CrossRef]
- Xie, J.; Meher, P.K.; He, J. Hardware-Efficient Realization of Prime-Length DCT Based on Distributed Arithmetic. IEEE Trans. Comput. 2013, 62, 1170–1178. [Google Scholar] [CrossRef]
- Kung Why Systolic Architectures? Computer 1982, 15, 37–46. [CrossRef]
- White, S.A. Applications of Distributed Arithmetic to Digital Signal Processing: A Tutorial Review. IEEE ASSP Mag. 1989, 6, 4–19. [Google Scholar] [CrossRef]
- Pilato, C.; Garg, S.; Wu, K.; Karri, R.; Regazzoni, F. Securing Hardware Accelerators: A New Challenge for High-Level Synthesis. IEEE Embed. Syst. Lett. 2018, 10, 77–80. [Google Scholar] [CrossRef]
- Knechtel, J.; Patnaik, S.; Sinanoglu, O. Protect Your Chip Design Intellectual Property: An Overview. In Proceedings of the International Conference on Omni-Layer Intelligent Systems, Crete, Greece, 5–7 May 2019; Association for Computing Machinery: New York, NY, USA, 2019; pp. 211–216. [Google Scholar]
- Zhang, J. A Practical Logic Obfuscation Technique for Hardware Security. IEEE Trans. Very Large Scale Integr. VLSI Syst. 2016, 24, 1193–1197. [Google Scholar] [CrossRef]
- Koteshwara, S.; Kim, C.H.; Parhi, K.K. Hierarchical Functional Obfuscation of Integrated Circuits Using a Mode-Based Approach. In Proceedings of the 2017 IEEE International Symposium on Circuits and Systems (ISCAS), Baltimore, MD, USA, 28–31 May 2017; pp. 1–4. [Google Scholar]
- Koteshwara, S.; Kim, C.H.; Parhi, K.K. Key-Based Dynamic Functional Obfuscation of Integrated Circuits Using Sequentially Triggered Mode-Based Design. IEEE Trans. Inf. Forensics Secur. 2018, 13, 79–93. [Google Scholar] [CrossRef]
- Parhi, K.K.; Koteshwara, S. Dynamic Functional Obfuscation. U.S. Patent US11061997B2, 3 August 2017. [Google Scholar]
- Murthy, N.R.; Swamy, M.N.S. On the On-Line Computation of DCT-IV and DST-IV Transforms. IEEE Trans. Signal Process. 1995, 43, 1249–1251. [Google Scholar] [CrossRef]
- Kidambi, S.S. Recursive Implementation of the DCT-IV and DST-IV. In Proceedings of the 1998 IEEE Symposium on Advances in Digital Filtering and Signal Processing, Symposium Proceedings (Cat. No.98EX185), Victoria, BC, Canada, 5–6 June 1998; pp. 106–110. [Google Scholar]
- Chiper, D.F.; Ahmad, M.O.; Swamy, M.N.S. A Unified VLSI Algorithm for a High Performance Systolic Array Implementation of Type IV DCT/DST. In Proceedings of the International Symposium on Signals, Circuits and Systems ISSCS2013, Iasi, Romania, 11–12 July 2013; pp. 1–4. [Google Scholar]
- Lai, S.-C.; Chien, W.-C.; Lan, C.-S.; Lee, M.-K.; Luo, C.-H.; Lei, S.-F. An Efficient DCT-IV-Based ECG Compression Algorithm and Its Hardware Accelerator Design. In Proceedings of the 2013 IEEE International Symposium on Circuits and Systems (ISCAS), Beijing, China, 19–23 May 2013; pp. 1296–1299. [Google Scholar]
- Chiper, D.F. A New VLSI Algorithm for a High-Throughput Implementation of Type IV DCT. In Proceedings of the 2016 International Conference on Communications (COMM), Bremen, Germany, 13–15 June 2016; pp. 17–20. [Google Scholar]
- Perera, S.M. Signal Flow Graph Approach to Efficient and Forward Stable DST Algorithms. Linear Algebra Its Appl. 2018, 542, 360–390. [Google Scholar] [CrossRef]
- Chiper, D.F.; Cotorobai, L.T. A New VLSI Algorithm for an Efficient VLSI Implementation of Type IV DST Based on Short Band-Correlation Structures. In Proceedings of the 2020 13th International Conference on Communications (COMM), Bucharest, Romania, 18–20 June 2020; pp. 69–72. [Google Scholar]
- Perera, S.M.; Liu, J. Complexity Reduction, Self/Completely Recursive, Radix-2 DCT I/IV Algorithms. J. Comput. Appl. Math. 2020, 379, 112936. [Google Scholar] [CrossRef]
- Chiper, D.F.; Cotorobai, L.-T. A New Approach for a Unified Architecture for Type IV DCT/DST with an Efficient Incorporation of Obfuscation Technique. Electronics 2021, 10, 1656. [Google Scholar] [CrossRef]
- Chiper, D.F. An Improved VLSI Algorithm for an Efficient VLSI Implementation of a Type IV DCT That Allows an Efficient Incorporation of Hardware Security with a Low Overhead. Electronics 2023, 12, 243. [Google Scholar] [CrossRef]
- FreePDK15|NC State EDA. Available online: https://eda.ncsu.edu/freepdk15/ (accessed on 3 October 2023).
- Open-Cell Library. Available online: https://si2.org/open-cell-library/ (accessed on 3 October 2023).
- Luo, C.-H.; Ma, W.-J.; Juang, W.-H.; Kuo, S.-H.; Chen, C.-Y.; Tai, P.-C.; Lai, S.-C. An ECG Acquisition System Prototype Design With Flexible PDMS Dry Electrodes and Variable Transform Length DCT-IV Based Compression Algorithm. IEEE Sens. J. 2016, 16, 8244–8254. [Google Scholar] [CrossRef]
- Perera, S.M.; Lingsch, L.E. Sparse Matrix Based Low-Complexity, Recursive, and Radix-2 Algorithms for Discrete Sine Transforms. IEEE Access 2021, 9, 141181–141198. [Google Scholar] [CrossRef]
Representation | Approximate Fixed-Point Value | Number of Adders/ Subtractors | ||
---|---|---|---|---|
0.54296875000 | −10.3 | 3 | ||
2.46484375000 | −10.3 | 3 | ||
2.57617187500 | −10.4 | 4 | ||
1.87011718750 | −13.5 | 3 | ||
1.98535156250 | −13.9 | 2 | ||
0.92968750000 | −12.0 | 2 | ||
0.65917968750 | −17.0 | 4 | ||
0.45117187500 | −11.5 | 3 | ||
3.51611328125 | −13.1 | 3 |
Constrained Clock Period/ Frequency | Critical Path Delay + Setup Time [ps] | Interconnect Area [µm2] | Combinational Area [µm2] | Flop Area [µm2] | Total Area [µm2] | Equivalent Gates Count | Static Power [µW] | Dynamic Power at Constrained Clock Frequency [mW] |
---|---|---|---|---|---|---|---|---|
50 ns/20 MHz | 218 | 277.8 | 443.1 | 505.4 | 1226.2 | 60,831 | 34.9 | 0.04 at 20 MHz |
10 ns/100 MHz | 218 | 277.8 | 443.1 | 505.4 | 1226.2 | 60,831 | 34.9 | 0.2 at 100 MHz |
1 ns/1 GHz | 218 | 280.1 | 445.2 | 505.4 | 1230.7 | 61,053 | 35.1 | 1.8 at 1 GHz |
300 ps/3.33 GHz | 219 | 281.5 | 446.1 | 505.2 | 1232.7 | 61,154 | 35.2 | 6.0 at 3.33 GHz |
250 ps/4 GHz | 219 | 281.9 | 446.4 | 505.2 | 1233.4 | 61,190 | 35.2 | 7.2 at 4 GHz |
200 ps/5 GHz | 196 | 304.1 | 461.2 | 505.2 | 1270.5 | 63,029 | 36.2 | 9.5 at 5 GHz |
175 ps/5.71 GHz | 175 | 326.6 | 484.3 | 505.2 | 1316.1 | 65,291 | 37.7 | 11.3 at 5.71 GHz |
150 ps/6.67 GHz | 150 | 351.9 | 533.4 | 505.2 | 1390.5 | 68,984 | 41.1 | 14.2 at 6.67 GHz |
137 ps/7.29 GHz | 137 | 379.0 | 597.0 | 505.2 | 1481.1 | 73,479 | 45.3 | 16.7 at 7.29 GHz |
Constrained Clock Period [ps] | Slack [ps] | Critical Path Delay + Setup Time [ps] | Operating Frequency [GHz] | Interconnect Area [µm2] | Cell Area [µm2] | Total Area [µm2] | Static Power [µW] | Dynamic Power at Operating Frequency [mW] |
---|---|---|---|---|---|---|---|---|
250 | 30.12 | 219.88 | 4.55 | 191.26 | 1083.80 | 1275.1 | 41.9 | 12.1 |
200 | 0.29 | 199.71 | 5.01 | 194.72 | 1103.41 | 1298.1 | 43.2 | 14.0 |
175 | −4.51 | 179.51 | 5.57 | 199.53 | 1130.64 | 1330.2 | 45.1 | 16.0 |
150 | −9.19 | 159.19 | 6.28 | 212.87 | 1206.29 | 1419.2 | 49.9 | 19.4 |
137 | −8.14 | 145.14 | 6.89 | 225.88 | 1279.97 | 1505.8 | 55.5 | 22.9 |
This Work (Hardware Core) | [22] | [28] | [32] | [33] * | |
---|---|---|---|---|---|
Type of transform | Unified DCT/DST IV | Unified DCT/DST IV | Unified DCT/DST IV | DCT IV | DST IV |
No. of adders | |||||
No. of multipliers | |||||
No. of registers | N/A | - |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Chiper, D.F.; Cracan, A. An Area-Efficient Unified VLSI Architecture for Type IV DCT/DST Having an Efficient Hardware Security with Low Overheads. Electronics 2023, 12, 4471. https://doi.org/10.3390/electronics12214471
Chiper DF, Cracan A. An Area-Efficient Unified VLSI Architecture for Type IV DCT/DST Having an Efficient Hardware Security with Low Overheads. Electronics. 2023; 12(21):4471. https://doi.org/10.3390/electronics12214471
Chicago/Turabian StyleChiper, Doru Florin, and Arcadie Cracan. 2023. "An Area-Efficient Unified VLSI Architecture for Type IV DCT/DST Having an Efficient Hardware Security with Low Overheads" Electronics 12, no. 21: 4471. https://doi.org/10.3390/electronics12214471
APA StyleChiper, D. F., & Cracan, A. (2023). An Area-Efficient Unified VLSI Architecture for Type IV DCT/DST Having an Efficient Hardware Security with Low Overheads. Electronics, 12(21), 4471. https://doi.org/10.3390/electronics12214471