1. Introduction
The advancement of technology and interconnected communications has significantly improved our lives but has also introduced numerous potential privacy and security risks. Cryptography plays a crucial role in safeguarding user privacy and security by employing primitives that prevent unauthorized access. The Advanced Encryption Standard (AES), established by the National Institute of Standards and Technology (NIST) in 2001, defines a widely accepted encryption algorithm with a block size of 128 bits and key sizes of 128, 192, or 256 bits [
1]. In AES, the strength of encryption heavily relies on the design and implementation of the substitution box (S-Box) [
2], a critical component that distorts the data to enhance security. The need for secure, efficient cryptographic solutions is particularly critical in several application areas. Firstly, the Internet of Things (IoT) encompasses a vast array of devices, from smart home appliances to industrial sensors, that require lightweight encryption to ensure data security while maintaining low power consumption and a minimal hardware footprint [
3]. Secondly, System-on-Chip (SoC) architectures integrate multiple components into a single chip, necessitating optimized cryptographic modules that balance performance, area, and power consumption. Thirdly, wearable technology, such as fitness trackers and smartwatches, demands compact, power-efficient cryptographic solutions to protect sensitive personal data. Furthermore, medical devices, including pacemakers and insulin pumps, rely on secure communication to safeguard patient data and ensure device integrity. Lastly, modern automotive systems incorporate interconnected subsystems requiring secure data exchange to prevent unauthorized access and ensure passenger safety.
Despite significant advancements in cryptographic accelerators, several research gaps remain unaddressed. A primary challenge is the comprehensive optimization of power, performance, and area (PPA) parameters. Existing solutions often focus on improving one or two of these metrics at the expense of the third, resulting in suboptimal designs. Additionally, high-security algorithms like AES are computationally intensive and often not optimized for compactness and power efficiency, making them unsuitable for highly optimized SoC designs and IoT applications. Researchers have proposed various methods to create more resource-efficient cryptographic implementations. For instance, N. Ahmad et al. (2013) [
4] introduced an XOR Gate approach for AES S-Box and inverse S-Box designs in a 65 nm CMOS standard library, focusing on low power and area efficiency. A. R. Masoleh et al. (2018) [
5] developed a logic-minimization heuristic for AES S-Box in the same technology, aiming for efficient implementation. F. Artuger et al. (2020) [
6] proposed a chaos-based technique for S-Box to improve performance, while B. Rashidi (2020) [
7] designed an S-Box with low-cost transformation, minimal area resources, and a short critical path delay in a 65 nm CMOS standard library. Despite these advancements, prior work has typically concentrated on improving either area, power, or delay using single 8-bit or 4-bit signals. None have comprehensively addressed all three parameters simultaneously, nor have they incorporated a dual quad-bit implementation with enhanced security.
In this paper, we proposed area-efficient S-Box architecture for ASIC implementations by employing a novel dual quad-bit structure. This approach maintains critical cryptographic properties, such as non-linearity, low differential uniformity, and bijectiveness. Utilizing Algebraic Normal Forms (ANFs) and the Walsh–Hadamard Transform, the design achieves high non-linearity and robust security against cryptographic attacks. The 8-bit S-box design leverages dual quad-bit forward and backward transformations, optimizing encryption and decryption processes. Our method demonstrates superior PPA optimization and security enhancements compared to previous techniques. Simulation results using Cadence RTL synthesis tools confirm that our proposed implementation significantly improves PPA metrics while providing enhanced security, outperforming all previously proposed methods. This comprehensive approach addresses the existing research gaps by simultaneously optimizing power, performance, and area while incorporating a novel dual quad-bit design. This ensures the proposed S-Box is more secure and more suitable for the stringent requirements of modern IoT devices, SoCs, wearable technology, medical devices, and automotive systems. Our findings highlight the potential of this new architecture to set a new standard in lightweight cryptographic implementations, paving the way for more secure and efficient digital communication systems. The paper’s significant contributions are summarized as follows:
(1) This work introduces a novel approach to designing substitution boxes (S-boxes) for AES encryption, leveraging dual quad-bit structures to enhance cryptographic security and hardware efficiency. Utilizing Algebraic Normal Forms (ANFs) and Walsh–Hadamard Transforms, the proposed RTL circuitry ensures optimal non-linearity, low differential uniformity, and bijectiveness, providing a robust and efficient solution for ASIC implementations. (2) The security analysis of the proposed S-Box architecture using comprehensive statistical tests demonstrates enhanced security levels comparable to the AES S-Box and other existing works, ensuring robust protection against cryptographic attacks. (3) The dual quad-bit forward and backward tracing circuitry is designed at the register transfer level (RTL) and is functionally verified using stringent measurement criteria, confirming the correctness and reliability of the proposed architecture. (4) The proposed S-Box design is implemented on a ZedBoard Zynq 7000 SoC Board for functional verification, confirming its practical applicability and effectiveness in real-world environments. Additionally, the ASIC implementation using a standard 65 nm CMOS library demonstrates a low transistor count, small die size, and low delay path, achieving optimal power, performance, and area (PPA) metrics.
The subsequent sections of this research work are organized as follows.
Section 1 provides the introduction and related work.
Section 2 covers the methodology, Proposed Architecture using Dual Quad-Bit S-Box Pair, Walsh to Hadamard Transformation for Dual Quad-Bit Forward S-Box, and Hadamard to Walsh Transformation for Dual Quad-Bit Backward S-Box.
Section 3 details the implementation and evaluation, which includes Security Tests using Statistical Analysis. It also discusses the Hardware Design and Implementation, including Verification and Security Measurement Criteria, RTL Synthesis using ZedBoard Zynq 7000 SoC, Front-End Design, and Back-End (Physical) ASIC Design, followed by a Comparative Discussion.
Section 4 concludes the research and discusses future work.
2. Related Work
Cryptography, derived from the Greek term meaning “secret writing”, is a technique that ensures message confidentiality. Historically, cryptography has been used to protect information, with roots tracing back to ancient civilizations. For instance, the Egyptians utilized secret hieroglyphs, while Ancient Greeks and Romans employed cryptographic methods, such as the renowned Caesar cipher, dating back to 2000 BC [
8]. In contemporary times, cryptography is critical for securing data, ensuring that only authorized recipients can access transmitted information. Despite its pervasive use in modern informatics, many individuals are unaware of cryptography’s role in their daily interactions with technology. However, the robustness of cryptographic systems can be compromised by a single programming error or improper implementation, highlighting their inherent fragility. The foundation of modern cryptographic standards builds on the principles established by Claude Shannon. The current standard for encryption, the Advanced Encryption Standard (AES), utilizes the S-Box and inverse S-Box algorithms proposed by Rijndael. These algorithms, depicted in
Figure 1, respectively, were adopted as the AES standard by the National Institute of Standards and Technology (NIST) in 2001 [
1]. Implementing these algorithms in hardware, particularly in Application-Specific Integrated Circuits (ASICs), is crucial for achieving high efficiency and security in lightweight cryptographic applications. This paper focuses on the efficient ASIC implementation of a novel dual quad-bit S-Box pair architecture using 65 nm CMOS technology, addressing the need for secure and efficient cryptographic solutions in the modern digital landscape.
The most critical step in symmetrical cryptography is the introduction of distortion to the data through substituting elements from a lookup table known as the S-Box. The S-Box maintains information security by incorporating the Shannon property of confusion [
9]. This non-linear property is essential in modern cryptography, providing a robust defense against linear and differential attacks [
10]. A prime example of this non-linear transformation is the implementation of the S-Box in the NIST-approved Advanced Encryption Standard (AES) algorithm, as illustrated by the AES S-Box and inverse S-Box, respectively [
1].
However, the AES S-Box is responsible for a significant portion of the delay in the entire encryption process. Therefore, research efforts are directed towards optimizing the algorithm, particularly designing new S-Boxes suitable for efficient implementation on various resource-constrained devices [
11]. Numerous researchers have contributed to developing various S-Box designs for hardware implementations targeting the 65 nm CMOS standard library.
D. Canright et al. (2005) [
12] were among the first to examine S-Box design choices based on polynomial and normal bases, providing 432 cases for each. They optimized bit matrices using a “greedy algorithm” and included NOR gates to save area, enabling compact hardware implementations for AES parallelism. This approach led to a structural code formulation that matched the hardware complexity reported, resulting in a reduced hardware cost of 200 GEs.
J. Boyar et al. (2012) introduced circuit optimization using three techniques: greedy heuristics for linear components, automatic theorem proving for resynthesizing nonlinear elements into shallow-deep tower blocks, and simple local replacements along critical paths [
13]. N. Ahmad et al. (2013) proposed using the arithmetic of a composite field and a low-powered Galois field GF(2
8) polynomial base for a reverse Galois S-Box CMOS model [
4].
R. Ueno et al. (2015) [
14] developed a GF(2
8) architecture based on an efficient and compact investment circuit design, combining GF arithmetic in both nonredundant and redundant ways. This design established a basic standard basis for efficiently mapping input power components into logical gates within a 65 nm CMOS standard cell library, showing improved performance compared to conventional circuits.
J. Boyar et al. (2017) [
15] advanced techniques for building small linear circuits with limited depths, utilizing a new heuristic for linear depth optimization. These techniques were used to create traditional encryption functions defined in the GF(2) area, generated by circuit gates like “XNOR”, “XOR”, and “AND”. This method was repeatedly used to optimize the linear top and bottom components in the See-Saw process, resulting in a smaller 16-bit S-Box with a reversal of GF(2
16).
R. Masolehey et al. (2018) [
5,
16] proposed two versions of the S-Box design: an all-structural lightweight design with a delay of 1.0808 ns and a slightly higher implementation area of 391.04 μm
2, and an all-structural fast design, the smallest, fastest, and most efficient S-Box design with the lowest power consumption, an area-time product of 162.177, and a low delay of 0.779697 ns.
In 2020, B. Rashid et al. (2020) [
7] suggested a hardware-efficient reverse-based S-Box, an alternative to the AES S-Box, with similar cryptographic features for lightweight cipher blockers. This S-Box calculation involved the reverse field and refined transformation, primarily through two processes, resulting in an integrated S-Box with low-area capital cost-effectiveness and a low critical path delay (CPD).
Y. Teng et al. (2022) [
17] introduced an advanced VLSI architecture for the AES S-box and inverse S-box, utilizing composite field arithmetic to achieve high area efficiency. Key optimizations include reducing the area of multipliers in the Galois composite field
and combining squaring and multiplication operations with constants. The methodology also features manual optimization of the multiplicative inversion through simplified Boolean equations. The design improved efficiency by using pre-processing and post-processing modules to share resources between the S-box and inverse S-box, validated by FPGA and ASIC implementations showing a 10% area efficiency increase on Virtex-6 and a 30% improvement with the TSMC 90 nm process.
Despite significant advancements in designing S-Box architectures for AES, a notable research gap persists in concurrently optimizing area efficiency, processing speed, and security in hardware implementations. Previous works have primarily focused on either compacting the design or enhancing throughput individually. There remains a need for a holistic approach that integrates area, computational optimizations, and security measures. This research aims to develop a VLSI architecture that achieves superior area efficiency, high throughput, and robust security, validated through rigorous FPGA and ASIC implementations.
3. Methodology
Substitution boxes (S-boxes) are pivotal in providing non-linearity in block ciphers, which is crucial for resisting linear and differential cryptanalysis. The Advanced Encryption Standard (AES) utilizes an S-box based on the finite field inversion, which, while secure, poses significant challenges in terms of hardware efficiency, particularly in ASIC implementations where power, performance, and area (PPA) are key constraints. Therefore, we proposed RTL circuitry for S-Box using the novel approach of a dual quad-bit structure while ensuring several cryptographic properties and robust security. To enhance cryptographic security, it is crucial to consider several key properties of S-boxes in symmetric-key algorithms. Firstly, non-linearity is fundamental as it maximizes the Hamming distance from any affine function, thereby providing robust defense against linear cryptanalysis. Secondly, maintaining low differential uniformity is essential; this ensures that the maximum output differential for any input differential occurs with low probability, thus protecting against differential cryptanalysis. Lastly, bijectiveness guarantees that each input maps to a unique output, ensuring the S-box is invertible and facilitating the decryption process in symmetric-key algorithms.
3.1. Algebraic Normal Forms (ANFs)
Algebraic Normal Form (ANF) is a polynomial representation of a Boolean function over the binary field
. It expresses the function as a sum of products, which can be directly implemented in hardware.
where
are coefficients in
and ⊕ denotes addition modulo 2. The Algebraic Normal Form (ANF) of a Boolean function is critical for several reasons. Firstly, the simplicity and implementability of ANFs make them highly compatible with digital circuit design, as they utilize XOR and AND gates, which are fundamental to such circuits. Secondly, ANFs are instrumental in analyzing the non-linearity of Boolean functions. Functions that include higher-degree terms in their ANF indicate improved security because they deviate further from linear functions, enhancing resistance against cryptographic attacks.
3.2. Walsh–Hadamard Transform
The Walsh–Hadamard Transform is employed to compute the Walsh spectrum of Boolean functions, which measures their deviation from affine functions. The transform is defined as follows:
where
represents the dot product modulo 2.
In cryptographic contexts, the non-linearity
of a Boolean function
f is crucial. It is calculated using the following formula:
This measure of non-linearity is vital because it quantifies the function’s distance from any linear or affine function, thus indicating the function’s robustness against linear cryptanalysis. High non-linearity is desirable in cryptographic Boolean functions to enhance security. This metric is fundamental for assessing the function’s resistance to linear cryptanalysis. The use of ANFs and WHT in cryptographic S-box design is crucial for ensuring robust security features. These mathematical tools provide a clear pathway for designing and evaluating the non-linearity and differential uniformity of Boolean functions in cryptographic applications
3.3. Proposed Architecture Using Dual Quad-Bit S-Box Pair
In this section, we propose an 8-bit S-Box design utilizing dual quad-bit forward (alpha) and backward (beta) transformations.
Figure 2 illustrates the RTL design for the 8-bit S-Box using dual quad-bit transformations. The design employs multiplexers and demultiplexers to select between forward and backward operations, ensuring efficient encryption and decryption processes. The design consists of several key components. Registers are used to store the input and output values, providing synchronized data flow. The demultiplexer splits the 8-bit input into two 4-bit values for processing. A multiplexer combines the two 4-bit processed values into an 8-bit output. Forward transformations (
) implement the quad-bit Walsh to Hadamard transformation, while backward transformations (
) implement the quad-bit Hadamard to Walsh transformation. The control logic determines whether the forward or backward transformation is applied based on the selected signal.
The 8-bit input is initially stored in a register. This register is clocked to ensure synchronized data flow. The 8-bit input is then split into two 4-bit values ( and ) using a demultiplexer. This separation allows for parallel processing of the high and low 4-bit segments. The separated 4-bit values are fed into both the forward () and backward () transformation units. The forward transformation () converts Walsh functions to Hadamard functions, while the backward transformation () converts Hadamard functions to Walsh functions. After processing, a multiplexer selects between the outputs of the forward and backward transformations based on the control logic. The select signal determines whether the forward or backward transformation is used. When the select signal () is 0, the forward transformation (Walsh to Hadamard) is applied. When the select signal () is 1, the backward transformation (Hadamard to Walsh) is applied. The selected 8-bit output is then stored in an output register, ensuring synchronized data output. The demultiplexer splits the 8-bit input into two 4-bit values, which are processed by the forward and backward transformation units. The multiplexer then selects the appropriate processed values based on the control logic, combining them into an 8-bit output. The design ensures efficient and secure cryptographic operations by leveraging the orthogonal properties of the Walsh and Hadamard transformations.
3.4. Walsh to Hadamard Transformation for Dual Quad-Bit Forward S-Box
The transformation from Walsh to Hadamard functions is pivotal in generating the S-Boxes. The Walsh functions are defined as a sequence of binary values (0 and 1). In contrast, the Hadamard functions are derived from the Hadamard matrix, which is a square matrix whose entries are binary (0 and 1) and rows are mutually orthogonal. This transformation is critical because it leverages the orthogonality properties of the Hadamard matrix to ensure the desired cryptographic strength and non-linearity in the S-Boxes. The forward S-Boxes and are defined by the following logical expressions, with the Walsh inputs and Hadamard outputs .
These logical expressions are derived based on the transformation rules from Walsh to Hadamard functions, allowing for efficient computation of the S-Box outputs, which are critical in the AES encryption process.
Figure 3 depicts two logic circuit diagrams, labeled (a) and (b), which are used to illustrate the transformation from Walsh functions to Hadamard functions.
Figure 3a corresponds to
, where the logic gates and connections form a specific arrangement to transform the Walsh inputs (
) into the Hadamard outputs (
). The key components in this transformation are NOT gates (Inverters), which are used to negate the inputs; AND gates, which perform logical conjunctions of the inputs and their negations; and OR gates, which perform logical disjunctions to combine the results of the AND gates.
Figure 3b corresponds to
, which also transforms Walsh inputs into Hadamard outputs but may have a slightly different configuration and connections of logic gates to achieve this transformation. The circuit operation involves feeding the inputs
into the circuit, where NOT gates negate the inputs where necessary, AND gates combine these inputs (and their negations) in specific ways to form intermediate results, and OR gates combine these intermediate results to produce the final outputs
. The diagrams essentially implement the Boolean expressions described for
and
, thereby achieving the transformation from the Walsh functions
w to the Hadamard functions
h.
3.5. Hadamard to Walsh Transformation for Dual Quad-Bit Backward S-Box
The transformation from Hadamard to Walsh functions is essential for generating the backward S-Boxes. This process is crucial for reversing the encryption process, ensuring that the S-Boxes can be used effectively in both encryption and decryption. The backward S-Boxes and are defined by the following logical expressions, with the Hadamard inputs and Walsh outputs .
These logical expressions are derived based on the transformation rules from Hadamard to Walsh functions, allowing for efficient computation of the S-Box outputs, which are critical in the AES decryption process.
Figure 4 depicts two logic circuit diagrams, labeled (a) and (b), which are used to illustrate the transformation from Hadamard functions to Walsh functions.
Figure 4a) corresponds to
, where the logic gates and connections form a specific arrangement to transform the Hadamard inputs (
) into the Walsh outputs (
). The key components in this transformation are NOT gates (Inverters), which are used to negate the inputs; AND gates, which perform logical conjunctions of the inputs and their negations; and OR gates, which perform logical disjunctions to combine the results of the AND gates.
Figure 4b) corresponds to
, which also transforms Hadamard inputs into Walsh outputs but may have a slightly different configuration and connections of logic gates to achieve this transformation. The circuit operation involves feeding the inputs
into the circuit, where NOT gates negate the inputs where necessary, AND gates combine these inputs (and their negations) in specific ways to form intermediate results, and OR gates combine these intermediate results to produce the final outputs
. The diagrams essentially implement the Boolean expressions described for
and
, thereby achieving the transformation from the Hadamard functions
h to the Walsh functions
w.
The Walsh–Hadamard Transform provides a robust framework for the design of S-Boxes in AES encryption. By leveraging the orthogonal properties of Hadamard matrices, we can derive efficient and secure logical expressions for both forward and backward S-Boxes. The dual S-box approach involves splitting the traditional 8-bit input into two 4-bit blocks, processed by distinct S-boxes. This design enhances the cryptographic strength and hardware efficiency by allowing more tailored and optimized transformations. This methodology enhances the security and performance of AES encryption, making it a valuable tool in modern cryptographic implementations.
Figure 5 shows generated 8-bit S-Box values from dual quad-bit S-Box pairs, as shown in
Figure 6.
In
Figure 5 each cell in the matrices represents a hexadecimal value that is substituted during the encryption and decryption processes. The left matrix shows the forward S-Box used for substitution in the encryption process, while the right matrix shows the backward S-Box used in the decryption process. These S-Boxes are designed to ensure high non-linearity and resistance against linear and differential cryptanalysis, enhancing the encryption’s security.
Figure 6 illustrates the values of dual quad-bit S-Box pairs for both forward and backward transformations. The dual quad-bit S-Box design is an innovative approach that splits the traditional 8-bit S-Box into two 4-bit S-Boxes, providing additional flexibility and complexity in the substitution process. The top row displays the S-Box 1 forward and backward values, while the bottom row shows the S-Box 2 forward and backward values. This dual S-Box structure aims to enhance the diffusion and confusion properties of the cipher, thereby increasing its resistance to various cryptographic attacks. Using dual S-Boxes allows for more intricate substitution patterns, which contribute to the overall strength and security of the encryption algorithm.
5. Conclusions
The S-Box is the core component in a cipher block that ensures the credibility of the AES. Designing the most efficient architecture for the S-Box should be a primary focus to achieve optimal cryptographic performance. This research proposed an energy-efficient and area-delay optimized forward and backward S-Box for use in lightweight cryptography. We demonstrated the software implementation of the S-Box in Python to perform statistical analysis of the security measures and reviewed the proposed design properties. Furthermore, we integrated our proposed S-Box into the AES core for efficient hardware implementation and compared the gate area, power, and delay with other methods. The dual quad-bit forward and backward S-Box were designed and implemented using efficient VLSI circuits. The ASIC implementation of the AES core with the proposed S-Box was carried out in a 65 nm CMOS standard cell library. The results proved optimal compared to other methods, showing that our proposed S-Box utilizes fewer hardware resources and achieves a lower critical path delay (CPD) than other S-Box architectures. The results indicate that our proposed S-Box not only consumes low hardware resources but also provides lower delay and power consumption, making it an optimal choice for lightweight block ciphers. The dual quad-bit S-Box structure enhances security levels, outperforming the traditional AES S-Box and other existing methods. Therefore, the proposed design is highly suitable for applications requiring efficient and secure cryptographic solutions. To enhance privacy in blockchain systems like Blockshare, our method can integrate homomorphic encryption for secure computations on encrypted data, Zero-Knowledge Proofs (ZKPs) for data integrity verification without disclosure, and differential privacy for protecting individual data points. Decentralized Identifiers (DIDs) and Verifiable Credentials (VCs) can improve identity management, reducing reliance on centralized authorities. These methods can also be applied to systems like VQL and VChain+ for secure cloud queries and improved data privacy. Our S-Box design, with its robust cryptographic properties and hardware efficiency, supports these advancements, enhancing security and resource efficiency in privacy-preserving protocols.