Enhancing Data Privacy Protection and Feature Extraction in Secure Computing Using a Hash Tree and Skip Attention Mechanism

Zhou, Zizhe; Wang, Yaqi; Cong, Lin; Song, Yujing; Li, Tianyue; Li, Meishu; Xu, Keyi; Lv, Chunli

doi:10.3390/app142210687

Open AccessArticle

Enhancing Data Privacy Protection and Feature Extraction in Secure Computing Using a Hash Tree and Skip Attention Mechanism

by

Zizhe Zhou

^†,

Yaqi Wang

^†,

Lin Cong

^†,

Yujing Song

,

Tianyue Li

,

Meishu Li

,

Keyi Xu

and

Chunli Lv

^*

China Agricultural University, Beijing 100083, China

^*

Author to whom correspondence should be addressed.

^†

These authors contributed equally to this work.

Appl. Sci. 2024, 14(22), 10687; https://doi.org/10.3390/app142210687

Submission received: 23 October 2024 / Revised: 8 November 2024 / Accepted: 12 November 2024 / Published: 19 November 2024

(This article belongs to the Special Issue Cloud Computing: Privacy Protection and Data Security)

Download

Browse Figures

Versions Notes

Abstract

:

This paper addresses the critical challenge of secure computing in the context of deep learning, focusing on the pressing need for effective data privacy protection during transmission and storage, particularly in sensitive fields such as finance and healthcare. To tackle this issue, we propose a novel deep learning model that integrates a hash tree structure with a skip attention mechanism. The hash tree is employed to ensure data integrity and security, enabling the rapid verification of data changes, while the skip attention mechanism enhances computational efficiency by allowing the model to selectively focus on important features, thus minimizing unnecessary processing. The primary objective of our research is to develop a secure computing model that not only safeguards data privacy but also optimizes feature extraction capabilities. Our experimental results on the CIFAR-10 dataset demonstrate significant improvements over traditional models, achieving a precision of 0.94, a recall of 0.89, an accuracy of 0.92, and an F1-score of 0.91, notably outperforming standard self-attention and CBAM. Additionally, the visualization of results confirms that our approach effectively balances efficient feature extraction with robust data privacy protection. This research contributes a new framework for secure computing, addressing both the security and efficiency concerns prevalent in current methodologies.

Keywords:

data privacy protection; deep learning; encryption visualization; skip loss function

1. Introduction

In the digital era, with the widespread application of big data and artificial intelligence technologies, issues of data security and privacy protection have become increasingly prominent [1,2]. Particularly in fields that deal with sensitive information, such as finance, banking, healthcare, and cross-border data computing, the importance of secure computing is self-evident [3,4,5,6]. Secure computing technologies aim to ensure the security and privacy of data during storage, transmission, and processing, preventing unauthorized access or disclosure. However, traditional data protection methods often struggle to balance data privacy with the utilization of data value.

Traditional data security methods, such as encryption and access control, do provide a certain level of data protection but typically perform poorly when handling complex data analysis tasks [7]. For instance, data in encrypted states are difficult to analyze effectively, and once decrypted, their security can no longer be guaranteed [8]. Additionally, traditional methods often require substantial computational resources and time when dealing with large-scale data, which is unacceptable in practical applications, especially those requiring real-time data processing [9]. Thus, understanding how to effectively compute and analyze data without exposing raw data presents a significant challenge in the field of secure computing.

In recent years, deep learning technologies have made significant advances in the field of secure computing [10]. For example, Sahu Santosh Kumar et al. [11] have discussed and summarized the latest developments in deep learning methods for predicting the stock market. They identified data scarcity as a major barrier to creating profitable Deep Reinforcement Learning (DRL)-based Quantitative Trading (QT) methods. In the financial sector, as online transactions and digital payments become more common, a large amount of sensitive personal and business financial information needs stringent protection; for large spatio-temporal data stored on the cloud, data owners lose physical control over the data and face risks of tampering and deletion. Ren et al. [12] designed a blockchain-based secure storage mechanism, BSMD (blockchain-based secure storage mechanism for big spatio-temporal data), which uses an on-chain and off-chain cooperative storage model to mitigate the problem of insufficient blockchain storage capacity. This mechanism not only enhances the storage security of large spatio-temporal data but also provides traceability. Subvector commitments are constructed from the CDH assumption and bilinear pairs, and then a specific scheme for secure large data spatio-temporal storage is built using subvector commitments to ensure data integrity and consistency of on-chain and off-chain data. Their proposed protocol has batch processing capabilities, allowing data validators to query and verify in batches and data owners to update data in storage servers in batches, thus reducing computational costs.

To implement security architectures to detect cloud intrusions and protect customer data from hackers and attackers, Suganya M et al. [13] combined Stochastic Gradient Descent Long Short-Term Memory (SGD-LSTM) and Blow Fish encryption to detect and prevent unauthorized cloud access. The planning system involves three phases: user registration, intrusion detection, and intrusion defense. The SGD-LSTM classifier predicts cloud data access and prevents unauthorized cloud access. During the data access phase, cloud data access is managed by authenticating authorized users using the Blowfish encryption algorithm. Experimental results show that compared to algorithms frequently used in cloud computing, their algorithm provides a high level of protection and has made considerable improvements in security and execution speed. The healthcare sector also faces challenges in data security, and Puneeth RP et al. [14] proposed a new model with an appropriate hash key blockchain security system algorithm to achieve high-speed data transmission rates. The analysis results show that the proposed method provides lower millisecond values, and the entire process is validated in a hospital environment for patient module data management systems. However, appropriate security and safety systems can be enhanced by models with machine learning technologies that include data blocks computed in security functions.

In response to the aforementioned issues, this paper proposes a deep learning security computing model based on a hash tree and skip attention mechanism. This model combines the structural advantages of the hash tree with the efficient computational capabilities of the skip attention mechanism, significantly enhancing data security while greatly improving computational efficiency. Our research focuses on effectively protecting data privacy during large-scale data processing while maintaining efficient feature extraction.

The main contributions of this paper are reflected in several aspects: First, we introduce the combination of the hash tree and the skip attention mechanism to address the inefficiencies of traditional secure computing methods when handling large-scale data. The hash tree structure ensures data integrity and security, with each node storing the hash value of a data block, allowing for the rapid verification of any data changes. Additionally, the skip attention mechanism effectively filters out irrelevant data processing paths, significantly reducing unnecessary computations and improving processing speed.

Secondly, we have designed a novel skip loss function aimed at further optimizing model performance. This loss function complements the skip attention mechanism, helping the model accurately identify and focus on critical features related to secure computing during training, thereby enhancing the model’s security performance and adaptability in complex data environments.

In summary, the proposed model combines the hash tree and skip attention mechanism, addressing the inefficiencies of traditional secure computing methods in processing large-scale data while enhancing the model’s performance and adaptability in the field of secure computing. This makes it highly practical and broadly applicable in real-world applications. Finally, the organization of this paper is as follows: Section 2 reviews related work, Section 3 presents the materials and methods used, Section 4 showcases the experimental results and discussions, and Section 5 concludes with a summary of the research findings.

2. Related Work

2.1. Hash Algorithms

In our study, we utilize hash algorithms, essential tools in data security, for generating unique, fixed-length representations of input data of any length. These algorithms are widely applied in secure computing to ensure the integrity of data during storage and transmission, offering a reliable means of verifying data authenticity. In practice, hash functions convert input data into a concise “digest” or “hash value”, effectively serving as a digital fingerprint, as shown in Figure 1.

The process begins by dividing the input data into blocks of fixed length. For example, data can be broken down into multiple blocks, each of 512 bits. Each of these 512-bit blocks is further divided into smaller 32-bit segments, referred to as “words”. Initially, the block consists of 16 such words. To enhance security and provide a more comprehensive hash, additional words are generated based on the initial 16 words, extending the set to 64 words. This transformation relies on sequential calculations that derive each new word from the preceding ones, forming an expanded data set.

Ideal hash functions, like those in the SHA series, are designed to be computationally efficient, challenging to reverse-engineer, collision-resistant, and sensitive to even minor changes in input data. In secure computing applications, hash trees, or Merkle trees, utilize these hash algorithms to arrange hash values of data blocks hierarchically. This structure enables efficient data verification, particularly valuable in fields where data integrity is paramount, such as finance, healthcare, and blockchain technology.

2.2. Attention Mechanisms

The attention mechanism has been a significant innovation in the field of deep learning in recent years, designed to mimic the focusing characteristics of human attention, allowing models to prioritize the most critical information when dealing with large and complex data [15,16]. This mechanism has significantly enhanced model performance across various tasks, such as natural language processing, image recognition [17], and voice processing [18,19,20]. Especially in processing serialized data, the attention mechanism can be seen as a form of weighted sum that dynamically focuses on different parts of the input data. The basic attention function can be expressed as a mapping of queries (Query), keys (Key), and values (Value), as shown in Figure 2.

The softmax function is used to normalize the weight values so that their sum equals one. The attention mechanism not only optimizes data processing but also achieves higher precision and efficiency in complex security computing tasks [21,22]. Overall, the attention mechanism provides a mechanism that focuses the model more on key features when processing information, not only optimizing data processing but also significantly improving model performance in specific tasks, particularly in the critical area of secure computing.

2.3. Large Models

Large models, especially those based on deep learning, such as BERT [23], GPT [24], and Transformer [25], have revolutionized the field of artificial intelligence. These models utilize vast amounts of data, complex network structures, and powerful computational resources to exhibit unprecedented performance across various tasks [26,27]. The core of these models lies in their ability to capture deep patterns and relationships within data through large-scale parameters and complex computations. Large models typically contain hundreds of millions to billions of parameters and are trained on massive datasets [28,29]. The infrastructure of these models is often deep neural networks, particularly the Transformer architecture, which processes dependencies in sequence data through self-attention mechanisms [30]. Training large models typically relies on extensive computational resources, such as GPU or TPU clusters. A crucial step in the model training process is backpropagation, used to optimize the following loss function:

L (θ) = - \sum_{i = 1}^{N} log p (y_{i} | x_{i}; θ)

(1)

Here, N is the number of training samples;

x_{i}

and

y_{i}

are the input data and labels, respectively;

θ

represents the model parameters; and

p (y_{i} | x_{i}; θ)

is the probability that the model predicts label

y_{i}

given input

x_{i}

. In secure computing, the application of large models is crucial. These models can process massive amounts of data and identify complex data patterns, which is particularly important in secure computing, where security threats often hide in minor differences within large-scale data [31,32]. Large models, through their powerful data processing capabilities, can identify and analyze potential threats within these data.

Despite their significant advantages in handling complex tasks, large models also face numerous challenges, such as high computational costs, lack of model transparency, and data privacy issues [33,34]. Especially in the security field, understanding how to utilize these large models without exposing sensitive data remains a significant problem [35,36]. In summary, the development of large model technology provides new opportunities and challenges for the field of secure computing [37]. By effectively integrating these advanced technologies, we can provide more effective solutions to complex problems in the field of secure computing.

3. Materials and Method

3.1. Materials

3.1.1. Dataset Collection

In our research, we collected stock price and medical imaging data to support our secure computing model using hash trees and skip attention mechanisms, especially in finance and healthcare. For stock price data, we deployed advanced web scraping to gather real-time information from reputable financial sources, such as global financial news sites and stock exchange announcements, capturing key price metrics (opening, closing, high, and low prices). Our multi-threaded crawler system, designed for scalability and error resilience, efficiently fetched over 2.7 million records from 2010 to 2023, covering thousands of companies, including data from high-volatility periods like financial crises.

For medical imaging, we obtained brain CT datasets from established databases, yielding high-resolution, annotated images. These datasets, carefully anonymized and ethically sourced, include brain conditions like tumors and hemorrhages, providing essential materials for training models to recognize complex disease patterns. This extensive dataset collection forms the basis for our model’s secure and precise applications in critical, data-sensitive fields.

3.1.2. Data Preprocessing

For the Time-series Dataset. In this study, preprocessing the stock price dataset is essential, especially during data cleaning and handling missing values. It is common to have missing data in stock price records due to various issues, such as collection errors or delays from market fluctuations. To address these gaps, we use methods like forward fill and backward fill. Forward fill replaces a missing value with the most recent known data point, allowing continuity in the sequence. Conversely, backward fill uses data from the following day if the previous data point is also missing. These methods help maintain temporal coherence in short periods of missing data. For longer gaps or irregular patterns, linear interpolation provides a simple way to estimate missing values by averaging adjacent data points. This approach can be extended to more complex interpolation methods, like polynomial or spline interpolation, which are better suited to nonlinear trends commonly found in stock markets. Besides filling in missing data, detecting and handling outliers is another crucial aspect of data cleaning. Outliers, which significantly deviate from the rest of the data, can be detected with box plots or Z-scores. If an outlier is identified, we may choose to either remove it or replace it with the average of the neighboring values to ensure smoother input data. Finally, we apply Z-score normalization to standardize the dataset, giving it a mean of 0 and a standard deviation of 1. This sequence of steps helps ensure high-quality, reliable input data for training our deep learning models.

For the Image Dataset. In this study, we applied three data augmentation techniques, including CutOut, CutMix, and Mosaic, to improve the model’s robustness and adaptability, as shown in Figure 3. CutOut involves masking random sections of an image, encouraging the model to recognize objects even when parts are missing. CutMix combines two images by overlaying parts of one onto another, creating blended samples that help the model learn from mixed-feature data, which is particularly useful for addressing class imbalances. Mosaic stitches together four images into a single frame, introducing diverse contexts and patterns that enhance the model’s ability to generalize across varied backgrounds and categories.

3.2. Proposed Method

3.2.1. Overview

In this study, we propose a deep learning model for secure computing that integrates hash trees and skip attention mechanisms, aiming to enhance the security and computational efficiency when handling sensitive data. The design of this model thoroughly considers the security needs during data storage and transmission, while incorporating deep learning technologies to boost the model’s ability to recognize complex data patterns. The first part of the model involves the integration of a hash tree structure. The hash tree, also known as a Merkle tree, is a tree-shaped data structure where each node is linked to its children via hash values, primarily used to ensure the integrity and security of data. As demonstrated in Figure 4, the hash tree recursively combines the hash value of each data block with those of its adjacent blocks to form a new hash value, culminating in the root node’s hash value.

This process can be mathematically represented as follows:

Hash (N) = Hash (Hash (N_{1}) + Hash (N_{2}))

(2)

where N represents a node, and

N_{1}

and

N_{2}

are child nodes. This structure ensures that any alteration in a data block will lead to a change in the root hash, thus quickly detecting any tampering with the data. The processed data first enters the skip attention mechanism module. Our model utilizes an improved attention mechanism—skip attention mechanism—that allows the model to skip unnecessary computational steps and directly focus on critical information through skip connections. Finally, to optimize the training process and enhance the model’s sensitivity to key features, we designed the skip loss function. This loss function takes into account the important data features that the skip attention mechanism might overlook, emphasizing the learning of these features to ensure that the model does not miss out on information crucial for security. The research presented in this paper indeed has broad practical application value, especially in fields where data security and privacy protection are critical, such as cloud-based medical diagnosis and finance.

In medical diagnosis applications, patients’ medical imaging data are often stored in the cloud to enable doctors or intelligent systems to perform diagnoses via remote access. However, cloud platforms face significant risks of data leakage, making it essential to effectively extract key information from images while ensuring patient privacy. The model proposed in this paper combines hash trees and the skip attention mechanism to encrypt image data, thereby protecting data privacy while ensuring the model’s ability to extract critical features from encrypted data. During medical image analysis, the model can identify important features such as edges, textures, and contrasts, ensuring the accurate recognition of lesions like tumors and fractures, thus providing a secure and efficient solution for cloud-based medical diagnosis. In financial scenarios, particularly in the analysis of sensitive data such as stocks and futures, financial institutions typically store and process customer data in the cloud to support efficient data analysis and risk management. However, the privacy and security of financial data also face significant challenges, especially in cross-border data processing, where data leakage can lead to immeasurable economic losses. The method proposed in this paper encrypts the data before they are uploaded to the cloud, ensuring that even if unauthorized users access the encrypted data, they cannot retrieve specific customer information.

In summary, the methods proposed in this paper have strong practical application potential in the fields of cloud-based medical diagnosis and financial data analysis. They not only effectively safeguard data privacy but also enhance the model’s performance in critical tasks through efficient feature extraction, providing a secure and efficient solution for scenarios with high privacy demands.

3.2.2. Hash-Tree-Based Transformer Model

In this research, we integrate hash trees with the Transformer structure in deep learning to propose a novel secure computing model for deep learning, particularly suitable for the integrity protection and rapid retrieval of large-scale data. Hash trees are widely used in data structures and function by constructing tree-shaped hash nodes to ensure that data are not tampered with during transmission and storage. Compared to traditional fully connected network structures, hash trees exhibit strong parallelism and fast verification speeds. When combined with the Transformer model, they further enhance the efficiency and accuracy of the model in handling complex tasks. The fundamental principle of a hash tree is to recursively segment data, with each node storing the hash values of its children, culminating in the storage of the entire dataset’s hash value at the root of the tree. Suppose we have a dataset

D = {d_{1}, d_{2}, \dots, d_{n}}

; the construction of its hash tree can be represented as

H_{root} = H (H (d_{1}) \oplus H (d_{2}) \oplus \dots \oplus H (d_{n}))

(3)

Here,

H (d_{i})

denotes the hash value of data block

d_{i}

, and ⊕ represents the combination operation between hash values. This structure ensures that any tampering with a data block will result in a change in the root hash value, thereby enabling rapid integrity checks. In the insertion operation of a hash tree, as shown in Figure 5, when a new node is added to the tree structure, the system determines the insertion position based on the hash value of the new data and updates the hash values of the relevant nodes along the path to maintain the consistency and integrity of the tree structure. The specific process is as follows:

1.: Locating the Insertion Position: First, we determine the position of the new node within the hash tree. This is typically carried out by selecting an appropriate leaf position for insertion according to the hash value of the data and the tree’s balancing rules.
2.: Calculating the Hash Value of the New Node: Perform a hash operation on the inserted node to generate its hash value. Assume the new node is $N_{new}$ , and its hash value is $H (N_{new})$ .
3.: Updating the Hash Value of the Parent Node: Add the hash value of the new node to its parent node’s hash calculation to update the parent’s hash value. Assume the parent node is P; its updated hash value is calculated as

$H (P) = H (H (P_{left}) + H (N_{new}))$

(4)

where $H (P_{left})$ is the hash value of the original left child node. This process recursively continues up the tree until reaching the root node.
4.: Recursively Updating Hash Values Along the Path: Each update affects its parent node; therefore, the hash values of all nodes along the path from the new node to the root node need to be recalculated, ensuring the overall hash consistency of the tree structure after inserting the new node.

In the deletion operation of a hash tree, as shown in Figure 6, the system first locates the node to be deleted and then updates the hash values of the relevant nodes along the path to ensure the integrity of the tree. The specific steps are as follows:

1.: Locating the Node to Delete: Identify the position of the node to delete, $N_{del}$ , within the tree structure based on the node’s hash value or positional information.
2.: Removing the Node: Remove the node from the tree. If the node is a leaf, it is directly removed. If the node has children, its child nodes are adjusted according to the tree’s balancing strategy to maintain the structure.
3.: Updating the Hash Value of the Parent Node: After deletion, the hash value of the parent node needs to be recalculated. Assume the deleted node is $N_{del}$ , and its parent node is P; the new hash value is calculated as

$H (P) = H (H (P_{left}) + H (P_{right}))$

(5)

where $H (P_{left})$ and $H (P_{right})$ are the updated hash values of the left and right child nodes, respectively.
4.: Recursively Updating Hash Values Along the Path: Starting from the parent node of the deleted node, recursively update the hash values along the path up to the root node to ensure the hash consistency of the entire tree.

This deletion operation, by updating the hash values of the relevant nodes, ensures the security of the tree structure and the immutability of data, as any deletion operation will affect the hash path of the entire tree, thereby maintaining data integrity and enabling fast verification.

The Transformer model excels in processing sequential data by capturing dependencies between different positions in a sequence through its self-attention mechanism. In the model described in this paper, the integration of hash trees with the Transformer structure creates an effective framework for data protection and processing. Specifically, each node in the hash tree corresponds to an input vector in the Transformer, and after hashing operations, it generates input features for subsequent layers. Suppose the input for the Transformer model is

X = {x_{1}, x_{2}, \dots, x_{n}}

; then, the hash tree generates a corresponding hash value for each input data point, forming the feature

H (X) = {H (x_{1}), H (x_{2}), \dots, H (x_{n})}

.

This method uses the hash values of each data point as keys, queries, and values in the attention mechanism, significantly enhancing the security during the data processing phase. In programming evaluation tasks, every step of a program can be tracked through the hash tree. Each execution step of the program corresponds to a node state in the hash tree, and changes in state after execution are reflected in updates to the node’s hash values. For example, the initial state of the program can be mapped to the root node of the hash tree, while each function call or variable update corresponds to a child node in the hash tree. The deep learning model based on the hash tree, by combining data integrity protection and efficient data processing capabilities, offers a powerful tool for the data security field. The hash tree provides a basis for data validation and dynamic updates, while the self-attention mechanism of the Transformer model enhances the model’s understanding of complex data patterns. This integration not only optimizes system performance but also ensures the security and reliability of the data.

3.2.3. Skip Attention Mechanism

In this study, we propose an improved attention mechanism—skip attention mechanism—based on the traditional Transformer structure. This mechanism is innovatively optimized to enhance the efficiency and accuracy of the model when processing large-scale datasets, as shown in Figure 7. By incorporating the skip mechanism, our model can effectively reduce unnecessary computations, particularly in handling complex tasks that require focused attention, significantly improving performance.

Traditional Transformer models utilize self-attention mechanisms to process sequential data, calculating attention weights for every element against all others in the sequence to capture the intricate relationships within the data. While powerful, this mechanism encounters substantial computational resource consumption when dealing with very large datasets. For instance, for a sequence of length n, the self-attention mechanism requires computing weights for

n \times n

pairs of elements, which becomes highly inefficient with large amounts of data. In contrast, the skip attention mechanism introduces skip connections, allowing the model to bypass certain layers during feature extraction and relationship modelling, and directly relay information from lower layers to higher ones. This not only reduces computational load but also helps maintain the benefits of deep networks while avoiding issues like gradient vanishing. When designing the skip attention mechanism, we focused on how to achieve efficient data processing through a streamlined network structure. The specific design parameters of the model are as follows:

Network layers: Our model is designed with three layers, each including a skip attention unit and a feed-forward network unit. This design ensures sufficient model depth to handle complex data relationships while maintaining computational efficiency.
Network width and channel numbers: The width of each skip attention unit is set to 64, meaning each attention head processes features of dimension 64. This width was determined through experimental balancing of model performance and computational efficiency. The number of channels is set to 256 to ensure the network can capture sufficient information.

The core of the skip attention mechanism is the introduction of skip connections in the self-attention computation, which can be mathematically represented as

SkipAttention (Q, K, V) = LayerNorm (X + MultiHead (Q, K, V))

(6)

Here,

Q, K, V

are the query, key, and value matrices, respectively, which are obtained from the outputs of the previous layer through linear transformations;

MultiHead (\cdot)

denotes the multi-head attention mechanism; X represents the input sequence; and

LayerNorm (\cdot)

is a layer normalization operation used to stabilize the network training process. By reducing computational steps, the skip attention mechanism can significantly improve the training and inference speed of the model on large-scale datasets. This is particularly important in secure computing tasks requiring real-time processing of vast amounts of data, such as real-time monitoring systems and online trading platforms. Additionally, this mechanism helps more effectively propagate gradients throughout the network by directly connecting input and output layers, thereby mitigating the common problem of gradient vanishing in deep networks. Mathematically, the design of the skip attention mechanism ensures that information can still be effectively transmitted even as the number of network layers increases, thus maintaining the model’s learning and generalization capabilities. This design shows significant advantages in tasks with complex structural dependencies, such as long-text analysis in natural language processing and the real-time prediction of high-frequency trading data. In summary, the introduction of the skip attention mechanism not only optimizes the computational process and enhances efficiency but also strengthens the model’s ability to capture key features in data, bringing a new, efficient solution to the field of secure computing.

3.2.4. Skip Loss Function

In deep learning models, the loss function is a crucial factor that measures the difference between model predictions and actual values, directly affecting the model’s training effectiveness and final performance. Traditional loss functions, such as cross-entropy loss or mean squared error loss, typically calculate the error between predictions and true values independently at each output node, then average these errors. While this method performs well in many tasks, particularly when the output layer structure is relatively simple, it may not fully utilize information passed through skip connections in deep networks with such connections, potentially leading to poor training outcomes or unstable training processes. The design of the skip loss function allows the model to consider inter-layer dependencies introduced by skip connections during loss computation. It accounts not only for errors at individual nodes but also for related errors across layers, enabling the model to better understand and utilize deep information transmitted through skip connections. The core of the skip loss function is the weighted loss terms, dynamically adjusted based on the path data take through the network, thereby optimizing the network’s learning of crucial information. The specific mathematical formula is as follows:

L (θ) = \sum_{i = 1}^{N} (α_{i} \cdot L_{main} (y_{i}, {\hat{y}}_{i}) + \sum_{j \in S (i)} β_{i j} \cdot L_{skip} (y_{i}, {\hat{y}}_{j}))

(7)

Here, N is the number of training samples,

y_{i}

is the actual label for sample i,

{\hat{y}}_{i}

is the corresponding model prediction, and

L_{main}

is the primary loss function, such as cross-entropy or mean squared error, used to calculate the main error between model output and actual labels.

α_{i}

is the weight of the main loss term,

S (i)

represents the set of layers connected to sample i through skip connections,

β_{i j}

is the weight of the loss term for layer j connected to sample i through a skip connection, and

L_{skip}

is the loss function for skip connections, used to calculate errors across layer outputs.

The design of the skip loss function is based on mathematical and practical considerations: in deep networks, gradient transmission issues can restrict the update of features in lower layers, hindering the model’s ability to learn complex data patterns effectively. By introducing skip connections, the model can directly feedback deep features to shallower layers, but if the loss function only calculates errors at the output layer, this deep-layered information might not be sufficiently utilized. By introducing additional loss calculations for every layer with skip connections, the skip loss function allows for a more balanced update of weights across layers, enhancing information flow and learning efficiency from bottom to top layers. In complex networks utilizing the skip attention mechanism, the skip loss function offers the following advantages:

1.: Improved gradient flow: By adding extra loss calculations for cross-layer connections, it helps address the problem of vanishing gradients, especially evident in deeper networks.
2.: Enhanced model learning capability: It enables the model to more effectively learn and utilize features transmitted through skip connections, thus improving overall prediction accuracy and generalization ability.
3.: Increased training stability: By dynamically adjusting the weights of loss at different layers, it makes the training process more stable, reducing fluctuations during training.

In summary, the introduction of the skip loss function not only optimizes the mathematical structure of the model but also provides a more effective training strategy for handling complex data. In fields like secure computing, image recognition, and natural language processing, which require processing large-scale data, this design significantly enhances the model’s performance and practical applicability.

4. Results and Discussion

4.1. Experimental Setup

4.1.1. Hardware and Software Environment

In this study, to ensure efficient training and accurate testing of the model, we selected a computer cluster equipped with high-performance GPUs as the hardware platform. This platform is equipped with multiple NVIDIA Tesla V100 GPUs, Hongkong, China, each with 32 GB of memory, capable of providing up to 100 TeraFLOPS of single-precision floating-point computational power, which is essential for processing large-scale deep learning models and complex datasets. Additionally, the cluster includes several servers with high-speed CPUs and large memory capacities, each equipped with at least 64-core Intel Xeon processors, Hongkong, China and 512 GB of RAM, which helps provide ample computational resources during data preprocessing and model validation phases.

On the software side, we primarily used the deep learning frameworks TensorFlow 2.18.0 and PyTorch 1.8. TensorFlow, developed by Google, is favored by many researchers and developers for its flexible computational graph concept, robust scalability, and broad community support. It supports various platforms and languages, allowing for easy deployment on different hardware. PyTorch, developed by Facebook’s AI research team, is popular for its native support for dynamic computational graphs and its straightforward user interface. Its ease of use and modular design make it particularly suitable for rapid prototyping in research and development.

To enhance the efficiency and reproducibility of experiments, we also utilized acceleration libraries such as CUDA and cuDNN to optimize GPU computational performance. These libraries significantly speed up deep learning computations, especially when performing intensive calculations on large datasets. Furthermore, to manage and monitor the experimental process, we deployed machine learning experiment management tools like MLflow, which can track experiments, record parameters, and manage the lifecycle of models.

4.1.2. Hyperparameters and Training Settings

In this research, to ensure the stable performance of the model across different datasets, we employed five-fold cross-validation to assess the robustness of the model. Through this method, the entire dataset was evenly divided into five parts, with four parts used for training the model and the remaining part used for testing. This process was repeated five times, each time with a different part used as the test set, thus ensuring that every data point had the opportunity to be used in testing. This cross-validation method reduces bias and variance in the model evaluation process, enhancing the accuracy and reliability of the assessment.

For model training, we adjusted several hyperparameters to achieve optimal learning outcomes. Initially, we set a relatively high learning rate,

α = 0.001

, to allow the model to converge quickly. As training progressed, we implemented a learning rate decay strategy, gradually reducing the learning rate to

α = 0.0001

to ensure that the model could make finer adjustments near the global minimum and avoid excessive oscillations. Batch size is also an important hyperparameter that determines the number of data samples computed in each forward and backward propagation. In this study, we set the batch size to 64, a compromise that ensures the utilization of computational resources while maintaining the stability and memory efficiency of the model training.

The number of training epochs, also determined through multiple experiments, was set at 50 epochs. This number was adjusted based on the model’s performance on the validation set. Typically, the performance of the model would stabilize after a few epochs, but to ensure that the model could fully learn all complex patterns in the data, it was necessary to increase the number of training cycles appropriately. Moreover, to prevent overfitting, we also incorporated an early stopping mechanism, which stops training if there is no significant improvement in the model’s performance on the validation set over 10 consecutive epochs.

4.1.3. Baseline

In the field of time series prediction, mainstream models such as Support Vector Machines (SVMs) [38], Random Forest [39], Recurrent Neural Networks (RNNs) [40], and Long Short-Term Memory (LSTM) networks [41] are widely applied across various domains including finance, meteorology, and traffic. SVM excels in handling small samples due to its powerful classification capability and high predictive accuracy, making it particularly suitable for high-dimensional data processing. However, SVM has limited capacity for modeling nonlinear relationships, often requiring the integration of kernel functions to enhance its performance. Random Forest improves prediction accuracy and stability by aggregating multiple decision trees, making it especially suitable for data with complex feature relationships. The structure of RNN enables it to process sequential data, capturing information over time, which makes it appropriate for modeling time series data such as speech and text, although it is prone to gradient vanishing issues. To address this, LSTM, as a variant of RNN, effectively captures long-term dependencies by introducing memory cells and gating mechanisms, particularly excelling in long-term predictions for financial market trends and weather changes.

In the domain of image classification, models such as AlexNet [42], GoogLeNet [43], ResNet [44], and EfficientNet [45] have garnered widespread attention in recent years. AlexNet gained prominence in the 2012 ImageNet competition with its deep convolutional neural network architecture, marking the rise of deep learning in the field of image recognition. By employing ReLU activation functions and dropout regularization, it effectively reduces the risk of overfitting and achieves high classification accuracy. Subsequently, GoogLeNet introduced the Inception module, which enhances the network’s expressive power by computing features in parallel with convolutional kernels of different sizes while maintaining a low computational complexity. ResNet addresses the gradient vanishing problem in training deep networks through residual learning, allowing the network to reach significant depths and markedly improving image classification performance. Finally, EfficientNet employs a compound scaling strategy in its model design, effectively balancing the model’s depth, width, and resolution, resulting in excellent performance in terms of efficiency and accuracy, making it a popular choice in the current image classification domain. By conducting an in-depth analysis of these mainstream models, we can better understand their strengths and weaknesses, enabling us to select the best solutions for specific application scenarios and enhance model performance in complex tasks.

4.1.4. Evaluation Metrics

In this study, we adopt accuracy, precision, recall, F1-score as evaluation metrics, each reflecting the model’s performance from different perspectives. Accuracy refers to the proportion of correctly predicted samples among the total number of samples, providing an intuitive reflection of the model’s overall performance; however, it may lead to misleading conclusions in cases of class imbalance. Precision is the proportion of truly positive samples among those predicted as positive by the model, assessing the accuracy of the model’s positive predictions. Recall, on the other hand, refers to the proportion of correctly predicted positive samples among all actual positive samples, reflecting the model’s ability to identify positive instances. The F1-score is the harmonic mean of precision and recall, serving as a comprehensive measure of the model’s classification performance, especially significant when addressing class imbalance issues. Finally, these evaluation metrics can be expressed by the following formulas:

Acc = \frac{T P + T N}{T P + T N + F P + F N}

(8)

P = \frac{T P}{T P + F P}

(9)

R = \frac{T P}{T P + F N}

(10)

F 1 - score = 2 \cdot \frac{P \cdot R}{P + R}

(11)

Here,

T P

represents True Positives,

T N

represents True Negatives,

F P

represents False Positives, and

F N

represents False Negatives. By comprehensively applying these evaluation metrics, we can thoroughly assess the model’s performance, considering both classification accuracy and model efficiency.

4.2. Time Series Data Prediction Results

The primary purpose of this experiment is to evaluate the performance of different models on stock price prediction and electronic health record data. By introducing multiple evaluation metrics, including precision, recall, accuracy, F1-score, memory usage, and frames per second (FPS), this study compares the proposed method with other classic models, as shown in Table 1 and Table 2 and Figure 8. These metrics provide a comprehensive assessment of each model’s performance on various datasets, considering not only prediction accuracy but also practical aspects such as computational resource consumption and operational efficiency. The handling of stock and medical data requires high standards, as these data types often exhibit complex time dependencies and volatility. Therefore, this experiment provides insights into each model’s adaptability to different data types, especially in contexts requiring data security and privacy protection. The model must demonstrate strong prediction capabilities and efficient resource management to meet these demands effectively.

From the results, the proposed method shows the highest precision, recall, and F1-score on both the stock and electronic health record datasets, achieving a precision of 0.94, a recall of 0.89, and an accuracy of around 0.92. In comparison, traditional machine learning models like SVM and Random Forest achieve relatively similar precision and recall but show significant drawbacks in memory usage and FPS metrics. For tasks involving time-series data, deep learning models such as RNN and LSTM display strong modeling capabilities for time dependencies, with relatively high precision and F1-scores, while maintaining lower memory consumption and slightly higher FPS than the proposed method, demonstrating their strengths in computational resource management. However, supported by multi-head attention mechanisms and hash tree structures, the proposed model can more precisely extract critical features from time-series data, an advantage that is particularly beneficial for handling high-dimensional data and complex feature extraction tasks. This structural advantage allows the proposed method to maintain prediction accuracy while leveraging a more complex architecture to provide deeper learning and protection of data details. This balance renders the proposed model especially valuable for tasks that demand high precision and data security.

4.3. Image Data Classification Results

The primary objective of this experiment is to evaluate the performance of the proposed model across various datasets, including Brain CT, CIFAR-10 (Canadian Institute for Advanced Research), and ImageNet, to comprehensively validate metrics such as precision, recall, memory consumption, and frames per second (FPS). In this experimental design, these metrics enable a detailed comparison of our proposed method with classical deep learning models across different tasks, assessing its adaptability and resource efficiency in image recognition tasks. Table 3, Table 4 and Table 5 and Figure 9 showcase the versatility and efficiency of our approach in multi-task scenarios. By selecting diverse datasets (e.g., Brain CT representing medical data and CIFAR-10 and ImageNet as standard computer vision data), we can observe the model’s stability and performance across varying data structures and complexities, especially in applications focused on privacy protection and information extraction for sensitive data.

The experimental results show that the proposed model achieves high precision and F1 scores across all datasets, performing particularly well on the Brain CT dataset with a precision of 0.92 and an accuracy of 0.91, surpassing classic models like AlexNet, GoogleNet, ResNet, and EfficientNet. However, our model also demonstrates relatively higher memory consumption and lower FPS. Due to its use of multi-head attention mechanisms and hash tree structures, the model excels in accurately extracting and safeguarding critical features within image data, which is especially suitable for high-precision tasks, such as detecting subtle lesions in medical imaging. Although this adds to memory usage and results in reduced FPS, this design proves advantageous in applications that prioritize high precision and data privacy. On CIFAR-10 and ImageNet, which are standard computer vision datasets, our method achieves precision and F1 scores close to those of EfficientNet but with higher memory usage. Theoretically, this performance reflects the structural complexity of our method, allowing for more effective multi-level feature extraction when handling high-dimensional image data. As such, our model performs exceptionally well in scenarios with complex dependencies, making it well suited for high-stakes tasks requiring privacy protection and precise recognition, such as medical diagnostics.

4.4. Results of the Ablation Study on Different Attention Mechanisms

This section aims to compare the performance of different attention mechanisms in the same task to discuss their impact on model performance. The primary purpose of the experiment is to verify the effects of various attention mechanisms on four key indicators: precision, recall, accuracy, and F1-score. This comparison helps deepen our understanding of how attention mechanisms enhance model performance, especially in complex data processing tasks.

The theoretical analysis of the results in Table 6 reveals significant insights. First, the standard self-attention mechanism, by computing the relationships between input features, can capture long-range dependencies. However, this mechanism may not be efficient for highly complex or noisy data, as it involves global computation of weights, which can be computationally expensive with large datasets and may obscure some crucial local features. On the other hand, the Convolutional Block Attention Module (CBAM) combines convolution operations with an attention mechanism. Not only can it capture important spatial features, but it can also enhance the model’s sensitivity to specific features through channel attention, making CBAM perform better than the standard self-attention across several metrics. CBAM’s fine-grained attention adjustment, particularly suitable for high-dimensional data like images, can more effectively discern which regions contain key information, thus improving overall prediction precision and recall. Lastly, the skip attention mechanism introduces skip connections, optimizing the flow of information within the network and reducing unnecessary computations, making the model more efficient and precise. This mechanism, while retaining the advantages of a deep network, avoids common issues like gradient vanishing and is particularly suitable for tasks that require large-scale feature integration and rapid response. The high precision and recall rates of the skip attention mechanism are due to its effective integration of information from various network layers, enabling the model not only to capture deep abstract features but also to attend to subtle variations in the input data, resulting in superior performance across all evaluation metrics.

In conclusion, the ablation study on different attention mechanisms clearly demonstrates the roles and limitations of these mechanisms in enhancing model performance. The results not only affirm the effectiveness of the skip attention mechanism in improving the model’s processing capabilities but also highlight the significance and application prospects of attention mechanisms within modern deep learning frameworks. These analyses enable researchers and practitioners to better select and design suitable attention mechanisms for specific tasks, maximizing model performance and efficiency.

4.5. Encryption Effect Visualization Analysis

In this section, the experiment compares the original data, encrypted data, and feature maps to demonstrate the encryption effect of the proposed model. The main purpose of this experiment is to visualize the encryption method’s ability to protect the privacy of the original data while still retaining feature information for model use. In the experiment, four random images from the CIFAR-10 dataset were selected as the original data and then processed by the proposed encryption model to generate the encrypted images. Subsequently, through the visualization of the feature maps from the first and second layers, the feature extraction effect of the encrypted images processed by the model is displayed, as shown in Figure 10. This visualization method allows users to observe the changes in the original data, encrypted data, and the feature extraction process, facilitating the analysis of the effectiveness of the encryption method.

As seen in the images, the first row shows the original images clearly presenting different categories from the CIFAR-10 dataset, while the second row displays the encrypted images processed by the proposed model. The encrypted images have significantly lost the visual features of the original images, making it impossible to recognize the specific content of the images with the naked eye, fully demonstrating the effectiveness of the encryption in protecting data privacy. However, in the subsequent feature map visualizations (the third and fourth rows), it can be observed that despite the encrypted input images, the model is still able to extract useful features.

To scientifically evaluate the effectiveness of the encryption methods, we introduced two commonly used metrics in the experiment: PSNR (Peak Signal-to-Noise Ratio) and SSIM (Structural Similarity Index), as shown in Table 7. These metrics were used to compare different encryption methods. PSNR measures the difference between the encrypted and original images, typically calculated by comparing the mean squared error (MSE) between them. A higher PSNR value indicates that the encrypted image is closer to the original image, thus preserving more critical information. The calculation of PSNR involves first determining the MSE between the original and encrypted images, then using this error to compute the PSNR value, which is the logarithmic ratio of the original image’s brightness to the MSE. On the other hand, SSIM focuses on structural similarity, which aligns more closely with human visual perception of image quality. SSIM evaluates the similarity between the encrypted and original images by comparing brightness, contrast, and structural information. An SSIM value closer to 1 indicates that the encrypted image is structurally more similar to the original, suggesting that the method effectively preserves essential image information while protecting privacy. The calculation of SSIM involves dividing the image into windows and assessing the brightness, contrast, and structural differences across these windows, which are then combined to yield the final SSIM score. Through quantitative analysis using PSNR and SSIM, our proposed method achieves a well-balanced trade-off between privacy protection and information retention. The experimental results show that the encryption method presented in this study ensures data security while effectively extracting and utilizing critical image features, offering an innovative solution that combines data privacy protection with deep learning model efficacy.

4.6. Discussion on Security

In this section, we discuss the security mechanisms in our model, addressing potential vulnerabilities and adversarial attacks, as well as suggesting optimization strategies to enhance model robustness. Our design uses a skip attention mechanism with skip connections and a hash tree structure, aimed at securing data privacy while improving feature extraction and computational efficiency. However, adversarial attacks in deep learning can exploit certain weaknesses. For instance, subtle adversarial noise—imperceptible to humans but disruptive to model predictions—can affect the attention mechanism’s focus, leading to biased feature extraction. Similarly, while the hash tree is efficient for data integrity checks, it may require significant computational resources to respond to rapid changes if malicious nodes are inserted or removed.

To understand the model’s behavior under adversarial conditions, we analyzed its response to common attacks like FGSM, PGD, and CW, which introduce slight data disturbances to skew model outputs. In FGSM, for example, the model is attacked using gradients to create minimal but effective changes in the data. Our skip attention mechanism uses multi-level feature extraction, where skip connections allow features to propagate across levels. This setup disperses disturbances, making it harder for small perturbations to compromise the entire feature set. The hash tree also verifies data integrity at multiple levels, which hinders perturbation propagation throughout the network, further reinforcing model robustness against FGSM attacks. In response to the iterative PGD attack, which applies stronger, multi-step perturbations, our model’s skip connections spread out and minimize the cumulative effect of disturbances, while the hash tree’s multi-level validation filters out changes at each level. This layered approach effectively limits PGD’s impact, as accumulated perturbations are checked at multiple stages, reducing their effectiveness. Finally, the CW attack, designed to apply minimal but impactful perturbations, is also challenged by our multi-level validation system. Even the smallest perturbations face layered scrutiny through both hash validation and multi-level feature focusing in skip attention, which makes it difficult for consistent disturbances to evade detection across all layers. In summary, the combination of skip attention and hash tree structures strengthens our model’s defenses against adversarial attacks. The multi-level structure in skip attention and hash tree validation ensures that disturbances cannot easily affect all feature layers, providing robust protection against FGSM, PGD, and CW attacks.

4.7. Discussion on Computational Complexities and Future Works

In large-scale deployments, our model’s computational efficiency becomes a crucial consideration, especially in high-demand settings like finance or healthcare. In our experiments, we used a multi-GPU setup to support model training and testing, which ensured that tasks like encryption and feature extraction were handled efficiently. However, in real-world production, where resources might be limited, this model’s complexity could pose challenges. The skip attention mechanism and hash tree structure, though highly effective in enhancing privacy and feature extraction, introduce some computational overhead. Distributed or cloud computing could be leveraged to distribute tasks across nodes, thereby reducing the load on individual systems and improving scalability. Furthermore, selectively applying skip attention or using shallower hash trees in specific scenarios can help reduce computational demands without compromising overall performance.

In terms of computational complexity, the skip attention mechanism optimizes traditional self-attention by using skip connections to prioritize essential features and reduce redundant calculations, which is beneficial for complex or sequential data. However, it still requires significant memory and processing power due to multi-head attention computations, especially in large-scale or real-time settings. Similarly, the hash tree structure offers rapid integrity checks but becomes resource-intensive as data volume grows, particularly with frequent insertions or deletions. To manage these costs, strategies such as incremental hash updates, parallel processing, and dynamic tree structures can be employed to localize and streamline updates. Additionally, lightweight attention modules and model compression techniques like pruning or quantization further help in reducing the model’s computational load, making it more viable for practical applications with resource limitations.

5. Conclusions

The background of this research stems from the growing concerns of data privacy protection and the security of deep learning models when processing large-scale data, especially in sensitive scenarios such as finance, healthcare, and cross-border data computations. With the rapid development of deep learning technology, the challenge in the field of secure computing lies in ensuring data security while maintaining efficient feature extraction. This paper makes a significant contribution by presenting the first combination of the hash tree with the deep learning Transformer model, specifically aimed at ensuring data integrity protection and enhancing feature extraction. The hash tree structure allows the model to ensure data security during storage and transmission, enabling the quick detection of any data tampering. Additionally, the paper introduces the skip attention mechanism, which reduces unnecessary computations through skip connections, enhancing computational efficiency while maintaining the advantages of deep networks. To further improve the model’s learning capability, we have designed a skip loss function that optimizes feature extraction through skip connections, thereby enhancing the model’s ability to capture critical features.

Author Contributions

Conceptualization, Z.Z., Y.W., L.C. and C.L.; Data curation, Y.S. and M.L.; Formal analysis, T.L.; Funding acquisition, C.L.; Methodology, Z.Z., Y.W., L.C. and C.L.; Project administration, C.L.; Resources, Y.S. and M.L.; Software, Z.Z., Y.W. and L.C.; Supervision, K.X.; Validation, Y.S., T.L. and K.X.; Visualization, T.L., M.L. and K.X.; Writing—original draft, Z.Z., Y.W., L.C., Y.S., T.L., M.L., K.X. and C.L. All authors have read and agreed to the published version of the manuscript.

Funding

The authors would like to express their sincere gratitude to the Computer Association of China Agricultural University (ECC) for their valuable technical support.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data presented in this study are available on request from the corresponding author.

Acknowledgments

We would like to express our gratitude to Tianyue Li from the School of Business Administration, Capital University of Economics and Business; Meishu Li from the Chinese Academy of Finance and Development, Central University of Finance and Economics; and Keyi Xu from the School of Foreign Languages, Beijing Forestry University, for their support.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Bharadiya, J.P. A comparative study of business intelligence and artificial intelligence with big data analytics. Am. J. Artif. Intell. 2023, 7, 24. [Google Scholar]
Zhang, Y.; Wa, S.; Liu, Y.; Zhou, X.; Sun, P.; Ma, Q. High-accuracy detection of maize leaf diseases CNN based on multi-pathway activation function module. Remote Sens. 2021, 13, 4218. [Google Scholar] [CrossRef]
Li, D.; Han, D.; Crespi, N.; Minerva, R.; Li, K.C. A blockchain-based secure storage and access control scheme for supply chain finance. J. Supercomput. 2023, 79, 109–138. [Google Scholar] [CrossRef]
Kafi, M.A.; Akter, N. Securing financial information in the digital realm: Case studies in cybersecurity for accounting data protection. Am. J. Trade Policy 2023, 10, 15–26. [Google Scholar] [CrossRef]
Zhang, Y.; Wa, S.; Zhang, L.; Lv, C. Automatic plant disease detection based on tranvolution detection network with GAN modules using leaf images. Front. Plant Sci. 2022, 13, 875693. [Google Scholar] [CrossRef]
Rehan, H. AI-Powered Genomic Analysis in the Cloud: Enhancing Precision Medicine and Ensuring Data Security in Biomedical Research. J. Deep. Learn. Genom. Data Anal. 2023, 3, 37–71. [Google Scholar]
Zhang, Y.; Yang, X.; Liu, Y.; Zhou, J.; Huang, Y.; Li, J.; Zhang, L.; Ma, Q. A time-series neural network for pig feeding behavior recognition and dangerous detection from videos. Comput. Electron. Agric. 2024, 218, 108710. [Google Scholar] [CrossRef]
Mohammad, N. Enhancing Security and Privacy in Multi-Cloud Environments: A Comprehensive Study on Encryption Techniques and Access Control Mechanisms. Int. J. Comput. Eng. Technol. (IJCET) 2021, 12, 51–63. [Google Scholar]
Shivaramakrishna, D.; Nagaratna, M. A novel hybrid cryptographic framework for secure data storage in cloud computing: Integrating AES-OTP and RSA with adaptive key management and Time-Limited access control. Alex. Eng. J. 2023, 84, 275–284. [Google Scholar] [CrossRef]
Xu, K.; Wu, Y.; Li, Z.; Zhang, R.; Feng, Z. Investigating financial risk behavior prediction using deep learning and big data. Int. J. Innov. Res. Eng. Manag. 2024, 11, 77–81. [Google Scholar] [CrossRef]
Sahu, S.K.; Mokhade, A.; Bokde, N.D. An overview of machine learning, deep learning, and reinforcement learning-based techniques in quantitative finance: Recent progress and challenges. Appl. Sci. 2023, 13, 1956. [Google Scholar] [CrossRef]
Ren, Y.; Huang, D.; Wang, W.; Yu, X. BSMD: A blockchain-based secure storage mechanism for big spatio-temporal data. Future Gener. Comput. Syst. 2023, 138, 328–338. [Google Scholar] [CrossRef]
Suganya, M.; Sasipraba, T. Stochastic Gradient Descent Long Short-Term Memory based secure encryption algorithm for cloud data storage and retrieval in cloud computing environment. J. Cloud Comput. 2023, 12, 74. [Google Scholar] [CrossRef]
Puneeth, R.; Parthasarathy, G. Security and Data Privacy of Medical Information in Blockchain Using Lightweight Cryptographic System. Int. J. Eng. 2023, 36, 925–933. [Google Scholar] [CrossRef]
Lu, S.; Liu, M.; Yin, L.; Yin, Z.; Liu, X.; Zheng, W. The multi-modal fusion in visual question answering: A review of attention mechanisms. PeerJ Comput. Sci. 2023, 9, e1400. [Google Scholar] [CrossRef]
Li, Q.; Ren, J.; Zhang, Y.; Song, C.; Liao, Y.; Zhang, Y. Privacy-Preserving DNN Training with Prefetched Meta-Keys on Heterogeneous Neural Network Accelerators. In Proceedings of the IEEE 2023 60th ACM/IEEE Design Automation Conference (DAC), San Francisco, CA, USA, 9–13 July 2023; pp. 1–6. [Google Scholar]
Zhang, Y.; Lv, C. TinySegformer: A lightweight visual segmentation model for real-time agricultural pest detection. Comput. Electron. Agric. 2024, 218, 108740. [Google Scholar] [CrossRef]
Zhang, N.; Kim, J. A Survey on Attention mechanism in NLP. In Proceedings of the IEEE 2023 International Conference on Electronics, Information, and Communication (ICEIC), Singapore, 5–8 February 2023; pp. 1–4. [Google Scholar]
Li, X.; Li, M.; Yan, P.; Li, G.; Jiang, Y.; Luo, H.; Yin, S. Deep learning attention mechanism in medical image analysis: Basics and beyonds. Int. J. Netw. Dyn. Intell. 2023, 2, 93–116. [Google Scholar] [CrossRef]
Cheng, G.; Lai, P.; Gao, D.; Han, J. Class attention network for image recognition. Sci. China Inf. Sci. 2023, 66, 132105. [Google Scholar] [CrossRef]
Han, D.; Zhou, H.; Weng, T.H.; Wu, Z.; Han, B.; Li, K.C.; Pathan, A.S.K. LMCA: A lightweight anomaly network traffic detection model integrating adjusted mobilenet and coordinate attention mechanism for IoT. Telecommun. Syst. 2023, 84, 549–564. [Google Scholar] [CrossRef]
Lv, Z.; Chen, D.; Cao, B.; Song, H.; Lv, H. Secure deep learning in defense in deep-learning-as-a-service computing systems in digital twins. IEEE Trans. Comput. 2023, 73, 656–668. [Google Scholar] [CrossRef]
Devlin, J. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv 2018, arXiv:1810.04805. [Google Scholar]
Achiam, J.; Adler, S.; Agarwal, S.; Ahmad, L.; Akkaya, I.; Aleman, F.L.; Almeida, D.; Altenschmidt, J.; Altman, S.; Anadkat, S.; et al. Gpt-4 technical report. arXiv 2023, arXiv:2303.08774. [Google Scholar]
Vaswani, A. Attention is all you need. arXiv 2017, arXiv:1706.03762. [Google Scholar]
Li, Q.; Zhang, Y.; Ren, J.; Li, Q.; Zhang, Y. You Can Use But Cannot Recognize: Preserving Visual Privacy in Deep Neural Networks. arXiv 2024, arXiv:2404.04098. [Google Scholar]
Li, Q.; Zhang, Y. Confidential Federated Learning for Heterogeneous Platforms against Client-Side Privacy Leakages. In Proceedings of the ACM Turing Award Celebration Conference, Changsha, China, 5–7 July 2024; pp. 239–241. [Google Scholar]
Zhang, P.; Dong, X.; Wang, B.; Cao, Y.; Xu, C.; Ouyang, L.; Zhao, Z.; Duan, H.; Zhang, S.; Ding, S.; et al. Internlm-xcomposer: A vision-language large model for advanced text-image comprehension and composition. arXiv 2023, arXiv:2309.15112. [Google Scholar]
Dong, X.; Zhang, P.; Zang, Y.; Cao, Y.; Wang, B.; Ouyang, L.; Wei, X.; Zhang, S.; Duan, H.; Cao, M.; et al. Internlm-xcomposer2: Mastering free-form text-image composition and comprehension in vision-language large model. arXiv 2024, arXiv:2401.16420. [Google Scholar]
Jacobs, S.A.; Tanaka, M.; Zhang, C.; Zhang, M.; Song, S.L.; Rajbhandari, S.; He, Y. Deepspeed ulysses: System optimizations for enabling training of extreme long sequence transformer models. arXiv 2023, arXiv:2309.14509. [Google Scholar]
Li, H.; Wang, S.X.; Shang, F.; Niu, K.; Song, R. Applications of large language models in cloud computing: An empirical study using real-world data. Int. J. Innov. Res. Comput. Sci. Technol. 2024, 12, 59–69. [Google Scholar] [CrossRef]
Yao, Y.; Duan, J.; Xu, K.; Cai, Y.; Sun, Z.; Zhang, Y. A survey on large language model (llm) security and privacy: The good, the bad, and the ugly. High-Confid. Comput. 2024, 4, 100211. [Google Scholar] [CrossRef]
Chen, J.; Liu, Z.; Huang, X.; Wu, C.; Liu, Q.; Jiang, G.; Pu, Y.; Lei, Y.; Chen, X.; Wang, X.; et al. When large language models meet personalization: Perspectives of challenges and opportunities. World Wide Web 2024, 27, 42. [Google Scholar] [CrossRef]
Qiu, J.; Li, L.; Sun, J.; Peng, J.; Shi, P.; Zhang, R.; Dong, Y.; Lam, K.; Lo, F.P.W.; Xiao, B.; et al. Large ai models in health informatics: Applications, challenges, and the future. IEEE J. Biomed. Health Inform. 2023, 27, 6074–6087. [Google Scholar] [CrossRef] [PubMed]
Marichamy, V.S.; Natarajan, V. Blockchain based securing medical records in big data analytics. Data Knowl. Eng. 2023, 144, 102122. [Google Scholar] [CrossRef]
Richter, T.; Artzt, M. International Handbook of Blockchain Law: A Guide to Navigating Legal and Regulatory Challenges of Blockchain Technology and Crypto Assets; Kluwer Law International BV: Alphen aan den Rijn, The Netherlands, 2024. [Google Scholar]
Hu, S.; Lin, J.; Du, X.; Huang, W.; Lu, Z.; Duan, Q.; Wu, J. ACSarF: A DRL-based adaptive consortium blockchain sharding framework for supply chain finance. Digit. Commun. Netw. 2023. [Google Scholar] [CrossRef]
Udayakumar, R.; Chowdary, P.B.K.; Devi, T.; Sugumar, R. Integrated SVM-FFNN for Fraud Detection in Banking Financial Transactions. J. Internet Serv. Inf. Secur. 2023, 13, 12–25. [Google Scholar]
Zheng, J.; Xin, D.; Cheng, Q.; Tian, M.; Yang, L. The Random Forest Model for Analyzing and Forecasting the US Stock Market in the Context of Smart Finance. arXiv 2024, arXiv:2402.17194. [Google Scholar]
Wang, J.; Hong, S.; Dong, Y.; Li, Z.; Hu, J. Predicting stock market trends using lstm networks: Overcoming RNN limitations for improved financial forecasting. J. Comput. Sci. Softw. Appl. 2024, 4, 1–7. [Google Scholar]
Fang, Z.; Ma, X.; Pan, H.; Yang, G.; Arce, G.R. Movement forecasting of financial time series based on adaptive LSTM-BN network. Expert Syst. Appl. 2023, 213, 119207. [Google Scholar] [CrossRef]
Arias-Serrano, I.; Cruz-Varela, J.; Almeida-Galárraga, D.; Tirado-Espin, A.; Velásquez-López, P.A.; Laurido-Mora, F.C.; Villalba-Meneses, F.; Avila-Briones, L.N. Artificial Intelligence Based Glaucoma and Diabetic Retinopathy Detection Using MATLAB—Retrained AlexNet Convolutional Neural Network; Technical Report; PubMed Central (PMC): Bethesda, MD, USA, 2024. [Google Scholar]
Ma, L.; Wu, H.; Samundeeswari, P. GoogLeNet-AL: A Fully Automated Adaptive Model for Lung Cancer Detection. Pattern Recognit. 2024, 155, 110657. [Google Scholar] [CrossRef]
Mirza, A.F.; Mansoor, M.; Usman, M.; Ling, Q. Hybrid Inception-embedded deep neural network ResNet for short and medium-term PV-Wind forecasting. Energy Convers. Manag. 2023, 294, 117574. [Google Scholar] [CrossRef]
Talukder, M.A.; Layek, M.A.; Kazi, M.; Uddin, M.A.; Aryal, S. Empowering COVID-19 detection: Optimizing performance through fine-tuned efficientnet deep learning architecture. Comput. Biol. Med. 2024, 168, 107789. [Google Scholar] [CrossRef]
Cortes, C. Support-Vector Networks. Mach. Learn. 1995, 20, 273–297. [Google Scholar] [CrossRef]
Breiman, L. Random forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
Zaremba, W. Recurrent neural network regularization. arXiv 2014, arXiv:1409.2329. [Google Scholar]
Hochreiter, S. Long Short-Term Memory. In Neural Computation; MIT-Press: Cambridge, MA, USA, 1997. [Google Scholar]
Pan, X.; Ge, C.; Lu, R.; Song, S.; Chen, G.; Huang, Z.; Huang, G. On the integration of self-attention and convolution. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2022; pp. 815–825. [Google Scholar]
Woo, S.; Park, J.; Lee, J.Y.; Kweon, I.S. Cbam: Convolutional block attention module. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 3–19. [Google Scholar]
Rijmen, V.; Daemen, J. Advanced encryption standard. Proc. Fed. Inf. Process. Stand. Publ. Natl. Inst. Stand. Technol. 2001, 19, 22. [Google Scholar]
Costache, A.; Smart, N.P. Homomorphic encryption without gaussian noise. Cryptol. ePrint Arch. 2017, 163, 1–24. [Google Scholar]
Umar, H.G.A.; Aoun, M.; Kaleem, M.A.; Rehman, S.U.; khan, M.Z.; Younis, M.; jamil, M. Cryptographic Analysis of Blur-Based Encryption an in depth examination of resilience against various attack vectors. Res. Sq. 2023. [Google Scholar] [CrossRef]
Ismail, S.M.; Said, L.A.; Radwan, A.G.; Madian, A.H.; Abu-ElYazeed, M.F. A novel image encryption system merging fractional-order edge detection and generalized chaotic maps. Signal Process. 2020, 167, 107280. [Google Scholar] [CrossRef]

Figure 1. Visualization of hash algorithms.

Figure 2. Visualization of attention mechanisms.

Figure 3. Image dataset augmentation methods: (A) is CutOut; (B) is CutMix; (C) is Mosaic.

Figure 4. Hash-tree-based transformer model.

Figure 5. Hash tree node-insertion structure diagram: This figure shows the process of node insertion within the hash tree structure. After a new node is inserted, the hash values of the relevant subtrees are updated sequentially, ultimately reflecting in the root node’s hash value. This process ensures data integrity, as any change to a node will be mirrored throughout the hash tree structure, thereby guaranteeing data security and verifiability.

Figure 6. Hash tree node-deletion structure diagram: This figure illustrates the node-deletion process within the hash tree structure. When a node is removed, the hash values of the associated subtrees are correspondingly updated up to the root node. This process ensures the consistency of hash values across the entire hash tree, preserving data integrity and verifiability after the deletion operation.

Figure 7. Skip attention structure diagram: This figure illustrates the basic structure of the proposed skip attention mechanism, which introduces skip connections to optimize information flow and computational efficiency within the model. The structure includes multi-head attention and feedforward network modules, where skip connections enable direct information transfer between feature layers, effectively reducing redundant computations and allowing the model to more efficiently extract and utilize key information.

Figure 8. ROCs of different models.

Figure 9. ROCs of different models.

Figure 10. Encryption effect visualization.

Table 1. Performance of stock price dataset on various models.

Model	Precision	Recall	Accuracy	F1-Score	Memory (MB)	FPS
SVM [46]	0.84	0.80	0.82	0.82	-	14,168
Random Forest [47]	0.85	0.82	0.83	0.83	-	13,624
RNN [48]	0.90	0.87	0.88	0.88	43	9103
LSTM [49]	0.93	0.88	0.91	0.90	67	9857
Proposed Method	0.94	0.89	0.92	0.91	381	6891

Table 2. Performance of electronic health records on various models.

Model	Precision	Recall	Accuracy	F1-Score	Memory (MB)	FPS
SVM [46]	0.86	0.81	0.83	0.83	-	14,168
Random Forest [47]	0.88	0.85	0.87	0.86	-	13,631
RNN [48]	0.90	0.87	0.88	0.88	46	9124
LSTM [49]	0.92	0.88	0.90	0.90	61	9859
Proposed Method	0.93	0.89	0.91	0.91	383	6899

Table 3. Performance of various models on the Brain CT dataset.

Model	Precision	Recall	Accuracy	F1-Score	Memory (MB)	FPS
AlexNet [42]	0.81	0.78	0.80	0.79	240	58
Inception v1 [43]	0.84	0.81	0.83	0.82	27	52
ResNet50 [44]	0.87	0.85	0.86	0.86	98	49
EfficientNet-B0 [45]	0.90	0.88	0.89	0.89	29	37
Proposed Method	0.92	0.89	0.91	0.90	381	24

Table 4. Performance of various models on the CIFAR-10 dataset.

Model	Precision	Recall	Accuracy	F1-Score	Memory (MB)	FPS
AlexNet [42]	0.98	0.96	0.97	0.97	238	57
Inception v1 [43]	0.98	0.97	0.98	0.97	27	52
ResNet50 [44]	0.99	0.98	0.98	0.98	98	47
EfficientNet-B0 [45]	0.99	0.99	0.99	0.99	27	38
Proposed Method	0.99	0.99	0.99	0.99	385	26

Table 5. Performance of various models on the ImageNet dataset.

Model	Precision	Recall	Accuracy	F1-Score	Memory (MB)	FPS
AlexNet [42]	0.90	0.88	0.89	0.89	242	58
Inception v1 [43]	0.91	0.89	0.90	0.90	29	54
ResNet [44]	0.93	0.91	0.92	0.92	95	48
EfficientNet-B0 [45]	0.94	0.92	0.93	0.93	31	37
Proposed Method	0.95	0.93	0.94	0.94	385	24

Table 6. Results of the ablation study on different attention mechanisms.

Model	Precision	Recall	Accuracy	F1-Score
Standard Self-Attention [50]	0.72	0.70	0.71	0.71
Convolutional Block Attention Module [51]	0.85	0.82	0.83	0.83
Skip Attention	0.94	0.89	0.92	0.91

Table 7. Quantitative comparison of encryption methods using PSNR and SSIM metrics. Higher PSNR and SSIM values indicate better preservation of critical information while ensuring data security.

Encryption Method	PSNR (dB)	SSIM
Proposed Method	28.5	0.72
Standard Encryption (AES) [52]	18.9	0.60
Gaussian Noise Encryption [53]	12.5	0.45
Blur-Based Encryption [54]	15.2	0.51
Edge Masking Encryption [55]	22.3	0.68

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Zhou, Z.; Wang, Y.; Cong, L.; Song, Y.; Li, T.; Li, M.; Xu, K.; Lv, C. Enhancing Data Privacy Protection and Feature Extraction in Secure Computing Using a Hash Tree and Skip Attention Mechanism. Appl. Sci. 2024, 14, 10687. https://doi.org/10.3390/app142210687

AMA Style

Zhou Z, Wang Y, Cong L, Song Y, Li T, Li M, Xu K, Lv C. Enhancing Data Privacy Protection and Feature Extraction in Secure Computing Using a Hash Tree and Skip Attention Mechanism. Applied Sciences. 2024; 14(22):10687. https://doi.org/10.3390/app142210687

Chicago/Turabian Style

Zhou, Zizhe, Yaqi Wang, Lin Cong, Yujing Song, Tianyue Li, Meishu Li, Keyi Xu, and Chunli Lv. 2024. "Enhancing Data Privacy Protection and Feature Extraction in Secure Computing Using a Hash Tree and Skip Attention Mechanism" Applied Sciences 14, no. 22: 10687. https://doi.org/10.3390/app142210687

APA Style

Zhou, Z., Wang, Y., Cong, L., Song, Y., Li, T., Li, M., Xu, K., & Lv, C. (2024). Enhancing Data Privacy Protection and Feature Extraction in Secure Computing Using a Hash Tree and Skip Attention Mechanism. Applied Sciences, 14(22), 10687. https://doi.org/10.3390/app142210687

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Enhancing Data Privacy Protection and Feature Extraction in Secure Computing Using a Hash Tree and Skip Attention Mechanism

Abstract

1. Introduction

2. Related Work

2.1. Hash Algorithms

2.2. Attention Mechanisms

2.3. Large Models

3. Materials and Method

3.1. Materials

3.1.1. Dataset Collection

3.1.2. Data Preprocessing

3.2. Proposed Method

3.2.1. Overview

3.2.2. Hash-Tree-Based Transformer Model

3.2.3. Skip Attention Mechanism

3.2.4. Skip Loss Function

4. Results and Discussion

4.1. Experimental Setup

4.1.1. Hardware and Software Environment

4.1.2. Hyperparameters and Training Settings

4.1.3. Baseline

4.1.4. Evaluation Metrics

4.2. Time Series Data Prediction Results

4.3. Image Data Classification Results

4.4. Results of the Ablation Study on Different Attention Mechanisms

4.5. Encryption Effect Visualization Analysis

4.6. Discussion on Security

4.7. Discussion on Computational Complexities and Future Works

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI