Feature-Attended Federated LSTM for Anomaly Detection in the Financial Internet of Thingsâ€
Round 1
Reviewer 1 Report
Comments and Suggestions for AuthorsThe topic of the paper is relevant, and the proposed work is suitable for this journal. However, it is not clear to me if the proposed work is original and if the paper has cited all relevant references. There are similar studies, and some sections overlap with the following paper:
"Feature-Attended Multi-Flow LSTM for Anomaly Detection in Internet of Things"
Although the authors thanked the authors of this paper, they do not seem to cite this paper. If this work builds on previous work, the previous study should be cited.
Some content of the paper seems to be borrowed from this source with minimal changes, such as the suction related to the Model Training (4.1) and some text is taken with very few changes. For example, the following text is very similar:
"Long short-term memory (LSTM) [5], which is a special recurrent neural network (RNN) [6], has impressive performance in long sequence inputs, and remarkable application advantages in serialization scenarios. Consequently, LSTM has received extensive attention from researchers in anomaly detection. D. Wu et al. [7] combined LSTM with a Gaussian Bayes model for anomaly detection by utilizing the predictive error. Y. Liu et al. [8] proposed a novel LSTM model to detect anomalies, which uses an attention mechanism-based convolutional neural network (CNN) to capture fine-grained features, and trains LSTM for prediction. These methods fully utilize the time correlation of IoT with LSTM."
Furthermore, the results (F1 score) from the UNSW-NB15 dataset in this paper appear to be around the same values.
A quick search on Google Scholar reveals several other studies that have used the UNSW-NB15 dataset for federated LSTM anomaly detection. These include the "Cyber Threat Intelligence Sharing Scheme" and "Federated Deep Learning for Anomaly Detection in the Internet of Things." It is unclear how these previous studies differ from the current one. Based on this information, the novelty of this paper is not clear to me.
Comments on the Quality of English LanguageThe quality of the English language is adequate.
Author Response
We express our gratitude to the reviewers and editors for their valuable questions, comments and suggestions. We have read the reviewers’ comments carefully and revised the paper accordingly. In what follows, we give our detailed reply to each comment and also a description of the changes we have made in the manuscript for each comment.
In the manuscript, the revised contents are highlighted in red color.
Reviewer 1
Question 1: There are similar studies, and some sections overlap with the following paper:
"Feature-Attended Multi-Flow LSTM for Anomaly Detection in Internet of Things"
Although the authors thanked the authors of this paper, they do not seem to cite this paper. If this work builds on previous work, the previous study should be cited.
Some content of the paper seems to be borrowed from this source with minimal changes, such as the suction related to the Model Training (4.1) and some text is taken with very few changes. For example, the following text is very similar: “Long short-term memory (LSTM) [5], which is …with LSTM.”
Furthermore, the results (F1 score) from the UNSW-NB15 dataset in this paper appear to be around the same values.
Response:
We would like to thank the reviewer for this comment. As described in the beginning, this paper is an extended version of our paper published in IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS), 2022, pp. 1-6. To make this clear and avoid misunderstanding, we have made the following revisions:
(1) We have added the paper title into the conference information. For the reviewer’s convenience, the revised contents are listed below:
This paper is an extended version of our paper published in IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS), 2022, Feature-Attended Multi-Flow LSTM for Anomaly Detection in Internet of Things, pp. 1-6.
(2) We have revised the repeated contents with new descriptions, which highlighted the new ideas of this manuscript. For the reviewer’s convenience, some of the revised contents are listed below (For the whole revisions, please refer to the highlighted contents in the revised manuscript):
Line 1-6: Recent years have witnessed the fast development of Financial Internet of Things (FIoT), which integrates the Internet of Things (IoT) into the financial activities. At the same time, FIoT are facing an increasing number of stealthy network attacks. Long short-term memory (LSTM) can be used as an anomaly detecting method to perceive such attacks, since LSTM specializes in discovering anomaly behaviours through the time correlation in FIoT traffics. However, current LSTM-based anomaly detection schemes haven’t considered the specific correlations among the features of the whole traffics.
Line 9-10: In this paper, we propose a feature attended federated LSTM (FAF-LSTM) for FIoT to address the above issues.
Line 15-16: Simulations are conducted to verify the effect of FAF-LSTM. The results show FAF-LSTM has good performance in anomaly detection.
Line 33-39: At present, there are many studies on anomaly detection in IoT, but barely specialized for FIoT. Literature [5] proposes an anomaly detection scheme based on fuzzy theory. In [4], principal component analysis (PCA) is used to detect abnormal behaviors in aging Industrial IoT. Squeezed Convolutional Variational Auto Encoder (SCVAE) for anomaly detection in Industrial IoT is proposed in [6]. Based on an adaptive learning rate and momentum, a method for trustworthy network anomaly detection is proposed in [7]. Whereas, these methods ignore the time correlation of the network traffics, and have limitations in the anomaly detection performance.
Line 71-75: We apply the federated learning architecture to solve those problems like the lack of training data faced by single detection devices, and enhance the synergy of detection devices. According to the traffic characteristics of each detecting node, the correlation is analyzed, and the parameter aggregation strategy in cooperative training is optimized to improve the detection models.
More revisions please refer to the highlighted contents in the revised manuscript.
Question 2: A quick search on Google Scholar reveals several other studies that have used the UNSW-NB15 dataset for federated LSTM anomaly detection. These include the "Cyber Threat Intelligence Sharing Scheme" and "Federated Deep Learning for Anomaly Detection in the Internet of Things." It is unclear how these previous studies differ from the current one. Based on this information, the novelty of this paper is not clear to me.
Response:
We are grateful for the reviewer's comment. In this paper, we propose a feature attended federated LSTM (FAF-LSTM) for FIoT to address the above issues. FAF-LSTM combines feature attended LSTM and federated learning to make full use of the deep correlation in data, and enhance the accuracy of the trained model via the cooperation among different detecting nodes. In FAF-LSTM, the features are grouped so that the model can learn the time-spatial correlation inner the flows of each group as well as their impact on the output. Meanwhile, the parameter aggregation is optimized based on feature correlation analysis.
Author Response File: Author Response.pdf
Reviewer 2 Report
Comments and Suggestions for AuthorsThis manuscript addresses the topic of anomaly detection in the Financial Internet of Things (FIoT). Anomaly detection plays a crucial role in maintaining the security and integrity of financial transactions, and this manuscript proposes a novel approach called Feature-Attended Federated LSTM (FAF-LSTM).
Additionally, the manuscript demonstrates the effectiveness of the proposed methodology using the UNSW-NB15 dataset. This dataset includes the latest attack data, making it ideal for detecting modern attacks. The experimental results show that the proposed methodology outperforms traditional independent LSTM and federated averaging algorithms.
Overall, this manuscript is well-written and the comparative evaluation is well-structured. However, there are some weaknesses that, if addressed, could improve the manuscript:
- While the manuscript proposes a novel approach to anomaly detection in the Financial Internet of Things, it is necessary to verify the effectiveness of the proposed method with other types of datasets and in different environments.
- Besides the proposed algorithm, it would be beneficial to apply traditional methods from related research, such as CNN and SVM. It is questionable whether LSTM indeed provides the best performance. It is recommended to apply other models capable of federated learning.
- The references for Independent LSTM and Fed-Avg LSTM used for comparison are not provided. While Independent LSTM might be one of the studies mentioned in Chapter 2, having a precise reference would be helpful. Furthermore, a detailed explanation of Fed-Avg LSTM and its methodology should be added to Chapter 2.
Author Response
We express our gratitude to the reviewers and editors for their valuable questions, comments and suggestions. We have read the reviewers’ comments carefully and revised the paper accordingly. In what follows, we give our detailed reply to each comment and also a description of the changes we have made in the manuscript for each comment.
In the manuscript, the revised contents are highlighted in red color.
Reviewer 2
Question 1: While the manuscript proposes a novel approach to anomaly detection in the Financial Internet of Things, it is necessary to verify the effectiveness of the proposed method with other types of datasets and in different environments.
Response:
We are very grateful for this suggestion. Regarding your suggestion to test the effectiveness of our proposed FAF-LSTM approach using additional datasets and in different environments, we fully agree that this would provide a broader validation of our methodology. Due to the limited revising time (5 days) of Applied Sciences in this round, we are unable to include this in the revision. We consider this an important direction for our future work and plan to incorporate multiple datasets and environments to further substantiate our findings, and have added the corresponding description into “Section 7 Conclusion” of the revised manuscript.
For the reviewer’s convenience, the added contents are listed below:
Line 433-436: For future work, the performance of FAF-LSTM can be verified across different datasets and environments. And other models, such as CNN or SVM, can also be combined with federate learning and compared with FAF-LSTM.
Question 2: Besides the proposed algorithm, it would be beneficial to apply traditional methods from related research, such as CNN and SVM. It is questionable whether LSTM indeed provides the best performance. It is recommended to apply other models capable of federated learning.
Response:
We are grateful for the reviewer's valuable feedback. Again, due to the tight timeline of the manuscript revision and the computational resources required for such comparisons, we could not include these analyses in the revision. However, we aim to explore these comparisons in our subsequent studies to provide a comprehensive evaluation of the FAF-LSTM against other established models capable of federated learning.
Therefore, we have added the corresponding description into “Section 7 Conclusion” of the revised manuscript. For the reviewer’s convenience, the added contents are listed below:
Line 433-436: For future work, the performance of FAF-LSTM can be verified across different datasets and environments. And other models, such as CNN or SVM, can also be combined with federate learning and compared with FAF-LSTM.
Question 3: The references for Independent LSTM and Fed-Avg LSTM used for comparison are not provided. While Independent LSTM might be one of the studies mentioned in Chapter 2, having a precise reference would be helpful. Furthermore, a detailed explanation of Fed-Avg LSTM and its methodology should be added to Chapter 2.
Response:
We are thankful for the reviewer's helpful and detailed comments. In traditional federated learning, the parameter aggregator collects the parameters of the trained model from each distributed training node, and set the average value of each parameter to be the parameter value in the converged model. Therefore, traditional federated learning can be abbreviated as Fed-Avg. The method combing traditional federated learning and LSTM can be abbreviated as Fed-Avg LSTM.
We have added the corresponding description in Chapter 2, and included a precise reference for Fed-Avg LSTM methodology. For the reviewer’s convenience, the added contents are listed below:
Line 133-137: H. Brendan et al. [24] combined Federated Learning and LSTM to train a global model in multiple nodes networks. In Federated Learning, the parameter aggregator collects the parameters of the trained model from each distributed training node, and set the average value of each parameter to be the parameter value in the converged model. Therefore, traditional federated learning can be abbreviated as Fed-Avg.
Line 397-398: the Fed-Avg LSTM [24] models get the accuracy of 0.2287, 0.2225, 0.2170 and 0.2209 …
Author Response File: Author Response.pdf
Reviewer 3 Report
Comments and Suggestions for AuthorsGeneral reviews
-
The manuscript provides a valuable section on related work, but I have noticed that it needs to include a specific discussion section. The discussion section would significantly enrich the manuscript by comparing the described model with related works.
-
The article mentions using different metrics, such as precision and AUC, in various sections. However, it would greatly benefit the clarity and coherence of the work to include a specific methodology section detailing all the metrics that will be used to evaluate the models.
Introduction
-
Could you elaborate on why data isolation and data shortage problems are particularly relevant in FIoT and how federated learning addresses these issues more effectively than other approaches?
Simulation
-
It would be valuable to compare various datasets relevant to anomaly detection, such as ISCXTor206, IOT-23, etc, to support the selection of the UNSW-NB15 dataset.
Conclusion
-
I suggest including future research or studies based on the current investigation to broaden the scope and depth of the study.
Author Response
We express our gratitude to the reviewers and editors for their valuable questions, comments and suggestions. We have read the reviewers’ comments carefully and revised the paper accordingly. In what follows, we give our detailed reply to each comment and also a description of the changes we have made in the manuscript for each comment.
In the manuscript, the revised contents are highlighted in red color.
Reviewer 3
Question 1: The manuscript provides a valuable section on related work, but I have noticed that it needs to include a specific discussion section. The discussion section would significantly enrich the manuscript by comparing the described model with related works.
Response:
We are grateful for this helpful suggestion. We have divided section 2 into 3 sub-sections to describe more clearly and logically, added more discussions in “Section 2 related work”, and also highlighted the innovation of our proposed model in “Section 1 Introduction”.
For the reviewer’s convenience, the revised contents are listed below:
Line 63-66: We propose the architecture of FAF-LSTM, so as to improve the utilization of the time correlations, the coordination among detection nodes, and the adaptability to dynamic changing environment. FAF-LSTM is composed of the Feature-attended LSTM, and the correlation-based federated learning.
Line 71-75: We apply the federated learning architecture to solve those problems like the lack of training data faced by single detection devices, and enhance the synergy of detection devices. According to the traffic characteristics of each detecting node, the correlation is analyzed, and the parameter aggregation strategy in cooperative training is optimized to improve the detection models.
Line 125-128: However, in these methods, the time correlations inner each feature (or a feature group including several features) are ignored. This leads to the result that such an important character in IoT traffics, which may greatly improve the anomaly detection performance, is not utilized.
Line 133-137: H. Brendan et al. [24] combined Federated Learning and LSTM to train a global model in multiple nodes networks. In Federated Learning, the parameter aggregator collects the parameters of the trained model from each distributed training node, and set the average value of each parameter to be the parameter value in the converged model. Therefore, traditional federated learning can be abbreviated as Fed-Avg.
Question 2: The article mentions using different metrics, such as precision and AUC, in various sections. However, it would greatly benefit the clarity and coherence of the work to include a specific methodology section detailing all the metrics that will be used to evaluate the models.
Response:
We are grateful for the reviewer's valuable suggestion. We have added a separated sub-section as “Section 6.2. Evaluation Indicators” to explain the meaning of the evaluation indicators.
For the reviewer’s convenience, the added contents are listed below:
Line 378-393: In order to verify the anomaly detection results of the proposed algorithm, there are three metrics of classification tasks are comprehensively used in the simulation. The first is accuracy, denoted as $A$, which can directly measure the overall correctness including the correct identification of both anomalies and normal instances. The second metric is the area under curve (AUC), representing the area under the receiver operating characteristic (ROC) curve, which assesses the algorithm's ability to discriminate between classes at various thresholds. A higher AUC indicates better algorithm performance, crucial for effective anomaly detection in dynamic scenarios. The last metric is the true positive rate, which indicates the proportion of actual anomalies correctly identified by the algorithm. This is crucial for financial systems, where missing an anomaly can have severe consequences. The accuracy is calculated as follows:
A=(TP+TN)/(TP+FP+FN+TN) (21)
where represents the number of correctly classified target samples, represents the number of correctly classified other samples, represents the number of incorrectly identified target samples and represents the number of target samples that were missed.
Question 3: Could you elaborate on why data isolation and data shortage problems are particularly relevant in FIoT and how federated learning addresses these issues more effectively than other approaches?
Response:
We are thankful for the reviewer's helpful suggestion. We have added the description in “Section 1 Introduction”. For the reviewer’s convenience, the revised contents are listed below:
Line 46-50: At the same time, financial entities, such as persons, banks, or related companies, have stricter rules on using the data generated from their FIoTs. They cannot freely exchange data to train more powerful anomaly detecting models. The anomaly detection in FIoT faces problems such as data isolation and data shortage during independent training. This may lead to impaired accuracy of the obtained models.
Question 4: It would be valuable to compare various datasets relevant to anomaly detection, such as ISCXTor206, IOT-23, etc., to support the selection of the UNSW-NB15 dataset.
Response:
We are very grateful for this suggestion. Regarding your suggestion to compare various datasets relevant to anomaly detection, such as ISCXTor206, IOT-23, etc., we fully agree that this would provide a broader validation of our methodology. Due to the limited revising time (5 days) of Applied Sciences in this round, we are unable to include this in the revision. We consider this an important direction for our future work and plan to incorporate multiple datasets to further substantiate our findings, and have added the corresponding description into “Section 7 Conclusion” of the revised manuscript.
For the reviewer’s convenience, the added contents are listed below:
Line 434-437: For future work, the performance of FAF-LSTM can be verified across different datasets and environments. And other models, such as CNN or SVM, can also be combined with federate learning and compared with FAF-LSTM.
Question 5: I suggest including future research or studies based on the current investigation to broaden the scope and depth of the study.
Response:
We are very grateful for this helpful suggestion. We have added the future work in Section 7 (Conclusion) of the revised manuscript.
For the reviewer’s convenience, the added contents are listed below:
Line 434-437: For future work, the performance of FAF-LSTM can be verified across different datasets and environments. And other models, such as CNN or SVM, can also be combined with federate learning and compared with FAF-LSTM.
Author Response File: Author Response.pdf
Reviewer 4 Report
Comments and Suggestions for AuthorsThe manuscript proposes feature-attended federated LSTM for FIoT to enhance anomaly detection performance by integrating feature-attended LSTM and federated learning. It is a well-written, in-depth literature review; the methodological choice is appropriate. Moreover, the results were presented sufficiently, and a logical conclusion was drawn in line with the results.
While the manuscript presented a good coverage of the proposed FAF-LSTM method, the following aspects need to be attended to:
1. A thorough proofreading should be carried out to correct typos and grammatical errors.
2. Better insights into FAF-LSTM’s advantages and potential limitations of FAF-LSTM could be revealed if comparisons with other state-of-the-art anomaly detection methods beyond traditional federated learning-based LSTM were done. If possible, the authors should consider this path.
Author Response
We express our gratitude to the reviewers and editors for their valuable questions, comments and suggestions. We have read the reviewers’ comments carefully and revised the paper accordingly. In what follows, we give our detailed reply to each comment and also a description of the changes we have made in the manuscript for each comment.
In the manuscript, the revised contents are highlighted in red color.
Reviewer 4
Question 1: A thorough proofreading should be carried out to correct typos and grammatical errors.
Response:
We would like to thank the reviewer for this comment. We have made a thorough proofreading, corrected the typos, grammatical errors, and improved some descriptions.
Question 2: Better insights into FAF-LSTM’s advantages and potential limitations of FAF-LSTM could be revealed if comparisons with other state-of-the-art anomaly detection methods beyond traditional federated learning-based LSTM were done. If possible, the authors should consider this path.
Response:
We are grateful for this valuable comment. Regarding your suggestion to compare FAF-LSTM with other state-of-the-art anomaly detection methods, we fully agree that this would provide a broader validation of our methodology. However, due to the tight timeline (5 days) of the manuscript revision and the computational resources required for such comparisons, we could not include these analyses in this revision. However, we aim to explore these comparisons in our subsequent studies to provide a comprehensive evaluation of the FAF-LSTM against other state-of-the-art anomaly detection methods.
Therefore, we have added the corresponding description into “Section 7 Conclusion” of the revised manuscript. For the reviewer’s convenience, the added contents are listed below:
Line 434-437: For future work, the performance of FAF-LSTM can be verified across different datasets and environments. And other models, such as CNN or SVM, can also be combined with federate learning and compared with FAF-LSTM.
Author Response File: Author Response.pdf
Round 2
Reviewer 1 Report
Comments and Suggestions for AuthorsMy comments have been addressed.