WER-Net: A New Lightweight Wide-Spectrum Encoding and Reconstruction Neural Network Applied to Computational Spectrum
Round 1
Reviewer 1 Report
The use of CS or deep learning to compress data is not a new topic, so what are the specific problems to be addressed in this paper. What are the particularities of Computational spectrum?
-
The rationality of the model design needs to be further clarified.
- Is the training data actually collected? How is the training data obtained? What objects are covered by the training data ?
- The spectral curve of the object in FIG. 8 changes gently, and whether the method is effective for the object with large spectral curve changes ?
- The results should be compared to the best available methods, including deep learning-based methods.
Author Response
Thank you for your comments concerning our manuscript entitled “WER-Net: A New Lightweight Wide-spectrum Encoding and Reconstruction Neural Network Applied to Computational spectrum ” . Those comments are valuable and very helpful. We have read through comments carefully and have made corrections. Revisions in the text are shown using highlight for additions. The responses to the comments are presented following.
Point 1: The rationality of the model design needs to be further clarified.
R: Agree, We noticed the lack of a specific explanation of the issue in the article, so we have revised the article at line95-line105.
WER-Net is composed of a coding network and a reconstruction network. On the one hand, for the encoding network, since the spectral sampling process of the wide spectral encoding filter and the input operation process of the fully connected layer without bias are matrix multiplication, a layer of fully connected layer without bias can be used to simulate the wide spectral coding filter and obtain the corresponding spectral transmittance curve. On the other hand, for the reconstruction network, it simulates a process of solving the CS problem. According to the universal approximation theorem, the NP-hard problem of solving CS probelm can be solved by using more than three layers of neural networks. After adjusting the network structure to achieve the balance between solving accuracy and efficiency, we decided to use two layers of fully connected layer and three layers of convolutional layer to realize the solving process.
Point 2: Is the training data actually collected? How is the training data obtained? What objects are covered by the training data?
R: For the first question, our original training data sets are open source data sets from CAVE and ICVL, both of which are real spectral data, and are the most commonly used and authoritative data sets in the 400-700nm band. For the second problem, our training dataset is based on the 10nm wavelength of CAVE and ICVL, and the least square method is fitted to achieve 2nm high-resolution simulation of real data. Since the simulation of real data is not convincing enough, we used Ispecfield-HH spectrometer which is a direct spectrum detection instrument to take 2nm step spectrum of nearly a thousand real data in the experimental stage to verify that WER-Net can achieve 2nm high resolution. For the third problem, the CAVE and ICVL datasets used in the training data contain fruits, figures, paintings, furniture and other rich objects. The data captured by the spectrometer during the verification phase include lawns, playgrounds, oceans, mountains and other objects that are rich in nature.
Point 3: The spectral curve of object in FIG.8 changes gently, and whether the method is effective for the object with large spectral curve changes?
R: Firstly, the WER-Net proposed in this paper is developed for the development of a broad spectral coding spectrometer for real-world exploration. In the real world, spectra of the 400-700nm band are mostly gentle and continuous, while the spectral information of the first lawn in FIG.8 is the data that we collected with very large variation in spectral curve, and WER-Net also performs very well in recovering it. However, in response to your question, we found very sharp variations in the spectral curves in CAVE and ICVL with large xmax-xmin, and the test results proved that WER-Net also achieved very good results, as shown below.
Point 4: The results should be compared to the best available methods, including deep learning-based methods.
R: We supplement this issue at line317-line321 and line334-line336 by adding the results of the network comparison with reference [8] to Table 2. Reference[8] also adopts neural network. The difference between our method and reference[8] is we adopt convolutional layer which is featured with fewer parameters.
Author Response File: Author Response.pdf
Reviewer 2 Report
In this paper, a neural network named WER-Net is proposed, which includes optical filters quantitative spectral transmittance encoding and fast spectral reconstruction of the encoded spectral information. With this net, the architecture, methodology and training procedure are described. Experiment results are also reported. Some questions are as follows:
1. To obtained a higher spectral resolution, some interpolation method (such as the least squares fitting) is used. In this process, there should be a contradiction between accuracy and efficiency, how is this contradiction solved in this paper.
2. In the dataset augmentation, a total of 165000 spectral data is used and the resolution data ranges from 400 to 700 nm. To obtain the better test results, how many training samples are usually used? Is the training and testing effect of the network sensitive to its wavelength?
3. From table 1 in Section 4.1, can you summarize the impact of noise on the performance of WER-Net?
Author Response
Thank you for your comments concerning our manuscript entitled “WER-Net: A New Lightweight Wide-spectrum Encoding and Reconstruction Neural Network Applied to Computational spectrum”. Those comments are valuable and very helpful. We have read through comments carefully and have made corrections. Revisions in the text are shown using highlight for additions. The responses to the comments are presented following.
Point 1: To obtained a higher spectral resolution, some interpolation method (such as the least squares fitting) is used. In this process, there should be a contradiction between accuracy and efficiency, how is this contradiction solved in this paper.
R: This project aims to develop a 2-resolution spectrometer, but at this point the data we have from the CAVE and ICVL are all 10nm-step resolution. Given the nature of the network architecture, we have to find a way to achieve 2nm resolution. Therefore, we chose the interpolation method to obtain the "simulated 2nm step size data" mentioned in the article. We agree that researchers are concerned about the efficiency of the interpolation process. As we mentioned in the paper, we used "simulated 2nm step data" as the training and testing datasets. To verify the performance, we use nearly a thousand independent spectrums of 2nm step-size for independent Monte Carlo experiments, and the recovery performance of the network is very good. Thus, we demonstrate that WER-Net is robust when trained on a dataset that is not very accurate but rich enough.
Point 2: In the dataset augmentation, a total of 1,650,000 spectral data is used and the resolution data ranges from 400 to 700nm. To obtain the better test results, how many training samples are usually used? Is the training and testing effect of the network sensitive to its wavelength?
R: For the first problem, we tried to use some small-batch training dataset, but we encountered with slow convergence and low accuracy problem. The reconstructed spectrum lines have so many small oscillations. So we decide to utilize the whole database of CAVE and ICVL. The WER-Net trains with a 10:1 ratio of training dataset to test dataset. In this way, by observing training errors and testing errors in the training process, we can fully ensure that better network convergence results can be obtained and overfitting will not occur. For the omission of mentioning the ratio of training set and test set in the paper, we add at line259.
For the second problem, we didn’t find its sensitivity on wavelength. But one thing for sure, if the training data falls at the range of 400-700, then the network can be only applied to this spectrum range. For instance, through the experimental verification of 400-700nm band, WER-Net can complete the task of coding and reconstruction very well. When the wavelength range is changed, the physical meaning of the input, output, and first fully connected layer without bias is changed for WER-Net, but the core of the algorithm does not change. If we want to get WER-Net for 700 to 1000 nm, just use 700 to 1000 nm dataset during training.
Point 3: From table 1 in Section 4.1, can you summarize the impact of noise on the performance of WER-Net?
R: Like the reviewer, we anticipate that the noise will affect its performance. So we have put a lot of thought into this network. Like the engineered loss function, the hierarchical optimization.
After implementation of the above measures, although each index will decrease slightly as the noise increases, it does not decrease much when the noise is in a certain range. WER-Net has strong robustness, can accept the filter error in the reverse design and production process within a certain range, which is also an important reason for the real application of WER-Net in engineering. As for the omission here, we have added it at line312-line315.
Author Response File: Author Response.pdf
Round 2
Reviewer 1 Report
Abstract IS LENGTHY, THE background content IS TOO much, THE research related introduction is less, the contribution is not outstanding
Due to high spectral resolution and high spectral vector dimension, using MSE as the evaluation standard, the error obtained is very small, but it is not representative. It is suggested to add more reasonable evaluation criteria or refine the evaluation results, such as giving the maximum error.
Author Response
Thank you again for your thoughtful comments concerning our manuscript (ID: 1842586). Those comments are much appreciated. We have read through comments carefully and have made corresponding corrections. Revisions in the text are shown using highlight for additions. The responses to the comments are presented following.
Point 1: Abstract IS LENGTHY, THE background content IS TOO much, THE research related introduction is less, the contribution is not outstanding.
R: Thank you for the suggestions. We rewrite some of the contents in abstract, introduction, and try to sharpen our contribution in this field. We also agree that our research is still featured with limited contribution and hope to improve it in the future research.
.
Point 2: Due to high spectral resolution and high spectral vector dimension, using MSE as the evaluation standard, the error obtained is very small, but it is not representative. It is suggested to add more reasonable evaluation criteria or refine the evaluation results, such as giving the maximum error.
R: We totally agree. In Table 1, we proposed 5 evaluation standards including MSE, FWHM, Peak amplitude error, peak wavelength position deviation and reconstruction speed. In Table 2, it indicated the traditional algorithms are having 2-3 orders of magnitude difference to deep learning method, so we dropped the other standards. In the future research, we will take this into consideration.
Author Response File: Author Response.pdf