Next Article in Journal
Incremental Placement Technology Based on Front-End Design
Previous Article in Journal
A Fast Operation Method for Predicting Stress in Nonlinear Boom Structures Based on RS–XGBoost–RF Model
 
 
Article
Peer-Review Record

LMD-DARTS: Low-Memory, Densely Connected, Differentiable Architecture Search

Electronics 2024, 13(14), 2743; https://doi.org/10.3390/electronics13142743
by Zhongnian Li, Yixin Xu, Peng Ying, Hu Chen, Renke Sun and Xinzheng Xu *
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Reviewer 3: Anonymous
Electronics 2024, 13(14), 2743; https://doi.org/10.3390/electronics13142743
Submission received: 6 June 2024 / Revised: 5 July 2024 / Accepted: 10 July 2024 / Published: 12 July 2024

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

LMD-DARTS: Low Memory Densely Connected Differentiable Architecture Search

This paper proposes a Low Memory Densely Connected Differentiable Architecture Search (LMD-DARTS) algorithm to solve the problems of large memory usage and slow search speed in traditional NAS algorithms. There are still some problems in the manuscript that need to be solved.

1. Please check whether the experimental comparison data in this article is accurate. Taking cifar10 as an example, according to the results in the original DARTS paper, the algorithm has 3.3M parameters, while this paper has 4.38M parameters. At the same time, this article mentions that the DARTS search cost is 1 GPU day. Is this based on the graphics card Tesla v100 32G in this article?

2. According to the original PC-DARTS paper, the number of parameters is 3.6M, which is slightly different from the 3.98M mentioned in this paper. PC-DARTS mentioned that the search cost is 0.1 under GTX 1080Ti and 0.06 under Tesla V100. However, this paper mentioned that the search cost of PC-DARTS is 0.15. The figure shows the experimental results of PC-DARTS under Cifar10, please refer to it.

 

3. The article abstract mentions that the Low Memory Densely Connected Differentiable Architecture Search (LMD-DARTS) algorithm can solve the problems of large memory usage and slow search speed in traditional NAS algorithms, and mentions reducing memory consumption and search complexity many times. However, the experimental results in this article show that the parameter quantity indicator is higher than other algorithms and has no obvious advantage.

4. If this paper emphasizes only the optimized search cost and does not consider the number of parameters, but when comparing the search cost in GPU days, there is no prompt to explain under what conditions the search cost of other algorithms is achieved. Should the results of the original paper be directly quoted, or should other algorithms be standardized to the same hardware equipment conditions as this paper? After all, researchers may run their algorithms on different types of machines.

5. When comparing algorithms on the ImageNet Dataset, this article also compares parameter indicators with other manually designed algorithms, which can easily make readers ambiguous about the optimization purpose of this article. It is recommended to clearly highlight the optimization goals and advantages of this article.

6. This article is about the optimization of the architecture search algorithm based on gradient optimization. It is recommended to add some algorithm comparisons of this search method in the ImageNet Dataset.

7. Please check if there are any errors in the references in the experimental comparison, such as DARTS and PC-DARTS.

Comments for author File: Comments.pdf

Comments on the Quality of English Language


Author Response

Comments 1: Please check whether the experimental comparison data in this article is accurate. Taking cifar10 as an example, according to the results in the original DARTS paper, the algorithm has 3.3M parameters, while this paper has 4.38M parameters. At the same time, this article mentions that the DARTS search cost is 1 GPU day. Is this based on the graphics card Tesla v100 32G in this article?

Response 1: Thank you for pointing this out. We agree with this comment. Therefore, we have conducted some experiments to reproduce the DARTS[1] paper, and the reproduced results based on the graphics card Tesla V100 32G show that the algorithm does have 4.38M parameters. In fact, the experimental comparison data for DARTS[1] in the article is the result of our own reproduction, not the result found in the original paper.

Comments 2: According to the original PC-DARTS paper, the number of parameters is 3.6M, which is slightly different from the 3.98M mentioned in this paper. PC-DARTS mentioned that the search cost is 0.1 under GTX 1080Ti and 0.06 under Tesla V100. However, this paper mentioned that the search cost of PC-DARTS is 0.15. The figure shows the experimental results of PC-DARTS under Cifar10, please refer to it.

Response 2: Thank you for pointing this out. We agree with this comment. Therefore, we have independently reproduced the PC-DARTS[2] study with a Tesla V100 32G GPU. The comparison data for  PC-DARTS[2] we presented are results from our independent reproduction, rather than those reported in the original paper. By reproducing the  PC-DARTS[2] study, we verified that the search cost is indeed 0.15.

Comments 3: The article abstract mentions that the Low Memory Densely Connected Differentiable Architecture Search (LMD-DARTS) algorithm can solve the problems of large memory usage and slow search speed in traditional NAS algorithms, and mentions reducing memory consumption and search complexity many times. However, the experimental results in this article show that the parameter quantity indicator is higher than other algorithms and has no obvious advantage.

Response 3: Thank you for pointing this out. We agree with this comment. Therefore, we have highlighted the comparative advantages of LMD-DARTS in terms of search time and performance. Specifically, On the CIFAR-10 dataset, compared to Beta-DARTS[3], CDARTS[4], AmoebaNet[5], XNAS[6], PC-DARTS[2], EPC-DARTS[7], SWD-NAS[8], Bandit-NAS[9] and VNAS[10], our algorithm takes significantly less search time despite the increase in the number of parameters. Similarly, on the ImageNet dataset, our algorithm reduced search time by 30% to 70% with comparable or better accuracy than these methods. Discussion details can be found in Tables 2 and 3.

Comments 4: If this paper emphasizes only the optimized search cost and does not consider the number of parameters, but when comparing the search cost in GPU days, there is no prompt to explain under what conditions the search cost of other algorithms is achieved. Should the results of the original paper be directly quoted, or should other algorithms be standardized to the same hardware equipment conditions as this paper? After all, researchers may run their algorithms on different types of machines.

Response 4: Thank you for pointing this out. We agree with this comment. Therefore, we have listed the hardware equipment conditions used in these studies in Tables 2 and 3, allowing for fair comparisons despite the differences in equipment. For reproducible methods, we conducted experiments on a standardized Tesla V100 GPU. This includes the following algorithms: NasNet[11], ENAS[12], AmoebaNet[5], XNAS[6], DARTS[1], and PC-DARTS[2]. For other methods, such as Beta-DARTS[3], CDARTS[4], EPC-DARTS[7], SWD-NAS[8], IS-DARTS[13], Bandit-NAS[9], and VNAS[10], we quoted the results from the original paper. Revised details can be found in Tables 2 and 3.

Comments 5: When comparing algorithms on the ImageNet Dataset, this article also compares parameter indicators with other manually designed algorithms, which can easily make readers ambiguous about the optimization purpose of this article. It is recommended to clearly highlight the optimization goals and advantages of this article.

Response 5: Thank you for your comments. We made several revisions to the Abstract and Introduction to better highlight the optimization goals and advantages of the LMD-DARTS algorithm. Revised details can be found in lines 4-13 and 44-53.

In the Abstract, we made the following changes:

We revised “We propose the Low Memory Densely Connected Differentiable Architecture Search (LMD-DARTS) algorithm to address the issues of abundant memory usage and slow search speed in traditional NAS algorithms.” to “To address the issues of abundant memory usage and slow search speed in traditional NAS algorithms, we propose the Low Memory Densely Connected Differentiable Architecture Search (LMD-DARTS) algorithm..

We revised “LMD-DARTS proposes a continuous strategy based on weights redistribution to increase the updating speed of the optional operations’ weights during the search process,” to “To increase the updating speed of the optional operation weights during the search process, LMD-DARTS introduces a continuous strategy based on weights redistribution.”.

We revised “reduce the inffuence of low-weight operations on classiffcation results to reduce the number of searches. Additionally, LMD-DARTS designs a dynamic sampler to prune operations that perform poorly during the search process, reducing memory consumption and the complexity of single searches.” to “To reduce the influence of low-weight operations on classification results and minimize the number of searches, LMD-DARTS designs a dynamic sampler to prune operations that perform poorly during the search process, lowering memory consumption and the complexity of single searches.” 

We revised “an adaptively downsampling search algorithm is proposed, which sparsifies the dense connection matrix to reduce redundant connections while ensuring the network’s performance.” to “To sparsify the dense connection matrix and reduce redundant connections while ensuring the network performance, an adaptively downsampling search algorithm is proposed.

In the Introduction, we made the following changes:

We revised “Based on the micro-neural architecture search algorithm, a low-memory neural architecture search algorithm, LMD-DARTS, is proposed by using the dense connected lightweight search space with DAS-Block adaptively subsampling.” to “To optimize the weight of candidate operations in the continuous process of discrete space searching for micro-neural architecture, we propose a low-memory neural architecture search algorithm, LMD-DARTS, based on the micro-neural architecture search algorithm and using a densely connected lightweight search space with DAS-Block adaptively subsampling.

We revised “To optimize the weight of candidate operations in the continuous process of discrete space searching for micro-neural architecture, a continuous strategy of weight redistribution, WRD-Softmax, is proposed to improve the weight of candidate operations with the maximum sampling probability, weaken the influence of poor candidate operations on classification results, and realize weight redistribution.” to “To weaken the influence of poor candidate operations on classification results, and realize weight redistribution, a continuous strategy of weight redistribution, WRD-Softmax, is proposed.

We revised “a dynamic sampler is proposed to prune the candidate operations during the search process, and dynamically trim the poorly performing candidate operations, so as to reduce the cache consumption and search time of a single search and further improve the search efficiency without affecting the search results.” to “To reduce cache consumption and search time of a single search while further improving search efficiency without affecting the search results, a dynamic sampler is proposed.

Comments 6: This article is about the optimization of the architecture search algorithm based on gradient optimization. It is recommended to add some algorithm comparisons of this search method in the ImageNet Dataset.

Response 6: Thank you for pointing this out. We agree with this comment. Therefore, we have added additional algorithm comparisons of this search method on the ImageNet dataset in Table 3, specifically including Shapley-NAS[14] and VNAS[10]. The results indicate that among the neural architecture search models based on gradient optimization, LMD-DARTS demonstrates the fastest search speed, reducing the search time by approximately 30% to 70%. Revised details can be found in Tables 3.

Comments 7: Please check if there are any errors in the references in the experimental comparison, such as DARTS and PC-DARTS.

Response 7: Thank you for pointing this out. We agree with this comment. Therefore, we have corrected the error in the references in the experimental comparison. The revisions are as follows: ENAS[12] is now cited in reference 52, DARTS[1] is cited in reference 19 and PC-DARTS[2] in reference 3. Revised details can be found in Tables 2.

Reference

[1] Liu, H., Simonyan, K., Yang, Y. Darts: Differentiable architecture search. In Proceedings of the International Conference on Learning Representations, New Orleans, LA, USA, 2019.

[2] Xu, Y., Xie, L., Zhang, X., Chen, X., Qi, G. J., Tian, Q., Xiong, H. Pc-darts: Partial channel connections for memory-efficient architecture search. arXiv:1907.05737, 2019, preprint.

[3] Ye, P., Li, B., Li, Y., Chen, T., Fan, J., Ouyang, W. b-darts: Beta-decay regularization for differentiable architecture search. In proceedings of the IEEE/CVF conference on computer vision and pattern recognition, New Orleans, LA, USA, 2022; 10864--10873.

[4] Yu, H., Peng, H., Huang, Y., Fu, J., Du, H., Wang, L., Ling, H. Cyclic differentiable architecture search. In proceedings of the IEEE transactions on pattern analysis and machine intelligence, 2023; 211--228.

[5] Real, E., Aggarwal, A., Huang, Y., Le, Q. V. Regularized evolution for image classifier architecture search. In Proceedings of the aaai conference on artificial intelligence,San Juan, Puerto Rico, 2016; 4780--4789.

[6] Nayman, N., Noy, A., Ridnik, T., Friedman, I., Jin, R., Zelnik, L. Xnas: Neural architecture search with expert advice. In Proceedings of the neural information processing systems, Vancouver, Canada, 2019; 1975--1985.

[7] Cai, Z., Chen, L., Liu, H. L. EPC-DARTS: Efficient partial channel connection for differentiable architecture search. In proceedings of the Neural Networks, 2023; 344--353.

[8] Xue, Y., Han, X., Wang, Z. Self-Adaptive Weight Based on Dual-Attention for Differentiable Neural Architecture Search. In proceedings of the IEEE Transactions on Industrial Informatics, 2024; 1--20.

[9] Lin, Y., Endo, Y., Lee, J., Kamijo, S. Bandit-NAS: Bandit sampling and training method for Neural Architecture Search. In proceedings of the Neurocomputing, 2024; 597.

[10] Ma, B., Zhang, J., Xia, Y., Tao, D. VNAS: Variational Neural Architecture Search. In proceedings of the International Journal of Computer Vision, 2024; 1--25.

[11] Qin, X., Wang, Z. Nasnet: A neuron attention stage-by-stage net for single image deraining. arXiv:1912.03151, 2019, preprint.

[12] Pham, H., Guan, M., Zoph, B., Le, Q., Dean, J. Efficient neural architecture search via parameters sharing. In Proceedings of the International conference on machine learning, July, 2018; 4095--4104.

[13] He, H., Liu, L., Zhang, H., Zheng, N. IS-DARTS: Stabilizing DARTS through Precise Measurement on Candidate Importance. In Proceedings of the AAAI Conference on Artificial Intelligence, March, 2024; 12367--12375.

[14] Xiao, H., Wang, Z., Zhu, Z., Zhou, J., Lu, J. Shapley-NAS: discovering operation contribution for neural architecture search. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2022; 11892--11901.

Reviewer 2 Report

Comments and Suggestions for Authors

There are several typos and grammatical mistakes in the paper e.g. Line 13-17 “during the search process. reduce the influence of low-weight operations on classification results to reduce the number of searches. Additionally, LMD-DARTS designs a dynamic sampler to prune operations that perform poorly during the search process, reducing memory consumption and the complexity of single searches. an adaptively downsampling search algorithm is proposed, which sparsifies the dense connection matrix to reduce redundant connections while ensuring the network’s performance.” Please attentively revise the complete paper. The English language of the paper is understandable, however, there are areas where specificity could be improved to enhance the overall quality and impact of the paper.

The abbreviations are not properly used in the complete paper, revise the complete paper carefully.

In “Introduction” section, how does the contribution address a specific gap in the field?

In Figure 3 and 4, some nodes and blocks explanations are missing,  

The article lacks vital discussion associated to the compared technologies. What are the other practical applications?

Comments on the Quality of English Language

The English language needs to be revised. 

Author Response

Comments 1: There are several typos and grammatical mistakes in the paper e.g. Line 13-17 “during the search process. reduce the influence of low-weight operations on classification results to reduce the number of searches. Additionally, LMD-DARTS designs a dynamic sampler to prune operations that perform poorly during the search process, reducing memory consumption and the complexity of single searches. an adaptively downsampling search algorithm is proposed, which sparsifies the dense connection matrix to reduce redundant connections while ensuring the network’s performance.” Please attentively revise the complete paper. The English language of the paper is understandable, however, there are areas where specificity could be improved to enhance the overall quality and impact of the paper.

Response 1: Thank you for pointing this out. We agree with this comment. Therefore, we have checked the entire paper, improved some English language expressions, and enhanced the overall quality of the paper. The revised sentences are as follows: To reduce the influence of low-weight operations on classification results and minimize the number of searches, LMD-DARTS designs a dynamic sampler to prune operations that perform poorly during the search process, lowering memory consumption and the complexity of single searches. Additionally, to sparsify the dense connection matrix and reduce redundant connections while ensuring the network performance, an adaptively downsampling search algorithm is proposed. Revised details can be found in lines 8-12.

Comments 2: The abbreviations are not properly used in the complete paper, revise the complete paper carefully.

Response 2: Thank you for pointing this out. We agree with this comment. Therefore, we have thoroughly reviewed the entire paper and corrected the inconsistencies in the use of abbreviations, ensuring that all instances of "DARTS," "PC-DARTS," and similar terms are consistently capitalized throughout the text. Revised details can be found in the manuscript.

Comments 3: In “Introduction” section, how does the contribution address a specific gap in the field?

Response 3: Thank you for pointing this out. We agree with this comment. Therefore, we have elaborated on our three key contributions to address specific gaps in the field of neural architecture search (NAS): Firstly, to enhance updating speed and reduce the impact of low-weight operations on classification results, we propose a continuous weight redistribution strategy, reducing required searches. Secondly, to optimize memory consumption and search complexity, we introduce a dynamic sampler that prunes poorly performing operations, improving efficiency. Finally, to streamline network architecture while maintaining performance, we propose an adaptively downsampling search algorithm that minimizes redundant connections in dense matrices.

Comments 4: In Figure 3 and 4, some nodes and blocks explanations are missing. 

Response 4: Thank you for pointing this out. We agree with this comment. Therefore, we have supplemented the description of nodes in Figures 3 and 4 to clarify their meaning. “(a) is a normal block and (b) is reduced dimension block. A cell is a directed acyclic graph consisting of an ordered sequence of N nodes. Each node is a feature map in convolutional networks.” Revised details can be found in Figure 3 and 4.

Comments 4: The article lacks vital discussion associated to the compared technologies. What are the other practical applications?

Response 4: Thank you for pointing this out. We agree with this comment. Therefore, we have added a discussion of contrast techniques in Section 4 of the paper. 

4.2 Compared Methods

In this section, we compare our LMD-DARTS method with various NAS technologies, highlighting the advantages of our approach in terms of search speed, and overall performance. These methods encompass reinforcement learning-based (RL) search methods, evolutionary algorithm-based (EA) search methods, and gradient-based search strategies. They have been applied to neural architecture search tasks with outstanding results, providing efficient and scalable solutions for various applications, including image classification, object detection, and other computer vision tasks.

Reinforcement learning-based search. This method optimizes optimize neural network parameters by leveraging a search algorithm to discover optimal architectures. NASNet[1], ENAS[2], Beta-DARTS[3], Bandit-NAS[4] belong to this category. This method can make the search process more robust.

Evolutionary algorithm-based search. This method uses a discrete search space to directly handle large-scale NAS tasks. AmoebaNet[5] belongs to this category, which performs well in tasks with discrete search spaces.

Gradient-based search. This method optimizes architecture by iteratively adjusting parameters in the gradient direction derived from the differentiable objective function. CDARTS[6], XNAS[7], DARTS[8], PC-DARTS[9], EPC-DARTS[10], SWD-NAS[11], IS-DARTS[12], VNAS[13], EfficientNet[14], Shapley-NAS[15] belong to this category. Compared to other methods, gradient-based NAS approaches effectively mitigate performance collapse issues."Revised details can be found in lines 319-339.

Reference

[1] Qin, X., Wang, Z. Nasnet: A neuron attention stage-by-stage net for single image deraining. arXiv:1912.03151, 2019, preprint.

[2] Pham, H., Guan, M., Zoph, B., Le, Q., Dean, J. Efficient neural architecture search via parameters sharing. In Proceedings of the International conference on machine learning, July, 2018; 4095--4104.

[3] Ye, P., Li, B., Li, Y., Chen, T., Fan, J., Ouyang, W. b-darts: Beta-decay regularization for differentiable architecture search. In proceedings of the IEEE/CVF conference on computer vision and pattern recognition, New Orleans, LA, USA, 2022; 10864--10873.

[4] Lin, Y., Endo, Y., Lee, J., Kamijo, S. Bandit-NAS: Bandit sampling and training method for Neural Architecture Search. In proceedings of the Neurocomputing, 2024; 597.

[5] Real, E., Aggarwal, A., Huang, Y., Le, Q. V. Regularized evolution for image classifier architecture search. In Proceedings of the aaai conference on artificial intelligence,San Juan, Puerto Rico, 2016; 4780--4789.

[6] Yu, H., Peng, H., Huang, Y., Fu, J., Du, H., Wang, L., Ling, H. Cyclic differentiable architecture search. In proceedings of the IEEE transactions on pattern analysis and machine intelligence, 2023; 211--228.

[7] Nayman, N., Noy, A., Ridnik, T., Friedman, I., Jin, R., Zelnik, L. Xnas: Neural architecture search with expert advice. In Proceedings of the neural information processing systems, Vancouver, Canada, 2019; 1975--1985.

[8] Liu, H., Simonyan, K., Yang, Y. Darts: Differentiable architecture search. In Proceedings of the International Conference on Learning Representations, New Orleans, LA, USA, 2019.

[9] Xu, Y., Xie, L., Zhang, X., Chen, X., Qi, G. J., Tian, Q., Xiong, H. Pc-darts: Partial channel connections for memory-efficient architecture search. arXiv:1907.05737, 2019, preprint.

[10] Cai, Z., Chen, L., Liu, H. L. EPC-DARTS: Efficient partial channel connection for differentiable architecture search. In proceedings of the Neural Networks, 2023; 344--353.

[11] Xue, Y., Han, X., Wang, Z. Self-Adaptive Weight Based on Dual-Attention for Differentiable Neural Architecture Search. In proceedings of the IEEE Transactions on Industrial Informatics, 2024; 1--20.

[12] He, H., Liu, L., Zhang, H., Zheng, N. IS-DARTS: Stabilizing DARTS through Precise Measurement on Candidate Importance. In Proceedings of the AAAI Conference on Artificial Intelligence, March, 2024; 12367--12375.

[13] Ma, B., Zhang, J., Xia, Y., Tao, D. VNAS: Variational Neural Architecture Search. In proceedings of the International Journal of Computer Vision, 2024; 1--25.

[14] Tan, M., Le, Q. Efficientnet: Rethinking model scaling for convolutional neural networks. In Proceedings of the International conference on machine learning, Long Beach, California, USA, 2019; 6105--6114.

[15] Xiao, H., Wang, Z., Zhu, Z., Zhou, J., Lu, J. Shapley-NAS: discovering operation contribution for neural architecture search. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2022; 11892-11901.

Reviewer 3 Report

Comments and Suggestions for Authors

[Summary]

This paper proposes Low Memory Densely Connected Differentiable Architecture Search algorithm called LMD-DARTS to address the issues of abundant memory usage and slow search speed in traditional NAS algorithms. The LMD-DARTS proposes a continuous strategy based on weights redistribution to increase the updating speed of the optional operations’ weights during the search process. reduce the influence of low-weight operations on classification results to reduce the number of searches. Additionally, LMD-DARTS designs a dynamic sampler to prune operations that perform poorly during the search process, reducing memory consumption and the complexity of single searches. an adaptively downsampling search algorithm is proposed, which sparsifies the dense connection matrix to reduce redundant connections while ensuring the network’s performance.

The summarized strengths and weaknesses are listed as follows:

[Strengths]

1. This paper proposes a continuous strategy based on weights redistribution to increase the updating speed of the optional operators’ weights.

2. This paper develops a dynamic sampler to prune operations.

3. This paper presents an adaptive downsampling search algorithm.

 

[Weaknesses]

1. This paper has the strong theoretical properties, which is hardly for readers. Could the authors provide some application examples to demonstrate the superiority of the proposed LMD-DARTS? For instance, several NAS-based techniques have been proposed to address the problems of image processing and computer vision. If the authors can give some applications in the field of computer vision, this paper will be enriched for readers.

2. The related work can provide some review about the applications in the field of image processing and computer vision, such as image restoration and enhancement (10.1109/TAI.2022.3204732, 10.1109/TCSVT.2022.3214430). I would like to understand how the proposed method can address image restoration and enhancement problem. At least, the authors can provide the potential solutions or strategies in the future work.

3. For experiment section, the provided comparisons on CIFAR-10 dataset cannot prove the superiority of the proposed LMD-DARTS. The authors should provide more recent STOA methods for comparisons. Moreover, the evaluation metrics seem to be insufficient to represent the performance of different methods.

5. The authors should provide the limitations of the proposed LMD-DARTS. In addition, the future work is also added for discussion.

Comments on the Quality of English Language

Some minor gramma errors and informal expressions should be carefully checked and corrected.

Author Response

Comments 1: This paper has the strong theoretical properties, which is hardly for readers. Could the authors provide some application examples to demonstrate the superiority of the proposed LMD-DARTS? For instance, several NAS-based techniques have been proposed to address the problems of image processing and computer vision. If the authors can give some applications in the field of computer vision, this paper will be enriched for readers.

Response 1: Thank you for pointing this out. We agree with this comment. Therefore, we have made the following additions about applications in the field of computer vision in Section 2 related work:“In recent years, Neural Architecture Search (NAS) has shown significant promise across various computer vision tasks, including image classification, recognition, restoration, enhancement, and retrieval. Several studies have applied NAS to these areas, improving performance and efficiency. For example, Ghiasi et al.[1] propose Neural Architecture Search Feature Pyramid Network (NAS-FPN), which uses reinforcement learning to sample and evaluate different feature pyramid architectures for detection tasks while minimizing the computational cost. Zhang et al.[2] propose a memory-efficient hierarchical NAS (HiNAS) to address image denoising and image super-resolution. Priyadarshi et al.[3] present DONNAv2, a computationally efficient neural architecture distillation method that showcases NAS applications in various vision tasks. Liu et al.[4] introduce Auto-DeepLab, expanding the application field of neural architecture search to semantic segmentation tasks. Additionally, Mandal et al.[5] and Liu et al.[6] propose methods for haze removal and nighttime image enhancement. These methods have set a benchmark in image restoration and enhancement. Our proposed LMD-DARTS algorithm further optimizes NAS with dynamic sampler and adaptive downsampling, enhancing image clarity and reducing artifacts, which enhance the performance of models in tasks such as image classification, recognition, and retrieval.” Revised details can be found in lines 137-153.

Comments 2: The related work can provide some review about the applications in the field of image processing and computer vision, such as image restoration and enhancement (10.1109/TAI.2022.3204732, 10.1109/TCSVT.2022.3214430). I would like to understand how the proposed method can address image restoration and enhancement problem. At least, the authors can provide the potential solutions or strategies in the future work.

Response 2: Thank you for pointing this out. We agree with this comment. Therefore, we have made the following additions about the applications in the field of image processing and computer vision in Section 2 related work. We have analyzed the relevance of these papers (10.1109/TAI.2022.3204732[5], 10.1109/TCSVT.2022.3214430[6]) to our work and included them in our references. Besides, we provide the potential solutions or strategies in future work. Specifically, in the future, we will focus on optimizing the computational efficiency, so that it can be applied to large models. Besides, we can consider combining it with other optimization algorithms to further expand it to a wider range of application scenarios such as image enhancement or image restoration. Revised details can be found in lines 418-421.

Comments 3: For experiment section, the provided comparisons on CIFAR-10 dataset cannot prove the superiority of the proposed LMD-DARTS. The authors should provide more recent STOA methods for comparisons. Moreover, the evaluation metrics seem to be insufficient to represent the performance of different methods.

Response 3: Thank you for pointing this out. We agree with this comment. Therefore, we have added some recent STOA methods (EPC-DARTS[7], SWD-NAS[8], IS-DARTS[9], Bandit-NAS[10], VNAS[11]) in Table 2 to demonstrate the superiority of the proposed LMD-DARTS. For example, LMD-DARTS reduces parameter count by 9% and search time by 73% compared to IS-DARTS[9]. Despite a 25% increase in parameters compared to EPC-DARTS[7], it still cuts search time by 45%. These results show LMD-DARTS significantly improves search efficiency while maintaining competitive parameter counts, highlighting its effectiveness. For detailed information, please refer to Table 3 in the revised manuscript.

In this paper, we chose to evaluate our models based on accuracy, parameters, and GPU days. These metrics are widely used in the community and provide a comprehensive assessment of different methods. Moreover, some notable methods such as EPC-DARTS proposed by Cai et al.[7] and IS-DARTS proposed by He et al.[9] employ these standard metrics to evaluate performance which are sufficient to represent the performance of different methods in the context of neural architecture search. Revised details can be found in Table 2 and 3.

Comments 5: The authors should provide the limitations of the proposed LMD-DARTS. In addition, the future work is also added for discussion.

Response 5: Thank you for pointing this out. We agree with this comment. Therefore, we have made the following additions about limitations and future work in Section 5 Discussion:“The adaptive downsampling search algorithm used by LMD-DARTS can effectively reduce redundant connections, but it may eliminate useful connections, which affect the performance of the model. On the other hand, although LMD-DARTS has improved in reducing search time, its performance improvement may vary depending on the complexity of the task and the size of the dataset. Therefore, we consider neural architecture search on large models as an interesting challenge for the future work. In the future, we will focus on optimizing the computational efficiency, so that it can be applied to large models. Besides, we can consider combining it with other optimization algorithms to further expand it to a wider range of application scenarios such as image enhancement or image restoration.Revised details can be found in lines 413-421.

Reference

[1] Ghiasi, G., Lin, T. Y., Le, Q. V. Nas-fpn: Learning scalable feature pyramid architecture for object detection. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2019; 7036--7045.

[2] Zhang, H., Li, Y., Chen, H., Gong, C., Bai, Z., Shen, C. Memory-efficient hierarchical neural architecture search for image restoration. International Journal of Computer Vision, 2022; 1--22.

[3] Priyadarshi, S., Jiang, T., Cheng, H. P., Krishna, S., Ganapathy, V., Patel, C. DONNAv2-Lightweight Neural Architecture Search for Vision tasks. In Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023; 1384--1392.

[4] Liu, C., Chen, L. C., Schroff, F., Adam, H., Hua, W., Yuille, A. L., Fei-Fei, L. Auto-deeplab: Hierarchical neural architecture search for semantic image segmentation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2019; 82--92.

[5] Mandal, M., Meedimale, Y. R., Reddy, M. S. K., Vipparthi, S. K. Neural architecture search for image dehazing. In Proceedings of the IEEE Transactions on Artificial Intelligence, 2022; 1337--1347.

[6] Liu, Y., Yan, Z., Tan, J., Li, Y. Multi-purpose oriented single nighttime image haze removal based on unified variational retinex model. In Proceedings of the IEEE Transactions on Circuits and Systems for Video Technology, 2022; 1643--1657.

[7] Cai, Z., Chen, L., Liu, H. L. EPC-DARTS: Efficient partial channel connection for differentiable architecture search. In proceedings of the Neural Networks, 2023; 344--353.

[8] Xue, Y., Han, X., Wang, Z. Self-Adaptive Weight Based on Dual-Attention for Differentiable Neural Architecture Search. In proceedings of the IEEE Transactions on Industrial Informatics, 2024; 1--20.

[9] He, H., Liu, L., Zhang, H., Zheng, N. IS-DARTS: Stabilizing DARTS through Precise Measurement on Candidate Importance. In Proceedings of the AAAI Conference on Artificial Intelligence, March, 2024; 12367--12375.

[10] Lin, Y., Endo, Y., Lee, J., Kamijo, S. Bandit-NAS: Bandit sampling and training method for Neural Architecture Search. In proceedings of the Neurocomputing, 2024; 597.

[11] Ma, B., Zhang, J., Xia, Y., Tao, D. VNAS: Variational Neural Architecture Search. In proceedings of the International Journal of Computer Vision, 2024; 1--25.

Reviewer 4 Report

Comments and Suggestions for Authors

1. Abstract needs to be rewritten in shorter format comparing with current format.

2. Three contributions are mentioned in the introduction section. These needs to be written again in shorter format with clear and strong indication of contributions.

3. There is no future work mentioned in the conclusion.

Comments on the Quality of English Language

Must be improved with professional English language editing service.

Author Response

Comments 1: Abstract needs to be rewritten in shorter format comparing with current format.

Response 1: Thank you for pointing this out. We agree with this comment. Therefore, we have reformulated the Abstract to make it shorter.“Neural network architecture search(NAS) technology is the key to designing lightweight Convolutional Neural Networks (CNNs), enabling automatic search for network structures without extensive prior knowledge. However, NAS consumes extensive computing resources and time since it evaluates abundant candidate architectures. To address the issues of abundant memory usage and slow search speed in traditional NAS algorithms, we propose the Low Memory Densely Connected Differentiable Architecture Search (LMD-DARTS) algorithm. To increase the updating speed of the optional operation weights during the search process, LMD-DARTS introduces a continuous strategy based on weights redistribution. To reduce the influence of low-weight operations on classification results and minimize the number of searches, LMD-DARTS designs a dynamic sampler to prune operations that perform poorly during the search process, lowering memory consumption and the complexity of single searches. Additionally, to sparsify the dense connection matrix and reduce redundant connections while ensuring the network performance, an adaptively downsampling search algorithm is proposed. Experimental results show that LMD-DARTS reduces search time by 20% and decreases the memory consumption of the NAS algorithm. The lightweight CNNs obtained through this algorithm demonstrate good classification accuracy.” Revised details can be found in lines 1-15.

Comments 2: Three contributions are mentioned in the introduction section. These needs to be written again in shorter format with clear and strong indication of contributions.

Response 2: Thank you for pointing this out. We agree with this comment. Therefore, we have rewritten the contribution in a shorter format:

1. A continuous strategy redistributes weights to accelerate updates for optional operations during the search, minimizing the impact of low-weight operations on classification results and reducing search iterations.

2. A dynamic sampler prunes underperforming operations in real-time, cutting memory usage and simplifying individual search processes.

3. An adaptively downsampling search algorithm is proposed, which sparsifies the dense connection matrix to reduce redundant connections while ensuring network’s performance.”Revised details can be found in lines 57-64.

Comments 3: There is no future work mentioned in the conclusion.

Response 3: Thank you for pointing this out. We agree with this comment. Therefore, we have made the following additions about limitations and future work in Section 5 Discussion:“The adaptive downsampling search algorithm used by LMD-DARTS can effectively reduce redundant connections, but it may eliminate useful connections, which affect the performance of the model. On the other hand, although LMD-DARTS has improved in reducing search time, its performance improvement may vary depending on the complexity of the task and the size of the dataset. Therefore, we consider neural architecture search on large models as an interesting challenge for the future work. In the future, we will focus on optimizing the computational efficiency, so that it can be applied to large models. Besides, we can consider combining it with other optimization algorithms to further expand it to a wider range of application scenarios such as image enhancement or image restoration.” Revised details can be found in lines 413-421.

Round 2

Reviewer 1 Report

Comments and Suggestions for Authors

It mainly includes seven opinions, such as experimental data issues, unclear optimization indicators for research papers, and citation issues in some literature. The author has provided detailed explanations and revisions to the above questions I have raised. The background, research content, and objectives of the entire paper are relatively clear, and the ablation and comparative experiments are very specific and detailed, highlighting the advantages of this optimization algorithm and demonstrating good innovation.

Comments for author File: Comments.pdf

Author Response

Comments 1: It mainly includes seven opinions, such as experimental data issues, unclear optimization indicators for research papers, and citation issues in some literature. The author has provided detailed explanations and revisions to the above questions I have raised. The background, research content, and objectives of the entire paper are relatively clear, and the ablation and comparative experiments are very specific and detailed, highlighting the advantages of this optimization algorithm and demonstrating good innovation.

Response 1: Thank you for pointing this out. We agree with this comment. Therefore, we have addressed the seven opinions you raised regarding our manuscript "LMD DARTS: Low Memory Densely Connected Differential Architecture Search", including issues with experimental data, unclear optimization indicators, and citation issues in some literature. I have provided detailed explanations and revisions to resolve these points as you recommended. I am pleased to hear that you found the background, research content, and objectives of the paper to be clear. Additionally, I appreciate your recognition of the specificity and detail in the ablation and comparative experiments, as well as the innovation and advantages of the proposed optimization algorithm.

Thank you for suggesting the acceptance of this paper. Your feedback has been invaluable in enhancing the quality of my research.

Reviewer 2 Report

Comments and Suggestions for Authors

Accept

Author Response

Comments 1: Accept

Response 1: Thank you for pointing this out. We agree with your positive assessment of our manuscript. Therefore, we have proceeded with the final revisions to ensure clarity and completeness.

We appreciate the time and effort you dedicated to thoroughly reviewing our work. Your valuable comments have greatly contributed to the improvement of the paper.We are truly grateful for your acceptance of our work. 

Reviewer 3 Report

Comments and Suggestions for Authors

Thank you for your response. All of my concerns have been addressed.

I recommend accepting this paper.

Comments on the Quality of English Language

Some minor grammatical issues should be carefully checked and corrected.

Author Response

Comment 1: Thank you for your response. All of my concerns have been addressed. I recommend accepting this paper.

Response 1:Thank you for pointing this out. We agree with this comment. Therefore, we have thoroughly reviewed the manuscript for minor grammatical issues and corrected them to improve the overall readability and presentation of the paper. Below are some examples of revisions, and more revised details can be found in the highlighted areas of the revised manuscript. Thank you once again for your valuable feedback and recommendation to accept our paper.

In the Abstract, we made the following changes:

We revised “the key to” to “pivotal for”.

We revised “enabling” to “facilitating the”.

We revised “NAS consumes extensive computing resources and time since it evaluates abundant candidate architectures.” to “NAS is resource-intensive, consuming significant computational power and time due to the evaluation of numerous candidate architectures.”.

We revised “Experimental results show that LMD-DARTS reduces search time by 20% and decreases the memory consumption of the NAS algorithm.” to “Our experimental results show that the proposed LMD-DARTS achieves a remarkable 20% reduction in search time, along with a significant decrease in memory utilization within NAS process.”.

We revised “The lightweight CNNs obtained through this algorithm demonstrate good classification accuracy.” to “Notably, the lightweight CNNs derived through this algorithm exhibit commendable classification accuracy, underscoring their effectiveness and efficiency for practical applications.”.

In the Introduction, we made the following changes:

We revised “with a series of major theoretical breakthroughs and method innovations in the field of artificial intelligence, Convolutional Neural Network (CNN) has been widely used in computer vision, natural language processing, speech recognition, and other fields, which promote the deep integration of artificial intelligence and all walks of life. ” to “the realm of artificial intelligence has witnessed a flurry of groundbreaking theoretical advancements and methodological innovations, propelling the Convolutional Neural Network (CNN) into widespread adoption across diverse domains such as computer vision, natural language processing, and speech recognition. This integration has fostered a profound fusion between artificial intelligence and various facets of society.”.

We revised “the complexity of convolutional neural network models is continuously improved to achieve higher network accuracy, which also severely restricts the deployment of CNN on resource-constrained devices.” to “the relentless pursuit of heightened network accuracy has necessitated the continuous escalation of CNN model complexity, posing a formidable challenge to the deployment of these networks on resource-constrained devices.”.

We revised “one is to build network structures with lower parameters and computation while ensuring network accuracy,” to “The first direction involves building network structures with fewer parameters and lower computational requirements while ensuring network accuracy,”.

We revised “models’ performance” to “performance of the models”.

We revised “At the same time” to “Simultaneously”.

We revised “network’s performance” to “performance of network”.

In the Related Work, we made the following changes:

We revised “which would be time-consuming if the traditional deep learning evaluation process are used directly.” to “which can be extremely time-consuming when using traditional deep learning evaluation methods.”.

We revised “(GPU days are used to measure the complexity of the NAS algorithm and represent the number of days the algorithm is searched on a single GPU);” to “a metric that reflects the computational complexity of the NAS algorithm by measuring the number of days required when running on a single GPU.”.

We revised “CNNs’ performance” to “performance of CNNs”.

We revised “The search result is better than most manual networks of the time.” to “surpassing most manually designed networks of the time.”.

We revised “eliminates the poorly performing architecture during the search process, selects the well-behaved architectures as the parents,” to “During the search process, it systematically eliminates poorly performing architectures, selecting the well-behaved architectures as the parents.”.

We revised “retains the structures and weights of the parent networks in the replication and mutation stages” to “In the replication and mutation stages, the evolutionary process retains the structures and weights of the parent networks,”.

We revised “migration performance” to “performance of migration”.

We revised “memory consumption is reduced to the original 1/N, which can directly search the network architecture on ImageNet dataset.” to “memory consumption is reduced to 1/N of the original, allowing for direct network architecture on ImageNet dataset.”.

We revised “optimize the problem that DARTS show different search preferences” to “optimize the problem of DARTS showing different search preferences”.

We revised “which applies” to “applying”.

We revised “These methods have set a benchmark in image restoration and enhancement.” to “setting benchmarks in image restoration and enhancement.”.

In the Section 3, we made the following changes:

We revised “In DARTS’s random relaxation strategy” to “In the random relaxation strategy of DARTS”.

We revised “In order to make the search process more efficient and adaptive” to “To enhance the efficiency and adaptability of the search process”.

We revised “Eight Cells are connected in sequence” to “The architecture comprises eight sequentially connected cells”.

We revised “The network architecture α inside the Cell is shared, so the search for the entire network can be simplified to the search for two types of Cells,” to “The network architecture, denoted as α, is consistent within each cell, thereby simplifying the search for the entire network to the search for two types of cells,”.

We revised “and the size of the output feature map is the same as that of the input feature map.” to “maintaining the same size for both the input and output feature maps.”.

We revised “and the height and width of the output feature map are reduced to half of those of the input feature map” to “halving the height and width of the output feature map compared to the input feature map.”.

In the Experiment, we made the following changes:

We revised “At the same time” to “Meanwhile”.

We revised “WRD-Softmax is used to continualize the search space and regularize the edges during subsequent searches, and candidate operations are pruned every 5 epochs.” to “In subsequent searches, WRD-Softmax is employed for the continualization of the search space and the regularization of the edges, with candidate operations being pruned every 5 epochs.”.

We revised “At the same time” to “Simultaneously”.

We revised “The use of DAS-Block, although it has caused a certain degree of parameter increase, but also improved the accuracy of LMD-DARTS to a certain extent.” to “Although the use of DAS-Block has caused a certain increase in parameters, it has also improved the 391 accuracy of LMD-DARTS to some extent.”.

In the Discussion, we made the following changes:

We revised “On the other hand, although LMD-DARTS has improved in reducing search time, its performance improvement may vary depending on the complexity of the task and the size of the dataset.” to “While LMD-DARTS has improved in reducing search time, its performance improvement may vary based on task complexity and dataset size.”.

We revised “we consider neural architecture search on large models as an interesting challenge” to “Therefore, conducting neural architecture search on large models remains an intriguing challenge”.

In the Conclusion, we made the following changes:

We revised “In terms of performance evaluation strategy” to “Regarding the performance evaluation strategy”.

Reviewer 4 Report

Comments and Suggestions for Authors

Please check English specially in the related study section.

Comments on the Quality of English Language

Not good. English checking is required by professional service.

Author Response

Comment 1: Please check English specially in the related study section.

Response 1: Thank you for pointing this out. We agree with this comment. Therefore, we have revised the manuscript accordingly to address this concern. Specifically, we have reviewed and improved the English language throughout the manuscript, with particular attention to the related study section. Below are some examples of revisions, and more revised details can be found in the highlighted areas of the revised manuscript. Thank you once again for your valuable feedback. Your insights have been instrumental in enhancing the quality of my research.

In the Abstract, we made the following changes:

We revised “the key to” to “pivotal for”.

We revised “enabling” to “facilitating the”.

We revised “NAS consumes extensive computing resources and time since it evaluates abundant candidate architectures.” to “NAS is resource-intensive, consuming significant computational power and time due to the evaluation of numerous candidate architectures.”.

We revised “Experimental results show that LMD-DARTS reduces search time by 20% and decreases the memory consumption of the NAS algorithm.” to “Our experimental results show that the proposed LMD-DARTS achieves a remarkable 20% reduction in search time, along with a significant decrease in memory utilization within NAS process.”.

We revised “The lightweight CNNs obtained through this algorithm demonstrate good classification accuracy.” to “Notably, the lightweight CNNs derived through this algorithm exhibit commendable classification accuracy, underscoring their effectiveness and efficiency for practical applications.”.

In the Introduction, we made the following changes:

We revised “with a series of major theoretical breakthroughs and method innovations in the field of artificial intelligence, Convolutional Neural Network (CNN) has been widely used in computer vision, natural language processing, speech recognition, and other fields, which promote the deep integration of artificial intelligence and all walks of life. ” to “the realm of artificial intelligence has witnessed a flurry of groundbreaking theoretical advancements and methodological innovations, propelling the Convolutional Neural Network (CNN) into widespread adoption across diverse domains such as computer vision, natural language processing, and speech recognition. This integration has fostered a profound fusion between artificial intelligence and various facets of society.”.

We revised “the complexity of convolutional neural network models is continuously improved to achieve higher network accuracy, which also severely restricts the deployment of CNN on resource-constrained devices.” to “the relentless pursuit of heightened network accuracy has necessitated the continuous escalation of CNN model complexity, posing a formidable challenge to the deployment of these networks on resource-constrained devices.”.

We revised “one is to build network structures with lower parameters and computation while ensuring network accuracy,” to “The first direction involves building network structures with fewer parameters and lower computational requirements while ensuring network accuracy,”.

We revised “models’ performance” to “performance of the models”.

We revised “At the same time” to “Simultaneously”.

We revised “network’s performance” to “performance of network”.

In the Related Work, we made the following changes:

We revised “which would be time-consuming if the traditional deep learning evaluation process are used directly.” to “which can be extremely time-consuming when using traditional deep learning evaluation methods.”.

We revised “(GPU days are used to measure the complexity of the NAS algorithm and represent the number of days the algorithm is searched on a single GPU);” to “a metric that reflects the computational complexity of the NAS algorithm by measuring the number of days required when running on a single GPU.”.

We revised “CNNs’ performance” to “performance of CNNs”.

We revised “The search result is better than most manual networks of the time.” to “surpassing most manually designed networks of the time.”.

We revised “eliminates the poorly performing architecture during the search process, selects the well-behaved architectures as the parents,” to “During the search process, it systematically eliminates poorly performing architectures, selecting the well-behaved architectures as the parents.”.

We revised “retains the structures and weights of the parent networks in the replication and mutation stages” to “In the replication and mutation stages, the evolutionary process retains the structures and weights of the parent networks,”.

We revised “migration performance” to “performance of migration”.

We revised “memory consumption is reduced to the original 1/N, which can directly search the network architecture on ImageNet dataset.” to “memory consumption is reduced to 1/N of the original, allowing for direct network architecture on ImageNet dataset.”.

We revised “optimize the problem that DARTS show different search preferences” to “optimize the problem of DARTS showing different search preferences”.

We revised “which applies” to “applying”.

We revised “These methods have set a benchmark in image restoration and enhancement.” to “setting benchmarks in image restoration and enhancement.”.

In the Section 3, we made the following changes:

We revised “In DARTS’s random relaxation strategy” to “In the random relaxation strategy of DARTS”.

We revised “In order to make the search process more efficient and adaptive” to “To enhance the efficiency and adaptability of the search process”.

We revised “Eight Cells are connected in sequence” to “The architecture comprises eight sequentially connected cells”.

We revised “The network architecture α inside the Cell is shared, so the search for the entire network can be simplified to the search for two types of Cells,” to “The network architecture, denoted as α, is consistent within each cell, thereby simplifying the search for the entire network to the search for two types of cells,”.

We revised “and the size of the output feature map is the same as that of the input feature map.” to “maintaining the same size for both the input and output feature maps.”.

We revised “and the height and width of the output feature map are reduced to half of those of the input feature map” to “halving the height and width of the output feature map compared to the input feature map.”.

In the Experiment, we made the following changes:

We revised “At the same time” to “Meanwhile”.

We revised “WRD-Softmax is used to continualize the search space and regularize the edges during subsequent searches, and candidate operations are pruned every 5 epochs.” to “In subsequent searches, WRD-Softmax is employed for the continualization of the search space and the regularization of the edges, with candidate operations being pruned every 5 epochs.”.

We revised “At the same time” to “Simultaneously”.

We revised “The use of DAS-Block, although it has caused a certain degree of parameter increase, but also improved the accuracy of LMD-DARTS to a certain extent.” to “Although the use of DAS-Block has caused a certain increase in parameters, it has also improved the 391 accuracy of LMD-DARTS to some extent.”.

In the Discussion, we made the following changes:

We revised “On the other hand, although LMD-DARTS has improved in reducing search time, its performance improvement may vary depending on the complexity of the task and the size of the dataset.” to “While LMD-DARTS has improved in reducing search time, its performance improvement may vary based on task complexity and dataset size.”.

We revised “we consider neural architecture search on large models as an interesting challenge” to “Therefore, conducting neural architecture search on large models remains an intriguing challenge”.

In the Conclusion, we made the following changes:

We revised “In terms of performance evaluation strategy” to “Regarding the performance evaluation strategy”.

Back to TopTop