Development of a Smart Material Resource Planning System in the Context of Warehouse 4.0
Round 1
Reviewer 1 Report
Comments and Suggestions for Authors
The paper represents an advanced adoption of neural networks in Warehouse 4.0, where novel insights into automated inventory and production management were obtained. However, there are a number of areas where significant improvements need to be done, as described below.
1. The abstract needs to be enhanced in order to focus more on practical implications and measurable effects of the study.
2. Problem statement must be outlined more vividly. The introduction would be even more transparent with the statement of those particular challenges this study addresses in the case of Warehouse 4.0.
3. Provide more details regarding how 3D printed models were utilized during the laboratory experiment and what kind of metrics were measured.
4. The implication discussions in the results section are needed. Compare critically with any existing system or methodology to clearly show the innovation and efficiency of the system proposed.
5. The resolution and clarity of figures and diagrams, like Figure 1 and Figure 3, should be improved to better understand the workflow of the system architecture. Captions could be more descriptive.
6. Also, Figure 5 is related to the architecture of the neural network. It is better to repeat a short explanation of the critical components directly under the figure to help readers who do not know these details.
7. The conclusion should be more focused on concrete outcomes. Repeat your main findings and how those can be applied in a real warehouse or manufacturing environment.
8. Add same clear futures research directions.
9. Insert the paragraph between line 130 -137 in the above paragraph.
If these aspects are worked upon, then the paper can come up to the desired publication standards as requested in the contribution to the field.
Comments on the Quality of English LanguageThe quality of language is moderate and needs editing.
Author Response
Comments 1: The abstract needs to be enhanced to focus more on the practical implications and measurable effects of the study.
Response 1: Thank you for pointing this out. We agree with this comment. Therefore, we have revised the abstract to emphasize the practical implications and measurable outcomes of the system, particularly in real-world production environments. These changes can be found on page 1, lines 15-30.
Comments 2: The problem statement must be outlined more vividly. The introduction would be even more transparent with the statement of those particular challenges this study addresses in the case of Warehouse 4.0.
Response 2: We appreciate your suggestion. We have clarified the problem statement in the introduction, more explicitly addressing the challenges faced in Warehouse 4.0. These changes have been implemented in lines 50-70.
Comments 3: Provide more details regarding how 3D-printed models were utilized during the laboratory experiment and what kind of metrics were measured.
Response 3: Thank you for this suggestion. We have included additional details regarding the use of 3D-printed models in the laboratory experiment and provided information on the metrics that were measured. This can be found on lines 431-437.
Comments 4: The implication discussions in the results section are needed. Compare critically with any existing system or methodology to clearly show the innovation and efficiency of the system proposed.
Response 4: We have expanded the discussion section to critically compare the proposed system with existing methodologies, highlighting the innovation and efficiency of our system. These additions are on lines 564-576.
Comments 5: The resolution and clarity of figures and diagrams, like Figure 1 and Figure 3, should be improved to better understand the workflow of the system architecture. Captions could be more descriptive.
Response 5: We have improved the resolution of Figures 1 and 3 and made the captions more descriptive for better clarity. These changes are reflected in the updated manuscript.
Comments 6: Figure 5 is related to the architecture of the neural network. It is better to repeat a short explanation of the critical components directly under the figure to help readers who do not know these details.
Response 6: As suggested, we have added a brief explanation of the critical components of the neural network architecture directly beneath Figure 5 to assist readers unfamiliar with these details. This can be seen on lines 327-335.
Comments 7: The conclusion should be more focused on concrete outcomes. Repeat your main findings and how those can be applied in a real warehouse or manufacturing environment.
Response 7: We have revised the conclusion to focus more on concrete outcomes and how these findings can be applied in real-world warehouse and manufacturing environments. This can be found on lines 548-554.
Comments 8: Add some clear future research directions.
Response 8: A section outlining future research directions has been added at the end of the conclusion. This includes expanding the product database and integrating predictive analytics. This is found on lines 576-590.
Comments 9: Insert the paragraph between lines 130-137 in the above paragraph.
Response 9: The paragraph between lines 130-137 has been moved as requested to better align with the flow of the section. Now it can be found on lines 179-187.
We trust that these revisions address your comments and improve the manuscript significantly. We appreciate your constructive feedback and believe it has enhanced the overall quality of the paper.
Reviewer 2 Report
Comments and Suggestions for AuthorsManuscript ID – eng-3211697
Line 79: the acronym DFMA is mentioned for the first time, but the name is not given. In the introduction and material and methods there is not some reference for Design for Manufacturing, Assembly, and Disassembly. I suggest to incorporate some citations where this research methodology is used for optimizing design and processes in manufacturing and construction.
Line 85: There is a typo: Section 3 presents...
Line 93: As I mentioned before, I suggest to incorporate some citations referred to DFMA principle.
Line 120: I suggest to work with the term Convolutional Neural Network. The main methodology for object detection is the CNN architecture YOLO. Then, I think is better to refer from the beginning that the CNN is integrated to the system.
Figure 1 has to be improved. For instance, the block “Decision maker” has a misaligned name. The variable names are also misaligned.
Line 144: Do the input images have size 1920 x 1080 pixels? It is not clear if it is the same size as input for the network. Which programming language is used to manipulate the images?
Line 189: deep neural network, I suggest to decide NN, DNN or CNN.
Figure 4 has to be improved. I can not observe very well the microflow. It is difficult to read. I suggest a high-quality image.
Line 226: I suggest to work with the term Convolutional Neural Network, because it is the heart of YOLO. As I mentioned in the comments for line 189, it would better to focus in one term. It is confusing to refer Neural Network and Convolutional Neural Network for just one method used. Also, I suggest to specify the hyperparameters values for the architecture used. For instance, How many epochs were used? What about the learning rate?
Also, I suggest to incorporate some citations according to the object detection problem using ML (Machine Learning). I think that it is not clear that YOLO is the only option. YOLO is a one-stage detector. Do you think that a two-stage detector can work as well? For instance, RCNN or Fast RCNN. I suggest to clarify this point.
Figure 5 has to be improved. I suggest a high-quality image with the architecture of YOLO. I suggest a diagram similar to the Figure 3 of this reference: https://link.springer.com/article/10.1007/s11760-023-02835-1
In the Table 1, it indicates that the input image size is (3,640,640), but the original image size is 1920 x 1080, ¿How do you reshape the images? I suggest to write in details the procedure for obtaining the input images. It would be better if some example can be shown. It is not clear that the image from Figure 6 is the input image (I assume that, because it is indicated as a image for training, but it is not clear the size). I suggest to show the difference, if there is any, between the original images and the input images.
In the Table 1, the values for parameter Output size for CSP1 I think must be centered.
Line 282: It is indicated that 174 images were used for training and 20 images for validation. How long did the training take? Once the model is trained, How long does it take to evaluate an image? It is important to highlight the computational time, because the object detection is integrated in another system.
Line 326/327: It is indicated that according to the results, the model is capable to generalize and recognize objects of different shapes and scales. I suggest to discuss a bit more this point. It is not clear that YOLO is capable to manage different scale in all scenarios. Perhaps in this case it is possible because the objects are not similar, or because the distance between the objects is ok. But, I’m not sure if YOLO is capable to manage with huge amount of objects, or with a dense scenario of objects. I suggest to clarify, for instance, the maximal amount of objects that can be in an image. Also, I suggest to incorporate a Figure with the 8 classes that the model can be detect.
I suspect that Python was used in some part. I suggest to clarify the programming language used for all the system.
Author Response
Comments 1: Line 79: the acronym DFMA is mentioned for the first time, but the name is not given. In the introduction and material and methods, there is no reference for Design for Manufacturing, Assembly, and Disassembly. I suggest incorporating some citations where this research methodology is used for optimizing design and processes in manufacturing and construction.
Response 1: Thank you for pointing this out. We agree with this comment. Therefore, we have now provided the full name "Design for Manufacturing, Assembly, and Disassembly" on its first mention. Additionally, we have incorporated citations where DFMA is applied in optimizing design and processes. These changes can be found on page lines 125-126, 141-152.
Comments 2: Line 85: There is a typo: Section 3 presents...
Response 2: Thank you for pointing this out. The typo has been corrected
Comments 3: Line 93: As I mentioned before, I suggest incorporating some citations referred to the DFMA principle.
Response 3: We appreciate your suggestion. Citations referring to the DFMA principle have been added to the relevant part of the introduction. These changes can be found on, lines 141-152
Comments 4: Line 120: I suggest working with the term Convolutional Neural Network. The main methodology for object detection is the CNN architecture YOLO. Then, I think it is better to refer from the beginning that the CNN is integrated into the system.
Response 4: We agree with your comment. Therefore, we have clarified from the beginning that the Convolutional Neural Network (CNN), specifically the YOLO architecture, is integrated into the system.
Comments 5: Figure 1 has to be improved. For instance, the block “Decision Maker” has a misaligned name. The variable names are also misaligned.
Response 5: We have updated Figure 1, aligning the "Decision Maker" block and correcting the misaligned variable names. The improved figure is now clearer and easier to follow.
Comments 6: Line 144: Do the input images have a size of 1920x1080 pixels? It is not clear if it is the same size as input for the network. Which programming language is used to manipulate the images?
Response 6: Thank you for your observation. We have clarified that the input images have a resolution of 1920x1080 pixels, which are resized to 640x640 pixels for the neural network input. The images and system are processed using Python. These changes can be found on lines 462-464, algorithm is given on lines 337-372
Comments 7: Line 189: I suggest deciding between NN, DNN, or CNN and using the term consistently.
Response 7: We agree with your suggestion. The manuscript has been revised to consistently use "Convolutional Neural Network (CNN)" as the primary term.
Comments 8: Figure 4 has to be improved. I cannot observe very well the microflow. It is difficult to read. I suggest a high-quality image.
Response 8: We have replaced Figure 4 with a high-resolution image, making the microflow clearer and easier to read.
Comments 9: Line 226: I suggest working with the term Convolutional Neural Network because it is the heart of YOLO. Also, specify the hyperparameter values for the architecture used, such as epochs and learning rate.
Response 9: We agree with your comment. The manuscript has been revised to consistently refer to Convolutional Neural Networks (CNN) and include detailed hyperparameter values for the YOLO architecture, including epochs and learning rate. This can be found on lines 450-451.
Comments 10: Also, I suggest incorporating some citations related to object detection using Machine Learning. Do you think that a two-stage detector could work as well?
Response 10: Thank you for your suggestion. We have added citations discussing object detection using Machine Learning and provided a justification for why YOLO, as a one-stage detector, was chosen over two-stage alternatives such as RCNN. These changes can be found on lines 294-320.
Comments 11: Figure 5 has to be improved. I suggest a high-quality image with the architecture of YOLO.
Response 11: We have updated Figure 5 with a higher-quality image of the YOLO architecture, as suggested.
Comments 12: In Table 1, the input image size is listed as (3,640,640), but the original image size is 1920x1080. How do you reshape the images?
Response 12: We have added a detailed explanation of how the original images are resized from 1920x1080 to 640x640 for input into the YOLO model. An example of this process has also been provided. This can be found on lines 337-372.
Comments 13: In Table 1, the values for the Output size of CSP1 should be centered.
Response 13: The formatting issue has been corrected, and the output size for CSP1 in Table 1 is now centered.
Comments 14: Line 282: It is indicated that 174 images were used for training and 20 for validation. How long did the training take? How long does it take to evaluate an image?
Response 14: We have added information on the time taken for training and evaluation of images. This is especially important as object detection is integrated into the overall system. These details can be found on lines 438-447 and 464-465
Comments 15: Line 326/327: Discuss in more detail how YOLO generalizes and handles objects of different shapes and scales.
Response 15: We described it inside methodology section on lines 373-419
Comments 16: I suspect that Python was used in some part. I suggest clarifying the programming language used for all the system.
Response 16: We have clarified that Python was used for image manipulation and the integration of the YOLO architecture, as well as for the overall system.
We believe these revisions address your comments thoroughly and have significantly improved the manuscript. Thank you once again for your valuable feedback.
Round 2
Reviewer 1 Report
Comments and Suggestions for AuthorsThe author has implemented all my remarks and suggestions. The manuscript has been enhanced accordingly.
Author Response
Thank you very much for your positive feedback. We are grateful for your valuable insights and suggestions, which greatly contributed to improving the quality of our manuscript. We appreciate your time and effort in reviewing our work.
Reviewer 2 Report
Comments and Suggestions for AuthorsThe images quality and the methodology have been improved. I have a few comments related with text editing:
Line 352: I suggest to use "equation 2."
Line 361: I suggest to use "equations 4 and 5."
Line 370: I suggest to use "equation 6."
Author Response
Comments 1: Line 352: I suggest using "equation 2."
Response 1: Thank you for your suggestion. We have revised the text to use "equation 2" as recommended. This change can be found on line 352.
Comments 2: Line 361: I suggest using "equations 4 and 5."
Response 2: We appreciate your input. We have updated the text to "equations 4 and 5" as suggested. This change is reflected on line 361.
Comments 3: Line 370: I suggest using "equation 6."
Response 3: Thank you for pointing this out. We have revised the text to refer to "equation 6" for consistency. This update has been made on line 370.
We believe these text revisions enhance the clarity and consistency of the manuscript. Thank you again for your valuable feedback. We appreciate your time and effort in reviewing our work.