In this section, we focus on evaluating the performance of our proposed cloud-edge collaborative pruning algorithm and compare it with the common centralized pruning method. We use the previously described dataset and model architecture to verify the effect of cloud-edge collaborative pruning at different pruning rates. To better observe the overall trends, some experimental results are presented with smoothing to highlight general patterns while retaining the raw experimental data.
In terms of the experimental environment, a high-performance server equipped with an NVIDIA GeForce GTX 1080 Ti accelerator card was used as the cloud center server, three personal computers were used as the edge cloud server, ten drones equipped with low-power AI computers were used as edge terminals, and a Huawei S5720S-28P-LI-AC Gigabit Ethernet switch was used for networking. Python 3.6.13 programming was used to realize the DCECOA, socket programming was used to realize communication transmission, and the PyTorch 1.8.0 framework was used to assist in modification of the deep neural network structure. For simulations and analyses, we selected the ResNet-18 architecture due to its balance between computational complexity and suitability for resource-constrained UAV networks. Training and testing data were sourced from the CIFAR-10 dataset to ensure data quality and consistency across experiments. To avoid additional complexity, direct sensor data from the UAVs were not utilized.
The distributed cloud-edge collaborative optimization algorithm requires several key parameters to be set. First, the parameter is used to indicate the number of convolution kernels that need to be pruned between the terminal and the central cloud server in each round of communication, and it was set to 32. Next, the parameter, which represents the percentage of the maximum number of convolution kernels that can be pruned out of the total number of convolution kernels, was set to 80%. The parameter is used to indicate how many rounds of convolution kernel pruning the terminal and central cloud server need to perform in total, that is, the total number of communications between the two. This parameter needs to be dynamically calculated based on the initial number of convolution kernels of the model as well as and . For a neural network model such as AlexNet, the calculated value of this parameter is 28. We set the number of central cloud servers to 1, the number of edge clouds to 3, and the number of drones to 10.
5.1. Pruning Optimization Only
In this section, we focus on evaluating the performance of our proposed cloud-edge collaborative pruning algorithm and compare it with the common centralized pruning method. We use the previously described dataset and model architecture to verify the effect of cloud-edge collaborative pruning at different pruning rates. In particular, we show the advantages of the algorithm in maintaining model accuracy and analyze the data transmission overhead incurred during the pruning process.
Figure 3 shows the relationship between normal pruning and model accuracy obtained using the centralized pruning method at different pruning rates. The following trends are observed. When the pruning rate is low (less than 30%), the accuracy of the neural network fluctuates within a certain range and the accuracy decreases relatively little when the pruning rate is low. In the pruning rate range of 30% to 40%, the accuracy hardly fluctuates, and the accuracy on the test set remains at around 75%. This stable fluctuation in accuracy suggests that pruning has a positive impact on model performance. However, when the pruning rate exceeds 40%, the accuracy of the model drops sharply. It should be emphasized that these results only involve pruning of the convolution kernels; no subsequent neural network fine-tuning was performed. This shows that there are certain redundant parameters in the neural network and that the convolution kernels can be pruned according to the specific task to reduce network complexity.
Figure 4 shows the relationship between the results obtained with the cloud-edge collaborative pruning method at different pruning rates and the corresponding model accuracy. These experiments were conducted in a data-dispersed environment, focusing on the convolution kernel pruning of the neural network, and there was no fine-tuning of the model after pruning. It is worth emphasizing that the effect of the cloud-edge collaborative pruning method is almost the same as that of the centralized pruning method. When the pruning rate is lower than 20%, the accuracy decreases relatively little; in the pruning rate range of 26% to 40%, the model accuracy can be maintained at a stable level of about 80%. Within this range, the cloud-edge collaborative pruning method shows a clear performance advantage over centralized pruning. Only when the pruning rate exceeds 45% does the accuracy of the neural network model drop sharply.
In comparing the performance of ordinary pruning and cloud-edge collaborative pruning,
Figure 5 confirms that cloud-edge collaborative pruning consistently outperforms centralized pruning, with each curve in the figure representing the average of five trials. This result is particularly evident at higher pruning rates, where cloud-edge collaborative pruning shows better ability to maintain model accuracy. This highlights the potential and effectiveness of multilevel cloud-edge collaborative pruning algorithms in addressing the accuracy degradation issue commonly encountered during neural network pruning.
As can be seen from the data in
Figure 6, a total of about 1.5 MB of data were sent during the entire pruning process, while the amount of data received reached 5 GB. This is because the multilevel cloud-edge collaborative pruning algorithm transmits the terminal’s convolution kernel importance evaluation data for the local neural network model through the network. These data volumes are very small, and from a structural point of view are simple two-dimensional arrays. At the same time, the amount of data received is mainly affected by the terminal device downloading the global model from the central cloud server. Between each two communications, the terminal device needs to obtain the global model on the central cloud server to update its local model in order to support the subsequent edge-side collaborative optimization algorithm. Considering that edge devices usually have high downlink bandwidth and relatively low uplink bandwidth, the amount of data sent by the terminal device is negligible and can be ignored. Therefore, the effect of cloud-edge collaborative pruning is mainly limited by the downlink bandwidth of the terminal device. However, according to
Figure 5 the actual pruning process only needs to reach a pruning rate of 40%. This is because the performance of the model becomes unfeasible as the pruning rate exceeds 40%. In this case, the terminal device needs to send about 3000 MB of data.
Figure 3 and
Figure 4 show a performance comparison between ordinary centralized pruning and cloud-edge collaborative pruning under different pruning rates. cloud-edge collaborative pruning maintains better accuracy at higher pruning rates, indicating that it has significant advantages in maintaining model performance.
Figure 5 further confirms this point and emphasizes the improvement in pruning efficiency with the cloud-edge collaborative pruning method. At the same time,
Figure 6 shows the overhead of cloud-edge collaborative pruning in data synchronization through a comparison of data transmission volumes. Although the communication overhead is high, the improvement in model performance makes it worthy of consideration.
5.2. Pruning and Fine-Tuning
The above experiment analyzed the impact of different pruning rates on the performance of neural network models, especially when only pruning was performed. The experimental results show that larger neural network models contain redundant information and that performance can be improved by removing some convolutional kernels. In comparing ordinary pruning and cloud-edge collaborative pruning, the performance of the cloud-edge collaborative pruning model is less affected by the pruning rate. The only cost is that the client must upload a small amount of data to the cloud center server and synchronizes the global model with the cloud center server.
To further analyze the impact of fine-tuning on the performance of neural network models after pruning, we first verified the changes in model performance after fine-tuning following ordinary pruning, then verified the changes after fine-tuning following cloud-edge collaborative pruning, and finally analyzed and compared the performance changes after fine-tuning with both ordinary pruning and cloud-edge collaborative pruning. Ordinary pruning is performed only on the server, while cloud-edge collaborative pruning requires the transmission of intermediate data needed by the cloud-side and edge-side algorithms over the network.
The experimental results for fine-tuning after ordinary pruning and cloud-edge collaborative pruning are shown in
Figure 7 and
Figure 8; the horizontal axis represents the pruning rate, which is the percentage of the number of convolutional kernels pruned in the neural network relative to the initial number of convolutional kernels, while the vertical axis represents the accuracy rate of the neural network model on the test set, indicating the performance accuracy of the model.
After pruning the convolutional kernels in the neural network with ordinary pruning, a neural network training process was conducted to fine-tune the network parameters. As shown in
Figure 7, the accuracy of the fine-tuned model is significantly improved and does not significantly decrease until the pruning rate approaches 50%. With a pruning rate of 50%, the accuracy on the test set can be maintained at about 80%. Without fine-tuning after pruning, the model’s accuracy drops below 80% when the pruning rate exceeds 25%, indicating that fine-tuning significantly improves the performance of the pruned model.
After pruning the convolutional kernels with cloud-edge collaborative pruning, a neural network training process was conducted using the multilevel cloud-edge collaborative training algorithm. The impact of fine-tuning on cloud-edge collaborative pruning is essentially the same as that on ordinary pruning, with both increasing the upper limit of the pruning rate. However, the pruning rate for cloud-edge collaborative pruning can reach 60% while still maintaining an accuracy of 80% on the test set. Compared to ordinary pruning, the pruning rate is increased by 10%, as shown in
Figure 8.
The experimental results comparing the performance of ordinary pruning and fine-tuning with cloud-edge collaborative pruning and fine-tuning are shown in
Figure 9 and
Figure 10, respectively along with the corresponding communication overhead. In these figures, the horizontal axis represents the pruning rate, which is the percentage of convolutional kernels pruned from the initial number of kernels. In the performance comparison between ordinary pruning and fine-tuning and cloud-edge collaborative pruning and fine-tuning, the vertical axis represents the accuracy of the neural network model on the test set, which is a key performance indicator of the model. In the communication overhead experiment for cloud-edge collaborative pruning and fine-tuning, the vertical axis shows the number of bytes (measured in MB) sent and received by the client.
Figure 9 shows the results of cloud-edge collaborative pruning and fine-tuning averaged over five experimental runs. The data demonstrate that cloud-edge collaborative pruning combined with fine-tuning achieves a higher pruning rate while maintaining the same level of model performance compared to centralized pruning methods. This indicates that the multilevel cloud-edge collaborative pruning algorithm not only surpasses ordinary centralized pruning algorithms in terms of pruning efficiency but also enhances pruning performance when paired with the multilevel cloud-edge collaborative training algorithms. Averaging of the results over multiple experiments ensures the robustness and reliability of the findings, providing a more accurate assessment of the algorithm’s performance and its effectiveness in maintaining model accuracy despite higher pruning rates.
Judging from the results, the performance improvement appears modest. One of the primary tradeoffs of DCECOA is its focus on balancing model pruning with minimizing communication overhead, as opposed to solely maximizing the pruning rate. Simpler approaches such as standard centralized pruning may achieve higher pruning rates in static environments where computational resources and bandwidth are abundant; however, these methods often result in increased communication overhead due to the need for frequent model updates and centralized coordination. In contrast, DCECOA is specifically designed to function in UAV networks, where low-latency decision-making and efficient communication are critical, making it more suitable for deployment in dynamic and resource-constrained environments.
Figure 10 demonstrates that the number of bytes sent and received by the client increases significantly with both cloud-edge collaborative pruning and fine-tuning. This is because the fine-tuning process requires cloud-edge collaborative training, which involves synchronizing a large number of parameters and models, leading to increased data transmission. The client typically uploads only the gradient updates of the local model while downloading the entire model from the cloud center, resulting in a much larger amount of data being received than sent. Additionally, as shown in
Figure 9, the model’s accuracy decreases significantly when the pruning rate exceeds 50%. Therefore, the pruning rate should be kept within 50% to maintain high model performance. Under this condition, the client sends approximately 3000 MB of data and receives about 7000 MB of data.
The communication overhead results shown in
Figure 10 show that the amount of data sent and received by the client increases significantly as the pruning rate increases. In particular, when the pruning rate exceeds 50%, the data transmission demand is significantly increased. For practical deployment in UAV networks, this data transmission requirement may pose a series of challenges.
UAV networks often operate in environments with limited bandwidth and unstable connections, especially over long distances or under adverse weather conditions. This means that the transmission of large amounts of data may lead to communication delays, data loss, and even system instability, potentially affecting the efficiency and performance of the entire model optimization process. Although cloud-edge collaborative pruning technology has certain advantages in model optimization, in the context of bandwidth-limited UAV networks it is necessary to weigh the pruning rate and communication overhead and to optimize the transmission strategy in order to ensure the feasibility and effectiveness of the model in actual deployment.
In this section, we assess the impact of fine-tuning on neural network models after pruning, comparing the ordinary and cloud-edge collaborative pruning methods.
Figure 7,
Figure 8 and
Figure 9 reveal that fine-tuning improves the accuracy of both methods, with cloud-edge collaborative pruning supporting up to 60% pruning while maintaining accuracy. However,
Figure 10 shows that while cloud-edge collaborative pruning offers higher efficiency, it also leads to significant communication overhead, especially beyond a 50% pruning rate, posing challenges for deployment in bandwidth-constrained UAV networks.