1. Introduction
Wheat is the second-largest food crop in the world and is crucial for food security and social stability [
1]. Wheat growth monitoring refers to recording the morphological changes in wheat during different growth and development stages [
2]. It is critical on smart farms to obtain high yields and is often performed using unmanned aerial vehicles (UAVs) and intelligent agricultural machinery [
3]. Due to technological advances in smart agriculture, and intelligent agricultural machinery and mobile devices, deep learning (DL) models and algorithms have been increasingly used in this field [
4]. However, mobile devices have relatively low computing power, low battery capacity, and high energy consumption. DL-based agricultural applications require mobile computing devices with high computing power, high battery capacity, and low energy consumption to provide longer working hours and better service quality. Thus, an imbalance exists between the high computing needs of smart agriculture and mobile devices with low computing power. Therefore, it is necessary to develop a lightweight DL model capable of running on intelligent mobile devices for wheat growth monitoring. As the use of artificial intelligence has increased, deep reinforcement learning (DRL) has attracted extensive attention from the academic community [
5]. The data generated by users show exponential growth, promoting the rapid development of DRL. The deep Q-learning network (DQN) is an unsupervised learning algorithm based on reinforcement learning and a neural network [
6]. It combines the learning ability of neural networks and the decision-making ability of reinforcement learning and can make decisions in a timely and intelligent manner according to changes in the environment [
7].
Edge computing is an ideal solution for real-time applications to upload the core parameters or data of the DRL model to the network edge for processing [
8,
9]. Running a DQN on an intelligent mobile device causes high battery energy consumption, and the model’s identification efficiency depends on the quality of the network service when it runs on a remote server. Therefore, the server location, the power of the mobile device, and the quality of the network service must be carefully selected to enable the use of a DQN so that the unloading strategy of the edge nodes can be adapted to the environment. This approach enables the use of relatively few resources to obtain optimal results and reduces the communication and computing costs of edge computing [
10,
11]. Migration is used in mobile edge computing to migrate intensive computing tasks to the wireless network edge server for processing, alleviating the shortcomings of low computing power, poor real-time performance, and large power consumption of intelligent devices. This technology has attracted the attention of academia and industry [
12,
13,
14,
15], especially optimal migration decisions and the allocation of computing resources [
16,
17]. Chen et al. [
18] proposed a task unloading and scheduling method based on DRL for unloading decisions with dependency in mobile edge computing. The goal was to minimize the application’s execution time. Experiments showed that the proposed algorithm has good convergence ability, verifying the effectiveness and reliability of the method. Tian et al. [
19] deployed a cognition model to the edge and designed an intelligent recognition device based on computer vision and edge computing for crop pest image recognition. Agricultural crop images were collected in realtime, and image recognition was used to identify crop pests. Zhang et al. [
20] proposed an improved algorithm called the natural deep Q-learning network (NDQN) for resource scheduling and decision-making in edge computing. The results showed that the improved NDQN algorithm performed better than the local unloading and random unloading algorithms. Gu et al. [
21] designed an embedded monitoring system based on edge computing that considered different planting conditions of crops in different regions. They established neural networks and crop data processing algorithms and deployed them in embedded devices. UAVs were used for crop monitoring. However, most of the above studies designed migration algorithms for relatively large computing tasks and complex models [
22,
23,
24,
25,
26], whereas few studies designed migration strategies or algorithms based on lightweight recognition models for intelligent agricultural production scenarios.
Wheat is an important grain crop and is grown extensively worldwide. Wheat growth monitoring algorithms have high computational complexity, many parameters, and long task execution times. They require extensive computing resources and sufficient battery power. General migration algorithms and intelligent equipment are inadequate. This paper proposes a lightweight wheat growth stage detection model for intelligent devices. The wheat growth stage detection model is migrated to the wireless network edge server for processing to reduce energy consumption and computing time and simulate the cost of intelligent devices to make decisions by calculating the weighted sum of the battery energy consumption and computing time delay. The DQN algorithm is used to obtain the optimal output model because it reduces energy consumption and computing time delay in the DL model. The proposed method enables complex computing tasks on intelligent mobile devices in smart agriculture, and its use for the accurate identification of wheat growth stages is demonstrated. The innovation points of this study are as follows:
- 1.
A wheat growth stage detection model that uses depth-wise separable convolutional layers and a residual network is designed. It has low energy consumption and computing delay and high accuracy in distinguishing the seedling stage (SS), tillering stage (TS), overwintering stage (OS), greening stage (GS), and jointing stage (JS). The average recognition accuracy of the five wheat growth stages is 98.6%, whereas the DenseNet model achieves an average accuracy of 99.2%.
- 2.
A dynamic migration algorithm for the wheat growth detection model is designed using the DQN. This algorithm makes optimal migration decisions by monitoring the power consumption and network service quality of the equipment in real-time, considering the energy consumption and delay cost caused by the migration/non-migration, respectively. At a wireless network transmission data rate of 0–8 MB/s, the overall energy consumption loss of the dynamic migration algorithm is 128.4% lower than that of the intelligent device.
In this paper, an artificial intelligence algorithm and experiment are used to identify wheat growth stages. A decision-making method for performing edge computing and migrating the wheat growth stage detection model to the wireless network edge server for processing is proposed. The dynamic migration strategy of the DQN-based identification model enables the execution of complex processes while minimizing energy consumption and processing time. This method is suitable for deploying application systems in agriculture. This paper is organized as follows:
Section 1 presents the introduction.
Section 2 describes the materials and methods.
Section 3 provides the wheat growth stage detection model, and
Section 4 presents the migration algorithm. The results are described in
Section 5, and
Section 6 provides the discussion.
4. Design of Migration Algorithm
The proposed wheat growth stage detection model has a low battery energy consumption and delay. However, there is a need for intensive computing to perform intelligent fault monitoring in smart agriculture. When there are many computing tasks, moving them to the edge server improves crop monitoring efficiency. However, the dynamic changes in the computing scenarios and the wireless network quality of the service may result in inadequate performance when tasks are executed at the edge. Therefore, intelligent mobile devices must dynamically decide whether to offload computing tasks to the edge of the network. When the wireless network transmission rate is high and the intelligent device has sufficient power, it is suitable to unload the task to the edge server, resulting in high performance. In contrast, when the wireless network transmission rate is low and the device power is insufficient, the task cannot be moved to the edge for processing. However, it is often impossible in real scenarios to determine whether task unloading is required due to the dynamic changes in the computing environment and the wireless network’s quality of service.
4.1. Design for Dynamic Migration Algorithm with a Mobile Terminal
The residual power of mobile devices is a valuable energy resource in the migration of computing services to the mobile edge. In addition to variable factors, such as the dynamic characteristics of the mobile device’s environment, especially the network conditions, many factors determine the migration decision of mobile devices. The strong perception of DRL can be used to learn the state information of the environment and modify the decision-making so that mobile users can complete the computing task at the lowest cost. The DQN is an unsupervised neural network learning algorithm based on reinforcement learning. It combines the learning ability of a neural network and the decision-making ability of reinforcement learning and makes intelligent decisions in a timely manner according to the changing environment [
30]. The proposed dynamic migration algorithm makes the optimal decision by monitoring the power and wireless network speed of the device in real-time, considering the energy consumption and delay cost caused by the unloading/non-unloading decision, minimizing the calculation delay and power consumption.
is used as the input of the DQN. The greedy method is used to make random selections of an action selection to prevent the network from falling into a local minimum.
Figure 4 shows the flowchart of the algorithm. The DRL model considers five key factors [
14]: the environment, agent, action, status, rewards, and penalties.
The following equation expresses the DRL model:
is the input of the deep Q-learning network. A greedy method is used to obtain the Q value. It uses a random selection to prevent the network from falling into the local optimum. The current action, , in the state, , is executed to obtain the feature vector corresponding to the new state S’ with and reward to terminate the status, . is used as the parameters in the experience pool. The agent obtains the experience value to learn the current value for .
4.2. Energy Consumption and Calculation Delay of Wheat Growth Stage Detection Model
A mathematical equation was established to calculate the energy consumption and delay of the wheat growth stage detection model. The processing information of the mobile device is represented as a quaternion, M
i= (c
w,u
w,d
w,f
s), where c
w is the CPU power of the mobile device, u
w and d
w are the power of the mobile device to upload and download data, respectively, and f
s is the number of floating-point operations per second. The wireless network status is represented as a binary group, S
i = (
,
), where
represents the upload speed, and
represents the download speed of the wireless network. The decision space is defined as xi = 0 and xi = 1, where “0” denotes the task is processed on the intelligent device, and “1” denotes the task is unloaded to the edge server for processing. The delay includes the calculation delay and communication delay, when xi = 0,
represents the calculation delay of the mobile device, and when xi = 1,
represents the calculation delay of the edge server. The communication delay is represented by
, as shown in Equations (2) and (3):
where
represents the floating-point number required by the mobile device’s CPU to complete the computing tasks, and
and
represent the size of the uploaded and received data, respectively. The energy consumption consists of the computing energy consumption and communication energy consumption; only the energy consumption of the mobile device is considered. The computing energy consumption and communication energy consumption are calculated by Equations (4) and (5), respectively.
4.3. Design of Agent
After defining the energy consumption and time delay, it is necessary to determine the agent’s learning ability to evaluate the two parameters and decide whether to migrate the services. The DQN evaluates the energy consumption and time delay dynamically. The weight of the energy consumption is small if the mobile devices have more residual power and vice versa, regardless of whether the services are migrated or not. Similarly, time delay also has a weight parameter.
Figure 5 shows the structure of the agent. During the training of the DQN algorithm, the agent learns useful information as the environment changes. The agent is used to simulate the decision-making and calculation processes of intelligent devices. After the agent inputs the network and electricity status into the neural network, it calculates the energy consumption and time delay of the decision results and evaluates the decision quality to assess the rewards and penalties. Because the input consists of only two parameters (the network speed and power), the agent uses a small back propagation (BP) neural network to simulate the decision-making of intelligent devices.
Figure 5 shows that the BP neural network for decision-making has four hidden layers, and the activation function is a leaky ReLU function. The decision-making results are obtained by inputting the network speed and power, and the agent learns using the reinforcement learning algorithm. The calculation of the energy consumption and time delay is expressed by Equations (6) and (7), which are combined into Equation (8) to optimize the time delay and energy consumption jointly.
where
and
represent the delay and energy consumption costs after the agent has made a decision, and
represents the weighted sum of the energy consumption and costs.
and
are the delay and energy consumption coefficients, indicating the importance of the delay and energy consumption. When the power is low, the energy consumption coefficient,
, is high, and when the network speed is high, the delay coefficient,
, is high.