1. Introduction
With the worldwide continuous reduction in the availability of fossil energy, the advantages of new energy vehicles have been gradually highlighted [
1,
2,
3]. Electric vehicles rely on clean and pollution-free features to get strong support from the government [
4,
5,
6]. In recent years, the shortage of urban land resources has become increasingly prominent, and the application of stereo charging garages has promoted the development of tram charging towards unmanned direction [
7]. Automatic parking and driverless technology are gradually becoming mature. With this technology, a vehicle will arrive at the parking lot by itself and should be charged automatically. For publicly used electric vehicles or those on time-sharing lease, when the user returns a vehicle, the charging is often delayed that affects the user experience and utilization. The charging pile is damaged by weather and human factors, and manual charging will have significant safety risks. At the same time, the DC charging gun line is heavy, which is not conducive to manual plugging [
8]. Based on the aforementioned problems, automatic charging of electric vehicles is an urgent problem that needs to be solved.
At present, some companies and research institutions have proposed their own solutions [
9,
10,
11,
12,
13,
14,
15,
16,
17,
18,
19]. These solutions show that the core of automatic charging of electric vehicles mainly consists of two parts: the identification and positioning of charging port (CP) and the plug-in mechanism. The identification and positioning of CP is the premise of plugging. Furthermore, the accuracy and universality of CP identification are important guarantees for the successful plugging of robots. Therefore, the high-precision identification and positioning of the CP is of great significance towards realization of automatic charging technology.
At present, the main CP recognition method uses visual positioning, which is divided into two categories: (1) with feature recognition and (2) without feature recognition. In terms of feature recognition, Lv [
20] added white labels around the CP, used feature matching for rough positioning, and inserted the CP according to the six-axis force sensor compensation. The author did not provide recognition and positioning accuracy. Pan et al. [
21] added five black and white labels around the CP. Based on the contour of the open operation, the geometric solution method was used to calculate the location and posture (LP) of the CP. The LP errors were 1.4 mm and 1.6 degrees, respectively, and the insertion success rate was 98.9%. The CP recognition methods without feature recognition include Li et al. [
22] that proposed a CP identification and location method based on the Scale-invariant feature transform and semi-global block matching. The method achieved an average error of 1.51 mm. Zhang et al. [
16] improved the canny edge detection and the CP image correlation algorithm of combined morphology. The authors did not specify any recognition accuracy, and the overall insertion success rate was 95.55%. Yao et al. [
23], based on the template matching algorithm in Halcon commercial vision software, tested the CP LP error in a room, achieving average errors of 2.5 mm in position and 0.8 degrees in angle. Quan et al. [
24] tested the CP identification accuracy in multiple environments using the cluster template matching algorithm (CTMA). The LP errors were 0.91 mm and 0.87 degrees, respectively, and the plugging success rate was 95%.
In recent years, deep learning has achieved rapid development in the field of target recognition, with the emergence of a series of target detection models such as Faster-RCNN, YOLO and SSD [
25,
26,
27,
28,
29,
30,
31,
32]. These models have improved the universality of target recognition, especially for specific targets, significantly improving the recognition accuracy in complex scenes and light. The YOLO algorithm is highly favored for its relatively high accuracy while ensuring high speed [
33,
34].
Based on the above research, CP recognition can only adapt to a single type of CP. Although the size of the CP has a unified standard, the CPs from different manufacturers and even different batches of CP from the same manufacturer will result in the inconsistency of the detailed texture and surface roughness of the CP. Due to the limitations of traditional algorithms, different CPs require adjustment of different characteristic parameters, and the universality is poor. At present, among the target detection algorithms, there is no recognition optimization algorithm with structural features. In view of the specificity of CP features, this paper proposes a data enhancement method based on the YOLOV7-tinp algorithm. The locations of similar features determine the label cate-gory (SFLDLC), which improved the target classification accuracy of similar feature location-determining categories. At the same time, using the convolutional block attention module (CBAM) attention mechanism combined with the CTMA, the universality and accuracy of the algorithm are improved, and a guarantee for the LP calculation algorithm of CP is provided. The rough positioning stage (RPS) and precise positioning stage (PPS) use the similar projection relationship and EPnP algorithm, respectively, to solve the LP of the recognition results. Subsequently, the robot is guided to complete the insertion work, which realizes automatic charging of various vehicle CPs. Our contributions in this paper are as follows:
- (1)
We propose a solution that combines deep learning methods to identify charging port pose information.
- (2)
We propose an SFLDLC and CTMA for CP recognition and positioning, which improves the accuracy of recognition.
- (3)
We have integrated CBAM into YOLOV7-tinp for CP recognition and positioning, improving recognition accuracy.
This paper is organized as follows:
Section 2 introduces the data collection process and the identification and location methods.
Section 3 conducts experimental verification in different scenarios, providing positioning accuracy and insertion success rate.
Section 4 discusses the sources of positioning errors.
Section 5 summarizes the experimental results and further research directions.
3. Results
The test process is conducted under the Windows 10 operating system. A processor of model Intel (R) Core (TM) i7-10700K CPU @ 3.80 GHz, 3.79 GHz memory, and Nvidia GeForce RTX 3080 graphics card is used. The programming language used is Python 3.9 on the PyCharm programming platform, and PyTorch 1.6 is selected as the deep learning framework. The training is based on the GPU. During the performance test, the CPU is used for comparative testing in order to ensure that it is similar to the actual application scenario.
3.1. Judgment Basis of CP LP Error
During data collection, this research fixed the robot on the base in order to obtain the actual position and orientation information of the CP relative to the camera. The world coordinates of the base were kept unchanged, and the robot was inserted into the CP while teaching. This state was considered as the zero LP. The robot was moved randomly out of the CP within the recognition range. Based on the LP information of the robot during data collection and combining it with the zero LP information, the LP information of the CP relative to the end of the manipulator was obtained. Subsequently, the actual LP information of the CP relative to the camera was calculated. The absolute difference between the actual LP information and the theoretical relative LP calculated in this paper was used as the basis for evaluating the accuracy of this algorithm.
3.2. LP Accuracy Test in RPS
The RPS is mainly divided into feature recognition and LP resolution of the CP.
Figure 7 shows the recognition performance of the feature points in different scenarios. The theoretical LP information is obtained based on the LP resolution algorithm proposed in this paper, and subsequently, the actual LP error information is obtained. A comparison of the different recognition methods in the RPS is provided in
Table 4.
The experimental results in
Table 4 show that the precision of CBAM-YOLOV7-tinp is 0.002 higher than that of Fast RCNN, 0.003 higher than that of yolov3, and 0.001 higher than those of YOLOV4, YOLOV5, and YOLOV7-tinp. The recall value of CBAM-YOLOV7-tinp is 0.02 higher than those of Faster RCNN, YOLOV3, and YOLOV4, and 0.001 higher than those of YOLOV5s and YOLOV7-tinp. In this paper, considering mAP @ 0.5:0.95 as an example, CBAM-YOLOV7-tinp has the highest accuracy, which is 0.005 higher than that without the CBAM. In the actual positioning, we try to improve the detection accuracy by reducing the false recognition in order to avoid damaging the manipulator. Therefore, CBAM-YOLOV7-tinp performs the best in terms of the detection accuracy. Although the detection time increases slightly due to the addition of the attention mechanism, this increased time is acceptable due to the improved accuracy weight in each index.
Based on the comparison of the above results, we use CBAM-YOLOV7-tinp to identify the position of the feature target, substitute the feature position information into the LP solution model, and obtain the LP information in different scenarios, as shown in
Table 5.
The positioning results in
Table 4 show that the indoor accuracy is basically the same as that at night, and the relative accuracy is relatively high. The average accuracy values of X, Y, and Z are 2.34 mm, 2.51 mm, and 2.64 mm, respectively. The accuracy in the sunny morning is basically the same as that on the cloudy day. The average accuracy values of X, Y, and Z are 2.72 mm, 2.92 mm, and 2.98 mm, respectively. The accuracy values of X, Y, and Z are 2.81 mm, 2.99 mm, and 3.17 mm, respectively, at noon on the sunny day. The average accuracy values of all cases are 2.61 mm, 2.79 mm, and 2.90 mm, which can meet the needs of the RPS. The reason for the above accuracy difference is related to the shooting clarity and light difference of the image under different light field conditions.
3.3. LP Accuracy Test in PPS
The PPS is mainly divided into feature recognition and LP resolution of the CP.
Figure 8 shows the effect of feature recognition in different scenarios. The theoretical LP information is obtained based on the LP resolution algorithm in this paper, and subsequently the actual LP error information is obtained.
Table 6 shows the LP error of the CP in different scenarios.
It can be concluded based on the experimental results in
Table 5 that out of Faster RCNN, YOLOV3, YOLOV4, YOLOV5s, and YOLOV7-tinp, YOLOV7-tinp outperforms the other models in terms of various indicators. It can further be noted that the results of the PPS directly affect the positioning results. In order to improve the accuracy and meet the insertion accuracy, YOLOV7-tinp is further improved. The precision of SFLDLC-CBAM-YOLOV7-tinp-CTMA algorithm proposed in this paper is 0.002 and 0.001 higher than YOLOV7-tinp and CBAM-YOLOV7-tinp, respectively. Its recall value is 0.002 higher than that of YOLOV7-tip; mAP @ 0.5 value is 0.002 and 0.001 higher than those of YOLOV7-tinp and CBAM-YOLOV7-tinp, respectively, and the mAP @ 0.5:0.95 value is 0.005 and 0.003 higher than those of YOLOV7-tinp and CBAM-YOLOV7-tinp, respectively. In the actual positioning, damage to the manipulator can be avoided by improving the detection accuracy as much as possible by reducing misidentification. Therefore, SFLDLC-CBAM-YOLOV7-tinp-CTMA performs the best in terms of the detection accuracy. However, its detection time is slightly increased due to the addition of SFLDLC, CBAM, and CTMA. This increased time is acceptable due to the improved accuracy of each index.
Based on the comparison of the above results, we use SFLDLC-CBAM-YOLOV7-tinp-CTMA to identify the position of the feature target, substitute the feature position information into the LP solution model, and obtain the LP information in different scenarios. The corresponding errors are shown in
Table 7.
The positioning results in
Table 6 show that the features of PPS and RPS have a common feature of circular edges. Therefore, the detection and positioning accuracy trends are similar. The positioning accuracy is basically the same in indoor sunny days, outdoor cloudy days, and at night, and the relative positioning accuracy is relatively high. The average accuracy values of x, y, z, Rx, Ry, and Rz are 0.61 mm, 0.85 mm, 1.21 mm, 1.16 degrees, 0.94 degrees, and 0.54 degrees, respectively. The positioning accuracy is low in outdoor sunny days, especially at noon. The average accuracy values of x, y, z, Rx, Ry, and Rz in outdoor sunny days are 0.70 mm, 0.95 mm, 1.30 mm, 1.24 degrees, 1.14 degrees, and 0.64 degrees, respectively. The positioning accuracy can meet the needs of PPS. The reason for the above accuracy difference is related to the shooting clarity and light difference of the image under different light field conditions.
3.4. Comparison of Results
In order to evaluate the progressiveness of the algorithm proposed in this paper, this paper compared it with three advanced electric vehicle CP identification and location methods.
Table 8 shows the comparison results.
Table 7 shows that when the three advanced methods are used to identify multi-category CPs, the robustness of the algorithm is low, the error is high, and they are unable to identify and locate. Therefore, it is verified that the algorithm proposed in this paper exhibits robustness with respect to the identification of multiple types of CPs and has a significant application value.
3.5. Plug Test Verification
As the positioning accuracy in outdoor sunny days is low in the above tests, and the positioning accuracy of other scenes is basically the same, we define two cases as scene 1 and scene 2. We carried out 200 plug-in tests for each of these two situations. In these tests, the algorithm proposed in this paper is used for positioning, combined with the minimum mechanism of three iterations, and the 6-DOF articulated robot of AUBO-i10 is used for plugging.
Table 9 shows the test results.
Based on the identification and location algorithm proposed in this paper, the average plugging rate of the CP is 96.5% in indoor (sunny/cloudy/night) conditions, and 92.0% in outdoor sunny (morning/noon/afternoon) conditions.
5. Conclusions
This paper proposed a set of electric vehicle CP identification and location algorithm based on CBAM-YOLOV7-tinp-CTMA, which realized the CP identification and location in multiple categories, multiple scenes, and a wide range. In this paper, the recognition process was divided into two stages, and the recognition and positioning model was established, respectively. The LP was calculated based on the similar projection relationship and EPnP algorithm, and the insertion test was completed by using the mechanical arm.
The two stages were tested in this paper, and the average positioning errors (x, y, z) of RPS CP were 2.61 mm, 2.79 mm, and 2.90 mm, respectively. The average LP errors (x, y, z, rx, ry, and rz) of the fine positioning CP were 0.64 mm, 0.88 mm, 1.24 mm, 1.19 degrees, 1.00 degrees, and 0.57 degrees, respectively. In different scenarios, the higher the positioning accuracy, the greater the plugging success rate. The plugging success rate in outdoor sunny days was 92.0%, and in other cases, it was equal to 96.5%. Compared with the existing advanced methods, the algorithm proposed in this paper had a high universality and could identify various types of CPs and complete positioning. It provided a theoretical basis for the positioning of various CPs and could have a high engineering application value.
In the future, more data on CP types and environmental complexity will be added. The improved algorithm will be optimized to improve its adaptability and recognition accuracy, increase the success rate of plugging, and reduce the impact of the plugging process on robots and vehicles. If there are problems with visual positioning, we can use vibration signals to compensate for visual positioning errors in the future, thereby avoiding accidents.