1. Introduction
As circular geomorphological features, craters are formed by the collision of small celestial bodies at high speed. They have great significance for geological age estimation of the Moon [
1,
2] and Mars [
3], terrain and evolutionary history research [
4], mineral resource assessment [
5], safe landing [
6,
7], landing site selection and obstacle avoidance for rovers [
8], evaluating the influence of the crater abundance on the ice occurrence [
9] in the lunar polar Permanently Shadowed Regions (where ARTEMIS [
10] will land), and even subsurface exploration [
11]. Hence, crater detection has always been a hot topic. In the early days, crater detection mainly relied on visual interpretation with astronomical telescopes and remote sensing images. Whereas these methods based on subjective experience are unstable and time-consuming [
12]. With the development of information and space technology, more and more automatic CDAs (Crater Detection Algorithms) have been brought forward with high accuracy [
13]. Now, the features extracted by deep networks have stronger discrimination and generalization abilities than hand-crafted features [
14], and numerous detection methods based on deep learning have been emerging.
As one of the most popular object detection models, You Only Look Once (YOLO) makes predictions based on the global information of an image and makes good use of the context [
14]. Moreover, it performs well in the detection of multi-scale, small objects. Therefore, more and more researchers use YOLO to detect craters. Benedix et al. [
15] developed a model based on YOLOv3 to detect kilometer-size craters on Mars using daytime infrared data (100 m/pixel) acquired by the Thermal Emission Imaging System (THEMIS) [
16] with an accuracy of
= 87%. In addition, YOLOv3 was able to perform well in detecting sub-kilometer craters using Lunar Reconnaissance Orbiter Camera Narrow Angle Camera (LROC-NAC) data with high accuracy (
= 84%) but low accuracy for large-scale craters (
= 62%) [
17]. Cui et al. [
18] trained YOLOv5 with SLDEM to detect craters of diameters ranging from 2 to 15 km in the South Pole-Aitken Basin, with an accuracy of
= 95%. Tang et al. [
19] utilized YOLOv5 to detect kilometer-size craters using Lunar Reconnaissance Orbiter Camera Wide Angle Camera (LROC-WAC) data, with an accuracy of
= 69%. In terms of model construction, most methods directly use the baseline YOLO model to detect craters. However, the loss function (measure the difference or error between the predicted results of the model and the actual results) of the YOLO makes it difficult to solve the sample imbalance problem of small targets in the training process [
20], which may lead to poor performance under complex conditions. The lunar surface has various geomorphic features [
21], and some circular highlight-shadow landforms are easy to misidentify as craters, such as volcanic cones, domes, etc. Moreover, images that are taken under different illumination conditions highlight different features on the lunar surface, and topographic features of craters generate less pronounced shadows at lower incidence angles [
22]. Therefore, to improve the detection accuracy of small craters (diameter < 1 km), it is necessary to improve the YOLO model.
Currently, an abundance of lunar data have been used to build crater sample datasets. For example, the main image datasets are based on Chang’E CCD, LROC-NAC, and SELENE TC (Terrain Camera). Most topographical datasets came from Chang’E-DEM (Digital Elevation Model), LOLA-DEM, SLDEM, and SELENE-DEM. In model training, the dataset requires completeness, self-consistency, timeliness (data keeps pace with the times and is not outdated), confidentiality, accuracy, standardization, unbiasedness, and ease of use [
23]. That is to say, the quality and distribution of the dataset will affect the detection accuracy. However, there is no standard sample dataset for crater detection. In fact, the sample data is selected regardless of regional differences, geomorphological features, data types, resolution, object size, and so on. According to the data processing, there is no systematic analysis of the impact of the data visualization. Some models directly use the original data, while others use the processed data with different visualization stretching methods [
24,
25]. However, different methods will lead to different visual effects and potential information loss, which may have a great impact on detecting accuracy. The Chang’E-2 (CE-2) data has good consistency in imaging mode, coverage, data processing, and spatial resolution [
26], which makes it possible to build a dataset for global small lunar crater detection.
In this paper, we propose a novel crater detection model (called YOLO-Crater) by replacing the loss function and introducing the CBAM attention mechanism based on the YOLOX network structure. Then, the CE-2 DOM (Digital Orthophoto Map), DEM (Digital Elevation Model), Slope (terrain slope derived from DEM), and integrated data were used to build 23 sample datasets with different visualization stretching methods and train the YOLO-Crater models, respectively. To determine the final dataset and the corresponding Lunar YOLO-Crater model, a series of comparative experiments (see
Section 4.1) were made to analyze the visualization stretching methods and the detection model accuracy. Finally, the Martian sample dataset provided by the 2022 GeoAI Martian Challenge was used to train the Lunar YOLO-Crater model to build the Martian crater detection model (called Martian YOLO-Crater) and evaluate the YOLO-Crater’s transferability and generalization capability.
2. Dataset
The dataset is used to train and test the crater detection model. The lunar dataset was based on high-spatial-resolution data and small sample craters. The sample information includes the size and position of the sample craters. Currently, the SELENE-TC, LROC-NAC, and CE-2-CCD can provide lunar image data with high resolution [
26,
27,
28]. The resolution of SELENE-TC data is 7.4 m/pixel, and the coverage rate (the percentage of the surface that is covered) is 92.4%. However, the mosaic image has dislocation, leakage in the middle and low latitudes, and inconsistent brightness in the polar regions. The LROC-NAC data resolution covers 0.5 to 2 m/pixel. However, due to inconsistent imaging conditions, there are great differences in positioning, resolution, brightness, and shadow. Now, the CE-2 provides global DOM data (7 m/pixel) and DEM data (20 m/pixel) with consistent imaging conditions and a positioning control network, which made it possible to build a standard sample dataset for all of the lunar crater detection. As for the sample dataset, there is no publicly available and unified small crater dataset. Now, many lunar crater databases have been created by Head et al. [
29], Salamunićcar et al. [
30], Povilaitis et al. [
31], Robbins [
32], and so on. However, the crater size is more than a kilometer, which cannot be used to create the small crater sample dataset. Many CDAs have used the DOM, DEM, and some derived data to detect the crater, but there is no adaptation evaluation of the above data. So, we selected DOM, DEM, Slope extracted from DEM, and integrated data (DOM, DEM, and Slope: DDS) to create the small crater sample dataset in the typical sample areas and make a comparative experiment to evaluate the applicability of datasets. The dataset creation process includes data preparation and data creation (see
Figure 1).
2.1. Data Preparation
Though the CE-2 image has consistent imaging conditions, the difference in terrain and placement makes the texture different. To make the detection model have better generalization capability, the crater samples should cover different types of craters. The lunar surface has different terrains, and the types of craters on different terrains are different. Therefore, the features of craters are also different, such as reflectance, shadow direction, and so on. In this paper, we selected six sample areas in Maria and Highland. Among them, R1, R2, R3, R4, and R6 were selected by Zang et al. [
33]. While R5 is reselected to expand the Maria area. R5 coves 57.59°W and 39.41°N–40.61°N, with the same extent as the R6 (in Highland).
We labeled the crater with DOM data manually using ArcMap software and recorded the center coordinates and radius of the crater. The labeling principle is that the shadow direction of any given crater in the same area is consistent [
33]. However, we found that there are some missed labels and incorrect labels, such as short lunar wrinkle ridges with the same highlight-shadow features as craters. To improve the detection accuracy of the model and the completeness of the crater label set, we verified and modified the labels marked by Zang et al. [
33]. In the end, a total of 83,620 labels were obtained. The number of labels is significantly higher than those labeled by Fairweather et al. [
17] with 43,402, Hashimoto and Mori [
34] with 4967, Yang et al. [
35] with 14,406, and Lagain et al. [
36] with 2142. The number of labels in R1 to R6 are 8632, 8857, 23,970, 34,884, 3519, and 3758, respectively. In addition, 42,006 new craters were labeled. The size-frequency distribution of labels (see
Figure 2) shows that 99% are less than 1 km in diameter.
In order to evaluate the crater detection ability with DEM data, we also collected the CE-2 DEM (20 m/pixel) in the same sample areas.
2.2. Dataset Creation
DEM data is the value of the elevation, which cannot be used to detect the craters directly in the CDAs. Thus, DEM should be transformed and visualized into an image [
37]. However, there is no reference to analyze and evaluate the effect of the visualization transform, which may affect the detecting accuracy deeply. In the compilation of
Chang’E-1 Topographic Atlas of the Moon [
38], Mu conducted a comparative experiment about DEM visualization. If the global DEM data acquired by Chang’E-1 was divided into 188 sheets, and then each sheet was enhanced and visualized, the topographic details were very clear. In verse, if the global DEM was visualized and then divided, the topographic details in each sheet were unclear. Mu obtained the same result in compiling
The Chang’E-2 High Resolution Image Atlas of Lunar Sinus Iridum [
39]. So, we used the former data processing to visualize the dataset with several image stretching methods. The following are the steps for dataset creation:
Firstly, we cropped the data into 640 × 640 pixel blocks with a certain overlap to make the crater on the segmentation boundary be detected properly [
40]. As shown in
Figure 2, approximately 90% of craters’ radius is less than 250 m, and we weighed the completeness of the crater (it is considered that a complete crater is defined as when at least half of its area appears in the block) and the efficiency of model training and chose an overlap rate of 5% (250 m ÷ 7 m/pixel ÷ 640 pixels ≈ 5%).
Secondly, we used seven kinds of stretching methods commended by Gao [
41] and ArcMap [
42] to visualize and enhance the cropped data. The stretching methods include Maximum-Minimum Stretching (MMS), 1%/2% Linear Truncation Stretching (1%/2% LTS), Standard Deviation Stretching (SDS), Histogram Equalization (HE), Laplacian Sharpening (LS), and Gaussian Smoothing (GS). As a result, we obtained 22 datasets [3 kinds of data (DOM, DEM, and Slope) × 7 (methods) + 1 (DOM without any enhancement) = 22] in VOC format for model training and stretching method adaptation evaluation.
Finally, we divided the dataset into training, validation, and testing data. In each dataset, the training data and validation data with a ratio of 9:1 were randomly selected in R1, R2, R3, and R4, and the testing data contained all of the samples in R5 and R6.
2.3. Martian Dataset
The Martian dataset, downloaded from the 2022 GeoAI Martian Challenge in CodaLab (
http://cici.lab.asu.edu/martian/#data-download (accessed on 20 April 2022)), was created by Hsu et al. [
43]. The image data was THEMIS daytime infrared with 100 m resolution and global coverage [
44]. In addition, Martian sample craters were labeled with bounding boxes (BBOXs), recording the center and length diameter provided by Robbin’s crater database [
45]. The total number of labels was 301,912, with 27.3% between 0.2 km and 1 km in diameter, 38.94% between 1 and 1.5 km and nearly 90% no more than 3 km. The dataset contained 102,675 images of 256 × 256 pixels. The training data contained 50,838 images with 149,560 craters, the testing data contained 50,837 images with 149,389 craters, and the validation data contained 2963 craters.
5. Conclusions
In this paper, we proposed a novel small crater detection model (called YOLO-Crater) by replacing the IoU loss and traditional binary cross entropy loss with the EIoU loss and VariFocal loss and introducing the CBAM attention mechanism. To obtain more lunar small crater (Diameter < 1 km) samples (based on 7 m/pixel DOM) with high accuracy, about 42,006 labels had been remodified manually based on the existing labels made by Zang et al. [
33]. A series of comparative experiments were made to systematically analyze the impact of data type, stretching method, terrain type, and crater size on the crater detection model.
The results showed that the data type and visualization stretching methods had an important impact on detecting accuracy. The DOM is the best data type for small crater detection. CE-2 DOM-MMS (Maximum and Minimum Stretching) was found to be the best stretching method, with a total = 87.86%, an = 66.04%, and an = 75.41%. Compared with the YOLOX, the Lunar YOLO-Crater performs better both in Maria and Highland with accuracies of = 74.66% and 76.11%, respectively. Moreover, the Lunar YOLO-Crater obtains higher accuracy in detecting small-scale craters within 400 m in radius. In addition, the Martian crater detection model (Martian YOLO-Crater) was trained by image sample data from the 2022 GeoAI Martian Challenge and achieved second place with = 88.37%, = 69.25%, and = 77.65%, which means the Martian YOLO-Crater has strong transferability and generalization capability.
In the future, more and more high-resolution data will be acquired for the Moon and other celestial bodies. The remodified lunar small crater dataset could serve as a valuable supplement for GeoAI datasets, which would enable more researchers to utilize, improve, and expand it to other celestial bodies. Meanwhile, the strong transferability and generalization capability of the YOLO-Crater will make it possible to detect craters with high accuracy on other celestial bodies using image data.