Figure 1.
Intra-class differences and aggregation on RSSCN7. The solid boxes show the images of the field and the industry class, respectively. It can be found that in the same class, the features of samples in the same dotted box are similar, while the features of samples in different dotted boxes are relatively different.
Figure 1.
Intra-class differences and aggregation on RSSCN7. The solid boxes show the images of the field and the industry class, respectively. It can be found that in the same class, the features of samples in the same dotted box are similar, while the features of samples in different dotted boxes are relatively different.
Figure 2.
The t-SNE visualization on RSSCN7. The points in the figure represent samples and different colors represent different classes. It can be found that the samples of the same classes are scattered, and show a local aggregation.
Figure 2.
The t-SNE visualization on RSSCN7. The points in the figure represent samples and different colors represent different classes. It can be found that the samples of the same classes are scattered, and show a local aggregation.
Figure 3.
The AMP method framework. The top half of the figure is the initialization module, and the backbone network maps the sample features into the embedding space, as shown in (a). The yellow and green in the figure represent two classes of samples, the gray-shaded parts are the clusters after performing the clustering algorithm, as show in (b). The lower half of the figure represents the training phase. (c) shows the initial training data in a batch. We synthesize the samples intra-cluster within the batch based on the information obtained in the initialization phase, and the dashed border indicates the synthesized samples, as shown in (d). After the synthesis phase, we assign a proxy to each cluster to express the intra-cluster features. The triangle in (e) represents the proxy, the square in (f) represents the overall features of the class, and the thickness of the line between the square and the triangle represents the weight of the proxy.
Figure 3.
The AMP method framework. The top half of the figure is the initialization module, and the backbone network maps the sample features into the embedding space, as shown in (a). The yellow and green in the figure represent two classes of samples, the gray-shaded parts are the clusters after performing the clustering algorithm, as show in (b). The lower half of the figure represents the training phase. (c) shows the initial training data in a batch. We synthesize the samples intra-cluster within the batch based on the information obtained in the initialization phase, and the dashed border indicates the synthesized samples, as shown in (d). After the synthesis phase, we assign a proxy to each cluster to express the intra-cluster features. The triangle in (e) represents the proxy, the square in (f) represents the overall features of the class, and the thickness of the line between the square and the triangle represents the weight of the proxy.
Figure 4.
Intra-cluster synthesis. We select similar samples in batches, divide them by the clusters obtained from the initialization module, and use the strategy of intra-cluster synthesis, which is performed with a controllable degree of randomness. The distribution of samples before synthesis is shown in the left figure, where each color represents a class of sample features and different shapes indicate different clusters. The middle image represents our proposed synthesis method with random factor, where is the sample synthesized through samples and , and the right image shows the distribution of the samples synthesized by the green class, and we use the dashed edges to indicate the synthesized samples.
Figure 4.
Intra-cluster synthesis. We select similar samples in batches, divide them by the clusters obtained from the initialization module, and use the strategy of intra-cluster synthesis, which is performed with a controllable degree of randomness. The distribution of samples before synthesis is shown in the left figure, where each color represents a class of sample features and different shapes indicate different clusters. The middle image represents our proposed synthesis method with random factor, where is the sample synthesized through samples and , and the right image shows the distribution of the samples synthesized by the green class, and we use the dashed edges to indicate the synthesized samples.
Figure 5.
and are the two original samples, the circle with a dashed border represents the synthesized sample, and the dashed straight line represents the range of the sample synthesis. Assuming that , the range of sample synthesis is between and , and the synthesis location is completely determined by the random parameter r. Assuming that , the range of synthesis locations is reduced by half. Assuming that , the sample is synthesized at the midpoint of and .
Figure 5.
and are the two original samples, the circle with a dashed border represents the synthesized sample, and the dashed straight line represents the range of the sample synthesis. Assuming that , the range of sample synthesis is between and , and the synthesis location is completely determined by the random parameter r. Assuming that , the range of synthesis locations is reduced by half. Assuming that , the sample is synthesized at the midpoint of and .
Figure 6.
Example images of the UCM dataset: (1) agricultural, (2) airplane, (3) baseball diamond, (4) beach, (5) buildings, (6) chaparral, (7) dense residential, (8) forest, (9) freeway, (10) golf course, (11) harbor, (12) intersection, (13) medium residential, (14) mobile home park, (15) overpass, (16) parking lot, (17) river, (18) runway, (19) sparse residential, (20) storage tanks, (21) tennis court.
Figure 6.
Example images of the UCM dataset: (1) agricultural, (2) airplane, (3) baseball diamond, (4) beach, (5) buildings, (6) chaparral, (7) dense residential, (8) forest, (9) freeway, (10) golf course, (11) harbor, (12) intersection, (13) medium residential, (14) mobile home park, (15) overpass, (16) parking lot, (17) river, (18) runway, (19) sparse residential, (20) storage tanks, (21) tennis court.
Figure 7.
Example images of the RSSCN7 dataset: (1) Grass, (2) Field, (3) Industry, (4) River Lake, (5) Forest, (6) Resident, (7) Parking.
Figure 7.
Example images of the RSSCN7 dataset: (1) Grass, (2) Field, (3) Industry, (4) River Lake, (5) Forest, (6) Resident, (7) Parking.
Figure 8.
Example images of the AID dataset: (1) Airport, (2) Bare Land, (3) Baseball Field, (4) Beach, (5) Bridge, (6) Center, (7) Church, (8) Commercial, (9) Dense Residential, (10) Desert, (11) Farmland, (12) Forest, (13) Industrial, (14) Meadow, (15) Medium Residential, (16) Mountain, (17) Park, (18) Parking, (19) Playground, (20) Pond, (21) Port, (22) Railway Station, (23) Resort, (24) River, (25) School, (26) Sparse Residential, (27) Square, (28) Stadium, (29) Storage Tanks, (30) Viaduct.
Figure 8.
Example images of the AID dataset: (1) Airport, (2) Bare Land, (3) Baseball Field, (4) Beach, (5) Bridge, (6) Center, (7) Church, (8) Commercial, (9) Dense Residential, (10) Desert, (11) Farmland, (12) Forest, (13) Industrial, (14) Meadow, (15) Medium Residential, (16) Mountain, (17) Park, (18) Parking, (19) Playground, (20) Pond, (21) Port, (22) Railway Station, (23) Resort, (24) River, (25) School, (26) Sparse Residential, (27) Square, (28) Stadium, (29) Storage Tanks, (30) Viaduct.
Figure 9.
Example images of the PatternNet dataset: (1) airplane, (2) baseball field, (3) basketball court, (4) beach, (5) bridge, (6) cemetery, (7) chaparral, (8) Christmas tree farm, (9) closed road, (10) coastal mansion, (11) crosswalk, (12) dense residential, (13) ferry terminal, (14) football field, (15) forest, (16) freeway, (17) golf course, (18) harbor, (19) intersection, (20) mobile home park, (21) nursing home, (22) oil gas field, (23) oil well, (24) overpass, (25) parking lot, (26) parking space, (27) railway, (28) river, (29) runway, (30) runway marking, (31) shipping yard, (32) solar panel, (33) sparse residential, (34) storage tank, (35) swimming pool, (36) tennis court, (37) transformer station, (38) wastewater treatment plant.
Figure 9.
Example images of the PatternNet dataset: (1) airplane, (2) baseball field, (3) basketball court, (4) beach, (5) bridge, (6) cemetery, (7) chaparral, (8) Christmas tree farm, (9) closed road, (10) coastal mansion, (11) crosswalk, (12) dense residential, (13) ferry terminal, (14) football field, (15) forest, (16) freeway, (17) golf course, (18) harbor, (19) intersection, (20) mobile home park, (21) nursing home, (22) oil gas field, (23) oil well, (24) overpass, (25) parking lot, (26) parking space, (27) railway, (28) river, (29) runway, (30) runway marking, (31) shipping yard, (32) solar panel, (33) sparse residential, (34) storage tank, (35) swimming pool, (36) tennis court, (37) transformer station, (38) wastewater treatment plant.
Figure 10.
The number of proxies assigned to each class in the UCMD dataset by our proposed adaptive multi-proxy assignment method.
Figure 10.
The number of proxies assigned to each class in the UCMD dataset by our proposed adaptive multi-proxy assignment method.
Figure 11.
The number of proxies assigned to each class in the AID dataset by our proposed adaptive multi-proxy assignment method.
Figure 11.
The number of proxies assigned to each class in the AID dataset by our proposed adaptive multi-proxy assignment method.
Figure 12.
The number of proxies assigned to each class in the PatternNet dataset by our proposed adaptive multi-proxy assignment method.
Figure 12.
The number of proxies assigned to each class in the PatternNet dataset by our proposed adaptive multi-proxy assignment method.
Figure 13.
The number of proxies assigned to each class in the RSSCN7 dataset by our proposed adaptive multi-proxy assignment method.
Figure 13.
The number of proxies assigned to each class in the RSSCN7 dataset by our proposed adaptive multi-proxy assignment method.
Figure 14.
Examples of UCMD retrieval results. Column 1 is the query sample, columns 2–4 are the success cases of our method, and columns 5–7 are the failure cases.
Figure 14.
Examples of UCMD retrieval results. Column 1 is the query sample, columns 2–4 are the success cases of our method, and columns 5–7 are the failure cases.
Figure 15.
Examples of RSSCN7 retrieval results. Column 1 is the query sample, columns 2–4 are the success cases of our method, and columns 5–7 are the failure cases.
Figure 15.
Examples of RSSCN7 retrieval results. Column 1 is the query sample, columns 2–4 are the success cases of our method, and columns 5–7 are the failure cases.
Figure 16.
Examples of AID retrieval results. Column 1 is the query sample, columns 2–4 are the success cases of our method, and columns 5–7 are the failure cases.
Figure 16.
Examples of AID retrieval results. Column 1 is the query sample, columns 2–4 are the success cases of our method, and columns 5–7 are the failure cases.
Figure 17.
Examples of PatternNet retrieval results. Column 1 is the query sample, columns 2–4 are the success cases of our method, and columns 5–7 are the failure cases.
Figure 17.
Examples of PatternNet retrieval results. Column 1 is the query sample, columns 2–4 are the success cases of our method, and columns 5–7 are the failure cases.
Figure 18.
A t-SNE visualization of the proposed method. (a–e) are the t-SNE visualizations of the first five epochs of all classes in the dataset UCMD. (f–i) are the t-SNE visualizations at the 10th, 20th, 30th, and 40th epochs, respectively. Different colors in the diagram represent different classes.
Figure 18.
A t-SNE visualization of the proposed method. (a–e) are the t-SNE visualizations of the first five epochs of all classes in the dataset UCMD. (f–i) are the t-SNE visualizations at the 10th, 20th, 30th, and 40th epochs, respectively. Different colors in the diagram represent different classes.
Table 1.
Experiment settings.
Table 1.
Experiment settings.
Settings Name | Value |
---|
Batch size | 150 |
Embedding size | 512 |
Epoch | 50 |
Evaluation metric | Recall@K mAP mAP@R |
Dataset division method | 50% for training and 50% for testing in each class |
80% for training and 20% for testing in each class |
50% classes for training and the other 50% of classes for testing |
Table 2.
Validation of the cluster algorithm on UCMD and AID. The best results are shown in bold.
Table 2.
Validation of the cluster algorithm on UCMD and AID. The best results are shown in bold.
| UCMD | AID |
---|
R@1 | R@2 | R@4 | R@8 | mAP | R@1 | R@2 | R@4 | R@8 | mAP |
---|
K-means | 97.82 | 98.67 | 99.25 | 99.71 | 94.56 | 95.95 | 96.80 | 97.65 | 98.73 | 94.15 |
DBSCAN | 97.78 | 98.63 | 99.19 | 99.65 | 94.51 | 95.96 | 96.81 | 97.62 | 98.69 | 93.99 |
BIRCH | 97.81 | 98.61 | 99.17 | 99.68 | 94.47 | 95.87 | 96.77 | 97.61 | 98.67 | 94.08 |
Table 3.
Validation of the sample synthetic method on UCMD and AID. The best results are shown in bold.
Table 3.
Validation of the sample synthetic method on UCMD and AID. The best results are shown in bold.
Method | UCMD | AID |
---|
R@1 | R@2 | R@4 | R@8 | mAP | R@1 | R@2 | R@4 | R@8 | mAP |
---|
No synthetic | 97.38 | 98.31 | 98.86 | 99.59 | 93.13 | 95.27 | 96.37 | 97.21 | 97.67 | 93.13 |
synthetic | 97.59 | 98.39 | 98.99 | 99.63 | 93.74 | 95.61 | 96.49 | 97.48 | 98.17 | 93.62 |
synthetic+a | 97.82 | 98.67 | 99.25 | 99.71 | 94.56 | 95.95 | 96.80 | 97.65 | 98.73 | 94.15 |
Table 4.
Validation of the parameter a on UCMD and AID. The best results are shown in bold.
Table 4.
Validation of the parameter a on UCMD and AID. The best results are shown in bold.
a | UCMD | AID |
---|
R@1 | R@2 | R@4 | R@8 | mAP | R@1 | R@2 | R@4 | R@8 | mAP |
---|
0.2 | 97.62 | 98.43 | 99.09 | 99.65 | 93.85 | 95.66 | 96.52 | 97.51 | 98.22 | 93.77 |
0.4 | 97.71 | 98.53 | 99.17 | 99.68 | 94.12 | 95.79 | 96.71 | 97.58 | 98.55 | 93.92 |
0.6 | 97.82 | 98.67 | 99.25 | 99.71 | 94.56 | 95.95 | 96.80 | 97.65 | 98.73 | 94.15 |
0.8 | 97.81 | 98.63 | 99.21 | 99.68 | 94.51 | 95.92 | 96.81 | 97.63 | 98.70 | 94.10 |
1 | 97.73 | 98.57 | 99.13 | 99.61 | 94.29 | 95.73 | 96.67 | 97.53 | 98.46 | 93.88 |
Table 5.
Validation of the number of proxy and adaptive proxy methods on UCMD and AID. The best results are shown in bold.
Table 5.
Validation of the number of proxy and adaptive proxy methods on UCMD and AID. The best results are shown in bold.
| UCMD | AID |
---|
R@1 | R@2 | R@4 | R@8 | mAP | R@1 | R@2 | R@4 | R@8 | mAP |
---|
1 | 97.42 | 98.34 | 98.98 | 99.57 | 93.07 | 95.27 | 96.37 | 97.21 | 97.67 | 93.13 |
2 | 97.55 | 98.41 | 99.05 | 99.59 | 93.32 | 95.42 | 96.46 | 97.33 | 97.93 | 93.54 |
3 | 97.72 | 98.52 | 99.12 | 99.62 | 93.68 | 95.73 | 96.61 | 97.45 | 98.41 | 93.82 |
4 | 97.67 | 98.49 | 99.11 | 99.63 | 93.71 | 95.75 | 96.59 | 97.47 | 98.37 | 94.05 |
5 | 97.69 | 98.51 | 99.13 | 99.61 | 93.75 | 95.78 | 96.62 | 97.46 | 98.39 | 94.01 |
adaptive | 97.82 | 98.67 | 99.25 | 99.71 | 94.56 | 95.95 | 96.80 | 97.65 | 98.73 | 94.15 |
Table 6.
Validation of the proxy weighting method on UCMD and AID. The best results are shown in bold.
Table 6.
Validation of the proxy weighting method on UCMD and AID. The best results are shown in bold.
Method | UCMD | AID |
---|
R@1 | R@2 | R@4 | R@8 | mAP | R@1 | R@2 | R@4 | R@8 | mAP |
---|
average | 97.51 | 98.32 | 98.99 | 99.53 | 93.41 | 95.37 | 96.46 | 97.37 | 98.03 | 93.56 |
adaptive | 97.82 | 98.67 | 99.25 | 99.71 | 94.56 | 95.95 | 96.80 | 97.65 | 98.73 | 94.15 |
Table 7.
Recall@K(%), mAP, and mAP@R performance comparison on UCMD and AID. The best results are shown in bold.
Table 7.
Recall@K(%), mAP, and mAP@R performance comparison on UCMD and AID. The best results are shown in bold.
Method | UCMD (50-50) | AID (80-20) |
---|
R@1 | R@2 | R@4 | R@8 | mAP | mAP@R | R@1 | R@2 | R@4 | R@8 | mAP | mAP@R |
---|
Triplet Loss [48] | 92.58 | 94.58 | 95.91 | 98.49 | 82.36 | 78.34 | 87.74 | 91.18 | 95.26 | 96.04 | 76.95 | 69.39 |
N-pair Loss [53] | 95.38 | 96.76 | 97.55 | 98.86 | 85.35 | 81.93 | 90.93 | 92.98 | 95.15 | 96.83 | 82.31 | 78.13 |
LS Loss [49] | 96.12 | 97.29 | 98.15 | 99.01 | 87.84 | 84.49 | 92.77 | 94.05 | 95.64 | 96.97 | 83.92 | 81.09 |
BIER [62] | 80.31 | 85.28 | 90.11 | 91.65 | 75.72 | 58.98 | 80.72 | 86.39 | 92.01 | 95.38 | 68.88 | 59.24 |
A-BIER [63] | 86.52 | 89.96 | 92.61 | 94.76 | 80.82 | 72.11 | 82.28 | 89.51 | 93.55 | 96.37 | 71.59 | 70.51 |
DCES [64] | 87.45 | 91.02 | 94.27 | 96.32 | 81.53 | 78.93 | 85.39 | 91.02 | 95.27 | 96.63 | 74.28 | 72.53 |
ABE [65] | 93.71 | 95.57 | 96.96 | 98.32 | 86.53 | 82.91 | 88.33 | 91.39 | 95.56 | 96.89 | 77.83 | 76.91 |
MS Loss [29] | 96.84 | 97.98 | 98.36 | 99.02 | 90.04 | 85.12 | 93.23 | 94.57 | 95.81 | 97.18 | 87.18 | 84.32 |
Proxy-NCA Loss [33] | 96.09 | 97.27 | 98.09 | 99.13 | 88.36 | 83.49 | 91.85 | 93.11 | 95.43 | 96.96 | 83.47 | 82.61 |
SoftTriple Loss [39] | 96.21 | 96.98 | 97.84 | 98.98 | 89.09 | 82.61 | 92.12 | 93.84 | 95.46 | 96.82 | 84.09 | 83.82 |
Proxy Anchor Loss [32] | 97.22 | 98.19 | 98.75 | 99.42 | 92.41 | 87.35 | 95.12 | 96.29 | 97.11 | 97.47 | 92.85 | 87.12 |
Our Method | 97.82 | 98.67 | 99.25 | 99.71 | 94.56 | 90.43 | 95.95 | 96.80 | 97.65 | 98.73 | 94.15 | 90.18 |
Table 8.
Recall@K(%), mAP, and mAP@R performance comparison on RSSCN7 and PatternNet with different Dataset division method.
Table 8.
Recall@K(%), mAP, and mAP@R performance comparison on RSSCN7 and PatternNet with different Dataset division method.
Dataset | R@1 | R@2 | R@4 | R@8 | mAP | mAP@R |
---|
RSSCN7 (50-50) | 94.64 | 96.25 | 97.85 | 98.57 | 90.71 | 82.76 |
RSSCN7 (80-20) | 95.14 | 97.36 | 98.43 | 99.01 | 92.64 | 85.67 |
PatternNet (50-50) | 99.65 | 99.79 | 99.83 | 99.88 | 99.55 | 99.01 |
PatternNet (80-20) | 99.87 | 99.88 | 99.90 | 99.91 | 99.74 | 99.51 |
Table 9.
Recall@K(%), mAP, and mAP@R performance for invisible classes on RSSCN7, UCMD, AID, and PatternNet. A total of 50% of classes were for training and the other 50% of classes were for testing.
Table 9.
Recall@K(%), mAP, and mAP@R performance for invisible classes on RSSCN7, UCMD, AID, and PatternNet. A total of 50% of classes were for training and the other 50% of classes were for testing.
Dataset | R@1 | R@2 | R@4 | R@8 | mAP | mAP@R |
---|
RSSCN7 | 96.92 | 98.83 | 99.58 | 99.92 | 74.32 | 55.87 |
UCMD | 93.40 | 96.10 | 97.70 | 98.70 | 57.21 | 40.56 |
AID | 87.93 | 93.24 | 96.21 | 98.11 | 48.29 | 34.21 |
PatternNet | 99.11 | 99.57 | 99.76 | 99.84 | 76.87 | 65.32 |