Next Article in Journal
A Low-Complexity Security Scheme for Drone Communication Based on PUF and LDPC
Previous Article in Journal
Achieving Robust Learning Outcomes in Autonomous Driving with DynamicNoise Integration in Deep Reinforcement Learning
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Generating 3D Models for UAV-Based Detection of Riparian PET Plastic Bottle Waste: Integrating Local Social Media and InstantMesh

1
Graduate School of Environmental and Life Science, Okayama University, Okayama 700-8530, Japan
2
TOKEN C.E.E. Consultants Co., Ltd., Tokyo 170-0004, Japan
*
Author to whom correspondence should be addressed.
Drones 2024, 8(9), 471; https://doi.org/10.3390/drones8090471
Submission received: 2 August 2024 / Revised: 30 August 2024 / Accepted: 5 September 2024 / Published: 9 September 2024

Abstract

:
In recent years, waste pollution has become a severe threat to riparian environments worldwide. Along with the advancement of deep learning (DL) algorithms (i.e., object detection models), related techniques have become useful for practical applications. This work attempts to develop a data generation approach to generate datasets for small target recognition, especially for recognition in remote sensing images. A relevant point is that similarity between data used for model training and data used for testing is crucially important for object detection model performance. Therefore, obtaining training data with high similarity to the monitored objects is a key objective of this study. Currently, Artificial Intelligence Generated Content (AIGC), such as single target objects generated by Luma AI, is a promising data source for DL-based object detection models. However, most of the training data supporting the generated results are not from Japan. Consequently, the generated data are less similar to monitored objects in Japan, having, for example, different label colors, shapes, and designs. For this study, the authors developed a data generation approach by combining social media (Clean-Up Okayama) and single-image-based 3D model generation algorithms (e.g., InstantMesh) to provide a reliable reference for future generations of localized data. The trained YOLOv8 model in this research, obtained from the S2PS (Similar to Practical Situation) AIGC dataset, produced encouraging results (high F1 scores, approximately 0.9) in scenario-controlled UAV-based riparian PET bottle waste identification tasks. The results of this study show the potential of AIGC to supplement or replace real-world data collection and reduce the on-site work load.

1. Introduction

Recently, polyethylene terephthalate (PET) bottles in riparian environments, especially as pollution in oceans, have come to pose a global environmental threat [1]. A main cause of this severe difficulty is the phenomenon of waste dumping in natural environments such as riverbanks or water surfaces [2]. PET bottles are an important recyclable resource, but they are also an abundant source of waste [1,2,3]. Once the waste enters the ocean (Figure 1), plastics are broken down by sunlight and the mechanical weathering of the wind and waves, eventually being degraded and abraded into smaller fragments known as microplastics, but the degradation of these microplastics can take hundreds of years [4,5,6]. In the meantime, both large pieces of plastic and microplastics are placing a substantial burden on marine life, putting animals at risk of becoming entangled and, if ingesting the plastic, starving to death. Plastics’ entry into the food chain can also lead to the absorption or ingestion of toxic substances by animals, with dire consequences for the animals and for the humans who eat them, the degrees of which remain largely unknown [7,8,9].
Rivers are the main conduit for plastic waste transport to oceans [3,6,10]. Researchers examining the Jiu Long River coastline in the port city of Xiamen are using cameras to track the slow progress of plastic trash flowing into the ocean. The large amounts of visual data collected using the cameras are useful to identify the manner in which the waste is moving downstream [11]. To reduce the amounts of plastic entering the ocean, the concept of riparian plastic monitoring has spread to several cities worldwide: the River Thames in London, England [12] and the Buriganga River in Dhaka, the capital of Bangladesh [13]. Along storm drains in the city of Hobart, Australia, monitoring cameras occasionally take photographs of trash floating by on the river [14]. These sites are part of an international project initiated by the Commonwealth Scientific and Industrial Research Organisation (CSIRO). To date, the project has collected more than 6000 photographs from water bodies in the three cities and has used artificial intelligence (AI) to train computers to identify plastic pollution in the photographs automatically and categorize them into 30 categories. The most common types of waste have been found to be food packaging and plastic bottles (e.g., PET). Such detailed information is crucially important: knowing what kinds of trash are present in most rivers can help cities target the source of their trash difficulties and implement their countermeasures more specifically [15].
In some places, drones have become important tools for tracking plastics [16,17,18]. Researchers in the Philippines, in collaboration with the German Research Center for AI, have used a fleet of drones to document plastic pollution in rivers feeding into the heavily polluted Manila Bay [19]. The researchers have applied machine learning to analyze the resulting video and combine it with footage taken by bridge-mounted cameras to detect trash entering the bay. To date, the World Bank-funded project has identified several hotspots at which plastic pollutants collect before entering the bay [20]. Ultimately, the project is hoped to use drone mapping and additional field surveys to inform government interventions to eliminate waste at the source.
To date, several advances have occurred in unmanned aerial vehicle (UAV)-based river waste detection technologies and datasets [21,22,23]. A technology company from the UK, Ellipsis Earth Ltd., has developed a system that uses technologies such as drones, fixed cameras, and vehicles to create detailed maps of waste, identify hotspots, and to elucidate the movement of pollution. This company can detect and categorize individual litter objects or dumping site locations with a low UAV flight height. Furthermore, this company used UAV and AI technologies to detect and categorize beach and river trash with more than 95% accuracy. Nevertheless, this system is limited to several countries that do not include Japan.
Although the mentioned system has not included Japan, in fact, waste pollution difficulties also exist in Japan. Figure 2 presents a scene from the Hyakken River Basin, Japan. The author has embarked on an exploration of the waste pollution dynamics in this region. River inspection and river patrol are two management services in Japan, which include several items. Among these is waste pollution detection. River patrols, according to the example of river patrol regulations [24] from the Ministry of Land, Infrastructure, Transport and Tourism (MLIT), Japan, “patrol rivers regularly and systematically as part of river management under normal conditions, detecting abnormalities and changes, and generally monitor the river”, and play an important role in elucidating the ever-changing conditions of rivers. Currently, river patrols in Japan are conducted with reliance on the human vision of the drivers and passengers of patrol vehicles (only visual inspection). They are conducted fundamentally at least twice a week on large rivers managed by governments.
Based on the information described above, and considering time consumption and efficiency, UAV-based river patrols can provide several benefits compared with human vision. Furthermore, MLIT plans to apply UAV for regular river patrols in the near future [25]. However, producing an available dataset to train the model for a UAV-based river patrol is becoming a topic deserving of attention from the MLIT managers. The authors have used several public datasets in open source projects (e.g., UAV-BD, UAVVaste, Nisha484) [26,27,28] for training the model; some limitations are listed hereinafter.
  • Dataset size and coverage: Although some large datasets are available, the size and coverage of datasets remains limited to practical applications. Many datasets include only specific regions or waste types. They are unable to assess pollution in different environments adequately.
  • Ground sample distance (GSD) or resolution: The GSD set was derived from different practical applications. Finding a dataset that matches all the practical needs is difficult.
  • Real-time and adaptability: Existing models and datasets still must be improved for detecting and adapting to different environmental changes in real time. Especially in complex natural environments, the robustness and adaptability of the models must be further improved.
Considering the limitations described above, the authors have tried to apply the AI-generated content (AIGC) to practical waste pollution detection tasks in riparian environments using UAV and deep learning [29]. This study explores the potential of AIGC as an alternative data source for training waste detection models. Using the Stable Diffusion text-to-image model, synthetic images of riparian waste were generated based on engineered prompts. Three datasets were created and compared: purely AIGC-based, AIGC combined with background-changed real images, and real-world UAV images. YOLOv5 object detection models were trained on these datasets and were evaluated using both a held-out test set of real UAV images and two public benchmark datasets. Results demonstrated that, whereas real-world data performed best on the held-out test set, AIGC-based models demonstrated superior performance to the real-world model on public datasets, particularly in cases with simpler backgrounds. The results of this study show the potential of AIGC to supplement or replace real-world data collection for environmental monitoring applications, consequently reducing the need for resource-intensive UAV flights and manual annotation. This study demonstrates that the text-to-image-based AIGC has the potential to supplement or partially replace real-world data in practical applications of environmental monitoring.
Another study presents a comprehensive analysis of riparian waste pollution in the Hyakken River Basin, Japan, using smartphone-based image collection and advanced deep learning techniques [30,31]. Research was conducted to elucidate the real-world conditions of waste pollution after long-term exposure to environmental factors. High-resolution images were taken using a smartphone’s 48 MP (megapixels) camera and were annotated using Roboflow’s Smart Polygon function (Roboflow is a comprehensive platform designed for building and deploying computer vision models). A YOLOv8n-seg model was trained on the created Hyakken River Basin Waste Dataset (HRB-WD), with the inclusion of data augmentation techniques. The trained model was tested using smartphone video frames from different field sections, with results compared to those obtained using pre-trained models and the Segment Anything Model (SAM). Results demonstrated that the HRB-WD-trained model performed better than pre-trained models for detection of on-site waste pollution. This study revealed important differences between real-world waste conditions and those represented in benchmark datasets or AI-generated content. Practical implications were explored through volunteer cleanup efforts, highlighting the need for accurate workload estimation tools. The results of this research can contribute to environmental monitoring by providing novel datasets and by emphasizing the importance of real-world data for developing effective waste detection models for riparian areas.
A technology company, Luma AI, San Francisco, California, USA supports the creation of realistic three-dimensional (3D) images through the web [32]. One of its standout features is GENIE, a tool that allows users to create 3D models from text prompts, similarly to AI image generation. With GENIE, users can generate 3D models of objects that might not even exist in real life. Considering the single-object 3D model’s wide application to waste pollution data resource accumulation, the single-object 3D model generation is presumed to be ongoing. Furthermore, as Figure 3 shows, the authors strove to generate PET bottle data using Luma AI GENIE.
Based on the comparison derived from the described txt2img AIGC, on-site HRB-WD, and online Luma AI GENIE generations, (as shown in Figure 4) the authors found that the AI-based generated results (txt2img AIGC, Luma AI GENIE generations) differed from those of local PET bottles in terms of Japanese style, label, color, and design (HRB-WD). The authors applied the AI-based generated PET bottles, specifically the samples portrayed in Figure 4, as the input for training the model to detect local, on-site waste pollution, which can engender the feature-based gap separating training and test data (e.g., the dust texture attached on the bottle, the bottle body color change derived from the sunshine). To bridge the gap separating the AI-generated data and the on-site data, similar-to-practical-situation-AIGC (i.e., S2PS AIGC, including but not limited to images and 3D models) should be given due consideration.
Based on understanding of the necessity of collecting the S2PS AIGC, the authors sought an approach that could match the corresponding needs. During the search process, the authors found a community-driven initiative in Okayama prefecture called “Clean-Up Okayama” (okayama.pref.pirika.org (accessed on 10 June 2024), SNS in the following content, Pirika Municipal Edition Visualization Page “The Land of Sunshine Cleanup Okayama”) [33]. This initiative encourages residents to participate in waste collection and environmental cleanup activities. The project leverages the social media platform “Pirika” to visualize and document volunteers’ efforts across the region. Participants can upload photographs of the waste they collect. The images are then shared on the platform to raise awareness and inspire others to join the cause. The initiative also supports large-scale cleanup activities by covering the costs of waste transportation and disposal for volunteer efforts. The goal is to create a cleaner and more beautiful Okayama, befitting of its nickname “The Land of Sunshine.” By connecting individual efforts, the project is aimed at fostering a sense of community and collective responsibility for the environment.
Figure 5 displays the interface of the web version of the SNS, which is separated into four sections. Figure 5 (1 and 4) mainly shows the number of participants and amount of collected waste. Furthermore, based on the waste from all periods, the visualization of the city-unit-based waste pieces per 10,000 people is displayed in panel (2) of Figure 5. Figure 5 (3) shows where participants can share information in the community (i.e., hashtag). This is the main section in which the images can be confirmed and selected. Figure 6 depicts some samples derived from section 3 in Figure 5.
Figure 7 presents the process of applying the SNS to collect S2PS AIGC. As shown in Figure 7 (1), the users picked up the on-site waste locally. In subsequent steps, as shown in Figure 7 (2 and 3), the image capture and uploading of the picked-up bottles are necessary for analyses of the waste-related amount and location data analysis. The main domain is based on Okayama prefecture. For this processing, there is no manual that guides the user how to take photographs of the objects (e.g., camera angle, distance to the objects or background color). As the final step in Figure 7 (4), a single-image-based 3D generator is able to convert the 2D information to the 3D model (i.e., S2PS AIGC in this study).

2. Materials and Methods

2.1. Study Sites

As shown in Figure 8, the authors selected the Hyakken River, Okayama Prefecture, Japan, as the study site because of the high density of data collection derived from the entire period. The Hyakken River, an artificial river, allows water to overflow and drain into the sea during floods. It also serves as an irrigation canal for the surrounding farmland.

2.2. InstantMesh

InstantMesh [34] is a cutting-edge model that is able to generate high-fidelity 3D meshes from a single image in about 5 s. This model is notable for its efficiency and accuracy, even reconstructing the non-visible parts for 3D model creation. This model is a feed-forward framework based on the Large Reconstruction Model (LRM)/Instant3D architecture. For this study, as shown in Figure 9, the authors created 3D Models using the Hugging Face Space by TencentARC (Applied Research Center). The selection of this model was mainly derived from its easy-to-use feature and the acceptable reconstruction ability of 2D-to-3D.
The process of generating the 3D model using InstantMesh includes following several steps:
  • Selecting the object image:
    The authors selected the objects with/without labels near the riverbank. Most of the selected objects are empty, several are filled with cigarettes litter.
  • Generating a multi-view image:
    This step was mainly completed by using the input image to generate images from different views (up-, down-, left-, and right- view) and combining these generated images into one.
  • Reconstructing the 3D Model:
    Based on the generated multi-view image, the Sparse-view Large Reconstruction Model reconstructed the 3D model.
  • Outputting the 3D Model file:
    After the 3D model reconstruction, the result can be output as a .obj (the model will be flipped) or .glb (the model shown here has a darker appearance) format file.

2.3. Generating S2PS AIGC

Figure 10 shows the three main steps of generating S2PS AIGC using InstantMesh: 1. Input Image, selecting one image from the SNS which has a clear outline of the bottles; 2. Generate Multi-views, as InstantMesh can generate non-visible parts of bottles, which are necessary for the 3D Model creation; and 3. Output Model (OBJ Format), where a single-object-derived S2PS AIGC is output and saved in an editable format.
As shown in Figure 11, there are 10 S2PS AIGC samples. These selected samples have different conditions: with/without labels, with/without contents inside, dirty/clean external appearance, and with/without shape-change. From the appearance-based perspective, the main parts of the bottles (top, neck and body) have been recreated clearly. However, some limitations of the S2PS AIGC are derived from the multi-view images. For example, some objects have some bottle body shape change, or no top part.

2.4. Single Object Extraction

As a supplement to riparian waste datasets, the S2PS AIGC cannot be applied directly in model training. As shown in Figure 12, an S2PS AIGC was dropped in The glTF Viewer at gltf-viewer.donmccurdy.com is developed by Don McCurdy, San Francisco, CA, USA (i.e., a website that can display the 3D model) as the input. Choosing autoRotate in glTF viewer, S2PS AIGC can rotate automatically. The authors recorded the automatic rotation videos and output them as .mp4 files.
As shown in Figure 13, frame images must be extracted from the automatic rotation videos. Furthermore, these frames (i.e., single object extraction) are applicable to future model training as supplemental data.

2.5. Generating Specific Datasets

To prove the possibility of applying the specific single-object extraction to detecting an object in the drone image, as shown in Figure 14, one specific frame image was selected. The reason for selecting this specific object is that it is similar to the practical size of the object shown in the drone image. After the background is made transparent, size adjustment (i.e., from 832.5 px to 85.1 px, px stands for pixels) is performed in combination with one drone image. Thereby, one new resource image can be generated. This generated new resource image alone is insufficient for the model training; pre-model-training data augmentation is necessary for increasing the image numbers in the dataset. For minimum color feature change before model training, the technologies of data augmentation were chosen as Rotate (i.e., Horizontal), Flip (i.e., Clockwise, Counter-Clockwise, Upside-down), and Blur (up to 2.5 px). The generated specific dataset includes the generated new resource image and all data-augmented images.
After generation of the one-resource-image specific datasets, as shown in Figure 15, multiple resource images were generated for more challenging tasks. After background and object selection and size adjustment (i.e., around 0.1 times), multiple resource images were generated as the S2PS AIGC Dataset. Directions and locations of the PET bottles were set randomly.

2.6. YOLOv8

After generation of the specific dataset, it must be evaluated using an object detection model. For this study, You Only Look Once ver. 8, i.e., YOLOv8, ref. [35] was chosen as the model for evaluation. YOLOv8 was developed by Ultralytics Inc., Austin, TX, USA, which earlier released YOLOv5. The reason for choosing this model comes from its state-of-the-art performance achieved by balancing speed and accuracy for real-time applications, its compatibility providing support for CPU and GPU operations, and model variants with different model sizes such as n, s, m, l, and x. As shown in Table 1, Table 2, Table 3, Table 4 and Table 5, all the necessary information is confirmed before model training.

3. Results

After training YOLOv8 using the generated specific dataset, we obtained samples of the batch images and the results of training the model, which are shown in Figure 16 and Figure 17 as references. One test image was chosen for evaluating the trained YOLOv8 model. Figure 18 presents the process of choosing the test image.
Figure 19 shows the relation between the inference image size and the corresponding confidence value. Until around 1.5 times the original size (i.e., from 512 px to 734 px) the confidence value can also be greater than 0.6.
Considering the practical application of detecting a similar object with a similar background as the training dataset for the test, because difficulty increases, the authors changed the epochs from 1000 (train/test using the same object and the same background) to 10,000 (train/test using a similar object and a similar background), as shown in Figure 20. As shown in Figure 21 and Figure 22, most of objects have a 0.9 confidence value, and the validation results have detected all objects in the images with a 1.0 F1 score.
The main reason the confidence value is high is because, under conditions of similar objects and backgrounds, the object size is similar. As shown in Figure 23, the pixel-based sizes of three objects are almost identical (i.e., 85.1 px, 84.8 px, and 85.9 px). Furthermore, the backgrounds are selected from the same location or are similar. The color features of the objects are also similar.
Based on the results derived from tests 1 and 2, the authors prepared test 3, which includes more background (e.g., road, grass and embankment) and location details (e.g., the bottle location in the image). Considering the time constraints, as shown in Figure 24, the authors selected the epochs 1000 and batch size 256 with the training batch samples in Figure 25. As shown in Figure 26 and Figure 27, the validation results have detected parts of all objects in the images with a total 0.93 F1 score. Considering the undetected objects, insufficient background images for generating the datasets were the main reason they were not detected.

4. Discussion

The trained YOLOv8 model in this research obtained from the S2PS AIGC dataset produced encouraging results in scenario-controlled UAV-based riparian PET bottle waste identification tasks. After proving the possibility of applying the single object, the application of waste-group generation (Figure 28) must also be considered in future studies. Furthermore, after demonstrating the possibility of applying a SNS as the input resource, in this case “Clean-Up Okayama”, the River Management Data Intelligent System (RiMaDIS), which includes on-site waste images, can also be regarded as data input. Additionally, except for the P2SP AIGC, some 3D models have failed to be generated. One reason might be that, as shown in Figure 29, the cover, in this case grass, over the PET bottle and the non-90° camera angle prevented InstantMesh from catching key information for the successful generation of the 3D Model. Until now, the current process also involves several manual steps (frame extraction, size adjustment, etc.). Future work should focus on how to automate these processes in a more scalable way for larger datasets. While this study focused on the Hyakken River for testing the models, it is also important to test the performance in diverse riparian environments to ensure the generalizability of the models. After all the mentioned considerations, integrating with the existing waste detection system and the S2PS AIGC approach in the practical applications will be a valuable topic for the river management. After several specific image-based data processes (i.e., frame image extraction, object size adjustment, background transparent/exchange, data augmentation) with the conditional controlling, SNS-resourced local-object-referenced 3D models can be applied in generating the specific riparian waste dataset for detecting the UAV-based riparian PET bottle waste.

5. Conclusions

The trained YOLOv8 model obtained from the S2PS AIGC dataset produced encouraging results (high F1 scores, approximately 0.9) in scenario-controlled UAV-based riparian PET bottle waste identification tasks. This suggests that the approach established in this work has practical applications in river management and environmental monitoring. Finally, the approach described in this study has the potential to greatly minimize the resources required for data collection and annotation in environmental monitoring applications, making trash detection systems more deployable in a variety of locales.

6. Future Works

Originally, the method in this research was designed for the plastics on the riverbanks which can be observed from the UAV view, e.g., on or inside vegetation. Observably, in flowing water (the water surface has no other noise, like bubbles) the macroplastic is much easier to detect than on the riverbanks because of the simple background, so the training algorithms are also available for monitoring macroplastic in the flowing water. The authors also prefer to apply the method in this research to other possible practical applications (e.g., macroplastic on the water surface), not just to riverbanks.
Another point needs to be mentioned: because the authors did not collect a dataset that used the same objects under different seasons from the same location with same GSD (i.e., ground sample distance), it is difficult to discuss its impact on the method’s accuracy and general utility in riverbank environments. The authors will attempt to challenge this topic in the near future research.
Excepting the mentioned content, there are also several bulleted points that warrant further research discovery, as follows:
  • 3D Waste Group Generation;
  • RiMaDIS-based Data Source Expansion;
  • Limitation of 3D Model Generation (or S2SP AIGC);
  • Scalability and Automation of the 3D Models Generation;
  • Performance of S2SP AIGC Across Different Environments;
  • Integration of S2SP AIGC with Existing Waste Detection Systems;
  • Multiple Object Detection Algorithms for Verification;
  • Multiple Study Sites/Locations/Backgrounds for Verification.

Author Contributions

Conceptualization, S.P. and K.Y.; methodology, S.P. and T.K.; software, S.P. and T.K.; validation, S.P., T.K. and D.S.; formal analysis, S.P.; investigation, S.P.; resources, S.P.; data curation, S.P. and D.S.; writing—original draft preparation, S.P.; writing—review and editing, K.Y.; visualization, S.P.; supervision, K.Y. and S.N.; project administration, K.Y.; funding acquisition, S.P. and K.Y. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by JST SPRING, grant number JPMJSP2126. This research was funded by Okayama University with donations from Dowa Holdings Co., Ltd. and Toho Electric Industrial Co., Ltd. This research was funded by the River Fund of the River Foundation, Japan, project number 2024-5211-067.

Data Availability Statement

Data related to this research can be made available by request from the corresponding author.

Acknowledgments

This work was supported by JST SPRING, Japan Grant Number JPMJSP2126. The authors acknowledge the financial support served by Okayama University with donations from Dowa Holdings Co., Ltd. and Toho Electric Industrial Co., Ltd. This study was supported by the River Fund of the River Foundation, Japan. Data collections from Pirika Municipal Edition Visualization Page “The Land of Sunshine Cleanup Okayama” (Japanese only) were authorized by the Ministry of the Environment/Agency for Cultural Affairs, Recycling-Based Society Promotion Division, Okayama prefecture.

Conflicts of Interest

Author Takashi Kojima was employed by the company TOKEN C.E.E. Consultants Co., Ltd. The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Abbreviations

AIArtificial Intelligence
AIGCArtificial Intelligence Generated Content
DLDeep Learning
GSDGround Sample Distance
HRB-WDHyakken River Basin Waste Dataset
MLITMinistry of Land, Infrastructure, Transport and Tourism
PETPoly Ethylene Terephthalate
SAMSegment Anything Model
S2PSSimilar to Practical Situation
UAVUnmanned Aerial Vehicle
YOLOv5(8)You Only Look Once version 5(8)

References

  1. Das, S.K.; Eshkalak, S.K.; Chinnappan, A.; Ghosh, R.; Jayathilaka, W.A.D.M.; Baskar, C.; Ramakrishna, S. Plastic Recycling of Polyethylene Terephthalate (PET) and Polyhydroxybutyrate (PHB)—A Comprehensive Review. Mater. Circ. Econ. 2021, 3, 9. [Google Scholar] [CrossRef]
  2. Ferronato, N.; Torretta, V. Waste Mismanagement in Developing Countries: A Review of Global Issues. Int. J. Environ. Res. Public Health 2019, 16, 1060. [Google Scholar] [CrossRef] [PubMed]
  3. Bratovcic, A.; Nithin, A.; Sundaramanickam, A. Microplastics pollution in rivers. In Microplastics in Water and Wastewater; Springer: Berlin/Heidelberg, Germany, 2022; pp. 123–145. [Google Scholar] [CrossRef]
  4. Lin, Y.-D.; Huang, P.-H.; Chen, Y.-W.; Hsieh, C.-W.; Tain, Y.-L.; Lee, B.-H.; Hou, C.-Y.; Shih, M.-K. Sources, Degradation, Ingestion and Effects of Microplastics on Humans: A Review. Toxics 2023, 11, 747. [Google Scholar] [CrossRef] [PubMed]
  5. Cai, Z.; Li, M.; Zhu, Z.; Wang, X.; Huang, Y.; Li, T.; Gong, H.; Yan, M. Biological Degradation of Plastics and Microplastics: A Recent Perspective on Associated Mechanisms and Influencing Factors. Microorganisms 2023, 11, 1661. [Google Scholar] [CrossRef] [PubMed]
  6. Newbould, R.A.; Powell, D.M.; Whelan, M.J. Macroplastic Debris Transfer in Rivers: A Travel Distance Approach. Front. Water 2021, 3, 724596. [Google Scholar] [CrossRef]
  7. Eze, C.G.; Nwankwo, C.E.; Dey, S.; Sundaramurthy, S.; Okeke, E.S. Food chain microplastics contamination and impact on human health: A review. Environ. Chem. Lett. 2024, 22, 1889–1927. [Google Scholar] [CrossRef]
  8. Saeedi, M. How microplastics interact with food chain: A short overview of fate and impacts. J. Food Sci. Technol. 2024, 61, 403–413. [Google Scholar] [CrossRef] [PubMed]
  9. New Link in the Food Chain? Marine Plastic Pollution and Seafood Safety. Available online: https://ehp.niehs.nih.gov/doi/pdf/10.1289/ehp.123-A34 (accessed on 1 February 2015).
  10. Maharjan, N.; Miyazaki, H.; Pati, B.M.; Dailey, M.N.; Shrestha, S.; Nakamura, T. Detection of River Plastic Using UAV Sensor Data and Deep Learning. Remote Sens. 2022, 14, 3049. [Google Scholar] [CrossRef]
  11. Xiamen Leads Way in Tackling Ocean Trash. Available online: https://dialogue.earth/en/pollution/12586-ocean-trash-xiamen-china/ (accessed on 10 January 2020).
  12. River Thames ‘Severely Polluted with Plastic’. Available online: https://www.bbc.com/news/science-environment-53479635 (accessed on 21 July 2020).
  13. 30,000 Tonnes of Plastic in 4 Rivers. Available online: https://en.prothomalo.com/environment/pollution/30000-tonnes-of-plastic-in-4-rivers (accessed on 18 October 2020).
  14. Monitoring Plastic Pollution with AI. Available online: https://research.csiro.au/ending-plastic-waste/wp-content/uploads/sites/408/2021/09/Factsheet_AITrash_FINAL.pdf (accessed on 28 September 2020).
  15. 8 Amazing Solutions to Stop Plastic Flowing into the World’s Oceans. Available online: https://www.weforum.org/agenda/2021/06/rivers-plastic-waste-clean-up-projects-trash/ (accessed on 28 June 2021).
  16. Jakovljevic, G.; Govedarica, M.; Alvarez-Taboada, F. A Deep Learning Model for Automatic Plastic Mapping Using Unmanned Aerial Vehicle (UAV) Data. Remote Sens. 2020, 12, 1515. [Google Scholar] [CrossRef]
  17. Geraeds, M.; van Emmerik, T.; de Vries, R.; bin Ab Razak, M.S. Riverine Plastic Litter Monitoring Using Unmanned Aerial Vehicles (UAVs). Remote Sens. 2019, 11, 2045. [Google Scholar] [CrossRef]
  18. Yang, Q.; Liu, M.; Zhang, Z.; Yang, S.; Ning, J.; Han, W. Mapping Plastic Mulched Farmland for High Resolution Images of Unmanned Aerial Vehicle Using Deep Semantic Segmentation. Remote Sens. 2019, 11, 2008. [Google Scholar] [CrossRef]
  19. Wolf, M.; van den Berg, K.; Garaba, S.P.; Gnann, N.; Sattler, K.; Stahl, F.; Zielinski, O. Machine learning for aquatic plastic litter detection, classification and quantification (APLASTIC-Q). Env. Res. Lett. 2020, 15, 114042. [Google Scholar] [CrossRef]
  20. Market Study for Philippines: Plastics Circularity Opportunities and Barriers. Available online: https://www.worldbank.org/en/country/philippines/publication/market-study-for-philippines-plastics-circularity-opportunities-and-barriers-report-landing-page (accessed on 21 March 2021).
  21. Majchrowska, S.; Mikołajczyk, A.; Ferlin, M.; Klawikowska, Z.; Plantykow, M.A.; Kwasigroch, A.; Majek, K. Deep learning based in natural and urban environments. Waste Manag. 2022, 138, 274–284. [Google Scholar] [CrossRef] [PubMed]
  22. Han, W. UAV Data Monitoring Plastic Waste Dataset [DS/OL]. V1. Science Data Bank, 2021. CSTR:31253.11.sciencedb.01121. Available online: https://cstr.cn/31253.11.sciencedb.01121 (accessed on 20 July 2024).
  23. Github. Available online: https://github.com/wimlds-trojmiasto/detect-waste (accessed on 24 October 2022).
  24. MLIT. Available online: https://www.mlit.go.jp/river/shishin_guideline/ (accessed on 5 May 2011).
  25. MLIT. Available online: https://www.mlit.go.jp/river/shinngikai_blog/kentoukai/drone/dai01kai/pdf/3_drone_katsuyou.pdf (accessed on 7 July 2019).
  26. Github. Available online: https://github.com/jwwangchn/UAV-BD (accessed on 6 September 2018).
  27. UAVVaste. Available online: https://uavvaste.github.io/ (accessed on 1 March 2021).
  28. Github. Available online: https://github.com/Nisha484/Nisha (accessed on 2 February 2022).
  29. Pan, S.; Yoshida, K.; Kojima, T. Application of the Prompt Engineering-assisted Generative AI for the Drone-based Riparian Waste Detection. Intell. Inform. Infrastruct. 2023, 4, 50–59. [Google Scholar] [CrossRef]
  30. Pan, S.; Yoshida, K.; Kojima, T. Comprehensive Analysis of On-Site Riparian Waste Pollution: A Case Study on the Hyakken River Basin. Intell. Inform. Infrastruct. 2023, 5, 98–110. [Google Scholar] [CrossRef]
  31. Pan, S.; Yoshida, K.; Kojima, T. Riparian Waste Pollution Dataset for the Hyakken River Basin (Version 1.0). J-STAGE 2023, 5, 98–103. [Google Scholar] [CrossRef]
  32. GENIE. Available online: https://lumalabs.ai/genie (accessed on 10 January 2024).
  33. Visualizing Website “Sunny Land Cleanup Okayama”. Available online: http://okayama.pref.pirika.org (accessed on 4 April 2017).
  34. Xu, J.; Cheng, W.; Gao, Y.; Wang, X.; Gao, S.; Shan, Y. InstantMesh: Efficient 3D Mesh Generation from a Single Image with Sparse-view Large Reconstruction Models. arXiv 2024. [Google Scholar] [CrossRef]
  35. Github. Available online: https://github.com/ultralytics (accessed on 2 August 2024).
Figure 1. Process of plastics’ entry into the food chain. The lifecycle of plastic waste in aquatic ecosystems includes the following steps: initial river entry; collapse of the oceanic macroplastic to microplastic pollution from the force of the wind/sunshine; bio-magnification through the marine food chain into the human body.
Figure 1. Process of plastics’ entry into the food chain. The lifecycle of plastic waste in aquatic ecosystems includes the following steps: initial river entry; collapse of the oceanic macroplastic to microplastic pollution from the force of the wind/sunshine; bio-magnification through the marine food chain into the human body.
Drones 08 00471 g001
Figure 2. On-site waste pollution detected in the Hyakken River, Japan.
Figure 2. On-site waste pollution detected in the Hyakken River, Japan.
Drones 08 00471 g002
Figure 3. Bottle-related models were generated by Luma AI GENIE (website page in the upper panel). Derived from the several results, four bottle-related models (the lower panel) were selected, using different prompts.
Figure 3. Bottle-related models were generated by Luma AI GENIE (website page in the upper panel). Derived from the several results, four bottle-related models (the lower panel) were selected, using different prompts.
Drones 08 00471 g003
Figure 4. Samples of PET bottle waste derived from txt2img AIGC, on-site, and Luma AI GENIE generations.
Figure 4. Samples of PET bottle waste derived from txt2img AIGC, on-site, and Luma AI GENIE generations.
Drones 08 00471 g004
Figure 5. Clean-Up Okayama website (English content in this figure was derived from image-based Google Translate) that includes four main parts: 1. Total number of participants and the amount of waste picked up in Okayama prefecture; 2. Number of waste items from the whole period distributed in the Okayama prefecture, with mapping derived from Google Maps, Alphabet Inc., Mountain View, CA, USA; 3. Comments and field images collected and uploaded by the website users, with obscured user names and profile logos; 4. Chart of waste collection activities in Okayama prefecture, including number of people and waste by date.
Figure 5. Clean-Up Okayama website (English content in this figure was derived from image-based Google Translate) that includes four main parts: 1. Total number of participants and the amount of waste picked up in Okayama prefecture; 2. Number of waste items from the whole period distributed in the Okayama prefecture, with mapping derived from Google Maps, Alphabet Inc., Mountain View, CA, USA; 3. Comments and field images collected and uploaded by the website users, with obscured user names and profile logos; 4. Chart of waste collection activities in Okayama prefecture, including number of people and waste by date.
Drones 08 00471 g005
Figure 6. Pick-up sample images from section 3 in Figure 5.
Figure 6. Pick-up sample images from section 3 in Figure 5.
Drones 08 00471 g006
Figure 7. Process of collecting local bottle waste-based objects: 1. Collecting on-site waste; 2. Taking the image; 3. Uploading the image to the website; 4. Generating the 3D model.
Figure 7. Process of collecting local bottle waste-based objects: 1. Collecting on-site waste; 2. Taking the image; 3. Uploading the image to the website; 4. Generating the 3D model.
Drones 08 00471 g007
Figure 8. The upper map displays the locations (icons) of the waste-related image capture and upload. The lower satellite shows the Hyakken River area (both are derived from Google Maps; Alphabet Inc.).
Figure 8. The upper map displays the locations (icons) of the waste-related image capture and upload. The lower satellite shows the Hyakken River area (both are derived from Google Maps; Alphabet Inc.).
Drones 08 00471 g008aDrones 08 00471 g008b
Figure 9. InstantMesh model architecture.
Figure 9. InstantMesh model architecture.
Drones 08 00471 g009
Figure 10. Process of generating S2PS AIGC using InstantMesh: 1. Inputting the image; 2. Generating the multiple views derived from a single input image; 3. Outputting the GLB/OBJ-formatted S2PS AIGC.
Figure 10. Process of generating S2PS AIGC using InstantMesh: 1. Inputting the image; 2. Generating the multiple views derived from a single input image; 3. Outputting the GLB/OBJ-formatted S2PS AIGC.
Drones 08 00471 g010
Figure 11. Samples of the S2PS AIGC.
Figure 11. Samples of the S2PS AIGC.
Drones 08 00471 g011
Figure 12. Process of generating the automatic rotating bottle videos using the autoRotate function in glTF Viewer.
Figure 12. Process of generating the automatic rotating bottle videos using the autoRotate function in glTF Viewer.
Drones 08 00471 g012
Figure 13. Frame images derived from the automatic rotating bottle video.
Figure 13. Frame images derived from the automatic rotating bottle video.
Drones 08 00471 g013
Figure 14. Process of generating specific datasets (one resource image): 1. Selecting one specific frame image; 2. Transparency of the black background; 3. Selecting one drone image derived using a 75° camera angle and 2 cm GSD (ground sample distance) resource images; 4. Adjusting the bottle size from Step 2 to match the bottle size of the drone image in Step 3; 5. Generating the new image with the bottle shown in Step 4 and the drone image in Step 3; 6. Preparing data augmentation for the specific dataset, mainly including image direction change and blur.
Figure 14. Process of generating specific datasets (one resource image): 1. Selecting one specific frame image; 2. Transparency of the black background; 3. Selecting one drone image derived using a 75° camera angle and 2 cm GSD (ground sample distance) resource images; 4. Adjusting the bottle size from Step 2 to match the bottle size of the drone image in Step 3; 5. Generating the new image with the bottle shown in Step 4 and the drone image in Step 3; 6. Preparing data augmentation for the specific dataset, mainly including image direction change and blur.
Drones 08 00471 g014
Figure 15. Process of generating S2PS AIGC Dataset (multiple resource images): 1. Pre-Processing—selecting the background and object images; 2. Setting the parameters—mainly adjusting the image size between the background and the object; 3. Generating multiple resource images.
Figure 15. Process of generating S2PS AIGC Dataset (multiple resource images): 1. Pre-Processing—selecting the background and object images; 2. Setting the parameters—mainly adjusting the image size between the background and the object; 3. Generating multiple resource images.
Drones 08 00471 g015
Figure 16. Results of training the model, which include train/valid-based box_loss/cls_loss/dfl_loss, precision/recall, and mAP50/50–95 (one source image, epoch 1000, batch-size 16, patience 50).
Figure 16. Results of training the model, which include train/valid-based box_loss/cls_loss/dfl_loss, precision/recall, and mAP50/50–95 (one source image, epoch 1000, batch-size 16, patience 50).
Drones 08 00471 g016
Figure 17. Samples of the batch images used for training: train_batch 0, train_batch 1, and train_batch 2 (one source image, epoch 1000, batch-size 16, patience 50).
Figure 17. Samples of the batch images used for training: train_batch 0, train_batch 1, and train_batch 2 (one source image, epoch 1000, batch-size 16, patience 50).
Drones 08 00471 g017
Figure 18. Process of selecting the test image: Left panel, using 50 drone images to reconstruct the 3D model; Right panel, zooming in on the screen closer to the object and outputting the image. The described process was performed with open source photogrammetry software called 3DF Zephyr free version, which can create 3D models from photographs. This process is to select the object which has a similar size using the zoom-in function of the 3D model.
Figure 18. Process of selecting the test image: Left panel, using 50 drone images to reconstruct the 3D model; Right panel, zooming in on the screen closer to the object and outputting the image. The described process was performed with open source photogrammetry software called 3DF Zephyr free version, which can create 3D models from photographs. This process is to select the object which has a similar size using the zoom-in function of the 3D model.
Drones 08 00471 g018
Figure 19. Relation between inference image size and corresponding confidence value (test 1).
Figure 19. Relation between inference image size and corresponding confidence value (test 1).
Drones 08 00471 g019
Figure 20. Results of training the model (one source image, epoch 10,000, batch-size 16, patience 1000).
Figure 20. Results of training the model (one source image, epoch 10,000, batch-size 16, patience 1000).
Drones 08 00471 g020
Figure 21. Validation results (test 2): Left panel, inference with confidence value; Right panel, true label.
Figure 21. Validation results (test 2): Left panel, inference with confidence value; Right panel, true label.
Drones 08 00471 g021
Figure 22. F1 score derived from the validation results (one source image).
Figure 22. F1 score derived from the validation results (one source image).
Drones 08 00471 g022
Figure 23. Object sizes (left panel, generated resource image; middle panel, a 3D-derived test with the same background; right panel, a similar object test with a similar background).
Figure 23. Object sizes (left panel, generated resource image; middle panel, a 3D-derived test with the same background; right panel, a similar object test with a similar background).
Drones 08 00471 g023
Figure 24. Results of training the model, which include train/valid-based box_loss/cls_loss/dfl_loss, precision/recall and mAP50/50–95 (one source image, epoch 1000, batch-size 256, patience 1000).
Figure 24. Results of training the model, which include train/valid-based box_loss/cls_loss/dfl_loss, precision/recall and mAP50/50–95 (one source image, epoch 1000, batch-size 256, patience 1000).
Drones 08 00471 g024
Figure 25. Samples of batch images used for training: train_batch 0, train_batch 1, train_batch 2, train_batch 19800, train_batch 19801, and train_batch 19802.
Figure 25. Samples of batch images used for training: train_batch 0, train_batch 1, train_batch 2, train_batch 19800, train_batch 19801, and train_batch 19802.
Drones 08 00471 g025
Figure 26. Validation results: Left panel, inference with confidence value; Right panel, true label.
Figure 26. Validation results: Left panel, inference with confidence value; Right panel, true label.
Drones 08 00471 g026
Figure 27. F1 score derived from the validation results (multiple source images).
Figure 27. F1 score derived from the validation results (multiple source images).
Drones 08 00471 g027
Figure 28. Samples of the 3D waste group generations.
Figure 28. Samples of the 3D waste group generations.
Drones 08 00471 g028
Figure 29. Samples of failed 3D model generations.
Figure 29. Samples of failed 3D model generations.
Drones 08 00471 g029
Table 1. Computer parameters.
Table 1. Computer parameters.
Parameters
(One Source Image/Multiple Source Images)
Memory (GB)11/21
Time per epoch (s/epoch)0.474/10.7
Table 2. Parameters for YOLOv8 training (one source image).
Table 2. Parameters for YOLOv8 training (one source image).
Train/Valid
Image Size512
Epochs1000 (10,000)
Batch Size16
Patience50 (1000)
Lr00.01
Model Sizen
Table 3. Ratio of the dataset (one source image).
Table 3. Ratio of the dataset (one source image).
TrainValidTest
Image Number2031 (15)
Table 4. Parameters for YOLOv8 training (multiple source images).
Table 4. Parameters for YOLOv8 training (multiple source images).
Train/Valid
Image Size512
Epochs1000
Batch Size256
Patience1000
Lr00.01
Model Sizen
Table 5. Ratio of the dataset (multiple source images).
Table 5. Ratio of the dataset (multiple source images).
TrainValidTest
Image Number5000100036
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Pan, S.; Yoshida, K.; Shimoe, D.; Kojima, T.; Nishiyama, S. Generating 3D Models for UAV-Based Detection of Riparian PET Plastic Bottle Waste: Integrating Local Social Media and InstantMesh. Drones 2024, 8, 471. https://doi.org/10.3390/drones8090471

AMA Style

Pan S, Yoshida K, Shimoe D, Kojima T, Nishiyama S. Generating 3D Models for UAV-Based Detection of Riparian PET Plastic Bottle Waste: Integrating Local Social Media and InstantMesh. Drones. 2024; 8(9):471. https://doi.org/10.3390/drones8090471

Chicago/Turabian Style

Pan, Shijun, Keisuke Yoshida, Daichi Shimoe, Takashi Kojima, and Satoshi Nishiyama. 2024. "Generating 3D Models for UAV-Based Detection of Riparian PET Plastic Bottle Waste: Integrating Local Social Media and InstantMesh" Drones 8, no. 9: 471. https://doi.org/10.3390/drones8090471

APA Style

Pan, S., Yoshida, K., Shimoe, D., Kojima, T., & Nishiyama, S. (2024). Generating 3D Models for UAV-Based Detection of Riparian PET Plastic Bottle Waste: Integrating Local Social Media and InstantMesh. Drones, 8(9), 471. https://doi.org/10.3390/drones8090471

Article Metrics

Back to TopTop