Next Article in Journal
Development of Magnetic Porous Polymer Composite for Magnetic Solid Phase Extraction of Three Fluoroquinolones in Milk
Previous Article in Journal
Design of Experiments and Optimization of Monacolin K Green Extraction from Red Yeast Rice by Ultra-High-Performance Liquid Chromatography
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Research on Using K-Means Clustering to Explore High-Risk Products with Ethylene Oxide Residues and Their Manufacturers in Taiwan

by
Li-Ya Wu
,
Fang-Ming Liu
*,
Wen-Chou Lin
,
Jing-Ting Qiu
,
Hsu-Yang Lin
and
King-Fu Lin
Food and Drug Administration, Ministry of Health and Welfare, Taipei 115209, Taiwan
*
Author to whom correspondence should be addressed.
Foods 2024, 13(16), 2510; https://doi.org/10.3390/foods13162510
Submission received: 28 June 2024 / Revised: 7 August 2024 / Accepted: 10 August 2024 / Published: 11 August 2024
(This article belongs to the Section Food Quality and Safety)

Abstract

:
Considering the frequency of ethylene oxide (EtO) residues found in food, the health effects of EtO have become a concern. Between 2022 and 2023, 489 products were inspected using the purposive sampling method in Taiwan, and nine unqualified products were found to have been imported; subsequently, border control measures were enhanced. To ensure the safety of all imported foods, the current study used the K-means clustering method for identifying EtO residues in food. Data on finished products and raw materials with EtO residues from international public opinion bulletins were collected for analysis. After matching them with the Taiwan Food Cloud, 90 high-risk food items with EtO residues and 1388 manufacturers were screened. The Taiwan Food and Drug Administration set up border controls and grouped the manufacturers using K-means clustering in the unsupervised learning algorithm. For this study, 37 manufacturers with priority inspections and 52 high-risk finished products and raw materials with residual EtO were selected for inspection. While EtO was not detected, the study concluded the following: 1. Using international food safety alerts to strengthen border control can effectively ensure domestic food safety; 2. K-means clustering can validate the risk-based purposive sampling results to ensure food safety and reduce costs.

1. Introduction

In August 2022, the European Commission revised the law to regulate the residual amount of ethylene oxide (EtO) in food to within 0.1 mg/kg, thereby circumventing the previous law enforcement difficulties caused by the unclear source of EtO. Since then, the EU Rapid Alert System for Food and Feed (RASFF) has successively reported many foods containing EtO residues, such as sesame, ice cream, flour products, bread, biscuits, spice plants, curry, empty capsules for food use, and other products [1]. In other countries, including Japan and South Korea [2,3], the maximum residue level must be below 0.01 ppm in food products, while the corresponding values in Canada are 7 ppm and 0.1 ppm for dried vegetables and other foods, respectively [4]. In order to maintain food safety and prevent products that may contain EtO residues from entering the country, the Taiwan Food and Drug Administration (TFDA) announced an inspection of products that could contain the pesticide residue on 14 March 2022 and has not approved the use of EtO as a pesticide. In addition, the Taiwan Use Scope, Limits, and Specifications of Food Additives policy does not allow the use of EtO gas as a food additive; so, it cannot be detected in food [5]. EtO is a colorless flammable gas that is soluble in water and other organic solvents and reacts readily with chlorine to produce 2-chloroethanol (2-CE). It is mainly used in the production of ethylene glycol (as an antifreeze), alongside being used for the sterilization of medical equipment, pharmaceuticals, and cosmetics. It blocks cell metabolism and replication by alkylating proteins and nucleic acids, which in turn leads to the death of microorganisms, hence achieving bactericidal effects [6].
The International Agency for Research on Cancer (IARC) has classified EtO as an IARC Group 1 carcinogen, which can enter the human body through inhalation, ingestion, and skin contact. The Department of Labor Occupational Safety and Health Administration (OSHA) has stated that exposure to EtO can cause symptoms such as dizziness, headache, nausea, vomiting, coughing, convulsions, blistering, and even difficulty breathing and blurred vision, while occupational exposure to EtO is associated with lymphoma, leukemia, and breast cancer [7,8,9]. Although the United States and Canada currently allow EtO to be used as a fumigant for crops such as sesame, spices, mint, and dried vegetables, Taiwan, European Union, Japan, and Australia have not approved its use as a pesticide.
Currently, the risk of residual EtO in food is an international concern. The food cloud database was established by the TFDA in 2015. The main purpose of its establishment was to assist health authorities in quickly obtaining product flow information from the database when a problematic product occurs to prevent it from entering the market. The food cloud includes the registration data of suppliers, upstream and downstream flow data of products in the market, inspection data (or import data) of border products, product inspection data, raw product materials, and ingredient data.
To determine whether high-risk raw materials or food products notified by the European Union and United States have the same situation in the domestic market, the current study aimed to investigate whether Taiwan has similar or identical risks of EtO residues in food and to enhance active inspection against risky products or manufacturers through big data risk analysis. This evidence may help prevent problematic foods from entering the market and affecting overall consumer health.
According to the Taiwan Import Food Information System (IFIS) data, from 1 March 2022 to 28 February 2023, a total of 1623 tests for EtO residues were conducted on foods at Taiwan’s borders imported from 46 countries. Of these, 44 batches were found to have EtO residues exceeding the permissible limits, resulting in a failure rate of 2.71%. Food imported from eight countries contained EtO residues. The failure rates in these countries were 1.45% in Japan, 2.03% in South Korea, 5.03% in Vietnam, 12.16% in Indonesia, 1.57% in France, 3.03% in Switzerland, 11.76% in the Philippines, and 33.33% in Brazil. The largest number of unqualified batches was 18 in Indonesia, and the highest unqualified rate was 33.33% in Brazil [10].
Before the implementation of this study, the TFDA had successfully intercepted imported batches of sesame, instant noodles, cheese, ice cream, and spicy seasoning with EtO residues at the border through the artificial intelligence assistance of the EL V2 risk-prediction model and requested them to be returned or destroyed [11]. For the same problematic products (including different batch numbers) that have flowed into the country and are still within the validity period, the flow was immediately traced, and random inspections were conducted.
The most common sampling methods used for food safety purposes included simple random sampling, purposive sampling based on risk, or model-assisted sampling [12,13]. The safety control measures taken by the TFDA for post-market foods included random on-site inspections of products and raw materials. The random inspection method is “risk-based purposive sampling” using the experience of senior inspectors. Statistical analysis of spot inspection data of Taiwan’s food from 2022 to 2023 showed that nine out of four-hundred and eighty-nine products contained EtO residues in the post-market stage (Table A1). The nine unqualified cases included one sesame case, four spice cases (including spice seasonings and curry powder), two cheese cases, and two instant noodle cases (Table 1). These problematic cases involved imported products that must be recalled, destroyed, or withdrawn from the market to ensure the safety of public consumption.

2. Materials and Methods

2.1. Choice of Analytical Method for Sampling

If we can identify the risky food items with EtO residues that have been detected internationally, we can use food cloud big data to obtain information on the manufacturers and sellers who produce or sell the same raw food materials and products in Taiwan. Another consideration would be the type of risk analysis method that is used to plan the “Priority Inspection List”.
K-means clustering is an unsupervised learning algorithm developed by MacQueen in 1967 and is one of the most widely used clustering analysis methods [14]. It is a major tool in the field of data mining. Unsupervised learning is the most commonly used prediction method in the absence of learning objects. Since there is no standard answer in the process of finding rules through the training data, the algorithm must find a set of grouping rules from the data group by itself. The advantage of unsupervised learning is that data need not be labeled in advance, and only data features input into the algorithm are required when training the model. It can then automatically determine the rules of classification and divide the data into K-groups.
The K-means clustering algorithm is widely used in many fields, such as historical retail transaction data clustering [15], abnormal cell image segmentation detection [16], and commercial building temperature clustering [17]. Mohsen et al. used 143 Sentinel-2 images to analyze the dynamics of surface-suspended sediments using a K-means unsupervised classification algorithm [18]. Miolla et al. investigated the consumers’ acceptance of bread enriched with oenological by-products by applying K-means clustering, thereby providing insights into the characterization of consumer groups based on their specific features and declared attitudes [19]. To improve the quality assessment of wheat during storage, Liu collected and analyzed the data from more than 20 regions, including information on storage environmental parameters and changes in wheat pesticide residue concentrations. Based on these factors, a model was developed to predict the changes in wheat pesticide residue concentrations during storage. A comprehensive wheat quality assessment index was set for the predicted and true values of pesticide residue concentrations, and it was then combined with the K-means algorithm to assess the quality of wheat during storage. The results of this study demonstrated that the model achieved optimal prediction results and the smallest error values [20]. Ferreira-Paiva et al. proposed an urban clustering approach—namely, K-means clustering—to identify the most critical sectors related to greenhouse gas emissions, based on the percentage contribution of agriculture, land-use change, energy, and waste sectors to total emissions [21]. These studies were based on the selected characteristic factors, and a cluster analysis was performed on the target groups to classify and group them. This approach is commonly known as “birds of a feather flock together”. It can help judge and analyze the characteristics and properties of each group according to the characteristic factors. Therefore, based on the previous studies, the current study used the K-means cluster analysis method to identify risk vendors. First, we must determine the characteristic factors of risk manufacturers (the premise of this research is that data on risk characteristic factors must exist). We believe that the K-means cluster analysis method would help obtain a list of prioritized manufacturers for inspection.

2.2. The Principle of K-Means Clustering

Cluster analysis is used in data mining and machine learning to classify similar objects into clusters. K-means clustering is a widely used cluster analysis method that divides a set of objects into K-clusters. The sum of squared distances between the objects and their assigned clusters is then minimized. K-means clustering is a method of grouping data by calculating their similarity. This similarity is based on the Euclidean distance formula in n-dimensional space to calculate the distance between each data point and the nearest cluster center [22]. A K-means clustering algorithm was selected for this study since its theory is comprehensible. The steps of operation of this algorithm are as follows: First, a dataset is divided into K-clusters, the cluster center (CC) is randomly selected, and the Euclidean distance (ED) to the cluster center is calculated for each data point. Next, each data point is assigned to the nearest cluster center to form a cluster. Subsequently, the new cluster centers are recalculated and updated [23]. The above steps are repeated until none of the cluster centers change. From the above steps, the entire process involves the classification of similar data into the same cluster and dissimilar data into different clusters.
First of all, a set of n d-dimensional data, x i R d , i = 1,2 , 3 , . . , n , and artificially set K (must be ≤n) clusters {S1, S2,…, Sk} are assumed; then, K-means clustering is expected to minimize the distance within the cluster and make the sum of squared errors of cluster centers as small as possible. The μ c is the cluster center, and ‖x − y‖ calculates the ED. The K-means clustering algorithm can be defined by the following formulas (Formulas (1)–(4)):
  • Initially, K cluster centers are set randomly
    μ c ( 0 ) R d , c = 1,2 , , K
  • The distance between each sample and each cluster center is calculated, with (t) being the t-th operation. Then, each sample is assigned to the cluster with the shortest distance.
    S c ( t ) = x i : x i μ c ( t ) x i μ c * t , i = 1 , , n
  • CC is updated (nc data are in cluster c)
    μ c ( t + 1 ) = s u m ( s c t ) n c = i = 1 n c x i x i s c ( t )
  • Steps 2–3 are repeated until the CC does not change—that is,
    S C ( t + 1 ) = S C ( t ) ,   c = 1 , , K
The greatest challenge in unsupervised learning is to determine the K-value (number of clusters) [24]. This study used the elbow method to determine the K values. The elbow method must be executed before using the K-means clustering algorithm [25]. In the same group of data, continuous increments by 1 are made from K = 2. Initially, the curve decreases rapidly. When K reaches a certain value, the curve begins to decline slowly. The first K value after slowdown is the optimal value [26] (Figure 1). This concept involves calculating the sum of squared errors (SSE) for each value of K to estimate the best elbow point [27]. The formula is as follows (Formula (5)):
SSE = i 1 K p C i p m i 2
K: total number of clusters;
Ci: represents one of the clusters;
mi: central point of the cluster.

2.3. Data Sources and Analytical Tools

In order to control the risk of EtO residues in food, the current study used the collected international food safety alerts and changed the previous purposive sampling to an unsupervised learning method to conduct “food safety risk prediction”. Improvements in post-market sampling methods further optimized the selection of riskier samples. The research methodology is described in the following sections in detail.
The data used in this study were mainly obtained from real-time open information in the food cloud. The collected content included the names of raw materials or finished products, date of notification, country, and manufacturers. These data were collected from the Consumer Agency of Japan, Rappel Conso of France, Food Safety Authority of Ireland (FSAI), EU RASFF, US Food and Drug Administration (FDA), USDA’s Food Safety and Inspection Service (FSIS), Food Standards Australia New Zealand (FSANZ), British Food Standards Agency (FSA), Canadian Food Inspection Agency (CFIA), Hong Kong Food and Environmental Hygiene Department Food Safety Center, Singapore Food Agency (SFA), and other international sources. Further, the data sources included supplier registration, inspection, testing, and product flow information in the food cloud established by the Food and Drug Administration of the Taiwan Ministry of Health and Welfare. The food cloud used the five major systems of TFDA as its core, namely, the Registration Platform of Food Businesses System (RPFBS), Food Traceability Management System (FTMS), Inspection Management System (IMS), Product Management Decision System (PMDS), and IFIS (Table 2). In addition, it comprised cross-agency data communication, including financial and tax electronic invoices, customs electronic gate verification data, national business tax registration data, industrial and commercial registration data, indicated chemical substance flow data, domestic industrial oil flow data, imported industrial flow data, waste oil flow data, toxic chemical substance flow data, feed oil flow data, campus food ingredient logins, and inspection data.
Imported foods and food-related products have to apply for import inspection through the IFIS from the TFDA, and only compliant products are permitted to enter the domestic market. The relevant business data must be registered with the RPFBS, national business tax registration data, and business registration data. The flow of information generated by domestic and imported products entering the market from the border should be recorded in the IFIS and FTMS, as well as in electronic invoices and electronic gate goods import and export verification records. All products sampled and inspected by the government were recorded in the PMDS, IFIS, and IMS. A company’s product information can also be obtained through the RPFBS and FTMS [28] (The full name of the abbreviation can be found in Table A2).
The analytical tools used in this study were R.4.0.0, SPSS 25.0, and Microsoft Excel 2010.

2.4. Selection of Key Risk Factors

To date, all problematic food products or raw materials with EtO residues found in Taiwan have been imported. In the absence of unqualified cases, we selected products or raw materials that had been reported to be contaminated with EtO abroad as a reference for our sampling project in Taiwan. The collection period for the above information was from 1 January 2021 to 31 March 2023. After integrating the problematic product information reported internationally in this study, 913 problematic products and raw material notifications with EtO residues were identified. Among them, we analyzed 90 clearly identifiable finished products (or categories) and raw materials that were screened out and were within the scope of this study.
The names of the 90 products and raw materials with EtO residues in Table 3 were matched with those of the products and raw materials in the FTMS. A list of risky product manufacturers (or suppliers) was obtained using the aforementioned steps. To plan the “Priority inspection list” for risky product manufacturers, variables associated with the manufacturers were selected to design the key risk factors. The design concept of these factors was that there must be “existing data”, so that the factors can be selected from the food cloud database. Five key risk factors determined by consensus among the experts in this study included “Capital amount (C)”, “Number of occupations (N)”, “Manufacturers that have detected unqualified finished products or raw materials with EtO residues during border inspections (Bb)”, “Quantity of products or raw materials that may be involved in EtO risk (Q)”, and “Manufacturers whose finished products or raw materials have been detected to contain ethylene oxide residues during the post-marketing period (Bp)”.

2.5. Research Methodology

To understand whether there are risky finished products or raw materials with EtO residues in Taiwan’s consumer market, the current study used preliminary research references to identify high-risk products and related food manufacturers to protect people from the risk of EtO. The steps of this study are detailed below and illustrated in Figure 2.
Step 1. The research topic was determined. To conduct a preliminary study on whether there are potentially risky finished products or raw materials containing EtO residues in Taiwan.
Step 2. The names of risky finished products or raw materials (RFPRMs) with EtO residues were collected and sorted.
Step 3. RFPRM data were matched with the FTMS data in the food cloud database to obtain a list of domestic manufacturers (or suppliers; MS) who produce or sell the same finished products or raw materials.
Step 4. The key risk factors required for the research were analyzed, matched, and organized. This step included the following sub-steps:
  • MS was used to match the RPFBS data in the food cloud database to obtain key risk factor 1—the Capital Amount (C) of MS. The reason this study used the C-factor is related to the manufacturer’s production volume, product circulation scope, and market share. These products are easily accessible to consumers and, therefore, closely related to food safety.
  • MS was used to match the RPFBS data in the food cloud database to obtain the key risk factor 2—the number of occupations (N) of MS. There are five categories of N factor: import industry, manufacturing industry, logistics industry, catering industry, and sales industry. More occupations mean that products can reach consumers through more extensive circulation channels or scopes. It is closely related to food safety.
  • MS was used to match the IFIS data in the food cloud database to obtain the key risk factor 3—“list of suppliers whose unqualified finished products or raw materials with EtO residues were detected during border inspection (referred to as Bb)”.
  • Calculate the number of finished products and raw materials that may have EtO residues to determine key risk factor 4—“quantity of finished products or raw materials that may involve EtO risk (referred to as Q)”.
  • MS was used to match the PMDS data in the food cloud database to obtain key risk factor 5—“Manufacturers whose finished products or raw materials with residues have been detected in Taiwan’s market (the factor is referred to as Bp)”.
The quantitative interpretation table of the aforementioned five factors is provided in Table 4.
Step 5. K-means clustering was used to cluster MS. Before that, the K-value was determined by asking “How many clusters must be determined first?”. This study used a simple and easy-to-understand elbow method to determine the K-value. First, from K = 2, the value started to gradually increase by 1; then, the SSE was calculated (Table 5) and a graph of the K-value and SSE was created (Figure 3). After K = 4, the curve began to decline slowly, and after K = 5, there was only an extremely small change in the curve. After observing the results, we chose K = 5 for the K-means clustering.
Step 6. The risk scores of each cluster were calculated, and the cluster with the highest cluster center (HCG) was selected as the initial target range for this study.
Step 7. The cluster with the highest risk score for the cluster center was selected, and the distance of each point in the cluster (each point represents a manufacturer or supplier) from the cluster center was calculated. When the distance between a point and the cluster center was small, the similarity between the point and the cluster center was high. In this study, points with shorter distances were prioritized for inspection.
Step 8. The inspection list was organized. After sorting according to Step 7, manufacturers whose target products or raw materials had sales records for the past 3 months in the FTMS were selected.
Step 9. A priority inspection list was obtained, and inspections were initiated. A list of priority inspection manufacturers was used to match the RFPRM produced or sold to facilitate the selection of samples during the inspection.
Step 10. Sampling was performed from February to June 2024, and the inspection results were evaluated.

3. Results

3.1. High-Risk Manufacturers Obtained after Clustering

To find the target of priority inspection under a limited inspection workforce and resources, this study used the theoretical method of “birds of a feather flock together” in K-means clustering to discover the potential risky product manufacturers, thereby enabling priority inspection. After sorting the international alert notification information from 2020 in this study, there were a total of 913 notifications on problematic finished products and raw materials with EtO residues. After further data screening and cleaning, 90 finished products, categories, and raw material names were identified. These were then matched with the names in the food cloud database to obtain detailed information on 1388 domestic risky product manufacturers (MS).
In this study, the value of K was set to 5 to perform the calculation of K-means clustering for the aforementioned 1388 manufacturers (MS). In the clustering process, waiting until the cluster center is no longer moving is necessary before it can be stopped. This process went through four calculations iteratively (Table 6). The minimum distance between the final cluster centers was 2.731 (clusters 4 and 5), and the longest distance was 8.162 (clusters 4 and 1), as shown in Table 7. This result indicated that the risk profiles of clusters 4 and 5 were more similar than that of cluster 1.
Further analysis of variance was performed on the five key risk factors, and the results showed that the C, N, Bb, and Q factors were suitable for inclusion in this study owing to the significant differences (*** p < 0.001) and independence among them. However, BPI was not suitable for inclusion in this study as a key risk factor at this stage, probably because no domestic manufacturer was found to involve EtO residue risk in their finished products or raw materials (Table 8). Based on the above, the four risk factors C, N, Bb, and Q were mainly used in this study for clustering.
The results showed that cluster 4 was the riskiest group of manufacturers. Among the five clusters established, the cluster center (CC) risk score of cluster 4 was 26, which was the highest among all (Table 9), and 212 manufacturers in total were included in the cluster. In this study, the four factors of C, N, Bb, and Q were first quantified, standardized, and unweighted; then, the scores of these four factors were summed up to obtain the risk score. Since K-means clustering did not use Bp in the cluster calculation process, it had a risk score of zero.

3.2. The Risk-Ranked List of Manufacturers Obtained by Euclidean Distance

In cluster analysis, each point represents a vendor, and the cluster center (CC) is the most representative virtual core of the cluster. In this study, the ED was calculated between each point in cluster 4 and the cluster center (CC). The smaller the ED, the higher the similarity between each vendor and the CC (the core with the highest risk). In this study, the conditions for selecting the preferred manufacturers for inspection were as follows: 1. Manufacturers with a relatively short ED; 2. More than two items manufactured or sold by the manufacturer must be finished products or raw materials that belong to international risk notification (Table 2); 3. The items manufactured or sold by the manufacturer were evaluated by experts as items frequently consumed by the Taiwanese and were also notified internationally as RFPRMs. Finally, 52 samples were carefully selected for priority sampling in this study, including sesame and its products, oats, vegetable sauce, ketchup, pepper, berry products, baked goods or ingredients, xanthan gum, sodium carboxymethyl cellulose, buckwheat, stevia leaves, coffee beans, chili (dry powder products), chocolate, carotene, cream, and rosemary (dried powder; Table 10). We believe that inspection programs for foods or additives, such as ice cream, instant noodles, guar gum, and locust bean gum, have been implemented in the past; therefore, they were not included in this study.
Testing 52 finished products and raw materials, all were found to be EtO-free. The result showed that the risk of finished food and raw materials containing EtO residues in Taiwan is low.

3.3. Verification of the Results

To verify the effectiveness and representativeness of K-means clustering sampling during the same research period, eight food science experts were invited to provide a suggested sampling list, and a total of 45 samples were collected, including cottage cheese, jam, sauces, and cheese. No EtO residue was detected, which suggested that K-means clustering has the same sampling effect as purposive sampling. Using K-means clustering could avoid the expenses arising from large-scale sampling due to “random sampling that requires a sufficient number of samples” or “purposeful sampling where it is difficult to determine the number of samples”. However, it can focus on high risks more scientifically and efficiently.

4. Discussion

4.1. Retaining the Border Anti-Blocking Policy

Since none of the products and raw materials evaluated in this study contained EtO residue risk, the risk of food with EtO residues in the market in Taiwan seemed to be relatively small. However, since the risk of EtO in food circulation is internationally still high, an EtO prevention policy focusing on borders should be a key to maintaining food safety at this stage.
Based on the international risk notification of EtO in food, the border of Taiwan started the “inspection of EtO in food” in August 2021 and gradually increased the sampling inspection rate for specific categories, namely, finished products or raw materials with EtO risk, as shown in Table 11. From 31 August 2021 to 30 November 2023, 4146 batches were inspected and 80 batches with EtO residues were found to be unqualified, with an unqualified rate of 1.9%. Unqualified food items included locust bean gum, instant noodles, ice cream, spices, curry products, dairy products, and other pastes and flavors. At present, the Taiwan border has strictly inspected and controlled the risky food items that have been notified internationally to achieve the purpose of comprehensively preventing the entry of problematic food into the country, thereby safeguarding the health of the people. These policies should be continued.

4.2. Use of K-Means Clustering to Select High-Risk Samples Can Reduce the Cost of Traditional Sampling

In the early stages of this study, to ensure domestic food safety, we reviewed food information from other countries that had reported EtO residues in the past, analyzed this information using K-means clustering, and selected 52 food samples with higher risks in sampling inspection (37 manufacturers). The method reduced the labor costs required for traditional random sampling or purposive sampling of at least 100 samples. When international warning cases of food with EtO residues appear, TFDA border control measures are immediately implemented to set up fences to block the food products or increase the sampling rate so that problematic products are immediately blocked from overseas and are not allowed to be imported. This study used K-means clustering to conduct post-market food sampling to detect EtO residues in food. The results showed that all samples qualified, hence indicating that food in Taiwan is safe.

4.3. Inspection May Be Planned for the Uninspected Manufacturers of Cluster 4

In this study, 52 finished products and raw materials from 37 manufacturers were sampled to confirm the presence of any EtO residue. In the previous section, we concluded that cluster 4 has the highest risk; further, the smaller the distance between a point and the cluster center, the greater the risk similarity. In this study, 212 manufacturers were in cluster 4 (Figure 4). The ED of the manufacturers in cluster 4 was approximately between 0.810 and 3.983. According to the screening principle described in Section 4.2, 52 manufacturers with the shortest ED of 0.810 were selected as priority sampling inspection objects. Sampling inspections were not conducted since some manufacturers’ products or raw materials were sold out or had insufficient inventory. In the future, with increased financial resources allocated toward conducting larger-scale sampling, it will be possible to sample and inspect manufacturers and products in cluster 4 that have not yet been assessed.

4.4. Other Algorithms May Be Used in the Future to Evaluate and Compare the Different Methods

This study used partitional clustering, a grouping algorithm in unsupervised learning, to solve the problem of the inability to mark data in advance because of the lack of unqualified food (including EtO) in Taiwan. K-means clustering is a partial clustering method. In addition to K-means clustering, agglomerative hierarchical clustering could also be used. It considers each sample as a cluster and continuously fuses similar samples from the bottom of the tree structure. If the number of groups generated is greater than the expected number of groups, the two groups with the closest distance would be aggregated until the number of groups falls within the conditional range. The completed clusters are presented in a tree structure called a “dendrogram” [29,30,31]. In addition, clustering algorithms are based on probability distributions, such as Gaussian mixture models. It uses multiple Gaussian probability density functions to quantify the distribution of features more accurately—that is, it is a statistical model that decomposes features into multiple Gaussian probability density functions [32,33].
All of the above methods can be used to plan future inspections, further compare their commonality (referring to the intersection of manufacturer lists under different methods), and list them as priority inspection objects. Artificial intelligence and machine learning can support the implementation of inspection management, helping ensure adherence to standards.

4.5. Research Limitations

Owing to the legal requirements related to government data, the names of manufacturers and products in this study cannot be presented in this paper to ensure information security and protection of sensitive data.
Empty capsules for food use are produced by four pharmaceutical factories in Taiwan. Previously, these factories primarily adhered to international drug regulations, which set the upper limit of EtO residue at “not to exceed 1 mg/kg”. After receiving counseling and making necessary corrections, they are now in compliance with the regulations. In addition, prior to conducting this study, inspections were conducted for EtO residue levels in domestically produced ice cream, instant noodles, guar gum, locust bean gum, and xanthan gum. The results indicated that the risk of EtO residue in these domestically manufactured products was low (EtO-free). Therefore, these products and raw materials are not included in the scope of this study. However, since the international residual risk of EtO remains high, stringent border control measures will continue to be implemented in order to ensure food safety.

5. Conclusions

The following conclusions were made from the study:
  • The use of international food safety alerts to strengthen border control can effectively ensure domestic food safety.
  • The use of K-means clustering to construct a risk analysis model that would verify the results of purposive sampling to ensure the safety of the product after it is launched could serve as a reference for food control agencies when planning similar projects.
Based on the results of the 52 samples tested in this study, EtO-contaminated products from the international alert appear to be relatively uncommon in Taiwan. Further analysis of the sources of the samples revealed that some were domestically produced in Taiwan while some were from abroad. This indicated that the use of EtO as a pesticide or fungicide to treat raw materials or finished products is relatively rare in Taiwan, or that the food manufacturers selected in this study used compliant raw materials and implemented self-management as per law. The forecasting model of this study may be further adjusted by adding risk parameters or factors and adjusting algorithms to identify the target manufacturer or product item more accurately. Future studies should aim to develop methods capable of accurately detecting problematic manufacturers and products to help ensure food safety.
Industrial food production has created global fears over food safety. Although the methods for collecting risk data may vary for different hazardous substances, the principles of using big data for risk analysis have not changed. With an initial grasp of internationally contaminated products and raw materials matched with food cloud big data and cross-ministerial information, we could identify domestic manufacturers and suppliers who sell the same products and raw materials. Moreover, through the risk analysis model established in this research, high-risk manufacturers and products could be identified and included in the inspection projects of public health departments. In addition, based on the results of border and domestic inspections, the risk analysis model could be further used to screen high-risk manufacturers and products. This “two-way feedback” model could strengthen border and domestic market supervision, effectively intercepting unqualified products; check domestic food; and ensure food safety.

Author Contributions

Conceptualization, L.-Y.W. and F.-M.L.; methodology, L.-Y.W.; software, L.-Y.W., F.-M.L. and W.-C.L.; validation, L.-Y.W., F.-M.L., H.-Y.L. and K.-F.L.; formal analysis, L.-Y.W. and J.-T.Q.; investigation, L.-Y.W. and F.-M.L.; resources, F.-M.L., W.-C.L., H.-Y.L. and K.-F.L.; data curation, L.-Y.W. and W.-C.L.; writing—original draft preparation, L.-Y.W. and J.-T.Q.; writing—review and editing, L.-Y.W. and F.-M.L.; visualization, F.-M.L. and W.-C.L.; supervision, F.-M.L.; project administration, L.-Y.W.; funding acquisition, no. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The original contributions presented in the study are included in the article. Further inquiries can be directed to the corresponding author.

Acknowledgments

We would like to express our gratitude to the Food and Drug Administration of the Ministry of Health and Welfare of Taiwan for approving the execution of this study in April 2023 (approval document No. 1122001801).

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A

Table A1. Results of previous purposive sampling inspection from 2022 to 2023.
Table A1. Results of previous purposive sampling inspection from 2022 to 2023.
Name of CategorySample NameCountry of OriginNumber of SamplesIs EtO Detected? (Number of Unqualified Samples)
Baked goodsPotato chips, square puff pastryTaiwan2No (0)
CocoaBlack cocoa massTaiwan1No (0)
Coffee beansCoffee beansTaiwan2No (0)
Edible oils and fatsSesame oil, fragrant pig oil, blended oil, white sesame oil, black sesame oil, sesame oil, peanut oil, shortening, camellia oilTaiwan, Thailand, South Korea58No (0)
Flour and its productsCorn starch, Camel brand purple camel all-purpose flour, Danish pineapple croissant, white breadChina, Taiwan4No (0)
Food additivesGuar gum, locust bean gum, H-Sodium carboxymethyl cellulose, H-Stevia leaf 094, Xanthan gumPakistan, Spain, China, Taiwan, France9No (0)
FruitsKyoho grapesTaiwan7No (0)
Ice cream and related productsIce cream, smoothies, popsiclesTaiwan, Japan, France, Australia, South Korea, Latvia, United States81No (0)
Instant noodles (noodles, oil packets, or seasoning powder packets)Beef noodle soup, dried noodles, nabe yaki egg noodles, bak kut teh flavor instant noodles, chicken instant vermicelli, spicy stinky tofu noodles, instant chow mein, laksa, rice noodles, mung bean thread, soba noodlesTaiwan, Vietnam, Malaysia, Japan, South Korea, Singapore, Indonesia, Thailand, China163Yes (2)
Milk and its productsItalian provolone cheese slices, pimento cheese, cheese powder, grated mozzarella cheese, mixed cheese, yogurt, diluted active fermented milk, skim milk powderUnited States, Malaysia, Taiwan, New Zealand14Yes (2)
NutUnsalted cashewsVietnam1No (0)
SaucesTomato sauce and vegetarian tomato sauceTaiwan4No (0)
Sesame and its productsBlack sesame, sesame powder, white sesame, black sesame paste and sauce, red sesameSri Lanka, Myanmar, India, Taiwan, Thailand, Paraguay, Togo, Bolivia48Yes (1)
Spices and productsBlack peppercorns, turmeric powder, chili powder, white pepper powder, black pepper powder, dried chili peppers, curry powder, green curry sauce, chili sauce, medium-spicy instant curry cubes, star anise, herb roast chicken powder, rosemary powder, steak spices, chicken seasoning, mixed spice seasoning powder, pepper and salt powder, red pepper powder, spicy soup powder, vanilla flavored seasoning powder (collagen), vanilla powder, New Orleans flavored seasoningSouth Africa, India, South Korea, Indonesia, China, Vietnam, Taiwan, Thailand, Japan, Malaysia, Morocco, United States, Spain61Yes (4)
Whole grainsMung beans, red beans, soybeans, quinoa, millet, white rice, barley, oatmeal, black beans, peanutsTaiwan, China, Argentina, Thailand, Australia, Peru34No (0)
Total 489Yes (9)
Table A2. Table of abbreviations.
Table A2. Table of abbreviations.
Abbreviated NameFull Name
CCCluster Center
CFIACanadian Food Inspection Agency
EDEuclidean Distance
EL V2Second-generation ensemble learning prediction model
EtOEthylene Oxide
FDAFood and Drug Administration
FSAFood Standards Agency
FSANZFood Standards Australia New Zealand
FSISFood Safety and Inspection Service
FTMSFood Traceability Management System
HCGHighest Cluster Center
IARCInternational Agency for Research on Cancer
IFISImport Food Information System
IMSInspection Management System
OSHAOccupational Safety and Health Administration
PMDSProduct Management Decision System
RASFFRapid Alert System for Food and Feed
RFPRMRisky Finished Products or Raw Materials with EtO
RPFBSRegistration Platform of Food Businesses System
SFASingapore Food Agency
SSESum of squared errors
TFCTaiwan Food Cloud

References

  1. European Union Directorate-General for Health and Food Safety. Ethylene Oxide Incident/Food Additive. Available online: https://food.ec.europa.eu/safety/acn/acn-incidents/ethylene-oxide-incident-food-additive_en (accessed on 22 January 2024).
  2. Japan Food Chemical Research Foundation. The Japanese Positive List System for Agricultural Chemical Residues in Foods. 2006. Available online: https://www.ffcr.or.jp/en/zanryu/the-japanese-positive/the-japanese-positive-list-system-for-agricultural-chemical-residues-in-foods-enforcement-on-may-29-.html#:~:text=The%20maximum%EE%80%80%20residue (accessed on 10 May 2024).
  3. Ministry of Food and Drug Safety. Pesticides and Veterinary Drugs Information. 2015. Available online: http://www.foodsafetykorea.go.kr/residue/prd/mrls/list.do?currentPageNo=4&searchCode=&searchFoodCode=&menuKey=1&subMenuKey=161&subChildMenuKey=&searchConsonantFlag=en&searchConsonantFlag2=en&searchValue2=E&searchFlag=prd&searchClassLCode=&searchClassMCode=&searchClassScode=&searchValue=E (accessed on 10 May 2024).
  4. Government of Canada. Maximum Residue Limits Search. 2020. Available online: https://pest-control.canada.ca/pesticide-registry/en/mrl-search.html (accessed on 10 May 2024).
  5. Ministry of Health and Welfare. Standards for Specification, Scope, Application and Limitation of Food Additives. 2023. Available online: http://consumer.fda.gov.tw/Law/FoodAdditivesList.aspx?nodeID=521&rand=342050149 (accessed on 10 May 2024).
  6. Healy, K.; Ducheyne, P.; Hutmacher, D.W.; Grainger, D.W.; Kirkpatrick, C.J. Comprehensive Biomaterials II, 2nd ed.; Elsevier: Amsterdam, The Netherlands, 2017. [Google Scholar]
  7. International Agency for Research on Cancer. Ethylene Oxide. Chem. Agents Relat. Occup. 2012, 100F, 379–400. [Google Scholar]
  8. NTP (National Toxicology Program), 2021; Report on Carcinogens, Fifteenth Edition. Research Triangle Park, NC: U.S. Department of Health and Human Services, Public Health Service. Available online: https://ntp.niehs.nih.gov/go/roc15 (accessed on 15 June 2024).
  9. U.S. Department of Labor Occupational Safety and Health Administration. 2002; Ethylene Oxide Fact Sheet. Available online: https://www.osha.gov/sites/default/files/publications/ethylene-oxide-factsheet.pdf (accessed on 10 May 2023).
  10. Food and Drug Administration, Ministry of Health and Welfare of Taiwan. Surveillance of Ethylene Oxide Residues in Imported and Post-Market Food Products. Annu. Rep. Food Drug Res. 2023, 14, 90–96. Available online: http://www.fda.gov.tw/TC/PublishResearch.aspx?pn=2 (accessed on 20 May 2023).
  11. Taiwan Food and Drug Administration Annual Report. 2023, 2, 33–35. Available online: http://www.fda.gov.tw/upload/ebook/AnnuaReport/2023/2023_E/index.html (accessed on 10 May 2023).
  12. Powell, M.R. Optimal food safety sampling under a budget constraint. Risk Anal. 2014, 34, 93–100. [Google Scholar] [CrossRef] [PubMed]
  13. Wang, Z.; Van Der Fels-Klerx, H.J.; Lansink, A.G.J.M.O. Optimization of sampling for monitoring chemicals in the food supply chain using a risk-based approach: The case of aflatoxins and dioxins in the dutch dairy chain. Risk Anal. 2020, 40, 2539–2560. [Google Scholar] [CrossRef] [PubMed]
  14. Zahra, S.; Ghazanfar, M.A.; Khalid, A.; Azam, M.A.; Naeem, U.; Prugel-Bennett, A. Novel centroid selection approaches for KMeans-clustering based recommender systems. Inf. Sci. 2015, 320, 156–189. [Google Scholar] [CrossRef]
  15. Harish, A.S.; Malathy, C. Customer segment prediction on retail transactional data using K-means and Markov model. Intell. Autom. Soft Comput. 2023, 36, 589–600. [Google Scholar] [CrossRef]
  16. Sarkar, P.; Roy, B.; Gupta, M.; De, S. Melanoma cell detection by using kmeans clustering segmentation and abnormal cell detection technique. In Artificial Intelligence on Medical Data. Lecture Notes in Computational Vision and Biomechanics; Gupta, M., Ghatak, S., Gupta, A., Mukherjee, A.L., Eds.; Springer: Singapore, 2023; Volume 37, pp. 193–202. [Google Scholar] [CrossRef]
  17. Wickramasinghe, A.; Muthukumarana, S.; Loewen, D.; Schaubroeck, M. Temperature clusters in commercial buildings using k-means and time series clustering. Energy Inform. 2022, 5, 1. [Google Scholar] [CrossRef] [PubMed]
  18. Mohsen, A.; Kovács, F.; Mezősi, G.; Kiss, T. Sediment transport dynamism in the confluence area of two rivers transporting mainly suspended sediment based on sentinel-2 satellite images. Water 2021, 13, 3132. [Google Scholar] [CrossRef]
  19. Miolla, R.; Palmisano, G.O.; Roma, R.; Caponio, F.; Difonzo, G.; De Boni, A. Functional foods acceptability: A consumers’ survey on bread enriched with oenological by-products. Foods 2023, 12, 2014. [Google Scholar] [CrossRef] [PubMed]
  20. Liu, Y.; Zhang, Q.; Dong, W.; Li, Z.; Liu, T.; Wei, W.; Zuo, M. Autoformer-based model for predicting and assessing wheat quality changes of pesticide residues during storage. Foods 2023, 12, 1833. [Google Scholar] [CrossRef]
  21. Ferreira-Paiva, L.; Locatel Suela, A.G.; Alfaro-Espinoza, E.R.; Cardona-Casas, N.A.; Valente, D.S.M.; Neves, R.V.A. A k-means-based-approach to analyze the emissions of GHG in the municipalities of MATOPIBA region, Brazil. IEEE Lat. Am. Trans. 2022, 20, 2339–2345. [Google Scholar] [CrossRef]
  22. Han, J.; Kamber, M. Data Mining: Concepts and Techniques; Morgan Kaufmann: San Francisco, CA, USA, 2001. [Google Scholar]
  23. Blazejczyk, K.; Epstein, Y.; Jendritzky, G.; Staiger, H.; Tinz, B. Comparison of UTCI to selected thermal indices. Int. J. Biometeorol. 2012, 56, 515–535. [Google Scholar] [CrossRef] [PubMed]
  24. Karimi, L.; Joshi, J. An Unsupervised Learning Based Approach for Mining Attribute Based Access Control Policies. In Proceedings of the 2018 IEEE International Conference on Big Data, Seattle, WA, USA, 10–13 December 2018; pp. 1427–1436. [Google Scholar] [CrossRef]
  25. Thorndike, R.L. Who belongs in the family? Psychometrika 1953, 18, 267–276. [Google Scholar] [CrossRef]
  26. Bholowalia, P.; Kumar, A. EBK-Means: A clustering technique based on elbow method and K-means in WSN. Int. J. Comput. Appl. 2014, 105, 17–24. [Google Scholar] [CrossRef]
  27. Shi, C.; Wei, B.; Wei, S.; Wang, W.; Liu, H.; Liu, J. A quantitative discriminant method of elbow point for the optimal number of clusters in clustering algorithm. Eurasip. J. Wirel. Commun. Netw. 2021, 31. [Google Scholar] [CrossRef]
  28. Food Safety Information Network. Available online: https://www.ey.gov.tw/ofs/A236031D34F78DCF (accessed on 22 July 2022).
  29. Shibing Zhou, S.; Xu, Z.; Liu, F. Method for determining the optimal number of clusters based on agglomerative hierarchical clustering. IEEE Trans. Neural Netw. Learn. Syst. 2017, 28, 3007–3017. [Google Scholar] [CrossRef] [PubMed]
  30. Díaz, J.J.V.; Pozo, R.F.; González, A.B.R.; Wilby, M.R.; Ávila, C.S. Hierarchical agglomerative clustering of bicycle sharing stations based on ultra-light edge computing. Sensors 2020, 20, 3550. [Google Scholar] [CrossRef] [PubMed]
  31. Kawa, N.; Cucchi, K.; Rubin, Y.; Attinger, S.; Heße, F. Defining hydrogeological site similarity with hierarchical agglomerative clustering. Ground Water 2023, 61, 563–573. [Google Scholar] [CrossRef] [PubMed]
  32. Valenti, M.; Ferrigno, G.; Martina, D.; Yu, W.; Zheng, G.; Shandiz, M.A.; Anglin, C.; Momi, E.D. Gaussian mixture models based 2D-3D registration of bone shapes for orthopedic surgery planning. Med. Biol. Eng. Comput. 2016, 54, 1727–1740. [Google Scholar] [CrossRef] [PubMed]
  33. McNicholas, P.D.; Murphy, T.B. Model-based clustering of microarray expression data via latent Gaussian mixture models. Bioinformatics 2010, 26, 26–2705. [Google Scholar] [CrossRef] [PubMed]
Figure 1. Schematic diagram of distortion score elbow for K-means clustering.
Figure 1. Schematic diagram of distortion score elbow for K-means clustering.
Foods 13 02510 g001
Figure 2. Flow chart of this study.
Figure 2. Flow chart of this study.
Foods 13 02510 g002
Figure 3. Diagram of distortion score elbow for K-means clustering. (The red circle represents K = 5).
Figure 3. Diagram of distortion score elbow for K-means clustering. (The red circle represents K = 5).
Foods 13 02510 g003
Figure 4. Euclidean distance distribution of manufacturers in cluster 4.
Figure 4. Euclidean distance distribution of manufacturers in cluster 4.
Foods 13 02510 g004
Table 1. List of non-compliant samples of previous post-market inspection.
Table 1. List of non-compliant samples of previous post-market inspection.
No.Name of CategorySample NameOriginEtO Value (mg/kg)
1Sesame raw materialsBlack sesame (Premium product)India1.340
2Spicy seasoningBan Cabe Chili Powder—extra spicy level 30India14.610
3McCormick Garlic Mixed Spice Seasoning PowderUSA0.100
4BonCabe Seasoned Chili Powder—medium spicy level 10Indonesia39.340
5Xianguo brand Indian curry powderNine countries *16.497
6Cottage cheeseFormaggio Fresh MozzarellaUSA1.407
7Sonoma Sincerely Brigitte Variety CheeseUSA0.200
8Instant noodles (noodles, oil packets, or powder packets)Penang Ah Lai White Curry Noodle—noodle/sauce packetsMalaysia0.065/0.084
9Indomie Instant Noodles Rasa Ayam Special (Special Chicken Flavor)—seasoning powder packetIndonesia0.187
Note: * Products manufactured in Taiwan. The factory inspection results showed no EtO risk in the manufacturing process, and the cause of this problem is likely from the raw material source countries.
Table 2. The five main systems of Taiwan Food Cloud.
Table 2. The five main systems of Taiwan Food Cloud.
Data SourceSupplementary Note
Registration Platform of Food Business System (RPFBS)According to Article 8 of the Taiwan Food Safety and Sanitation Management Act, food businesses whose category and scale have been announced need to register with RPFBS to operate. The system records the basic information of food companies in different industries, such as the product category and name with the largest turnover, business category, etc.
Food Traceability Management System (FTMS)According to Article 9 of the Taiwan Food Safety and Hygiene Management Law, food businesses whose categories and scales have been announced must electronically declare and establish the source and flow of raw materials, semi-finished products, and finished products. Once a food safety incident occurs, the food industry and health authorities can quickly deal with the problematic products. This study used the raw materials and finished products declared by FTMS to connect risky raw materials and the finished products that may be related to international public opinion to obtain a list of operators that may have EtO risks.
Inspection Management System (IMS)The database of this system integrates product test data from central health authorities and local government health bureaus.
Product Management Decision System (PMDS)The database of this system is the integrated data established by the central health authority and the local government health bureau after the inspection and testing of food in the post-market stage. The content of the information includes the name of the product, the sampling site, the country of origin, and the inspection and test results.
Import Food Information System (IFIS)TFDA established this system for the inspection of imported food in 2011. The database content of the system includes imported food inspection information, such as product name, country of origin, net weight, inspection method, and test results.
Table 3. Finished products or raw materials with EtO residues in the international risk notification (data collection period: 1 January 2021 to 31 March 2023).
Table 3. Finished products or raw materials with EtO residues in the international risk notification (data collection period: 1 January 2021 to 31 March 2023).
Serial NumberFinished Products or Raw Materials with EtO ResiduesSubmitted Number of NotificationsTotal Number of Notifications
1Sesame and its productsSesame461470
2Sesame paste5
3Sesame oil4
4Locust bean gum5757
5Xanthan gum2121
6Guar gum1212
7Baked goods or ingredientsBread1321
8Pastry3
9Biscuit2
10Baking powder1
11Other baked cooking and steaming foods2
12Ice cream2121
13Instant noodles3131
14Vanilla2020
15Ginger (powder)1313
16Turmeric1313
17Curry productsCurry powder510
18Curry4
19Curry paste1
20Chili99
21Black pepper99
22Psyllium77
23Cinnamon55
24Fenugreek44
25Dairy and its productsMilk powder25
26Cheese1
27Fermented milk1
28Cream1
29Noodle products and their raw materialsNoodle24
30Flour2
31Capsule44
32Calcium carbonate77
33Sodium carboxymethyl cellulose22
34Barbecue productsBarbecue powder13
35Barbecue sauce2
36Ketchup22
37Ashwagandha44
38Okra44
39Amaranth33
40Onion33
41Dehydrated vegetables55
42Moringa55
43Chia seeds22
44Chickpeas22
45Barley grass22
46Kidney bean22
47Shallot22
48Gotu Kola22
49Curry leaves22
50Bacopa monnieri22
51Coriander seeds (Caraway seeds)22
52Carob powder (Chocolate substitute)22
53Coconut productsCoconut milk12
54Coconut juice1
55–90Carotene, Chocolate, Coffee beans, Buckwheat, Oat, Rosemary, Stevia leaves, Barley grass powder, Basil, Cardamom, Celery, Clove, Coleus forskohlii, Cranberry, Cumin seeds, Flaxseed, Ginseng, Gluten, Green beans, Hawthorn, Indian fennel, Lemon, Lemon Leaf, Lentils, Meat, Meatballs, Melatonin, Mustard seed, Nutmeg, Quinoa, Spirulina Powder, Sunflower seeds, Tangerine, Tea, Vitamin B12, Wheatgrass powderOne case of each item
Note: 1. This table is sorted by the number of notifications from large to small. 2. A total of 81 cases had unclear names of finished products or raw materials and could not be included in the table.
Table 4. Interpretation table of factor quantification.
Table 4. Interpretation table of factor quantification.
FractionRisk Factors from ManufacturersRisk Factors from Finished Food Products or Raw Materials
CNBpQBb
10More than 100 million51069–1169510
830 million to 100 million4 20–68
6200,000 to 30 million3 7–19
41–200,0002 3–6
20 or null1 1–2
1 1 1
0-- -
Table 5. Sum of the squared errors (SSE) for the value of K.
Table 5. Sum of the squared errors (SSE) for the value of K.
KDistance ScoreKDistance ScoreKDistance Score
64939.79113541.50
29842.5974352.80123483.60
37355.3183892.47133128.03
45719.7593592.26142823.35
55319.43104064.35152743.12
Table 6. Iterative process.
Table 6. Iterative process.
Iterative CalculationsCluster Center Changes
12345
13.1134.4113.0233.5772.701
20.5850.7700.2280.3330.157
30.1540.1560.0130.0000.000
40.0000.0000.0000.0000.000
Note: Convergence is reached when there are no or minor changes in the cluster centers. The absolute coordinate changes were capped at 0.000 for all centers. The number of iterations was four. The minimum distance between the start centers was 8.307.
Table 7. Distance between final cluster centers.
Table 7. Distance between final cluster centers.
Cluster12345
1 4.5894.1688.1626.232
24.589 4.9264.6432.965
34.1684.926 5.6644.518
48.1624.6435.664 2.731
56.2322.9654.5182.731
Table 8. Analysis of variance.
Table 8. Analysis of variance.
VariableClusterErrorFSignificance
Mean SquareDegrees of FreedomMean SquareDegrees of Freedom
C974.44240.72613561341.3610.000
N259.75341.4511356178.9620.000
Bb13.81640.210135665.8490.000
Q2382.17441.53513561551.6860.000
Bp0.00040.0001356..
Table 9. Final cluster center.
Table 9. Final cluster center.
VariableCluster
12345
C66999
N44664
Bb01011
Q374108
Bp00000
Risk score for the cluster1318192622
Number of manufacturers in each cluster438342177212192
Note: There is a missing value of 27. This means that 27 manufacturers could only be independent individuals and not be classified into the five clusters in the table; however, such a result does not mean that they are risk-free.
Table 10. Summary table of testing finished products or raw materials for EtO.
Table 10. Summary table of testing finished products or raw materials for EtO.
Serial NumberName of CategorySample NameProduct NameCountry of OriginIs EtO Detected?Government Uniform Invoice Number *
1Baked goods or ingredientsCookieOyster omelet potato chipsTaiwanNo56670000
2Flour and its productsSuper Bread Flour No. 1TaiwanNo22522000
3Egg square cookiesTaiwanNo66600000
4White breadTaiwanNo27610000
5Danish pineapple croissantTaiwanNo73250000
6Camel Brand Purple Camel Heart FlourTaiwanNo28427000
7CocoaChocolateDark Cocoa MassTaiwanNo27359000
8Chocolate golden candyMalaysiaNo84150000
9Coffee beansCoffee beansFandi Roasted Special Coffee Bean Manor Italian Coffee Bean, Northern Italian StyleTaiwanNo86017000
10Mandheling and Brazilian coffeeTaiwanNo86570000
11Food AdditivesCaroteneβ-carotene 1% CWS/MAustralia, New ZealandNo16930000
12β-carotene 1% powderChinaNo5013000
13Sodium carboxymethyl celluloseH-Sodium carboxymethyl celluloseChinaNo86017000
14Sodium carboxymethyl celluloseTaiwanNo80130000
15Xanthan gumXanthan gumTaiwanNo3028000
16Xanthan gum 80SFChinaNo3341000
17Xanthan gum RDChinaNo24547000
18Milk and its productsButterSalted butterNew ZealandNo31264000
19New Zealand salted butterNew ZealandNo15319000
20CheeseCheese powderTaiwanNo23520000
21High-melt cheese (processed cheddar)TaiwanNo80610000
22CreamUnicorn Cream (No Salt Added)Australia, New ZealandNo70443000
23Fermented milkDiluted fermented milk (less sugar and fibers)TaiwanNo12680000
24Original yogurtTaiwanNo73250000
25Milk powderSkimmed milk powderAustraliaNo73250000
26Whole milk powderNew ZealandNo16930000
27Salad sauceLemon yogurt salad sauceNew ZealandNo22880000
28Other plant productsRosemary (dried powder)Rosemary leavesTaiwanNo86384000
29Rosemary powderTaiwanNo53260000
30Stevia leavesH-Stevia Leaf 094TaiwanNo86017000
31Vanilla and its productsVanilla chicken baking powderTaiwanNo73000000
32Vanilla powderTaiwanNo89410000
33Vanilla flavoring powder (collagen)TaiwanNo23520000
34Healthy food (plant juices and extracts)Healthy food with ginger and turmeric extractSentosa Curcumin and Zn Male Complex TabletsTaiwanNo12197000
35Grape King Yi Turmeric Ex Flagship CapsulesTaiwanNo11880000
36SaucesBarbecue sauceBill Head Barbecue Sauce (Sha Cha)TaiwanNo80550000
37KetchupKetchupTaiwanNo23930000
38KetchupTaiwanNo3028000
39Veggie KetchupTaiwanNo3028000
40Tomato pasteChinaNo24547000
41SesameSesame and its productsNatural and unsweetened Extra Thick Black Sesame PasteTaiwanNo81133000
42SpicesChili (dry powder products)Spicy Soup PowderTaiwanNo24293000
43100% Pure White PepperMalaysia and IndonesiaNo31264000
44Chili powderNew ZealandNo70470000
45Curry ingredientsSeasoned Curry powderTaiwanNo86384000
46Curry ProductsIssuta Medium Spicy Instant CurryTaiwanNo86384000
47PepperExtra hot pepperTaiwanNo97301000
48Pure White Pepper AFTaiwanNo23930000
49Whole grainsBuckwheatGolden Buckwheat Tea BagsTaiwanNo86017000
50OatMayushan High-Fiber Large OatmealTaiwanNo81133000
51OatsNew ZealandNo59660000
52Oats with red yeast rice and buckwheatTaiwanNo22100000
Note: * The government uniform invoice number has been de-identified.
Table 11. EtO inspection table for foods at the border of Taiwan (data time interval: from 31 August 2021 to 30 November 2023).
Table 11. EtO inspection table for foods at the border of Taiwan (data time interval: from 31 August 2021 to 30 November 2023).
Name of the Category (Start Time of Inspection)Number of Inspection BatchesNumber of Unqualified BatchesUnqualified Rate (%)
Sesame (31 August 2021)13100
Guar gum (18 October 2021)1300
Locust bean gum (18 October 2021)1715.9
Flour
Instant noodles (with and without meat; 1 March 2022)
Other noodles (other pasta and couscous; 1 November 2022)
1640442.7
Empty capsule
Empty capsules and medicinal capsules containing medicines (15 March 2022)
2500
Ice cream and related products
Ice cream (15 March 2022)
Other edible ice (1 December 2022)
35820
Spices
Pepper, capsicum, all spices, and herbs (1 November 2022)
Fennel, coriander seeds, turmeric, cinnamon, cardamom, and cloves (2 March 2023)
Bay leaves (31 October 2023)
315247.6
Curry Products (10 February 2023)
Curry powder, curry cubes, and curry paste
6211.6
Rice (1 April 2023)
Brown rice, glutinous rice, and white rice
8000
Cereals (1 April 2023)
Wheat, rye, barley, maize, buckwheat, millet, and quinoa
11100
Nuts (1 April 2023)
Brazil nuts, cashews, almonds, hazelnuts, walnuts, chestnuts, pistachios, macadamias, ginkgo nuts, and pine nuts
7500
Dried beans (1 April 2023)
Peas, chickpeas, mung beans, black beans, red beans, kidney beans, lentils, and broad beans
4600
Seeds (1 April 2023)
Peanuts, pumpkins, sunflower seeds, and melon seeds
2500
Coffee beans (1 April 2023)
Roasted and unroasted
9500
Dairy products (1 April 2023)
Condensed milk, butter, cream, cheese, anhydrous butter, fresh milk, long-life milk, flavored milk, and yogurt
34020.6
Wheat gluten powder (1 April 2023)4--
Xanthan gum (1 April 2023)300
Chocolate products (1 August 2023)
Cocoa beans, cocoa paste, cocoa butter, cocoa powder, and other chocolate preparations
12200
Tomato sauce products (1 August 2023)
Tomato ketchup and other tomato sauces
2700
Sweetener (1 August 2023)200
Other paste and flavors (21 August 2023)65360.9
Jam (27 September 2023)200
Mustard oil (4 December 2023)100
Total4146801.9
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Wu, L.-Y.; Liu, F.-M.; Lin, W.-C.; Qiu, J.-T.; Lin, H.-Y.; Lin, K.-F. Research on Using K-Means Clustering to Explore High-Risk Products with Ethylene Oxide Residues and Their Manufacturers in Taiwan. Foods 2024, 13, 2510. https://doi.org/10.3390/foods13162510

AMA Style

Wu L-Y, Liu F-M, Lin W-C, Qiu J-T, Lin H-Y, Lin K-F. Research on Using K-Means Clustering to Explore High-Risk Products with Ethylene Oxide Residues and Their Manufacturers in Taiwan. Foods. 2024; 13(16):2510. https://doi.org/10.3390/foods13162510

Chicago/Turabian Style

Wu, Li-Ya, Fang-Ming Liu, Wen-Chou Lin, Jing-Ting Qiu, Hsu-Yang Lin, and King-Fu Lin. 2024. "Research on Using K-Means Clustering to Explore High-Risk Products with Ethylene Oxide Residues and Their Manufacturers in Taiwan" Foods 13, no. 16: 2510. https://doi.org/10.3390/foods13162510

APA Style

Wu, L. -Y., Liu, F. -M., Lin, W. -C., Qiu, J. -T., Lin, H. -Y., & Lin, K. -F. (2024). Research on Using K-Means Clustering to Explore High-Risk Products with Ethylene Oxide Residues and Their Manufacturers in Taiwan. Foods, 13(16), 2510. https://doi.org/10.3390/foods13162510

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop