5.4. Correlation Analysis
Correlation analysis is a valuable statistical tool for determining the strength and direction of the linear relationship between two variables, which also enables the evaluation of their association [
51]. A correlation coefficient between −1 and 1 is typically used to determine the strength and direction of the relationship between the variables [
51]. In this sense, a correlation coefficient of 1 signifies a perfect positive correlation, while a correlation coefficient of −1 indicates a perfect negative correlation. On the other hand, a correlation coefficient of 0 suggests no linear association between the variables. The Spearman correlation coefficient is a nonparametric method that gauges the rank correlation and does not assume a normal data distribution [
51]. This study used the Spearman correlation coefficient to assess the relationships between the variables. The outcomes of the correlation analysis between variables are presented in
Table 7.
The Spearman correlation coefficient between IU and PU, PEOU, SAT, MGC, and UGC showed that IU had a strong positive correlation with MGC (r = 0.920), a strong positive correlation with PU (r = 0.863), POUE (r = 0.870), and SAT (r = 0.863), and a moderate positive correlation with UGC (r = 0.759). These results suggest that IU is influenced by all five variables, with MGC being the most influential factor. Likewise, the findings indicate a strong positive correlation between PU and all the other variables. The strongest correlation is between PU and PEOU (r = 0.990), which means that the PU and PEOU of social media platforms are directly related. If the PU is high, the PEOU will also be high, and vice versa. The correlation between PU and IU is also strong and positive (r = 0.863), which means that as the PU of social media platforms increases, so does the IU of the social media platforms to plan travel to Saudi Arabia, and vice versa. The correlation between PU and SAT is also strong and positive (r = 0.858), which indicates that satisfaction with social media platforms increases with their PU of the social media platforms to plan travel to Saudi Arabia, and vice versa. Furthermore, there is a strong and positive correlation (r = 0.958) between PU and MGC. This implies that the greater the PU of social media platforms, the greater the motivation to use MGC on the social media platforms to plan travel to Saudi Arabia, and vice versa. The weakest correlation is between PU and UGC (r = 0.834), which means that the PU of social media platforms has a lesser impact than the other variables on the motivation to use UGC on the social media platforms to plan travel to Saudi Arabia.
According to the results, there is also a strong positive correlation between PEOU and all other variables. The correlation between PEOU and SAT is also strong and positive (r = 0.859), which means that if social media platforms have a high PEOU, users are more likely to be satisfied with the social media platforms to plan travel to Saudi Arabia. Similarly, if users are satisfied with the platforms, they are more likely to perceive them as easy to use. The correlation between PEOU and MGC is strongly positive (r = 0.964), which means that the more accessible social media platforms are used, the more motivated users are to use MGC on the social media platforms to plan travel to Saudi Arabia. The weakest correlation is between PEOU and UGC (r = 0.829), which means that, although the more user-friendly social media platforms are, the more likely people are to use UGC on social media platforms to plan travel to Saudi Arabia, this relationship is less significant than other factors.
The second strongest correlation is between SAT and UGC (r = 0.925), which is a strong positive correlation. This means that the higher the satisfaction with the social media platforms, the higher the motivation to use UGC on social media platforms to plan travel to Saudi Arabia, and vice versa. The correlation between SAT and IU is also strong and positive (r = 0.863), which means that the higher the satisfaction with social media platforms, the higher the intention to use social media platforms to plan travel to Saudi Arabia, and vice versa. The correlation between SAT and PEOU is also strong and positive (r = 0.859), which means that the higher the satisfaction with social media platforms, the higher the PEOU of the social media platforms, and vice versa. The correlation between SAT and MGC is strongly positive (r = 0.905), which means that the higher the satisfaction with social media platforms, the higher the motivation to use MGC on the social media platforms to plan travel to Saudi Arabia, and vice versa. The weakest correlation is between SAT and PU (r = 0.858), which means satisfaction with social media platforms positively correlates with PU, but that this relationship is relatively weak compared to other variables.
The results of the correlation analysis of MGC with all variables show that the strongest correlation is between MGC and PEOU (r = 0.964), which indicates that when there is a high motivation to use MGC on social media platforms for planning travel to Saudi Arabia, there is also a high PEOU of those platforms for the same purpose, and vice versa. The correlation between MGC and IU is also strong and positive (r = 0.920), which means that the motivation to use MGC on social media platforms increases, so does the use of social media platforms to plan travel to Saudi Arabia, and vice versa. There is a strong positive correlation (r = 0.905) between the motivation to use MGC on social media for planning trips to Saudi Arabia and the satisfaction with social media platforms for such planning. This means that as the motivation to use MGC increases, so does the satisfaction with social media platforms for trip planning, and vice versa. The correlation between MGC and PU is also strong and positive (r = 0.958), which means that if a tourist is highly motivated to use MGC for planning travel to Saudi Arabia on social media platforms, the tourist is likely to have a higher PU of those platforms for travel planning purposes, and vice versa. The weakest correlation is between MGC and UGC (r = 0.833), which is still a strong positive correlation. This means that the level of motivation to use MGC or UGC on social media platforms for planning travel to Saudi Arabia is directly proportional, but this relationship is weaker than with other variables.
The correlation analysis of UGC with all variables shows that the strongest correlation is between UGC and SAT (r = 0.925), which is a strong positive correlation. This means that the higher the motivation to use UGC on social media platforms to plan travel to Saudi Arabia, the higher the satisfaction with the social media platforms, and vice versa. The correlation between UGC and PU is also strong and positive (r = 0.834), which implies that the more motivated a tourist is to use UGC on social media platforms to plan their travel to Saudi Arabia, the higher their PU of the social media platforms will be, and vice versa. The correlation between UGC and PEOU is also strong and positive (r = 0.829), which indicates that higher motivation to use UGC on social media platforms to plan travel to Saudi Arabia leads to higher PEOU of social media platforms, and vice versa. The correlation between UGC and MGC is also strong and positive (r = 0.833), indicating a that greater desire to use UGC on social media platforms for planning travel to Saudi Arabia is associated with a greater desire to use MGC, and vice versa. The weakest correlation is between UGC and IU (r = 0.759), which is still a strong positive correlation. This means that the higher the motivation to use UGC on social media platforms for planning travel to Saudi Arabia, the greater the IU for social media platforms, but that this relationship is less significant compared to other factors.
A heatmap illustrating the results of the correlation analysis is shown in
Figure 7.
5.5. Hypothesis Testing
This study investigated the behavior of tourists on social media by identifying their utilization of social media as a new technology when planning to travel to Saudi Arabia and determining the factors that influence their decisions. The tourist-based theoretical framework hypotheses were developed and tested with OLS to achieve this study’s objectives.
5.5.1. The Influence of PU, PEOU, SAT, MGC, and UGC
This study suggested that tourists’ intentions to use social media platforms for planning travel to Saudi Arabia are influenced by all the variables, PEOU, PU, SAT, MGC, and UGC.
The OLS results of H1: PU positively affects tourists’ intentions to use social media for planning travel to Saudi Arabia. The OLS results give a coefficient of 0.9706, with a standard error of 0.019, a t-value of −4.279, and a p-value of 0.000011. Since the p-value is less than 0.05, the null hypothesis can be rejected, and the alternative hypothesis can be accepted. Therefore, the results support H1, which means that PU positively impacts tourists’ intentions to use social media for planning travel to Saudi Arabia. In simple terms, tourists are more likely to use social media for travel planning if they find it useful.
The OLS results of H2: PEOU positively affects tourists’ intentions to use social media for planning travel to Saudi Arabia, since the coefficient is 0.9086, with a standard error of 0.016, a t-value of −3.294, and a p-value of 0.000524. Since the p-value is less than 0.05, the null hypothesis can be rejected, and the alternative hypothesis can be accepted. Therefore, the results support H2, which means that PEOU positively impacts tourists’ intentions to use social media for planning travel to Saudi Arabia. In simple terms, tourists are more likely to use social media for travel planning if they find it easy to use.
The OLS results of H3: SAT positively affects tourists’ intentions to use social media for planning travel to Saudi Arabia, since the coefficient is 1.7337, with a standard error of 0.041, a t-value of 4.190, and a p-value of 0.000016. Since the p-value is less than 0.05, the null hypothesis can be rejected, and the alternative hypothesis can be accepted. Therefore, the results support H3, which means that SAT positively impacts tourists’ intentions to use social media for planning travel to Saudi Arabia. In simple terms, tourists are more likely to use social media for travel planning if they are satisfied with the content on social media platforms.
The OLS results of H4: MGC positively affects tourists’ intentions to use social media for planning travel to Saudi Arabia, since the coefficient is 1.1389, with a standard error of 0.016, a t-value of −3.966, and a p-value of 0.000041. Since the p-value is less than 0.05, the null hypothesis can be rejected, and the alternative hypothesis can be accepted. Therefore, the results support H4, which means that MGC positively impacts tourists’ intentions to use social media for planning travel to Saudi Arabia. In simple terms, tourists are more likely to use social media for travel planning if they are satisfied with MGC on social media platforms.
The OLS results of H5: UGC positively affects tourists’ intentions to use social media for planning travel to Saudi Arabia, since the coefficient is 1.6279, with a standard error of 0.053, a t-value of 3.946, and a p-value of 0.000045. Since the p-value is less than 0.05, the null hypothesis can be rejected, and the alternative hypothesis can be accepted. Therefore, the results support H5, which means that UGC positively impacts tourists’ intentions to use social media for planning travel to Saudi Arabia. In simple terms, tourists are more likely to use social media for travel planning if they are satisfied with UGC on social media platforms.
Based on the hypothesis testing conducted, it was found that tourists have a positive attitude towards using social media for planning travel in Saudi Arabia. This is mainly due to the user-friendly and useful features provided by various social media platforms. Also, the content related to Saudi Arabia’s tourism on social media, whether MGC or UGC, significantly impacts tourists’ decision-making processes, ultimately leading to a better travel experience. The results of testing hypotheses
H1 to
H5 are presented in
Table 8.
5.5.2. The Influence of the Tourists’ Characteristics
This study delves into the impact of MGC and UGC on the intention of tourists to use social media platforms for planning travel to Saudi Arabia. To examine how tourist characteristics affect their intention to use social media to plan travel to Saudi Arabia, the hypotheses were tested using OLS.
H6 and H7 examine how MGC and UGC influence the intention of tourists, based on their gender, to use social media for planning travel to Saudi Arabia. The OLS results of “H6: The effect of MGC on tourists’ intentions to use social media for planning travel to Saudi Arabia will differ by gender”, reveal that the coefficient is 0.3620, with a standard error of 0.075, a t-value of 4.848, and a p-value of 0.000031, indicating a significant effect of gender on the impact of MGC, meaning that male tourists are more likely to be influenced by MGC than female tourists when making travel plans to Saudi Arabia. On the other hand, the OLS results of “H7: The effect of UGC on tourists’ intentions to use social media for planning travel to Saudi Arabia will differ by gender” demonstrate a coefficient of 0.1719, with a standard error of 0.102, a t-value of 1.685, and a p-value 0.0894884, indicating no significant effect of gender on the impact of UGC, meaning that there is no significant difference in the way that male and female tourists are influenced by UGC when making travel plans to Saudi Arabia. The results from the OLS analysis indicated that gender affects the influence of MGC, but not UGC, regarding tourists’ intentions to use social media for planning travel to Saudi Arabia. As a result, H6 is accepted, while H7 is rejected.
H8 and H9 examine how MGC and UGC influence the intention of tourists, based on their age, to use social media for planning travel to Saudi Arabia. The OLS results of “H8: The effect of MGC on tourists’ intentions to use social media for planning travel to Saudi Arabia will differ by age”, reveal a coefficient of 0.1842, with a standard error of 0.038, a t-value of 4.784, and a p-value of 0.000397, indicating a significant effect of age on the impact of MGC. Meanwhile, the OLS results of “H9: The effect of UGC on tourists’ intentions to use social media for planning travel to Saudi Arabia will differ by age”, demonstrate that the coefficient is 0.0605, with a standard error of 0.071, a t-value of 0.857, and a p-value of 0.000186, indicating a significant effect of age on the impact of UGC. Therefore, H8 and H9 are accepted. As shown, the age group analysis results are also interesting. The largest age group among the participants was 245 between the ages of 26 and 35, making up a 42.75% proportion of the total sample. This suggests that younger tourists are a vital target audience for tourism marketing in Saudi Arabia.
The OLS analysis results of hypothesis testing of hypotheses
H6 to
H9 are summarized in
Table 9.
H10 and H11 investigate how MGC and UGC influence the intention of tourists, based on their nationality, to use social media for planning travel to Saudi Arabia. The OLS results of “H10: The effect of MGC on tourists’ intentions to use social media for planning travel to Saudi Arabia will differ by nationality”, reveal that the coefficient is 0.0536, with a standard error of 0.008, a t-value of 6.505, and a p-value of 0.017071, indicating an effect of nationality on the impact of MGC. On the other hand, the OLS results of “H11: The effect of UGC on tourists’ intentions to use social media for planning travel to Saudi Arabia will differ by nationality” demonstrate that the coefficient is 0.0036. with a standard error of 0.016, a t-value of 0.230, and a p-value of 0.081823, indicating no significant effect of nationality on the impact of UGC. The results of the OLS showed that nationality influences the impact of MGC but not UGC on tourists’ intentions to use social media for planning travel to Saudi Arabia. As a result, H10 is accepted, while H11 is rejected.
H12 and H13 investigate how MGC and UGC influence the intention of tourists, based on their education level, to use social media for planning travel to Saudi Arabia. The OLS results of “H12: The effect of MGC on tourists’ intentions to use social media for planning travel to Saudi Arabia will differ by education level” reveal that the coefficient is 0.2152, with a standard error of 0.041, a t-value of 5.221, and a p-value of 0.002171, indicating a significant positive effect of education level on the impact of MGC. Meanwhile, the OLS results of “H13: The effect of UGC on tourists’ intentions to use social media for planning travel to Saudi Arabia will differ by education level” demonstrate that the coefficient is 0.0362, with a standard error of 0.061, a t-value of 0.593, and a p-value of 0.0653, indicating no significant effect of education level on the impact of UGC. Therefore, H12 is accepted, while H13 is rejected.
H14 and H15 investigate how MGC and UGC influence the intention of tourists, based on their monthly income, to use social media for planning travel to Saudi Arabia. The OLS results of “H14: The effect of MGC on tourists’ intentions to use social media for planning travel to Saudi Arabia will differ by monthly income” reveal a coefficient of 0.2480, with a standard error of 0.014, a t-value of 17.553, and a p-value of 0.01912, indicating a significant positive effect of monthly income on the impact of MGC. However, the OLS results of “H15: The effect of UGC on tourists’ intentions to use social media for planning travel to Saudi Arabia will differ by monthly income” demonstrate that the coefficient is 0.0050, with a standard error of 0.041, a t-value of 0.12,1 and a p-value of 0.09034, indicating no significant effect of monthly income on the impact of UGC. The results of OLS showed that monthly income influences the impact of MGC but not UGC on tourists’ intentions to use social media for planning travel to Saudi Arabia. As a result, H14 is accepted, while H15 is rejected.
The OLS analysis results of the hypothesis testing for hypotheses
H10 to
H15 are summarized in
Table 10.
In summary, the OLS was analyzed to examine the influence of tourist characteristics on the impact of MGC and UGC when using social media for planning travel to Saudi Arabia. The findings revealed that tourist characteristics impact MGC but not UGC in terms of their intention to use social media for travel planning.
5.5.3. The Influence of the Visit Characteristics
To investigate how visit characteristics affect the tourists’ intentions to use social media to plan travel to Saudi Arabia, the hypotheses were tested using OLS. H16 and H17 examine how MGC and UGC affect tourists’ intentions to use social media for planning travel to Saudi Arabia, differing depending on whether tourists have visited the country before. Based on the OLS results of “H16: The effect of MGC on tourists’ intentions to use social media for planning travel to Saudi Arabia will differ based on whether they have made previous visits”, the coefficient is 0.4007, with a standard error of 0.058, a t-value of 6.857, and a p-value of 0.01833, which means that there is a significant difference between the effect of MGC on tourists’ intentions to use social media for planning travel to Saudi Arabia based on whether they have made previous visits. MGC can help to build trust and confidence among these tourists, making them more likely to use social media for travel planning. Meanwhile, the OLS results of “H17: The effect of UGC on tourists’ intentions to use social media for planning travel to Saudi Arabia will differ based on whether they have made previous visits” show that the coefficient is −0.2472, with a standard error of 0.108, a t-value of −2.293, and p-value of 0.07822, which means there is no significant difference between the effect of MGC on tourists’ intentions to use social media for planning travel to Saudi Arabia based on whether they have made previous visits. UGC can create confidence among all tourists when using social media for travel planning. Therefore, H16 is accepted, while H17 is rejected.
H18 and H19 examine how MGC and UGC affect tourists’ intentions to use social media for planning travel to Saudi Arabia, differing depending on the season that tourists prefer. The OLS results of “H18: The effect of MGC on tourists’ intentions to use social media for planning travel to Saudi Arabia will differ by season” show that the coefficient is 0.0672, with a standard error of 0.065, a t-value of 13.822, and a p-value of 0.00122, which means there is a significant difference in the effect of MGC on tourists’ intentions to use social media for planning travel to Saudi Arabia based on the season that they prefer. The OLS results of “H19: The effect of UGC on tourists’ intentions to use social media for planning travel to Saudi Arabia will differ by season” show that the coefficient is 0.3676, with a standard error of 0.027, a t-value of 1.031, and a p-value of 0.06303, which means there is no significant difference in the effect of UGC on tourists’ intentions to use social media for planning travel to Saudi Arabia based on the season that they prefer. Therefore, H18 is accepted, while H19 is rejected.
H20 and H21 examine how MGC and UGC affect tourists’ intentions to use social media for planning travel to Saudi Arabia, differing depending on the length of stay. Based on the OLS results of “H20: The effect of MGC on tourists’ intentions to use social media for planning travel to Saudi Arabia will differ based on the length of stay”, the coefficient is 0.2652, with a standard error of 0.030, a t-value of 8.868, and a p-value of 0.0458, indicating that the length of stay influences the impact of MGC on tourists’ intentions to use social media for planning travel in Saudi Arabia. The survey results have shown that most respondents (46.88%) favored a seven-day stay; this indicates that if tourists plan a brief stay in Saudi Arabia, they may rely on utilizing MGC to obtain comprehensive information about Saudi Arabia. The OLS results of “H21: The effect of UGC on tourists’ intentions to use social media for planning travel to Saudi Arabia will differ based on the length of stay” show a coefficient of 0.0897, with a standard error of 0.064, a t-value of 1.404, and a p-value of 0.08161, indicating that the length of stay does not affect the impact of UGC on tourists’ intentions to use social media for planning travel in Saudi Arabia. Therefore, H20 is accepted, while H21 is rejected.
Further, H22 and H23 examine how MGC and UGC affect tourists’ intentions to use social media for planning travel to Saudi Arabia, differing depending on the type of accommodation. The OLS results of “H22: The effect of MGC on tourists’ intentions to use social media for planning travel to Saudi Arabia will differ based on the type of accommodation” show a coefficient of 0.2648, with a standard error of 0.019, a t-value of 13.802, and a p-value of 0.00138, indicating that the type of accommodation influences the impact of MGC on tourists’ intentions to use social media for planning travel to Saudi Arabia. The reliance of tourists on MGC is different based on the accommodation type. According to the survey analysis, tourists looking for a motel or economical hotel will rely on MGC more than other tourists. The OLS results of “H23: The effect of UGC on tourists’ intentions to use social media for planning travel to Saudi Arabia will differ based on the type of accommodation” show that the coefficient is 0.0363, with a standard error of 0.047, a t-value of 0.775, and a p-value of 0.094391, indicating that the type of accommodation does not affect the impact of UGC on tourists’ intentions to use social media for planning travel in Saudi Arabia. Therefore, H22 is accepted, while H23 is rejected.
In summary, the results of the OLS analysis showed that visit characteristics affect MGC but not UGC when it comes to tourists’ intentions to use social media for travel planning.
Table 11 summarizes the findings of the OLS analysis and hypothesis testing.
5.6. Tourist-Based ML Classification Model
The tourist-based ML classification model was trained using a combination of selected classification algorithms. The MGC class is labelled 0, and the UGC class is labelled 1. The performance of each model was assessed by calculating its accuracy, precision, recall, f1-score, and support for each class label.
5.6.1. Linear Classification Models
Linear classification models, including logistic regression, LDA, and LinearSVC, were trained. The logistic regression classification model achieved an accuracy score of 0.9593, indicating a correct prediction of class labels for (95.93) of test instances. The model showed strong results, with high precision and recall scores for both classes, with slightly better scores for class UGC (0.98 and 0.95) than for class MGC (0.93 and 0.97). The f1-score improved for both classes, with class UGC (0.97) showing slightly better results than class MGC (0.95). The support indicated more instances in class UGC (108) than in class MGC (64), suggesting that the model performed well in distinguishing between the two classes.
The LDA classification model achieved an accuracy score of 0.9070, indicating a correct prediction of class labels for 90.70% of test instances. The model demonstrated high precision and recall scores for both classes, with better scores for class UGC (0.94) and (0.91) than for class MGC (0.85 and 0.91). The f1-score also improved for both classes, with class UGC (0.92) showing slightly better results than class MGC (0.88). The support indicated more instances in class UGC (108) than in class MGC (64), suggesting that the model performed well in distinguishing between the two classes.
The LinearSVC classification model achieved an accuracy score of 0.9767, indicating that the model accurately predicted the class labels for 97.67% of the test cases. For class UGC, the model achieved a perfect precision score of 1.00, meaning that the model made no false positive errors for this class. However, the model missed 4% of the actual positives for the UGC class, as indicated by the recall score of 0.96. The f1-score for class UGC was 0.98, indicating a good balance between precision and recall. There were 108 instances of the UGC class in the test set. For class MGC, the model achieved a precision score of 0.94, indicating that the model made 6% false positive errors for the MGC class. However, the model did not miss any actual positives for the MGC class, as indicated by the recall score of 1.00. The f1-score for class MGC was 0.97, indicating a good balance between precision and recall. There were 64 instances of the MGC class in the test set. Overall, the model demonstrated high accuracy and performance for both classes.
5.6.2. Tree-Based Classification Models
The tree-based classification models trained included the decision tree classifier, extra tree classifier, and random forest classifier. The decision tree classifier model achieved an accuracy of 0.9069, indicating its ability to predict target variable values for 90.69% of the test instances. The classification report reveals that class MGC had a precision of 0.84 and class UGC had a precision of 0.95, which signifies that the model can accurately identify the presence or absence of a specific attribute in the data. In addition, the model had a recall score of 0.92 for class MGC and 0.90 for class UGC, indicating its ability to detect all instances of the target class accurately. The f1-score for class MGC was 0.88 and for class UGC was 0.92, indicating a good balance between precision and recall. The support value indicates that 64 instances belonged to class MGC and 108 to class UGC, suggesting that the model performed well in distinguishing between the two classes.
The extra tree classifier model performed well, with an accuracy of 0.8837, correctly predicting 88.4% of the test cases. The classification report shows that the model had a precision for MGC of 0.83 and UGC 0.92, indicating that it could accurately distinguish between positive and negative instances. The recall for MGC (0.86) and UGC (0.90) indicates that it could accurately distinguish between positive and negative instances. The f1-score, which measures the balance between precision and recall, was MGC at 0.85 and UGC at 0.91. The support shows 64 instances of MGC and 108 instances of UGC in the test set.
The random forest classifier model achieved an accuracy of 0.9476, correctly predicting 94.77% of the test cases. The model had a high precision for both classes—0.91 for MGC and 0.97 for UGC—meaning that it had a low rate of false positives. The model also had a high recall for both classes—0.95 for MGC and 0.94 for UGC—meaning that it had a low rate of false negatives. The f1-score measures the balance between precision and recall, and the model had a high f1-score for both classes: 0.93 for MGC and 0.96 for UGC. The support shows 64 instances of MGC and 108 instances of UGC in the test set.
5.6.3. MLP and KNN Models
The MLP and KNN models were trained. The MLP classifier model achieved an accuracy of 0.8546, correctly predicting 85.5% of the test cases. The classification report shows that the model had a moderate precision for both classes—0.77 for MGC and 0.91 for UGC—meaning that it had a moderate rate of false positives. The model also had a recall 0.86 for MGC and 0.85 for UGC, meaning that it had a low rate of false negatives. The f1-scores for MGC and UGC were 0.81 and 0.88, respectively. The support shows 64 instances of MGC and 108 instances of UGC in the test set.
The KNN classifier model achieved an accuracy of 0.8662, meaning that it correctly predicted 86.63% of the test cases. The classification report shows that the model had a moderate precision for both classes—0.84 for MGC and 0.88 for UGC—meaning that it had a moderate rate of false positives. The model also had a high recall for both classes—0.80 for MGC and 0.91 for UGC—meaning it had a low rate of false negatives. The f1-scores for MGC and UGC were 0.82 and 0.89, respectively. The support shows 64 MGC instances and 108 instances of UGC in the test set.
Based on the performance results displayed in
Table 12, it was observed that the LinearSVC, random forest classifier, and logistic regression models exhibited superior performance. Consequently, these models were selected for further evaluation in the next phase to determine the best ML classification algorithm for the tourist-based classification model.
5.7. Tourist-Based ML Model Evaluation
This study aimed to determine the most effective ML model for classifying tourist data. Three models, namely logistic regression, random forest classifier, and LinearSVC, were evaluated using the K-Fold Cross Validation (KF-CV) approach to ensure accuracy and reliability. After thoroughly analyzing the data, the results showed that the random forest classifier and logistic regression models achieved high accuracy scores of 0.95, even after applying KF-CV. However, the LinearSVC model stood out with an exceptional accuracy score of 0.99, indicating its superiority in avoiding overfitting compared to the other models. The results of the accuracy calculated after KF-CV was applied are shown in
Table 13.
LinearSVC is a highly versatile and practical ML model that can be used for classification and regression tasks. One of its key strengths is using a linear kernel, which calculates the dot product of two feature vectors, allowing for the efficient processing of large datasets and ensuring that the model can make accurate predictions, even when working with complex and nuanced data [
52]. In particular, the LinearSVC model is particularly well-suited for working with linearly separable data. When a straight line can separate data, the model can achieve perfect accuracy, making it an ideal choice for a wide range of classification tasks. Additionally, the LinearSVC model is rapid and efficient regarding training and implementation, making it an excellent choice for working with large datasets [
52]. Another key advantage of the LinearSVC model is its interpretability. Unlike some ML models, the LinearSVC model is developed to be easy to understand and explain, making it a valuable tool for researchers, analysts, and other professionals who need to be able to interpret and make sense of their results.
Overall, the LinearSVC model is a highly accurate and reliable choice for correctly classifying tourist data, given its high accuracy, resistance to overfitting, and interoperability. Its use can lead to the development of more effective and efficient Saudi tourism-related content and digital marketing strategies on social media platforms, meaning that this research significantly contributes to ML, tourism, and marketing.