Next Article in Journal
Adaptive Virtual Synchronous Generator Control Strategy Based on Frequency Integral Compensation
Previous Article in Journal
Capacity Optimization of Wind–Solar–Storage Multi-Power Microgrid Based on Two-Layer Model and an Improved Snake Optimization Algorithm
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Empowering Consumer Decision-Making: Decoding Incentive vs. Organic Reviews for Smarter Choices Through Advanced Textual Analysis †

Department of Information Science, University of North Texas, Denton, TX 76203, USA
*
Author to whom correspondence should be addressed.
This paper won the 2023 IEEE CISOSE Best Paper award at the 2023 IEEE International Conference on Artificial Intelligence Testing (IEEE AITest 2023), Athens, Greece, 17–20 July 2023.
Electronics 2024, 13(21), 4316; https://doi.org/10.3390/electronics13214316
Submission received: 1 September 2024 / Revised: 19 October 2024 / Accepted: 28 October 2024 / Published: 2 November 2024

Abstract

:
Online reviews play a crucial role in influencing seller–customer dynamics. This research evaluates the credibility and consistency of reviews based on volume, length, and content to understand the impacts of incentives on customer review behaviors, how to improve review quality, and decision-making in purchases. The data analysis reveals major factors such as costs, support, usability, and product features that may influence the impact. The analysis also highlights the indirect impact of company size, the direct impact of user experience, and the varying impacts of changing conditions over the years on the volume of incentive reviews. This study uses methodologies such as Sentence-BERT (SBERT), TF-IDF, spectral clustering, t-SNE, A/B testing, hypothesis testing, and bootstrap distribution to investigate how semantic variances in reviews could be used for personalized shopping experiences. It reveals that incentive reviews have minimal to no impact on purchasing decisions, which is consistent with the credibility and consistency analysis in terms of volume, length, and content. The negligible impact of incentive reviews on purchase decisions underscores the importance of authentic online feedback. This research clarifies how review characteristics sway consumer choices and provides strategic insights for businesses to enhance their review mechanisms and customer engagement.

1. Introduction

The internet has revolutionized communication, with social media’s evolution profoundly impacting e-commerce. This transformation reshapes seller–customer relationships and influences consumer decision-making. According to Kargozari et al. [1], online reviews, which are specific types of electronic word-of-mouth (eWOM), are crucial in guiding shoppers’ decisions. They are among the key determinants of online purchasing, alongside other forms of eWOM, price, and website/business reputation [2]. This underscores the importance of understanding review classifications in today’s digital marketplace.
Reviews are categorized by different criteria. Valence or polarity-based classification ranks products or services as positive, negative, or neutral based on the sentiment expressed in customer reviews [3,4]. This can be binary (positive/negative) or ternary (positive/negative/neutral) classification [5]. Aspect-based classification depends on context-specific aspects [6]. For instance, categorizing reviews based on their motives, such as monetary rewards, denotes them as organic (no incentive or non-incentive) [7], incentive (incentivized), or fake reviews. Despite organic reviews [8,9,10] being based on real experiences and free from external motivation or incentives, some individuals are influenced by incentives, which are rewards offered to encourage specific actions.
This results in either genuine incentive reviews [8,9,11,12,13,14,15] based on actual product experiences, or fake reviews without an experiential basis [13,16,17]. Aspect-based sentiment classification combines aspect-based and sentiment classifications to effectively sort review aspects by sentiment [18,19,20]. Star-rating systems categorize reviews by rating to help infer product quality and reduce information asymmetry [21]. In specific contexts like industrial research, satisfaction factors are classified into satisfiers and dissatisfiers [22], as well as criticals and neutrals [23]. These categorizations offer valuable insights into product performance and customer satisfaction, aiding businesses in identifying areas for improvement.
Factors like social presence in online reviews enhance purchase intentions by adding warmth and sociability during the shopping experience [24]. Accurate, credible reviews are crucial for informed customer decisions and mitigating bias in seller descriptions. They positively impact brand perception [25]. Contrary to belief, negative reviews are not inherently more credible than positive ones [26]. Source credibility moderates how review comprehensiveness affects usefulness [27,28], influencing consumer behavior and purchasing decisions [29]. Consistency impacts the credibility of both high and low-quality reviews [30], enhancing online review credibility [31]. While consistency can reduce informational influence [32] and review helpfulness [33,34], it improves perceptions of review usefulness [35] and positively influences brand attitudes [36].
Sellers can influence review quality through the design of review systems, including templates, presentation styles, and metrics [11]. This, in turn, affects the helpfulness and credibility of the reviews, which impacts seller outcomes, reviewer behavior, and consumer perception [37]. Although ensuring truthful, high-quality responses is vital [38], limited research explores methods for generating high-quality reviews [37]. While businesses can save costs with organic reviews [2], the challenge lies in encouraging customers to post them.
The link between review count and sales drives sellers to incentivize reviews [12,39,40]. Minimal agent manipulation, lack of authenticity, and poor incentives can bias feedback [38]. Incentives, with possible positive or negative effects [9], impact review credibility and trust [41]. They can enhance positive sentiments [12], influencing purchase intentions, trust, and satisfaction [2,8].
Customer behavior in posting reviews plays a crucial role in shaping review quality [42,43,44]. Review quality, driven by star ratings and content, affects product evaluation and customer decisions [45]. Verifying authenticity and mitigating fraud is crucial [46] for enhancing review quality, which builds trust and aids in purchase decisions [9]. Incentive reviews can inflate ratings, mislead customers [47], and cause confusion due to their high volume [13]. Therefore, maintaining high review quality is key to ensuring informed customer decisions and bridging the gap between perceived and actual product value.
Existing research highlights the importance of reviews but lacks a comprehensive comparison of the impact of content versus ratings of incentive and organic reviews on review quality and consumer decisions. In particular, there is limited exploration of how incentives affect customers’ review-posting behavior and how review characteristics like volume, length, and content influence a review’s credibility (referring to the trustworthiness and reliability of reviews, particularly how accurately they reflect true product experiences) [25,26,27,28] and consistency (referring to the degree to which incentive and organic reviews of a product or service align across various dimensions, such as semantic content, language use, sentiment, ratings, and distribution patterns) [30,31,32,33,34,35,36]. Moreover, the dynamics of how incentives shape reviews over time remain underexplored, particularly in the context of their impact on trust and purchase decisions.
To address these gaps, this study aims to investigate the distinct impacts of incentive and organic reviews on review quality and consumer decision-making through their content and ratings, focusing on the credibility and consistency of the reviews. Specifically, the study aims to assess how key review characteristics—such as volume, length, and content—affect review quality, customer behavior, and decision-making, particularly in the presence of incentives. By leveraging existing evidence and employing advanced analytical techniques such as Sentence-BERT (SBERT), term frequency-inverse document frequency (TF-IDF), and A/B testing, the research seeks to enhance customer decision-making based on the characteristics of online reviews and provide businesses with actionable insights to optimize review systems. This study proposes novel approaches to enhance review quality by emphasizing review credibility and consistency and evaluating these dimensions based on reviews’ volume, length, and content to assess their impact on consumer purchase decisions and review behavior.
To achieve these goals, we came up with the following three research questions:
RQ1. 
What are the significant differences between incentive and organic reviews?
RQ2. 
How do incentives influence the quality of purchase reviews through changes in customer behavior?
RQ3. 
How does the quality of incentive and organic reviews impact decision-making in purchases?
The following methods identify the underlying review patterns and differences. We performed the exploratory data analysis (EDA) and sentiment analysis, focusing on the “incentivized” status. We propose a comprehensive analysis using advanced techniques like Sentence-BERT (SBERT) and term frequency-inverse document frequency (TF-IDF) to capture semantic differences and term frequencies within reviews. Spectral clustering categorizes reviews into distinct clusters, distinguishing incentive reviews from organic ones based on semantic content and term frequency. Consequently, t-distributed stochastic neighbor embedding (t-SNE) visually projects these clusters onto two dimensions, comparing the similarity between incentive and organic reviews. Spectral clustering and t-SNE enhance our understanding of semantic links and provide a detailed analysis of the review landscape. Additionally, A/B testing of review rating scores examines the impact of incentives on customer purchase decisions.
We hypothesize that this deeper understanding can enhance recommendation systems, leading to a more customer-centric shopping experience. Our study establishes a framework for differentiating reviews and assessing their impact on customer behavior, supporting reliable e-commerce solutions.
Our research explores how company size and user experience duration affect the effectiveness of software reviews, an area scarcely addressed in existing literature. Our comprehensive analysis extends beyond general purchase reviews to dissect software-specific feedback, distinguishing between incentive and organic reviews and their effects on consumer decisions. We investigate whether review content or ratings better reflect their impact on review quality and purchasing behavior.
By evaluating review volume, length, and content, we provide insights into reviews’ credibility and consistency. This informs strategies to enhance review quality and offers actionable business recommendations to refine review processes and boost customer engagement, addressing critical research gaps.
The article’s structure is as follows: Section 2 introduces the related work, leading into hypothesis development in Section 3. The methodology is proposed in Section 4. The results are then detailed in Section 5, with further discussion in Section 6. The article concludes with Section 7.

2. Related Work

This research investigates the influence of incentive and organic reviews on review quality, credibility, and consumer behavior. Previous studies have extensively examined how social cues, review content, and ratings shape customer trust and purchasing decisions. However, the specific effects of incentives on review behavior and quality are still being explored. Our study builds on this by analyzing the differences between incentive and organic reviews, providing insights into how these reviews affect decision-making and review credibility.
With techniques such as sentiment analysis, semantic links, clustering, and experimental and hypothesis testing, we offer a deeper understanding of how incentives influence online review dynamics, which can guide businesses in improving review management strategies.

2.1. Online Review

Online reviews significantly influence consumer perceptions and purchasing decisions, with trust in online reviews being equivalent to trust in the recommendations of friends [48]. The goal is to enhance review quantity and quality while minimizing bias [39,40]. Managing online reviews requires distinguishing organic from non-organic reviews, including fake or incentive reviews, to maintain review authenticity and influence customer behavior [46].
Despite efforts to improve review quality, challenges such as bias, authenticity issues, and inconsistent quality persist. Businesses prefer organic reviews for their cost-effectiveness and credibility, although uncertainty about their authenticity can impact purchase decisions. Encouraging sincere customer feedback enhances their influence [2]. Higher review volumes are believed to attract customers and boost credibility [29], but higher reviewer status often leads to more anonymous reviews, raising credibility concerns [49]. The increase in deceptive and incentive reviews undermines authenticity and affects purchase decisions [50].
Figurative language in reviews enhances social connections and influences purchase decisions for experience-based products. The review language style impacts perceived social presence and customer purchasing behavior, particularly by product type [24]. Keywords from numerous reviews offer consistent and credible product evaluations, shaping consumer decisions [51].
Studies show inconsistent results regarding the relationship between online reviews and sales. Meta-analyses highlight different effects of review-related factors on sales, emphasizing their complex role in customer decisions and sales performance [52]. Companies use incentives to boost review volume and sales as more reviews often lead to more purchases; however, the effectiveness of this approach remains contentious. Some believe incentives enhance review quality and quantity, while others argue they lead to overly positive feedback, harming authenticity and helpfulness [12]. Inadequate validation and incentives due to minimal agent effort can result in biased feedback, compromising the trustworthiness of online reviews. This highlights the need for mechanisms to encourage truthful and informative responses [38].

2.2. Incentive vs. Organic

Online reviews influence purchase intentions by affecting customer trust through perceived information quality and social presence [2]. While reviewers’ contribution and readability levels initially rise [14], incentives can enhance review quality [15] over time and stabilize the numerical rating behaviors. According to social exchange theory (SET), incentives can motivate social behavior to encourage review writing by fulfilling individual needs [15]. For companies, incentives attract attention [9], increase ratings, reduce returns, and contribute to company success [15].
However, incentive reviews can be fake, where sellers reward reviewers intending to make the reviews appear organic. Network analysis reveals that products with fake reviews are more clustered in the review network, sharing common reviewers, which helps in detecting them with high accuracy [53].
Although disclosing incentives seems more authentic and less betraying [54], which may maintain trust, reduce bias, boost helpfulness, and increase sales and review volume, it may not always enhance credibility [16]. Practices like monitoring for authenticity, exposing fake reviews, building community, and endowing status to reviewers enhance consumer trust in the platform [55]. However, the impact of disclosing statements on product quality judgment depends on whether it is integral or incidental [56]. Prompting reviewers to be more truthful and consumers to be more discerning through disclosure is not supported by empirical evidence [47], as the disclosure of incentive reviews may mislead consumers, misguide consumer decisions, allow for accuracy failures [16], and cause customer dissatisfaction.

2.3. Incentive and Decision-Making in Purchases

Decision-making in purchases involves complex considerations that can be simplified by utility-driven systems providing detailed information [57]. Purchase decisions are influenced by factors like price discounts, shipping offers, and online reviews [52]. The positive relationship between quantity (volume) and quality (credibility) of the reviews strengthens their influence on customer purchase intentions and decisions [29].
Incentives make users more active [58] and increase the number of review writers by aligning with social norms [10], which makes review writing more enjoyable [12]. Therefore, incentives contribute to an increase in both the volume [15] and length of reviews [10,11]. Consequently, the increased volume of provided information aids new customers in making better purchase decisions [10].
According to loss aversion theory, review valence (emotional impact) is more influential than review usefulness in decision-making [59], with incentive reviews positively affecting purchase decisions by increasing review valence [15], which means increasing emotional words in customer reviews [11,12]. Incentive reviews enhance the effectiveness of review signals for new customers [60].
Conversely, existing studies have investigated the importance of avoiding incentives. Offering and accepting incentives can decrease trust as it follows market norms rather than social norms, highlights human behavioral issues, raises moral concerns, increases review fraud that undermines credibility, and establishes conflicting interests between businesses and reviewers [17]. In addition, incentives can also lead to biased positive reviews [10,50]. Despite differing views [10,11], incentives may reduce user effort to write lengthy informative reviews [61]. Moreover, customers may provide valuable negative reviews, which offer valuable insights [62], when they are uncomfortable receiving incentives for their opinions [10].
The following two tables provide a summary of the existing works: Table 1 outlines the goals, gaps, and methodologies, while Table 2 details the findings, contributions, and limitations. While many existing studies have explored the effects of incentives on review volume, valence, and sales, they often overlook how incentives influence deeper semantic aspects, such as consistency across pros, cons, and descriptions. These works typically focus on high-level metrics like ratings or overall sentiments but do not investigate how language and structure change between incentive and organic reviews.
Our study fills this gap by using SBERT and TF-IDF to capture subtle semantic differences between these review types. SBERT provides deeper insights into semantic relationships, while TF-IDF highlights shifts in term frequency that signal changes in review content.
Additionally, previous research has primarily focused on surface-level sentiment analysis, without fully exploring how incentives influence review credibility over time. By using advanced methods like spectral clustering and t-SNE, our approach provides a more comprehensive understanding of how incentives affect both the consistency and credibility of reviews, offering deeper insights into how incentives impact consumer trust and decision-making.
This comprehensive approach offers a better understanding of how incentives influence not only review sentiment but also the credibility and trustworthiness of reviews, contributing important insights into consumer decision-making.

3. Hypothesis Development

The impact of online reviews on purchase intention and decision-making is well-documented, yet opinions on the nature of this impact vary significantly. Key factors such as reviewer reputation, product age, and incentives can influence the relationship between online reviews, motivation factors, and sales outcomes [52].
To explore these relationships in depth, particularly considering the role of incentives, we propose several hypotheses based on existing literature and our preliminary findings.

3.1. Review Credibility and Consistency

Credibility and consistency are vital dimensions of review quality that determine the overall value and trustworthiness of reviews in shaping consumer decisions. Various factors, such as volume, length, and content, contribute to these values.

3.1.1. Review Credibility

Existing research highlights the significant role of online review volume in influencing perceived credibility and consumer behavior. The larger volume of reviews is often correlated with greater perceived credibility, signaling consumer engagement and product reliability, which positively impacts purchase intention [29]. However, this relationship is complex, as an excessive volume of reviews, particularly from high-status reviewers—which can lead to concerns about anonymity—may diminish perceived credibility [49].
Additionally, deceptive reviews can diminish the credibility of online reviews [50]. Given the diverse perspectives on how the volume of reviews (considering associated contextual factors such as sentiment, rating distribution, and content) influences the credibility of online reviews, and the insufficient evidence on the role of incentives, we propose the following hypothesis:
H1a. 
Incentive reviews are less credible than organic reviews when considering their volume and associated contextual factors.
Similarly, longer reviews tend to be viewed as more credible because they offer more detailed information [15], which positively influences customer decision-making [10,11]. However, in the context of incentive reviews, it is unclear whether the review length contributes similarly to credibility. Thus, we propose the following:
H1b. 
Incentive reviews are less credible than organic reviews based on their average length.
Depending on the purpose of the review, review content includes elements such as overall rating, description, pros and cons, purchase details, and personal demographics that provide insight into the reviewer’s experience. This study focuses on the review description and the pros and cons.
Research highlights the relationship between content, incentives, and perceived authenticity. While the study highlights that the tones and details of incentive reviews differ from organic ones, affecting credibility [55], another study has found no significant differences in content, aside from sentiment variation in certain parts of the text, suggesting that incentive reviews may not necessarily lead to biased reviews [61].
Additionally, some incentive reviews, especially those compensated after a five-star rating, are crafted to appear organic and evade platform detection filters. These reviews are often categorized as fake and tend to cluster in the review network, contrasting with the more dispersed nature of organic reviews [53]. Given the mixed findings regarding the influence of incentives on review content, we propose the following hypothesis:
H1c. 
Incentive reviews are less credible than organic reviews based on their content, due to differences in tone and detail.

3.1.2. Review Consistency

Consistency, while less discussed than credibility, is equally important in evaluating the quality of reviews, reflecting their reliability. Previous studies suggest that when large volumes of reviews are summarized effectively, they can create a more consistent representation of consumer opinions [51]. Recognizing its impact on customer views and decisions, the following hypotheses are proposed:
H2a. 
Incentive reviews are more consistent than organic reviews when considering their volume and associated contextual factors.
The review length is also associated with consistency, with longer reviews typically viewed as more detailed and uniform, providing greater context and evidence to support the reviewer’s experience, thereby contributing to higher consistency. Therefore, we propose the following:
While the review length provides more detail, there is not enough evidence about whether it directly contributes to higher consistency in reviews [67]. Nevertheless, longer reviews may offer more context, which could lead to perceptions of greater consistency, even if this relationship is not always clear. Thus, the following hypothesis is proposed:
H2b. 
Incentive reviews are more consistent than organic reviews based on their average length.

3.2. Impact on Customer Decision-Making

Decision-making in purchasing often relies on online reviews. Incentive reviews, unlike organic ones, may distort this process by creating a network of reviews that lack authentic consumer experiences, making it harder to distinguish genuine reviews and favoring products with manipulated ratings [53]. Incentives also increase review volume and subtly influence their emotional tone, affecting consumer perception [12].
Disclosure practices further impact review credibility [68]. While mandatory disclosure leads to more trustworthy reviews, voluntary disclosure can introduce bias and reduce credibility [16]. When manipulation is unnoticed, incentive reviews may boost purchase intentions, but awareness of manipulation negatively impacts behavior [50].
Given the complex relationship between incentive and organic reviews in shaping customer decisions, we propose the following hypothesis:
H3. 
Incentive reviews have less effect on customer decision-making than organic reviews.

4. Research Methodology

This section outlines a research methodology that distinguishes the impact of incentive versus organic reviews on consumer behavior. Using SBERT and TF-IDF, this study analyzes the semantics and emotional signals in reviews, which are crucial in shaping consumer perceptions and decisions.
This approach provides a robust framework, as shown in Figure 1, for understanding how various forms of online reviews influence consumer trust and purchasing behavior. This offers valuable insights for future studies in online marketing and consumer behavior.

4.1. Data Collection

Data were collected from software review websites, including Capterra (https://www.capterra.com/project-management-software, accessed on 25 October 2021), Software Advice (https://www.softwareadvice.com/project-management, accessed on 1 October 2021), and GetApp (https://www.getapp.com/customer-management-software/crm, accessed on 10 November 2021), containing user-revealed experiences. We focused on review sections, including “Personal Information”, “Itemized Scores”, “Review time and source”, and “Review text”; see Figure 2. A total of 1189 software product reviews were scraped using Python code with the aid of the Selenium web scraper [69] and Beautiful Soup [70].
Combining Selenium with Beautiful Soup enhanced web scraping by leveraging Selenium’s ability to handle dynamic content and Beautiful Soup’s fast HTML parsing. Selenium navigated and scrolled through pages, while Beautiful Soup quickly extracted review data, including titles, descriptions, pros, cons, ratings, and review details such as name, date, company, and prior product used. This process generated a CSV file with 43 attributes from 62,423 unique reviews.

4.2. Data Preprocessing

We used Python to preprocess the data, removing “None” values from the incentivized feature, and leaving 49,998 instances. Null values in other attributes were retained to preserve critical information, with the data categorized by the presence or absence of incentives, as follows:
  • 29,597 incentive reviews and 14,658 organic reviews from Capterra.
  • 861 incentive reviews and 1397 organic reviews from GetApp.
  • 2280 incentive reviews and 1205 organic reviews from Software Advice.
We reclassified reviews based on their incentivized status, consolidating the original five groups into two binary categories. For this purpose, we grouped reviews labeled as “NominalGift” and “VendorReferredIncentivized” into the “Incentive” category. Meanwhile, reviews labeled as “NoIncentive”, “NonNominalGift”, and “VendorReferred” were classified under “NoIncentive”. The results of this reclassification are stored in a new column labeled “Incentivized”.
In data pre-processing, “expanding contractions” was used to replace shortened words with their full forms, including the original root or base words, ensuring each word could be analyzed individually as separate tokens [71]. This step preceded tokenization and was followed by the removal of non-alphabetic and non-numeric characters, such as punctuation marks.
Lemmatization, a natural language processing (NLP) technique, enhances sentiment analysis by converting words to their root form, improving accuracy, and reducing dimensionality. This technique simplifies the recognition of the fundamental meanings of words through morphological analysis without altering their sentiment value [72,73]. It ensures consistency across analytical models, which is crucial for comparative analysis, and aids in identifying differences in how models process review text [74].
Tokenization breaks text into smaller units, such as words or phrases called tokens, aiding in identifying meaningful keywords and enhancing text classification and sentiment analysis accuracy. Removing stop-words like “a”, “an”, and “the” also enhances these processes by reducing noise, text dimensionality, and computational resources [75].
Data pre-processing was followed by EDA, sentiment and semantic analyses, spectral clustering, t-SNE, A/B testing, and recommendations. Key attributes used included “incentivized”, “overallRating”, “value_for_ money”, “ease_of_use”, “features”, “customer_support”, “likelihood_to_recommend”, “year”, “company_size”, “time_used”, “source”, “preprocessed_pros”, “pros_Sentiment”, “preprocessed_cons”, “cons_Sentiment”, “preprocessed_ReviewDescription”, “ReviewDescription_Sentiment”, and “Incentivized”.

4.3. Data Analysis

4.3.1. Exploratory Data Analysis (EDA)

We conducted EDA to uncover underlying patterns, relationships, and characteristics in our dataset. EDA employed statistical graphics to summarize the dataset’s main features and provided valuable insights to guide subsequent text analysis techniques [76].

4.3.2. Sentiment Analysis

Sentiment analysis was employed to compare the emotional tones of incentive and organic reviews [77], enhancing personalized experiences, informed purchase decisions [64], and business strategies [65,78].
We employed the HuggingFaceTransformers library (https://github.com/huggingface/transformers, accessed on 20 January 2022), utilizing various NLP techniques, including lemmatization, tokenization, embedding, and classification to determine the sentiment polarity and intensity of review texts. Due to a model limitation of 200 characters, sentiments for review descriptions, pros, and cons were analyzed separately and stored as “ReviewDescription_Sentiment”, “pros_Sentiment”, and “cons_Sentiment”. Spearman’s correlation coefficient measured the correlation between incentive and organic review ratings based on sentiment. This method is ideal for ordinal data, especially when there is no linear relationship or normal distribution [79,80].
Considering the review description, incentivized status, and sentiment, a random sample of 4000 reviews per category was analyzed. A 95% confidence interval was determined using the z-test.

4.3.3. Semantic Link Analysis

“Semantic links” refer to the relationships between words based on their meanings, derived from semantic networks [81]. To explore connections between incentive and organic reviews, we used TF-IDF to assess the significance of words or phrases by comparing their frequencies within the document to the entire corpus [82]. We randomly selected 15,000 reviews from each incentive and organic category and applied feature extraction to the “preprocessed_CombinedString” to generate trigrams. These trigrams, more meaningful than bigrams, were analyzed for their frequency in each review set. The TF-IDF scores highlighted the importance of trigrams in the document corpus, enabling us to compare and rank trigrams between the two categories by overall score. We calculated cosine similarity with a range from −1 (diametrically opposed vectors, dissimilar) to 1 (identical vectors), with 0 (no similar vectors) in between for further analysis of these frequencies. This process is supplemented by the t-statistic and p-value to assess differences between the review categories.
To uncover deeper semantic links, we implemented the SBERT model [83] (https://huggingface.co/sentence-transformers/bert-base-nli-mean-tokens, accessed on 22 January 2024), an advanced NLP technique that maps text to a 768-dimensional vector space. Unlike BERT, which focuses on word-level embeddings and understands the contextual meaning of morphological words [73], SBERT generates semantically meaningful sentence-level embeddings for more effective text comparisons using cosine similarity. The two categories of the data, incentive and organic, were used for model deployment. The “SentenceTransformer” package was imported to initiate the “SentenceTransformer” class, automatically downloading the “bert-base-nli-mean-tokens” model, which is adept at capturing sentence semantics. Reviews in each category were encoded into embeddings, and average embeddings were calculated to represent the average semantic content of each review category. Cosine similarity was then used to quantify the similarity between these vectors.
To support semantic link findings, we also employed spectral clustering and t-SNE.

4.3.4. Spectral Clustering in Topic Modeling

Spectral clustering, an efficient method for clustering large datasets [84], was applied to topic modeling with SBERT, using the same preprocessed data to keep consistency in the results. Key Python libraries installed included “SentenceTransformers” for embedding generation, “nltk” for text processing, and “scikit-learn” for classification, clustering, and dimensionality reduction. Essential tools such as “SentenceTransformer” for sentence embeddings, “TruncatedSVD” for dimensionality reduction [85], “SpectralClustering” for clustering, and “silhouette_score” for clusters’ evaluation were utilized.
We generate embeddings from the “preprocessed_CombinedString” column using the “bert-base-nli-mean-tokens” model via the “SentenceTransformer” framework. This BERT-based model, pre-trained on natural language inference tasks, provided mean pooled token embeddings that capture the text information for clustering. To enhance clustering, we reduced dimensionality to 50 components using “truncated singular value decomposition (SVD)” to embeddings. Cluster analysis was performed using “spectral clustering”, testing 2 to 10 clusters based on the “silhouette score”. This evaluates cluster quality by comparing cohesion within clusters to separation from others. A higher score (range: −1 to 1) indicates better [86]. The silhouette score decreased from 0.150 to 0.050 as the number of clusters increased, identifying 2 clusters as optimal. This transformation into a lower-dimensional space, executed via the “TruncatedSVD” function with “n_components = 50” was crucial for handling large datasets and improving subsequent analyses.
The “spectral clustering” function was configured with the “nearest_neighbors” affinity and a fixed “random_state” to ensure consistency. We calculated the average “silhouette score” for each cluster count aiming to identify the optimal cluster number. This clustering also used the “nearest_neighbors” affinity and was seeded with a fixed “random_state” for consistent running results. To visualize the clustering in a two-dimensional (2D) space, “Truncated SVD” was used if the reduced embeddings exceeded two dimensions. To understand the characteristics and tendencies of each cluster, we examined the distribution of “Incentive” and “NoIncentive” cases within each cluster, and calculated key metrics like the mean and standard deviations for each review type. Finally, we visualized the clustered data in 2D, highlighting the incentivized status of the reviews to clarify the data structure.

4.3.5. t-Distributed Stochastic Neighbor Embedding (t-SNE)

The t-SNE is an unsupervised machine learning algorithm that visualizes high-dimensional data by mapping each data point into a two- or three-dimensional space. This non-linear dimensionality reduction improves upon stochastic neighbor embedding (SNE) by reducing the tendency to crowd points together in the map center, facilitating better visualizations that help in identifying patterns and clusters [87]. For this purpose, we integrated the “t-SNE” model using Python libraries, such as “SentenceTransformera” for advanced text embeddings, “scikit-learn” for machine learning algorithms, and “plotly” for interactive plots. We used “SentenceTransformer” with the “bert-base-nli-mean-tokens” model to encode review texts into vector embeddings, to encapsulate the semantics of the text. The 2D visualizations, created based on two components, facilitate exploring complex relationships in high-dimensional text data. We evaluated the embedding quality using “Kullback–Leibler (KL) divergence” to ensure the reduced dimensions accurately represented the original data and to validate the integrity and reliability of the dimensionality reduction.
We then used the “DBSCAN clustering” algorithm to segment t-SNE results and facilitate review analysis. A color palette was generated to assign a unique color to each cluster, which was plotted as distinct scatter traces in the t-SNE-reduced space, annotating the data with the “incentivized” status and count. This method effectively visualized the distribution of incentive and organic reviews within clusters, providing an interactive view of the clustering dynamics.

4.4. Statistical Testing and Validation

4.4.1. A/B Testing

A/B testing, a popular controlled experiment known as split testing, was conducted to compare incentive (A) and organic (B) reviews. We analyzed customer reviews to test the null hypothesis for significant differences between the two groups. Using 10,000 repetitions, we measured mean differences across six rating attributes, including “overAllRating”, “value_for_ money”, “ease_of_use”, “features”, “customer_support”, and “likelihood_to_recommend”.

4.4.2. Hypothesis Testing and Bootstrap Distribution

Hypothesis testing and bootstrap distribution validated the robustness of the A/B testing results. These methods provided further statistical support to ensure the differences observed between incentive and organic reviews were significant and not due to random variation, reinforcing the reliability of the A/B testing outcomes.

4.5. Recommendation

A/B testing revealed that organic reviews have a stronger impact on customer decisions. To improve customer experience, we developed a recommendation system using TF-IDF and SBERT. Users could input preferences as queries, which were matched to organic reviews and their corresponding listing IDs, providing the top five most similar reviews for better decision-making.
We reevaluated data preprocessing to ensure accurate ground-truth data labeling. Organic reviews were extracted and stratified by ensuring each product ID was proportionally represented in both the training and test datasets. For this purpose, we first filtered out the listing IDs with less than two reviews and then applied “StratifiedShuffleSplit” (considering the listing IDs) to illustrate the stratified split and split organic reviews into 60% “main_data” and 40% “ground_truth_data”. TF-IDF vectorization with trigram consideration was used to convert text into numerical vectors. We applied the Euclidean norm (L2 norm) to ensure that the vector length does not influence the model’s behavior in the similarity calculation. The same vectorizer was used to process user queries, calculating cosine similarities between queries and review vectors considering listing IDs. Using SBERT for semantic search, we installed the “SentenceTransformer” library, preprocessed the data, split the data, and used the “bert-base-nli-mean-tokens” model to generate embeddings that captured the semantic essence of reviews. Cosine similarity identified the top five similar reviews based on these embeddings. We evaluated both models by calculating precision, recall, F1 score, accuracy, match ratio (1), and mean reciprocal rank (MRR) (2). The models provided the top five most relevant reviews to a given query based on cosine similarity and their associated listing IDs, representing the specific product. Below are the definitions and formulas of match ratio and MRR metrics.
Match Ratio = Number of Matching Top Reviews Total Number of Top Reviews
  • Script-wise match ratio: number of top reviews identified by the model that were actually present in the ground truth data (to evaluate the relevance of the model’s predictions)
Mean reciprocal rank ( MRR ) = 1 Number of Queries 1 Rank of First Relevant Answer
  • Script-wise MRR: calculates average reciprocal ranks of results for a query set (evaluating ranking-based system performance where the order of the results matters).

5. Results and Analysis

This study reveals how incentive and organic reviews affect consumer trust, purchasing decisions, and perceived product quality through the empirical analysis of content and ratings. These findings enhance our understanding of online consumer behavior and offer new insights into the complex relationship between review traits and consumer reactions.

5.1. EDA Results

After removing null values from the “incentivized” feature, the EDA shows that—among the 49,998 remaining reviews—44,255 are from Capterra, 3485 are from Software Advice, and 2258 are from GetApp. The five review categories derived from the “incentivized” feature are categorized into two groups. The first two categories, 29,466 “NominalGift” and 3272 “VendorReferredIncentivized”, have 32,738 reviews labeled as “Incentive”; the last three, i.e., 16,812 “NoIncentive”, 90 “NonNominalGift”, and 358 “VendorReferred”, contain 17,260 reviews labeled as “NoIncentive”.

5.2. Sentiment Analysis Results

Table 3 shows that incentive reviews outnumber organic reviews for ratings of 2 and above. The cost and customer support zero scores are prevalent in incentive reviews, negatively impacting the overall product recommendations. This trend suggests a critical relationship between incentive reviews and a decline in product endorsement, particularly highlighting cost and customer support issues. The data highlight biases in rating patterns and their effect on product perception and consumer trust.
Table 4 illustrates the changing volume of software reviews from 2017 to 2021, with a noticeable peak in positive incentive reviews in 2018 followed by a decline in 2020. This pattern may suggest that external factors, such as social media engagement, adjustments in incentive programs, and a surge in the posting of genuine feedback, could influence review trends. The reduction in review volumes, particularly for incentive reviews, might be associated with factors like the COVID-19 pandemic, diminished customer trust due to increasing awareness, and policies limiting incentive reviews.
Users with over two years of product usage experience tend to post more positive and fewer negative incentive reviews, likely due to product familiarity and incentive benefits. In contrast, new users in the free trial phase post fewer reviews, mostly incentive reviews, indicating limited experience may increase susceptibility to incentives, as shown in Table 4.
Table 4 highlights that smaller companies (i.e., 11–50 employees) have over 7000 reviews, mainly positive incentive reviews, while larger companies (i.e., 5001–10,000 employees) have fewer than 510 reviews. This disparity likely stems from smaller companies being easier to establish and more likely to offer incentives for posting reviews.
Among the 324 product listing IDs, 264 contain both incentive and organic reviews, 34 have only organic reviews, and 26 include only incentive reviews. In the 264 IDs with both review types, regarding review description, there are 32,508 (23,676 positive vs. 8832 negative) incentives and 16,952 (12,847 positive and 4105 negative) organics. A similar pattern with more positive reviews exists for incentive reviews (24,082 positive and 8426 negative) and organic reviews (12,875 positive and 4105 negative) pros. However, incentive reviews show more negative comments (26,439 negatives vs. 6,069 positives) compared to organic reviews (12,481 negatives vs. 447 positives). This pattern is consistent across the 34 IDs with only 317 organic reviews and the 26 IDs with 230 incentive reviews, where incentive reviews generally present more favorable outlooks but also more critical cons compared to organic reviews, reflecting a similar skew across all categories.
The word cloud results in Table 5 reveal the top 20 words extracted from each review text, including descriptions, pros, and cons, emphasizing the impact of incentivization. Words such as “great” and “good” dominate both positive incentive reviews and organic reviews, frequently appearing even in negative review descriptions and pros. The absence of typically negative terms may result from the removal of negative words like “not” as stop words during preprocessing, which may inadvertently filter out expressions of negative sentiment. This aligns with prior research indicating that positive incentive reviews are longer. Notably, positive incentive reviews contain a higher volume of top words than organic reviews, suggesting that incentives strongly influence review content. In contrast, fewer top words in negative incentive reviews indicate a reluctance to leave negative feedback when incentivized. This discrepancy in word frequency underscores the complexity of assessing review authenticity. It suggests that incentives may amplify positive sentiment, potentially skewing perceptions of product quality. Table 5 presents these findings for review descriptions.
Measuring the average length of review descriptions reveals that negative organic reviews, 153.91 characters, are longer than negative incentive reviews, 125.17 characters. This suggests that organic reviews may reflect more detailed dissatisfaction and deeper reviewer engagement. Negative reviews, in general, tend to be longer than positive ones. The close average lengths of positive reviews—104.41 characters for incentive reviews and 104.14 for organic reviews—imply that incentives do not significantly impact the level of detail users provide. This indicates that users share similar levels of positive sentiment regardless of whether they are incentivized, which aligns with previous studies [10,11].
Our “Spearman’s rank correlation coefficient” test on review rating scores, considering the 95% confidence interval, reveals a stronger correlation among organic reviews, especially negative ones; see Figure 3. The highest correlation, at 0.80, is observed between “likelihood_to_recommend” and “overAllRating”, driven by strong correlations of “overAllRating” with “features” at 0.78 and “ease_of_use” at 0.76. Similarly, it is influenced by strong correlations of “likelihood_to_recommend” with “features” at 0.73 and “ease_of_use” at 0.72. A similar but weaker pattern appears in negative incentive reviews. The correlation between “features” and “ease_of_use” highlights the value of user-friendly software. Additionally, the correlation between “value_for_money” and “customer_support” is weaker in negative incentive reviews, 0.60, compared to negative organic ones, 0.66. All correlations are significant with 95% confidence, as shown by p-values of zero.

5.3. Semantic Link Results

Semantic links compare the contents of incentive and organic reviews using TF-IDF and SBERT methods.

5.3.1. Semantic Link Results Using TF-IDF

Trigram analysis with the TF-IDF technique reveals distinct language differences between organic and incentive reviews, as shown in Table 5. Organic reviews feature unique terms like “sensitive content hidden” and “everything one place”, while incentive reviews often use phrases like “project management tool” and “steep learning curve”.
However, phrases such as “great customer service” and “software easy use” are common to both with varying frequencies.
Despite overlaps in trigrams across both categories, differences in order and prevalence highlight distinct priorities and areas of focus. Organic (“NoIncentive”) reviews emphasize “software easy use”, while incentive reviews focus on a “project management tool”. A cosine similarity of 0.675 between incentive and organic reviews suggests a moderate to high similarity in their trigram representations. This indicates that offering incentives does not significantly alter review language. A t-test result of −0.867 shows a lower average TF-IDF score for incentive reviews compared to organic reviews, but the small t-statistic indicates a minimal difference in means. The difference is not statistically significant, as evidenced by the p-value of 0.389, which is higher than 0.05. This implies that any content differences between incentive and organic reviews are likely due to chance rather than a systematic cause.

5.3.2. Semantic Link Results Using Sentence-BERT

We used the “SentenceTransformer” model to capture the contextual meaning of the reviews using embeddings. The average embeddings for both review categories, summarized into mean vectors, showed a cosine similarity of 0.999. This indicates that both types of reviews are nearly identical in topics, information, and sentiments, suggesting that incentives may not impact the content or language of the reviews.

5.4. Spectral Clustering in Topic Modeling Results

Supporting the semantic link results, spectral clustering uncovers nonlinear similarities among the review groups. The “silhouette scores” for 2 to 10 clusters decrease from 0.125 to 0.048, with 2 clusters showing the highest score. This indicates optimal clustering that maximizes within-cluster similarity and minimizes between-cluster similarity. This suggests a binary nature of reviews, primarily distinguished by incentivized status.
Statistical analysis reveals mean values of 0.15 for incentive reviews and 0.17 for organic reviews, with nearly identical standard deviation values (0.36 for incentive reviews and 0.37 for organic reviews). This indicates slight differences between the two review category clusters but similar distribution patterns across clusters. Despite the higher volume of incentive reviews, both types are evenly distributed, implying that incentives influence review volume but not their inherent characteristics.
The spectral clustering graph, Figure 4, shows reviews plotted based on two principal components driven from “TruncatedSVD”. Each point on the plot represents a review, positioned by its first and second SVD components. The axis ranges, derived from SVD transformation, reflect the variance captured by each component, transforming the original high-dimensional data into two-dimensional space for visualization.
The two clusters highlight the concentration of data points with red star centroids marking average positions. In each cluster, incentive reviews outnumber organic reviews by nearly 2:1, mirroring the actual dataset ratio. A gradient from dark to light indicates each point’s cluster affiliation, showing organic reviews have a broader spread. Darker areas suggest higher review densities. After SVD dimensionality reduction, reviews were grouped by semantic content, confirming high cosine similarity. This visual clustering, using spectral clustering, reinforces the patterns identified by cosine similarity, validating the method’s robustness.

5.5. t-SNE Results

The t-SNE reduced our high-dimensional data to two dimensions, enabling more effective data exploration after testing semantic similarities in review text through embeddings generated by the SBERT model. While t-SNE itself is not a clustering algorithm, combining it with “DBSCAN (density-based spatial clustering of applications with noise)” effectively grouped the data and identified noise. The t-SNE visualization quickly highlighted outliers, aiding in assessing the impact of noisy data, which may uncover unique insights into customer experiences, as shown in Figure 5. The t-SNE mapped reviews into clusters, revealing semantic relationships and trends in customer feedback, which are crucial in product development, customer service, and recommendations. Due to the high density of certain clusters, especially in the center of the graph, circles, and crosses overlap to form shapes that resemble squares, making individual markers often indistinguishable in areas where reviews share close semantic similarities.
Cluster-1, containing 0.058% of the data (28 reviews), represents noise, which is crucial to identifying for accurate analysis. The majority of the data (99.89%) falls into Cluster 0, indicating central density and consistency. The dominance of this cluster suggests that most reviews share semantic similarities, aiding in analyzing customer preferences.
The first and second t-SNE components represent the data structures in two dimensions, with each x and y pair reflecting the relative positioning of reviews to one another. The axis values result from the algorithm’s scaling to fit data into two dimensions and are arbitrary, with no fixed reference outside this model. These values primarily reflect the structure and relationships within the high-dimensional space.
The cost function used, “Kullback–Leibler(KL) divergence”, measured 4.42. The model reduced dimensions and iteratively adjusted data in two dimensions to minimize KL divergence. In t-SNE, KL divergence measures the difference between the high-dimensional probability distribution, representing input data similarities, and the low-dimensional probability distribution, representing data point similarities in the compressed two-dimensional space. A KL divergence of 4.42 suggests that while the low-dimensional representation may not capture all high-dimensional characteristics, one distinct cluster meets the desired outcome, making the KL divergence’s exact value less critical.

5.6. Statistical Testing and Validation Results

The analysis of hypothesis testing, bootstrap distribution, and A/B testing reveals key differences in how incentive and organic reviews impact customer decision-making; see Table 6. Organic reviews consistently show higher variability, as evidenced by a standard deviation of 0.913 for the “overall rating” compared to 0.702 for incentive reviews and 1.350 for “ease of use” compared to 0.890, respectively. This suggests that despite incentive reviews, organic reviews capture a broader range of customer experiences, reflecting more diverse opinions that could lead to more informed decision-making. In the “value for money” category, the observed difference of 0.296, supported by a t-value of −16.294 and a p-value of 0.000, indicates that incentive reviews may underestimate the true values of products, which could potentially mislead customers regarding cost-effectiveness, negatively impacting their purchasing decisions. Furthermore, the “customer support” attribute shows a significant difference of 0.505 (t-value of −26.961, p-value of 0.000), indicating that organic reviews are more critical and likely provide a more accurate assessment of support quality, a factor crucial for customer satisfaction and retention. Conversely, for the “ease of use” and “features” attributes, despite higher ratings in incentive reviews, the non-significant p-values (both 1.000) suggest that these differences do not substantially impact customer perception. Additionally, the “likelihood to recommend” attribute, with a difference of −0.177 and a p-value of 1.000, indicates that incentives do not significantly alter customers’ willingness to recommend a product, thus having minimal influence on this aspect of decision-making.
Smaller standard errors in incentive reviews suggest more consistent estimates, offering a stable but narrower perspective, while larger errors in organic reviews reflect greater variability, offering a broader but less precise perspective. This variability in organic reviews can lead to more informed decision-making by offering a wider range of perspectives, although with less certainty. The 5% significance threshold ensures reliability, emphasizing that organic reviews, despite their variability, provide a more comprehensive and accurate foundation for making informed purchasing decisions.
Furthermore, regardless of review sentiment, the results of statistical testing indicate greater variability in the length of organic reviews based on the difference in standard deviation (132.151 for organic reviews compared to 107.383 for incentive reviews). The slightly higher standard error for organic reviews, 1.006 vs. 0.593, suggests less precision in the average length estimate. The observed difference in length is minimal, at 0.304, with a non-significant p-value of 1.000 and a t-value of −0.260, indicating that this difference is not statistically significant and is unlikely to impact customer perceptions or decision-making. This suggests that the review length is consistent across both review types, with minimal influence from incentives; see Table 6.

5.7. Recommendation Results

To enhance customer decision-making, we focused on organic reviews based on our A/B testing results, which showed their stronger influence on customer choices. We tested 6 queries, detailed in Table 7. The first three are unique, user-generated preferences, and the last three are variations of existing organic reviews, including a complete review, a partial of the previous complete review, and a synonym-substituted version.
Comparing the top five listing IDs and similarity scores, Table 8 shows that SBERT significantly outperforms TF-IDF in all queries. The TF-IDF keyword identification method excels with simpler queries, whether seen or unseen, but fails to capture deeper semantics. The lower similarity scores and frequent zero scores for TF-IDF highlight its limitations in aligning with the query’s meaning. SBERT, being context-aware, generally provides better results with complex, seen texts than with simple, unseen texts. It also performs well with the synonym-replaced query 6 (Q 6), but struggles with overly simple texts (e.g., Q 3).
To compare the TF-IDF and Sentence-BERT models, in addition to evaluating the top five listing IDs and similarity scores, we assessed their key metrics; see Table 9. Both models achieved perfect precision (1.000) for all queries, indicating all top five recommended IDs were relevant. The optimal MRR of 1.000 confirms that the top five listing IDs were always the first relevant results. SBERT’s higher similarity scores and MRR highlight its effectiveness in identifying closely related listing IDs. Despite high accuracy (0.995) for both models, the models’ low recall scores indicate a limited ability to find all relevant IDs. Although the split was stratified to ensure a fair representation of data in the analysis, a combination of high precision and low recall suggests a potential data imbalance, reflected in a low F1 score. This implies that the models correctly identified most matches but missed a significant number of relevant items, possibly due to an imbalance in the distribution of review listing IDs. Additionally, the dataset’s imbalanced sentiment distribution and the presence of detailed or complex reviews among more generic ones may affect the model’s matching accuracy. Despite these challenges, both models show a perfect match ratio (1.000) for both seen and unseen queries, indicating consistent accuracy in matching the top five results with the ground truth data.

6. Discussion

This research delves into how online reviews offer insights into consumer decision-making, expanding our understanding of the motivational factors behind purchasing decisions. It challenges existing assumptions by exploring the effectiveness of disclosure practices, revealing how consumers interpret incentive and organic reviews differently. This section starts by comparing these review types to address the first research question. It then answers the second and third research questions by discussing the impact of incentives on review posting behavior, review quality, and purchasing decisions, ultimately determining which type of review better assists customer decision-making.

6.1. Incentive vs. Organic

The analysis shows that incentive reviews often have more positive descriptions and pros, more negative cons, and generally higher ratings, with a minority of lower scores. Unlike organic reviews, the volume of incentive reviews has fluctuated significantly over the years, influenced by factors such as pandemics, economic issues, social platform growth, changes in incentive program structure, a rise in authentic feedback, growing customer skepticism, stricter regulations on incentive reviews, and the expansion of smaller companies. Despite differences, incentive and organic reviews share a significant common language, suggesting that incentives do not entirely change the focus or sentiment of reviews. Therefore, customers may view both review types as having similar content. This may encourage companies to shift from offering incentives to improving advertising and consumer awareness. Based on statistical results, despite organic reviews, incentive reviews have a higher total rating and more consistent overall rating.

6.2. Incentives, Customer Behavior, and Review Quality

To answer the question “How do incentives influence the quality of purchase reviews through changes in customer behavior?”, we rely on our findings from the first research question.
Considering customer behavior and its impact on review quality, findings show that incentives significantly increase review volume. Reviewers often rate reviews more positively, even with negative feedback, driven by the expectation of rewards, as reflected in the higher sum of ratings for incentive reviews. Despite changes over time due to factors like commerce and technology, incentive reviews consistently outnumber organic ones, supporting H2a and indicating that incentives increase the likelihood of posting reviews and slightly influence customer perspectives. However, the higher volume of incentive reviews is associated with lower credibility and increased bias, often due to non-experience-based content aimed at boosting ratings. This issue is more pronounced in smaller companies, where fake reviews may undermine credibility, supporting H1a. While experienced users provide more credible reviews, incentive ones tend to show less consistent ratings compared to organic reviews, as positive ratings increase and negative ones decrease, leading to the rejection of H2a for experience-related reviews. Regarding cost and customer support, the significant number of zero ratings in incentive reviews, compared to organic ones, shows that incentives do not always increase positivity. Both types of reviews show similar zero-rate volumes for recommendation likelihood, indicating greater consistency in organic reviews, especially considering that the overall volume of organic reviews is almost half that of incentive reviews, which does not support H2a. A higher negative-to-positive cons ratio in organic reviews suggests customer sensitivity to negative feedback, enhancing the credibility of negative reviews, and supporting H1a. Additionally, negative software reviews correlate more strongly than positive ones, indicating that negative reviews are often more credible [41].
While our study emphasizes the relationship between review volume and credibility, it is important to acknowledge that a high volume of short or superficial reviews could potentially mislead consumers. This is particularly evident in incentive reviews, where lack of detail may compromise credibility despite the larger number of reviews. Our analysis of the review length demonstrates that organic reviews tend to be longer and more detailed, and generally offer more comprehensive insights. This indicates that lengthier reviews are associated with higher credibility and more valuable consumer feedback. This observation underscores that while volume is a factor, the depth and detail of the reviews are also critical in determining their trustworthiness.
The high volume of top 20 words in positive incentive reviews indicates bias and reduced credibility compared to organic reviews, supporting H1a. The frequent use of company-favoring phrases in incentive reviews compared to customer-oriented phrases in organic ones underscores the greater credibility of organic reviews, based on the volume of customer-related phrases. This observation further supports H1a and suggests that incentives may reduce content diversity and increase consistency in review content, thus supporting H2a. Emphasizing specific content, like “project management tool”, in incentive reviews, points to targeted promotion, potentially undermining credibility and supporting H1c.
The higher volume of these words in negative organic reviews suggests more detailed, lengthy, and credible content, thus supporting H1b. Longer, more informative negative organic reviews reveal higher credibility and lower bias compared to incentive reviews, further supporting H1b. While the average length of positive incentive reviews and organic reviews is similar, 104.41 vs. 104.14, the greater volume of incentive reviews results in more consistent lengths, supporting H2b. However, the higher rating scores in incentive reviews, driven by rewards, raise credibility concerns. Consistency in length does not equate to higher quality, as incentives can reduce the authenticity and reliability of reviews. Understanding the direction of incentive reviews can help guide customers toward more authentic feedback.
While our study highlights patterns of incentive reviews in the software sector, findings in other sectors show both similarities and differences. For example, in the retail context, incentives often lead to biased, positively skewed reviews, and influence purchase decisions, like our observations in software reviews [50]. However, the impact of incentive disclosures may vary. Studies on social sampling reveal that undisclosed incentives can significantly alter consumer perceptions [68]. In retail, disclosing incentives can mislead consumers and reduce credibility, with concerns about accuracy emerging [16]. While some findings may generalize across sectors, the differences suggest that each sector exhibits specific dynamics, warranting further exploration of sector-based variations.
Our analysis captures how different emotional tones related to positive, neutral, or negative sentiments influence customer decision-making, particularly alongside attributes such as “features”, “customer_support”, and “value_for_money”. For instance, organic reviews, which tend to be more critical and varied in nature, reflect deeper emotional tones by offering a broader range of perspectives, contributing to more informed decision-making. This aligns with the study’s objectives of exploring review quality, particularly credibility. The variability in sentiment and review characteristics reveals that organic reviews, particularly negative ones, are generally more credible and detailed compared to incentive reviews. The higher standard deviation observed in attributes like customer support implicitly highlights the broader range of emotional tones within negative sentiments, providing a deeper understanding of customer feedback. These findings underscore the importance of emotional tone in influencing consumer trust and purchase decisions, especially in contexts like customer support and cost, where review authenticity is crucial.

6.3. Incentive Review and Decision-Making in Purchases

A higher volume of incentive reviews often emphasizes rating and quantity over content, leading to a lack of an accurate and comprehensive view of the products/services. This causes less consistency in incentive reviews, resulting in lower credibility and less informed purchase decisions. A willingness to post positive incentive reviews, reflected in the higher sum of all rating scores, may indicate excessive positivity, which could impact purchase decisions.
Incentive reviews emphasize project management tools, favoring businesses and boosting brand recognition, while organic reviews focus more on customer-friendly aspects like “customer support” and “ease of use”. This business-centric content in incentive reviews can reduce their credibility and trustworthiness for new customers, supporting H3.
Statistical validation of these findings was conducted using A/B testing, hypothesis testing, and bootstrap distribution, as shown in Table 4. The observed differences and t-values demonstrate the statistical significance of key attributes, such as overall rating, value for money, and customer support. For example, the difference in customer support ratings showed a statistically significant observed difference (p = 0.000), confirming that incentives influence customer ratings. The empirical p-values further reinforce that these results are unlikely to have occurred by chance. Further, these statistical results show that incentive reviews have less influence on purchase decisions based on overall rating, have little to no impact on decisions regarding software costs and features, and have no significant effect on ease of use. Additionally, incentive reviews negatively impact decisions related to customer support, all of which support H3.
Moreover, the use of bootstrapping ensures that these findings are robust and reliable across various samples. This method validates that the observed patterns in incentive versus organic reviews are not random but reflect genuine behavioral differences. Thus, while incentive reviews may present higher ratings overall, their credibility and impact on purchase decisions are statistically confirmed to be lower than organic reviews in some key areas, such as customer support.

6.4. Recommendations and Decision-Making in Purchases

Based on semantic link analysis, spectral clustering, and t-SNE results, which showed the minimal content difference between incentive and organic reviews, we conducted statistical tests, revealing that organic reviews have a greater influence on customer decision-making, supporting H3. We then developed a recommendation system using SBERT and TF-IDF, emphasizing organic reviews to present the top five most relevant products based on customer preferences.
It is crucial to consider the nature and study objectives of the data. Therefore, we stratified the split of organic reviews by product ID, emphasizing the research goal to examine consumer reviews by ensuring a fair proportional representation of each product in the analysis, leading to more reliable and generalizable models. This approach captures the true distribution of reviews and honest consumer experiences, reflecting accurate customer satisfaction and product performance. The uneven review counts per product ID highlight real-world product popularity and offer insights into customer behavior and market trends. The findings from this approach offer a realistic and relevant exploration of consumer opinions. This methodological choice provides a solid foundation for the conclusions drawn and the recommendations made based on this analysis.
Initially, we used “seen queries” to evaluate the model, ensuring its performance, functional integrity, and consistent responses in similar scenarios while identifying potential issues. Although this approach with seen data proved beneficial for initial verification, we recognized the need to extend validation to unseen data. This step was crucial to mitigate the risk of high-performance metrics due to overfitting and more accurately assess the model’s generalization ability. Our methodology involved using detailed but not overly complex review content to optimize the recommendation model’s performance. Despite the challenges of an imbalanced dataset typical of online reviews, the SBERT model’s high performance demonstrated its potential for achieving reliable and accurate outcomes with careful model selection.

6.5. Comparison of the Proposed Approach with the State-of-the-Art

The comparative analysis between the more advanced SBERT model and the baseline TF-IDF model demonstrates SBERT’s superior performance in recommendation effectiveness and semantic link applications. SBERT achieved a near-perfect cosine similarity score of 0.999, compared to 0.675 for TF-IDF, highlighting its enhanced ability to discern similarities among review groups. In particular, for complex review texts, SBERT consistently outperforms TF-IDF in handling semantic relationships, as discussed in detail in the recommendation results shown in Table 8.
Although SBERT requires more computational resources due to its complex sentence embeddings and high-dimensional vector cosine similarity calculations, it offers richer semantic insights. In contrast, TF-IDF is faster and more computationally efficient, making it better suited for large-scale real-time applications where speed and scalability are prioritized over deeper semantic understanding. However, for this study, where a deeper understanding of review content is critical, SBERT provides more accurate and semantically rich results.
When compared to recent state-of-the-art models, SBERT continues to show strong performance. A recent study reported an 89% accuracy using BERT for sentiment analysis, which is comparable to SBERT’s performance in analyzing complex content [65]. Similarly, another study achieved 91% accuracy in sentiment scoring with RoBERTa, highlighting SBERT’s broader capabilities in processing complex text [88].
In another comparison, a hybrid recommendation system using bi-LSTM achieved 93.39% accuracy [89]. While bi-LSTM performs well in sentiment classification, SBERT’s strength lies in its ability to manage semantic link analysis, providing a more comprehensive approach to software review recommendations. Additionally, the effectiveness of Transformer models in semantic-rich tasks like conversational AI further supports SBERT’s role in advanced text understanding [90].
In summary, SBERT offers deep semantic insights and strong recommendation capabilities, positioning it as a highly competitive model for review analysis, comparable to state-of-the-art approaches in sentiment classification.

6.6. Comparable Analysis with Existing Studies

Our study explored how the review length, volume, and content affect credibility, consistency, and purchase decisions, considering the incentivized and sentiment status of the reviews. We focused on the differences between incentive and organic reviews and found that the impact of these factors varies depending on the review’s nature and context. Extensive research already exists on this matter.
The meta-analysis underscores the importance of consistent, well-argued reviews in enhancing eWOM credibility and shaping customer perception [91]. It examines how review characteristics like volume, ratings, and length affect sales, with volume and ratings positively influencing sales, but not the review length [52]. Qiu [29] highlighted a positive correlation between review volume and credibility, influencing customer purchase intentions. However, our findings suggest that a high volume of incentive reviews may introduce bias and negatively impact review quality. Li [92] noted that review volume and valence can inflate customer expectations and increase return rates. Studies show that financial incentives can lead to dishonest reviews [66]. While undisclosed incentive reviews can boost purchases [50], they ultimately reduce trust [9], aligning with our findings.
Furthermore, while other research explores the complex interaction between review volume and emotional tendencies on product diffusion [93], our study did not consider emotions in conjunction with volume. Similar to our findings, Tang [61] observed that incentive reviews do not significantly differ in content from organic ones except in the sentiment of certain parts of the review text, suggesting they might not always lead to biased reviews. Moreover, while incentive reviews might appear more positive due to length and ratings, their sentiment content is not substantially different [40].

7. Conclusions and Future Work

This study examines the impact of incentive versus organic reviews on consumer decisions through advanced textual analysis, offering strategic insights for businesses and online review platforms. It advises companies to preserve review authenticity and develop robust methods to detect incentive reviews, ensuring trust. Our findings were validated through rigorous statistical significance tests, including A/B testing, hypothesis testing, and bootstrap distribution, further enhancing the reliability of our conclusions. The statistical results confirmed that while incentive reviews tend to yield higher ratings, they possess less credibility and consistency compared to organic reviews, particularly in critical aspects such as customer support. This study also contributes to managing online consumer feedback, suggesting that maintaining a balance between incentive and organic reviews is crucial to managing online feedback in e-commerce.
This research employs a comprehensive suite of methodologies, including EDA, sentiment analysis, semantic link analysis using SBERT and TF-IDF, spectral clustering, t-SNE, statistical analysis, and recommendations to explore differences and semantic variances in reviews and their impact on consumer behavior and purchase decisions. EDA provided an initial understanding of review distribution, highlighting key features and outliers in data. Sentiment analysis assessed the emotional tone of the review, revealing how incentives might influence consumer sentiment. The SBERT model captured semantic differences between incentive and organic reviews, while TF-IDF quantified word importance, identifying key terms that distinguish review types. Semantic link analysis using TF-IDF further explored how specific terms contribute to the perceived helpfulness of reviews.
Spectral clustering, using SBERT embeddings, grouped similar reviews based on semantic content, effectively categorizing incentive versus organic reviews by underlying themes or sentiments. The t-SNE then visualized these clusters in two-dimensional space, clearly distinguishing between the two review types. A/B testing, hypothesis testing, and bootstrap distribution were conducted to statistically assess the impact of review types on consumer behavior and decision-making. Finally, a recommendation system integrating SBERT and TF-IDF was developed to enhance personalized shopping experiences by matching consumer preferences with relevant product reviews.
We also formulated six hypotheses, some of which strongly support our results. However, the hypotheses related to length and volume may be supported or rejected due to the diversity of the situations related to these.
This research highlights the minimal impact of incentive reviews on purchasing decisions, emphasizing the importance of authentic online feedback. Our use of SBERT outperformed traditional models like TF-IDF, improving our ability to analyze the semantic differences in reviews that influence consumer perceptions and actions.
Existing studies often yield conflicting results due to differences in study populations, research methods, and approaches. For instance, Woolley and Sharif [12] found that incentives make writing reviews more enjoyable, while Garnefeld et al. [15] noted that incentives increase review rates. In contrast, another study showed that incentivized customers tend to leave negative reviews. The authors also noted that incentives boost the volume and length of reviews, providing new customers with more information for better purchase decisions [10]. While our findings support some existing research and oppose others, these opposing results underscore the need for further investigation across diverse populations and methods. Such research is crucial for improving the quality of online reviews and recommendation systems, potentially through collaborations among companies.

7.1. Implications of the Study

Our research uniquely contributes to the fields of e-commerce and consumer behavior by employing a multifaceted analytical approach that significantly advances the understanding of how incentive reviews influence consumer perceptions and behavior. Using advanced methodologies such as Sentence-BERT (SBERT), TF-IDF, and t-SNE, we explored deep semantic variances in reviews, assessed their impacts on consumer decision-making, and analyzed key semantic and qualitative review characteristics. Additionally, statistical testing, including bootstrap distribution and hypothesis testing, contributes to robust evidence regarding review quality and credibility. These combined approaches provide important insights into the differences between incentive and organic reviews while also offering practical insights for businesses on how to optimize review systems to improve authenticity and credibility. Together, these contributions underscore the significance of this research in both academic understanding and real-world applications.
Although incentive software reviews currently outnumber organic ones by nearly two-to-one, this could shift due to factors like time, business size, platform, and user awareness and experience. Key factors like cost, software features, ease of use, and customer support strongly influence ratings in both incentive and organic reviews. Despite the high volume and ratings of incentive reviews, our findings suggest they may not significantly impact customer purchase decisions.
Our research highlights how organic reviews influence consumer decisions and demonstrates how advanced models like SBERT and TF-IDF can manage this impact despite dataset imbalances. For e-commerce experts, integrating these insights can lead to more customer-centric, trustworthy, and engaging recommendation systems.
Our approach, balancing seen and unseen data, sets a standard for future research in data-driven decision support systems. This study enriches existing literature and offers practical insights for businesses to refine review strategies and enhance customer engagement. It also contributes to the broader discourse on how AI and machine learning intentionally enhance user experiences in digital marketplaces, advocating for continued exploration and innovation in online review systems.

7.2. Strengths and Limitations

This study has wide applicability beyond the fast-growing world of reviews. As business growth and product diversity intensify sales competition, accessing high-quality reviews becomes essential. Our research can enhance product review systems, boosting customer satisfaction by saving time and money. Moreover, our unique contribution, which focuses on software review quality, particularly incentive reviews, distinguishes our work from existing research in the field. Our study’s strength lies in its rigorous methodology, combining advanced sentiment analysis, SBERT and TF-IDF semantic analyses, and innovative use of clustering algorithms and statistical analysis. This comprehensive approach deepens our understanding of review quality and consumer behavior.
While this work has strengths, it also has weaknesses. Current methods struggle to accurately determine review sentiment due to the subjective nature of human emotions and expressions. A large dataset makes manual sentiment annotation impractical, and even human efforts can be error-prone. The challenges posed by the large volume of data necessitate using automated tools. Despite their efficiency, these tools often fail to fully capture the spectrum of emotions expressed in reviews, especially in longer reviews.
The model’s constraints limit our sentiment analysis to processing only 200 characters per review. This limitation may have caused important details in longer reviews to be missed, potentially overlooking critical context and sentiments that offer deeper insights into consumer attitudes and behaviors.

7.3. Future Work

Building on our analysis of differences in purchase reviews and the assessment of review credibility and consistency, we gained insights into how incentive reviews influence customer decisions. Our research used EDA, sentiment analysis, semantic links, spectral clustering, t-SNE, and statistical testing to assess review quality, examine its impact on purchase decisions, and enhance our recommendation algorithm.
We aim to extend our research by surveying new software users to gather and analyze feedback, addressing the subjectivity in online reviews and decision-making processes. We plan to explore key review quality dimensions—objectivity [94,95], depth [96,97], authenticity [98,99], and helpfulness [100,101]—with a focus on the incentivized status of the reviews. This comprehensive analysis will compare user perceptions of incentive review quality with their actual impact on purchasing behavior. We also plan to refine our recommendation system by integrating NLP techniques to incorporate sentiment analysis, offering deeper insights into customer reviews and their impact on the e-commerce landscape.
While our current sentiment analysis captures positive, neutral, and negative reviews, future research could explore how varying ranges of emotional tones—such as highly positive and moderately positive within these categories—offer deeper insights into how customer decisions vary across categories like price concern or product quality. More subtle variations in emotional feedback might play a more significant role in decision-making.
Our study also opens new avenues for investigating the complexities of online review systems across various domains. Future research could examine the long-term impact of incentive reviews on consumer trust and preference. We are also interested in investigating cross-cultural differences in how consumers perceive and respond to incentive versus organic reviews, considering diverse behaviors and trust mechanisms across regions. Developing advanced analytical tools is essential for better detecting authentic reviews and identifying manipulative practices on online platforms. Leveraging AI and machine learning to automate review analysis could significantly enhance the credibility and usefulness of online reviews, boosting consumer trust and enabling more informed purchasing decisions.

Author Contributions

Conceptualization, K.K. and J.D.; methodology, K.K., J.D. and H.C.; software, K.K.; validation, K.K. and J.D.; formal analysis, K.K.; investigation, K.K.; resources, K.K. and H.C.; data curation, K.K.; writing—original draft preparation, K.K. and J.D.; writing—review and editing, K.K., J.D. and H.C.; visualization, K.K.; supervision, K.K. and J.D.; project administration, K.K. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Science Foundation, NSF award nos. 2225229 and 2231519.

Data Availability Statement

The data presented in this study are available upon request from the corresponding author. The data are not publicly available due to privacy.

Acknowledgments

This paper is a substantially extended version of the IEEE AITest 2023 conference paper “Evaluating the Impact of Incentive/Non-incentive Reviews on Customer Decision-making”. The authors would like to thank Bhanu Prasad Gollapudi for contributing to the data collection and preparation.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Kargozari, K.; Ding, J.; Chen, H. Evaluating the Impact of Incentive/Non-incentive Reviews on Customer Decision-making. In Proceedings of the 2023 IEEE International Conference on Artificial Intelligence Testing (AITest), Athens, Greece, 17–20 July 2023; pp. 160–168. [Google Scholar]
  2. Zhu, L.; Li, H.; Wang, F.; He, W.; Tian, Z. How online reviews affect purchase intention: A new model based on the stimulus-organism-response (S-O-R) framework. Aslib J. Inf. Manag. 2020, 72, 463–488. [Google Scholar] [CrossRef]
  3. Yu, Y.; Yang, Y.; Huang, J.; Tan, Y. Unifying Algorithmic and Theoretical Perspectives: Emotions in Online Reviews and Sales. MIS Q. 2023, 47, 127–160. [Google Scholar] [CrossRef]
  4. Alqaryouti, O.; Siyam, N.; Abdel Monem, A.; Shaalan, K. Aspect-based sentiment analysis using smart government review data. Appl. Comput. Inform. 2024, 20, 142–161. [Google Scholar] [CrossRef]
  5. Alamoudi, E.S.; Alghamdi, N.S. Sentiment classification and aspect-based sentiment analysis on yelp reviews using deep learning and word embeddings. J. Decis. Syst. 2021, 30, 259–281. [Google Scholar] [CrossRef]
  6. Jain, D.K.; Boyapati, P.; Venkatesh, J.; Prakash, M. An intelligent cognitive-inspired computing with big data analytics framework for sentiment analysis and classification. Inf. Process. Manag. 2022, 59, 102758. [Google Scholar] [CrossRef]
  7. Qiao, D.; Rui, H. Text performance on the vine stage? The effect of incentive on product review text quality. Inf. Syst. Res. 2023, 34, 676–697. [Google Scholar] [CrossRef]
  8. Petrescu, M.; O’Leary, K.; Goldring, D.; Ben Mrad, S. Incentivized reviews: Promising the moon for a few stars. J. Retail. Consum. Serv. 2018, 41, 288–295. [Google Scholar] [CrossRef]
  9. Ai, J.; Gursoy, D.; Liu, Y.; Lv, X. Effects of offering incentives for reviews on trust: Role of review quality and incentive source. Int. J. Hosp. Manag. 2022, 100, 103101. [Google Scholar] [CrossRef]
  10. Burtch, G.; Hong, Y.; Bapna, R.; Griskevicius, V. Stimulating Online Reviews by Combining Financial Incentives and Social Norms. Manag. Sci. 2018, 64, 2065–2082. [Google Scholar] [CrossRef]
  11. Costa, A.; Guerreiro, J.; Moro, S.; Henriques, R. Unfolding the characteristics of incentivized online reviews. J. Retail. Consum. Serv. 2019, 47, 272–281. [Google Scholar] [CrossRef]
  12. Woolley, K.; Sharif, M. Incentives Increase Relative Positivity of Review Content and Enjoyment of Review Writing. J. Mark. Res. 2021, 58, 539–558. [Google Scholar] [CrossRef]
  13. Imtiaz, M.N.; Ahmed, M.T.; Paul, A. Incentivized Comment Detection with Sentiment Analysis on Online Hotel Reviews. Authorea 2020. [Google Scholar] [CrossRef]
  14. Zhang, M.; Wei, X.; Zeng, D. A matter of reevaluation: Incentivizing users to contribute reviews in online platforms. Decis. Support Syst. 2020, 128, 113158. [Google Scholar] [CrossRef] [PubMed]
  15. Garnefeld, I.; Helm, S.; Grötschel, A.K. May we buy your love? psychological effects of incentives on writing likelihood and valence of online product reviews. Electron. Mark. 2020, 30, 805–820. [Google Scholar] [CrossRef]
  16. Cui, G.; Chung, Y.; Peng, L.; Zheng, W. The importance of being earnest: Mandatory vs. voluntary disclosure of incentives for online product reviews. J. Bus. Res. 2022, 141, 633–645. [Google Scholar] [CrossRef]
  17. Luca, M.; Zervas, G. Fake it till you make it: Reputation, competition, and yelp review fraud. Manag. Sci. 2016, 62, 3412–3427. [Google Scholar] [CrossRef]
  18. Li, H.; Bruce, X.B.; Li, G.; Gao, H. Restaurant survival prediction using customer-generated content: An aspect-based sentiment analysis of online reviews. Tour. Manag. 2023, 96, 104707. [Google Scholar] [CrossRef]
  19. Alhumoud, S.O.; Al Wazrah, A.A. Arabic sentiment analysis using recurrent neural networks: A reviews. Artif. Intell. Rev. 2022, 55, 707–748. [Google Scholar] [CrossRef]
  20. Samah, K.A.F.A.; Jailani, N.S.; Hamzah, R.; Aminuddin, R.; Abidin, N.A.Z.; Riza, L.S. Aspect-Based Classification and Visualization of Twitter Sentiment Analysis Towards Online Food Delivery Services in Malaysia. J. Adv. Res. Appl. Sci. Eng. Tech. 2024, 37, 139–150. [Google Scholar]
  21. Martin-Fuentes, E.; Fernandez, C.; Mateu, C.; Marine-Roig, E. Modelling a grading scheme for peer-to-peer accommodation: Stars for Airbnb. Int. J. Hosp. Manag. 2018, 69, 75–83. [Google Scholar] [CrossRef]
  22. Singh, H.P.; Alhamad, I.A. Deciphering key factors impacting online hotel ratings through the lens of two-factor theory: A case of hotels in the makkah city of Saudi Arabia. Int. Trans. J. Eng. Manag. Appl. Sci. Technol. 2021, 12, 1–12. [Google Scholar]
  23. Singh, H.P.; Alhamad, I.A. A Novel Categorization of Key Predictive Factors Impacting Hotels’ Online Ratings: A Case of Makkah. Sustainability 2021, 14, 16588. [Google Scholar] [CrossRef]
  24. Liu, Z.; Lei, S.H.; Guo, Y.L.; Zhou, Z.A. The interaction effect of online review language style and product type on consumers’ purchase intentions. Palgrave Commun. 2020, 6, 1–8. [Google Scholar] [CrossRef]
  25. Chakraborty, U.; Bhat, S. Credibility of online reviews and its impact on brand image. Manag. Res. Rev. 2018, 41, 148–164. [Google Scholar] [CrossRef]
  26. Mackiewicz, J.; Yeats, D.; Thornton, T. The Impact of Review Environment on Review Credibility. J. Bus. Res. 2016, 59, 71–88. [Google Scholar] [CrossRef]
  27. Aghakhani, N.; Oh, O.; Gregg, D.; Jain, H. How Review Quality and Source Credibility Interacts to Affect Review Usefulness: An Expansion of the Elaboration Likelihood Model. Inf. Syst. Front. 2022, 25, 1513–1531. [Google Scholar] [CrossRef]
  28. Filieri, R.; Hofacker, C.F.; Alguezaui, S. What makes information in online consumer reviews diagnostic over time? The role of review relevancy, factuality, currency, source credibility and ranking score. Comput. Hum. Behav. 2018, 80, 122–131. [Google Scholar] [CrossRef]
  29. Qiu, K.; Zhang, L. How online reviews affect purchase intention: A meta-analysis across contextual and cultural factors. Data Inf. Manag. 2023, 8, 100058. [Google Scholar] [CrossRef]
  30. Zhao, K.; Stylianou, A.C.; Zheng, Y. Sources and impacts of social influence from online anonymous user reviews. Inf. Manag. 2018, 55, 16–30. [Google Scholar] [CrossRef]
  31. Tran, V.D.; Nguyen, M.D.; Lương, L.A. The effects of online credible review on brand trust dimensions and willingness to buy: Evidence from Vietnam consumers. Cogent Bus. Manag. 2022, 9, 2038840. [Google Scholar] [CrossRef]
  32. Hung, S.W.; Chang, C.W.; Chen, S.Y. Beyond a bunch of reviews: The quality and quantity of electronic word-of-mouth. Inf. Manag. 2023, 60, 103777. [Google Scholar] [CrossRef]
  33. Aghakhani, N.; Oh, O.; Gregg, D. Beyond the Review Sentiment: The Effect of Review Accuracy and Review Consistency on Review Usefulness. In Proceedings of the International Conference on Information Systems (ICIS), Seoul, Republic of Korea, 10–13 December 2017. [Google Scholar]
  34. Xie, K.L.; Chen, C.; Wu, S. Online Consumer Review Factors Affecting Offline Hotel Popularity: Evidence from Tripadvisor. J. Travel Tour. Mark. 2016, 33, 211–223. [Google Scholar] [CrossRef]
  35. Aghakhani, N.; Oh, O.; Gregg, D.G.; Karimi, J. Online Review Consistency Matters: An Elaboration Likelihood Model Perspective. Inf. Syst. Front. 2021, 23, 1287–1301. [Google Scholar] [CrossRef]
  36. Wu, H.H.; Tipgomut, P.; Chung, H.F.; Chu, W.K. The mechanism of positive emotions linking consumer review consistency to brand attitudes: A moderated mediation analysis. Asia Pacific J. Mark. Logist. 2020, 32, 575–588. [Google Scholar] [CrossRef]
  37. Gutt, D.; Neumann, J.; Zimmermann, S.; Kundisch, D.; Chen, J. Design of review systems—A strategic instrument to shape online reviewing behavior and economic outcomes. J. Strateg. Inf. Syst. 2019, 28, 104–117. [Google Scholar] [CrossRef]
  38. Kamble, V.; Shah, N.; Marn, D.; Parekh, A.; Ramchandran, K. The Square-Root Agreement Rule for Incentivizing Objective Feedback in Online Platforms. Manag. Sci. 2023, 69, 377–403. [Google Scholar] [CrossRef]
  39. Le, L.T.; Ly, P.T.M.; Nguyen, N.T.; Tran, L.T.T. Online reviews as a pacifying decision-making assistant. J. Retail. Consum. Serv. 2022, 64, 102805. [Google Scholar] [CrossRef]
  40. Zhang, H.; Yang, A.; Peng, A.; Pieptea, L.F.; Yang, J.; Ding, J. A Quantitative Study of Software Reviews Using Content Analysis Methods. IEEE Access 2022, 10, 124663–124672. [Google Scholar] [CrossRef]
  41. Kusumasondjaja, S.; Shanka, T.; Marchegiani, C. Credibility of online reviews and initial trust: The roles of reviewer’s identity and review valence. J. Vacat. Mark. 2012, 18, 185–195. [Google Scholar] [CrossRef]
  42. Jamshidi, S.; Rejaie, R.; Li, J. Characterizing the dynamics and evolution of incentivized online reviews on Amazon. Soc. Netw. Anal. Min. 2019, 9, 22. [Google Scholar] [CrossRef]
  43. Gneezy, U.; Meier, S.; Rey-Biel, P. When and why incentives (don’t) work to modify behavior. J. Bus. Res. 2011, 25, 191–210. [Google Scholar] [CrossRef]
  44. Chen, T.; Samaranayake, P.; Cen, X.; Qi, M.; Lan, Y.C. The Impact of Online Reviews on Consumers’ Purchasing Decisions: Evidence from an Eye-Tracking Study. Front. Physiol. 2022, 13, 2723. [Google Scholar] [CrossRef] [PubMed]
  45. ANoh, Y.G.; Jeon, J.; Hong, J.H. Understanding of Customer Decision-Making Behaviors Depending on Online Reviews. Appl. Sci. 2023, 13, 3949. [Google Scholar] [CrossRef]
  46. Truong Du Chau, X.; Toan Nguyen, T.; Khiem Tran, V.; Quach, S.; Thaichon, P.; Jo, J.; Vo, B.; Dieu Tran, Q.; Viet Hung Nguyen, Q. Towards a review-analytics-as-a-service (raaas) framework for smes: A case study on review fraud detection and understanding. Australas. Mark. J. 2024, 32, 76–90. [Google Scholar] [CrossRef]
  47. Park, S.; Shin, W.; Xie, J. Disclosure in Incentivized Reviews: Does It Protect Consumers? Manag. Sci. 2023, 69, 7009–7021. [Google Scholar] [CrossRef]
  48. Bigne, E.; Chatzipanagiotou, K.; Ruiz, C. Pictorial content, sequence of conflicting online reviews and consumer decision-making: The stimulus-organism-response model revisited. J. Bus. Res. 2020, 115, 403–416. [Google Scholar] [CrossRef]
  49. Zhang, Z.; Zhang, Z.; Liu, S.; Zhang, Z. Are high-status reviewers more likely to seek anonymity? Evidence from an online review platform. J. Retail. Consum. Serv. 2024, 78, 103792. [Google Scholar] [CrossRef]
  50. Zhong, M.; Yang, H.; Zhong, K.; Qu, X.; Li, Z. The Impact of Online Reviews Manipulation on Consumer Purchase Decision Based on The Perspective of Consumers’ Perception. J. Internet Technol. 2023, 24, 1469–1476. [Google Scholar] [CrossRef]
  51. Lu, B.; Ma, B.; Cheng, D.; Yang, J. An investigation on impact of online review keywords on consumers’ product consideration of clothing. J. Theor. Appl. Electron. Commer. Res. 2023, 18, 187–205. [Google Scholar] [CrossRef]
  52. Li, K.; Chen, Y.; Zhang, L. Exploring the influence of online reviews and motivating factors on sales: A meta-analytic study and the moderating role of product category. J. Retail. Consum. Serv. 2020, 55, 102107. [Google Scholar] [CrossRef]
  53. He, S.; Hollenbeck, B.; Overgoor, G.; Proserpio, D.; Tosyali, A. Detecting fake-review buyers using network structure: Direct evidence from Amazon. Proc. Natl. Acad. Sci. USA 2022, 119, e2211932119. [Google Scholar] [CrossRef] [PubMed]
  54. Gerrath, M.H.; Usrey, B. The impact of influencer motives and commonness perceptions on follower reactions toward incentivized reviews. Int. J. Res. Mark. 2021, 38, 531–548. [Google Scholar] [CrossRef]
  55. Beck, B.B.; Wuyts, S.; Jap, S. Guardians of Trust: How Review Platforms Can Fight Fakery and Build Consumer Trust. J. Mark. Res. 2023, 61, 00222437231195576. [Google Scholar] [CrossRef]
  56. Du Plessis, C.; Stephen, A.T.; Bart, Y.; Goncalves, D. When in Doubt, Elaborate? How Elaboration on Uncertainty Influences the Persuasiveness of Consumer-Generated Product Reviews When Reviewers Are Incentivized. SSRN Electron. J. 2016, 59, 2821641. [Google Scholar] [CrossRef]
  57. Yin, H.; Zheng, S.; Yeoh, W.; Ren, J. How online review richness impacts sales: An attribute substitution perspective. J. Assoc. Inf. Sci. Technol. 2021, 72, 901–917. [Google Scholar] [CrossRef]
  58. Jamshidi, S.; Rejaie, R.; Li, J. Trojan horses in amazon’s castle: Understanding the incentivized online reviews. In Proceedings of the 10th IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2018), Barcelona, Spain, 28–31 August 2018; pp. 335–342. [Google Scholar]
  59. Jia, Y.; Liu, I.L. Do consumers always follow “useful” reviews? The interaction effect of review valence and review usefulness on consumers’ purchase decisions. J. Assoc. Inf. Sci. Technol. 2018, 69, 1304–1317. [Google Scholar] [CrossRef]
  60. Siering, M.; Muntermann, J.; Rajagopalan, B. Explaining and predicting online review helpfulness: The role of content and reviewer-related signals. Decis. Support Syst. 2018, 108, 1–12. [Google Scholar] [CrossRef]
  61. Tang, M.; Xu, Z.; Qin, Y.; Su, C.; Zhu, Y.; Tao, F.; Ding, J. A Quantitative Study of Impact of Incentive to Quality of Software Reviews. In Proceedings of the 9th International Conference on Dependable Systems and Their Applications (DSA 2022), Wulumuqi, China, 4–5 August 2022; pp. 54–63. [Google Scholar]
  62. Li, X.; Wu, C.; Mai, F. The effect of online reviews on product sales: A joint sentiment-topic analysis. Inf. Manag. 2019, 56, 172–184. [Google Scholar] [CrossRef]
  63. Danilchenko, K.; Segal, M.; Vilenchik, D. Opinion Spam Detection: A New Approach Using Machine Learning and Network-Based Algorithms. In Proceedings of the Sixteenth International AAAI Conference on Web and Social Media (ICWSM 2022), Atlanta, GA, USA, 6–9 June 2022; Volume 11, pp. 125–134. [Google Scholar]
  64. Liu, Z.; Liao, H.; Li, M.; Yang, Q.; Meng, F. A deep learning-based sentiment analysis approach for online product ranking with probabilistic linguistic term sets. IEEE Trans. Eng. Manag. 2023. [Google Scholar] [CrossRef]
  65. Ali, H.; Hashmi, E.; Yayilgan Yildirim, S.; Shaikh, S. Analyzing Amazon Products Sentiment: A Comparative Study of Machine and Deep Learning, and Transformer-Based Techniques. Electronics 2024, 13, 1305. [Google Scholar] [CrossRef]
  66. Victor, V.; James, N.; Dominic, E. Incentivised dishonesty: Moral frameworks underlying fake online reviews. Int. J. Consum. Stud. 2024, 48, e13037. [Google Scholar] [CrossRef]
  67. Husain, A.; Alsharo, M.; Jaradat, M.I.R. Content-rating consistency of online product review and its impact on helpfulness: A fine-grained level sentiment analysis. Interdiscip. J. Inf. Knowl. Manag. 2023, 18, 645–666. [Google Scholar] [CrossRef] [PubMed]
  68. Liao, J.; Chen, J.; Jin, F. Social free sampling: Engaging consumer through product trial reports. Inf. Technol. People. 2023, 36, 1626–1644. [Google Scholar] [CrossRef]
  69. Joseph, E.; Munasinghe, T.; Tubbs, H.; Bishnoi, B.; Anyamba, A. Scraping Unstructured Data to Explore the Relationship between Rainfall Anomalies and Vector-Borne Disease Outbreaks. In Proceedings of the 2021 IEEE International Conference on Big Data (Big Data), Orlando, FL, USA, 15–18 December 2021; pp. 4156–4164. [Google Scholar]
  70. Dogra, K.S.; Nirwan, N.; Chauhan, R. Unlocking the Market Insight Potential of Data Extraction Using Python-Based Web Scraping on Flipkart. In Proceedings of the 2023 International Conference on Sustainable Emerging Innovations in Engineering and Technology (ICSEIET), Ghaziabad, India, 14–15 September 2023; pp. 453–457. [Google Scholar]
  71. Naseem, U.; Razzak, I.; Eklund, P.W. A survey of pre-processing techniques to improve short-text quality: A case study on hate speech detection on Twitter. Multimed. Tools Appl. 2021, 80, 35239–35266. [Google Scholar] [CrossRef]
  72. Gupta, H.; Patel, M. Method of text summarization using LSA and sentence-based topic modeling with Bert. In Proceedings of the 2021 International Conference on Artificial Intelligence and Smart Systems (ICAIS), Coimbatore, India, 25–27 March 2021; pp. 511–517. [Google Scholar]
  73. Özçift, A.; Akarsu, K.; Yumuk, F.; Söylemez, C. Advancing natural language processing (NLP) applications of morphologically rich languages with bidirectional encoder representations from transformers (BERT): An empirical case study for Turkish. Automatika 2021, 62, 226–238. [Google Scholar] [CrossRef]
  74. Yuan, L.; Zhao, H.; Wang, Z. Research on News Text Clustering for International Chinese Education. In Proceedings of the 2023 International Conference on Asian Language Processing (IALP), Singapore, 18–20 November 2023; pp. 377–382. [Google Scholar]
  75. Bawa, S.S. Implementing Text Analytics with Enterprise Resource Planning. Int. J. Simul. Syst. Sci. Technol. 2023, 24. [Google Scholar] [CrossRef]
  76. Jebb, A.T.; Parrigon, S.; Woo, S.E. Exploratory data analysis as a foundation of inductive research. Hum. Resour. Manag. Rev. 2017, 27, 265–276. [Google Scholar] [CrossRef]
  77. Basiri, M.E.; Ghasem-Aghaee, N.; Naghsh-Nilchi, A.R. Exploiting reviewers’ comment histories for sentiment analysis. J. Inf. Sci. 2014, 40, 313–328. [Google Scholar] [CrossRef]
  78. Catelli, R.; Pelosi, S.; Esposito, M. Lexicon-based vs. Bert-based sentiment analysis: A comparative study in Italian. Electronics 2022, 11, 374. [Google Scholar] [CrossRef]
  79. Arroni, S.; Galán, Y.; Guzmán Guzmán, X.M.; Núñez Valdéz, E.R.; Gómez Gómez, A. Sentiment analysis and classification of hotel opinions in twitter with the transformer architecture. Int. J. Interact. Multimed. Artif. Intell. 2023, 8, 53. [Google Scholar] [CrossRef]
  80. Schober, P.; Boer, C.; Schwarte, L.A. Correlation coefficients: Appropriate use and interpretation. Anesth Analg. 2018, 126, 1763–1768. [Google Scholar] [CrossRef] [PubMed]
  81. Gomaa, W.H.; Fahmy, A.A. A survey of text similarity approaches. Int. J. Comput. Appl. 2013, 68, 13–18. [Google Scholar]
  82. Qaiser, S.; Ali, R. Text mining: Use of TF-IDF to examine the relevance of words to documents. Int. J. Comput. Appl. 2018, 181, 25–29. [Google Scholar] [CrossRef]
  83. Reimers, N.; Gurevych, I. Sentence-bert: Sentence embeddings using siamese bert-networks. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2019), Hong Kong, China, 3–7 November 2019. [Google Scholar]
  84. Huang, D.; Wang, C.D.; Wu, J.S.; Lai, J.H.; Kwoh, C.K. Ultra-scalable spectral clustering and ensemble clustering. IEEE Trans. Knowl. Data Eng. 2019, 32, 1212–1226. [Google Scholar] [CrossRef]
  85. Hansen, P.C. The truncated SVD as a method for regularization. BIT Numer. Math. 1987, 27, 534–553. [Google Scholar] [CrossRef]
  86. Rousseeuw, P.J. Silhouettes: A graphical aid to the interpretation and validation of cluster analysis. J. Comput. Appl. Math. 1987, 20, 53–65. [Google Scholar] [CrossRef]
  87. Van der Maaten, L.; Hinton, G. Visualizing data using t-SNE. J. Mach. Learn. Res. 2008, 9, 2579–2605. [Google Scholar]
  88. Kumar, B.; Badiger, V.S.; Jacintha, A.D. Sentiment Analysis for Products Review based on NLP using Lexicon-Based Approach and Roberta. In Proceedings of the 2024 International Conference on Intelligent and Innovative Technologies in Computing, Electrical and Electronics (IITCEE), Bangalore, India, 24–25 January 2024; pp. 1–6. [Google Scholar]
  89. Alatrash, R.; Priyadarshini, R. Fine-grained sentiment-enhanced collaborative filtering-based hybrid recommender system. J. Web Eng. 2024, 22, 983–1035. [Google Scholar] [CrossRef]
  90. Sharma, D.; Hamed, E.N.; Akhtar, N.; Vignesh, G.; Thomas, S.A.; Sekhar, M. Next-Generation NLP Techniques: Boosting Machine Understanding in Conversational AI Technologies. J. Comput. Anal. Appl. 2024, 33, 100–109. [Google Scholar]
  91. Verma, D.; Dewani, P.P.; Behl, A.; Pereira, V.; Dwivedi, Y.; Del Giudice, M. A meta-analysis of antecedents and consequences of eWOM credibility: Investigation of moderating role of culture and platform type. J. Bus. Res. 2023, 154, 113292. [Google Scholar] [CrossRef]
  92. Li, X.; Ma, B.; Chu, H. The impact of online reviews on product returns. Asia Pac. J. Mark. Logist. 2018, 33, 1814–1828. [Google Scholar] [CrossRef]
  93. Sun, B.; Kang, M.; Zhao, S. How online reviews with different influencing factors affect the diffusion of new products. Int. J. Consum. Stud. 2023, 47, 1377–1396. [Google Scholar] [CrossRef]
  94. Hair, M.; Ozcan, T. How reviewers’ use of profanity affects perceived usefulness of online reviews. Mark. Lett. 2018, 29, 151–163. [Google Scholar] [CrossRef]
  95. Luo, C.; Luo, X.R.; Xu, Y.; Warkentin, M.; Sia, C.L. Examining the moderating role of sense of membership in online review evaluations. Inf. Manag. 2015, 52, 305–316. [Google Scholar] [CrossRef]
  96. Bi, S.; Liu, Z.; Usman, K. The influence of online information on investing decisions of reward-based crowdfunding. J. Bus. Res. 2017, 71, 10–18. [Google Scholar] [CrossRef]
  97. Janze, C.; Siering, M. ‘Status Effect’in User-Generated Content: Evidence from Online Service Reviews. In Proceedings of the 2015 International Conference on Information Systems: Exploring the Information Frontier (ICIS 2015), Fort Worth, TX, USA, 13–16 December 2015; pp. 1–15. [Google Scholar]
  98. Chatterjee, S.; Chaudhuri, R.; Kumar, A.; Wang, C.L.; Gupta, S. Impacts of consumer cognitive process to ascertain online fake review: A cognitive dissonance theory approach. J. Bus. Res. 2023, 154, 113370. [Google Scholar] [CrossRef]
  99. Campagna, C.L.; Donthu, N.; Yoo, B. Brand authenticity: Literature review, comprehensive definition, and an amalgamated scale. J. Mark. Theory Pract. 2023, 31, 129–145. [Google Scholar] [CrossRef]
  100. Xu, C.; Zheng, X.; Yang, F. Examining the effects of negative emotions on review helpfulness: The moderating role of product price. Comput. Hum. Behav. 2023, 139, 107501. [Google Scholar] [CrossRef]
  101. Luo, L.; Liu, J.; Shen, H.; Lai, Y. Vote or not? How language mimicry affect peer recognition in an online social Q&A community. Neurocomputing 2023, 530, 139–149. [Google Scholar]
Figure 1. Methodological framework.
Figure 1. Methodological framework.
Electronics 13 04316 g001
Figure 2. A glance at the data from CACOO reviews; 2022.
Figure 2. A glance at the data from CACOO reviews; 2022.
Electronics 13 04316 g002
Figure 3. Correlation among review rating scores.
Figure 3. Correlation among review rating scores.
Electronics 13 04316 g003
Figure 4. Spectral clustering of all data.
Figure 4. Spectral clustering of all data.
Electronics 13 04316 g004
Figure 5. t-SNE (t-distributed stochastic neighbor embedding of all data.
Figure 5. t-SNE (t-distributed stochastic neighbor embedding of all data.
Electronics 13 04316 g005
Table 1. Existing Studies’ gap(s), goal(s), and method(s).
Table 1. Existing Studies’ gap(s), goal(s), and method(s).
StudyYearGap(s)Goal(s)Method(s)
[2]2022Lack of focus on info quality
Difficulty measuring fragmented info
Study impact of online reviews
Explore mediating/moderating roles
Smart PLS analysis
Web-based experiment and survey
[7]2023Lack of focus on text quality
Limited research on coherence
Study effect of incentives on text quality
Explore coherence and aspect richness
Two-way fixed-effect model
Randomized MTurk experiment
[8]2018Limited studies on influencer marketing
Few works on incentivized reviews
Study effect of incentivized reviews
Analyze reviewer motivations
Qualitative and quantitative analyses
Content analysis and surveys
[9]2022Lack of focus on eWOM trust
Limited exploration of norms conflict
Study impact of incentives on trust
Explore norms conflict mediation role
Three experiments
Bootstrap analysis
[10]2018Limited studies on incentives vs norms
Lack of combined strategy research
Study effect of incentives and norms
Examine their joint impact on reviews
Two randomized experiments
Econometric analysis
[11]2019Lack of study on identifying
incentivized reviews using text
Predict incentivized reviews
Explore text features and sentiment
Decision trees (C5.0, C&RT)
Random forest, sentiment analysis
[12]2021Limited focus on content positivity
Lack of review-writing enjoyment data
incentives impact on review positivity
Examine enjoyment of review writing
Seven controlled experiments
NLP and human judgment analysis
[13]2020Lack of studies on incentivized
reviews in the hotel sector
Detect incentivized reviews
Perform sentiment analysis
Random forest, KNN, SVM
Sentiment analysis (VADER)
[14]2020Lack of research on
reevaluation mechanisms in incentives
Study how reevaluation-based
incentives affect reviewer behavior
Propensity score matching (PSM)
Difference-in-differences (DID)
[15]2020Lack of studies on incentive
effects on review valence
Investigate psychological effects
of incentives on review valence
Pilot study, two experiments
Content analysis
[16]2022Limited research on mandatory
vs voluntary disclosure effects
Compare mandatory and voluntary
disclosures on review bias
Propensity score matching
Sentiment analysis
[38]2023Lack of effective reward
mechanisms for objective feedback
Propose SRA to incentivize
objective, truthful evaluations
Square-root agreement rule (SRA)
Numerical experiments
[42]2019Lack of quantitative study on
incentivized reviews’ prevalence
Detect and characterize incentivized
reviews on Amazon
Machine learning classification
Regular expression patterns
[46]2024Limited frameworks for SMEs
on fraudulent review detection
Develop RAaaS framework
for SMEs to detect fake reviews
Cloud-based framework,
NLP, sentiment analysis,
unsupervised learning
[47]2023Lack of empirical study on
disclosure effectiveness
Investigate if incentivized
review disclosures protect consumers
Difference-in-differences (DID)
Regression analysis
[50]2023Lack of studies on deceptive
reviews’ impact on purchase decisions
Study how deceptive reviews
affect consumer purchase decisions
Questionnaire survey
Empirical analysis using SPSS
[54]2021Lack of focus on influencer
motives for accepting incentives
Examine how acceptance motives
affect follower reactions
Survey study, experiments
Field study with blog data
[57]2021Lack of focus on
review richness impacts
Investigate the impact of review
richness on sales
Regression models
Online experiments
[61]2022Lack of clarity on the
impact of incentivized reviews
Investigate the impact
of incentives on review quality
Sentiment analysis
A/B testing, similarity analysis
[62]2019Limited study on joint
sentiment-topic models
Investigate how numerical and
textual reviews affect sales
Joint sentiment-topic model
Mediation analysis
[63]2022Insufficient labeled data for
opinion spam detection
Develop a new opinion spam
detection using few-shot learning
Machine learning, network
algorithms, belief propagation
[64]2023Limited accuracy of PLTS
in sentiment analysis
Develop a deep learning
approach for PLTS generation
Deep learning, sentiment
analysis, PLTS
[65]2024Lack of comparative study
on sentiment analysis methods
Compare ML, DL, and
Transformer-based sentiment models
NLP, BERT, CNN, Bi-LSTM,
random forest, TF-IDF
[66]2024Limited empirical study on
moral frameworks in fake reviews
Investigate how incentives
affect dishonest reviews
Identify moral heuristics involved
Survey, hypothetical scenarios
Philosophical moral framework measure
Table 2. Existing Studies’ finding(s), contribution(s), and limitations(s).
Table 2. Existing Studies’ finding(s), contribution(s), and limitations(s).
StudyFinding(s)Contribution(s)Limitation(s)
[2]Info quality improves trust
Social presence improves trust
Positive reviews drive intention
Insights on trust and intention
Extends S-O-R to online reviews
Sample mostly Chinese students
No time dimension considered
[7]Incentives improve text coherence
Aspect richness increases with incentives
Insights into text quality improvements
Encourages detail-rich reviews
Limited to the Amazon Vine program
Data until August 2015 only
[8]Incentivized reviews boost review numbers
Positive reviews increase purchase potential
Applies exchange theory to reviews
Insights on influencer marketing effects
Limited generalizability platforms
Focused on one product category
[9]Incentives lower trust in eWOM
High-quality reviews boost trust
Insights on trust restoration
Concrete strategies for eWOM management
Focused only on monetary incentives
Only positive reviews analyzed
[10]Incentives drive review volumes
Norms lengthen reviews
Insights on incentives and norms
Combines social and financial incentives
Limited to specific retail contexts
Limited generalizability platforms
[11]Incentivized reviews are longer
Positive sentiment is higher
Text mining model for detection
Practical rules to spot bias
Limited to two product categories
Assumed disclaimers may miss bias
[12]Incentives increase review positivity
Incentives boost review writing enjoyment
Highlights enjoyment role in review writing
Extends literature on incentives and reviews
Limited to short-term incentives
Only online reviews considered
[13]Random forest has a 94.4% accuracy
VADER performs well for polarity
Provides a methodology for detecting
incentivized hotel reviews
Limited to hotel reviews
Small sample size
[14]Reviewers increase review frequency
and quality in the short term
Shows the long-term impact of
the reevaluation of content quality
Focused on Yelp Elite Squad only
and limited geographic scope
[15]Incentives increase review numbers
Psychological costs reduce review valence
Explores reciprocity and resistance
Highlights unintended effects of incentives
Limited to monetary incentives
Potential bias in the participant sample
[16]Mandatory disclosure reduces bias
Voluntary disclosure increases ratings
Highlights the importance of mandatory
disclosure for consumer trust
Focused only on the Amazon platform
Limited generalizability
[38]SRA incentivizes truthful behavior
Effective in homogeneous settings
Proposes SRA as a new
reward mechanism in online platforms
Limited to objective feedback
Assumes homogeneous responses
[42]EIRs show different patterns
EIRs affect non-EIR submissions
Quantitative analysis of EIRs
Temporal analysis of EIRs
Limited to two product categories
Focused on Amazon only
[46]Fake reviews affect the ranking
Fake reviews are shorter
Emotional bias in fake reviews
Provides cost-effective
review analytics for SMEs
Insights into fake reviews
Characteristics and patterns of fake reviews
Limited to English reviews
Focused on two datasets
[47]Disclosure doesn’t remove inflation
Sales increase despite disclosure
Highlights limitations of disclosure
Proposes alternative (platform-initiated IR)
Limited to Amazon platform
Time constraints for post-policy data
[50]Perceived deception lowers trust
Fake reviews affect purchase decisions
Insights into the impact
of fake reviews on behavior
Small sample size
Focused only on Taobao
[54]Intrinsic motives mitigate
negative effects on credibility
Shows the importance of motives
in incentivized review acceptance
Limited to review and
lifestyle influencers
[57]Richer reviews boost sales
More impact on utilitarian products
Introduces review richness
as a key factor in sales
Limited to JD.com platform
Focused on specific product categories
[61]Incentives do not strongly
impact overall review quality
Proposes evaluation of multiple
review dimensions for quality
Focused on software reviews
Limited to G2 platform data
[62]Textual reviews complement
numerical ratings
Proposes a new model
linking reviews to sales
Limited to tablet products
Short time frame
[63]CRSDnet outperforms other
spam detection algorithms
Introduces CRSDnet, a
novel spam detection method
Limited to Yelp datasets
Not tested on other platforms
[64]High prediction accuracy
with PLTS method
Introduces deep learning
for PLTS generation
Limited to product reviews
Focused on specific datasets
[65]BERT achieved highest
sentiment analysis accuracy
Provides insight into
comparative performance
of sentiment models
Limited to Amazon reviews
Tested on limited product categories
[66]Incentives increase fake reviews
Utilitarian, egoism frameworks dominate
Shows link between incentives
and moral frameworks in reviews
Limited to food delivery platforms
Focused on a single Indian city
Table 3. Number of reviews by rating score.
Table 3. Number of reviews by rating score.
AttributeIncentivized012345678910
overAllRatingNoIncentive-554335711403711,623-----
Incentive-129346216211,32818,773-----
Value for moneyNoIncentive299966832497929309360-----
Incentive73678177483283741113,612-----
Ease of useNoIncentive909505413138343959655-----
Incentive43320985436210,44616,582-----
FeaturesNoIncentive909486342144249689413-----
Incentive43176672882111,72316,303-----
Customer supportNoIncentive3026842275704227610,047-----
Incentive86674718233142654013,095-----
Likelihood to recommendNoIncentive237212412011886803331995224228577714
Incentive21841192253093671192149536056288621710,737
Table 4. Number of reviews by year, product usage duration, and company size based on the number of employees for the review description.
Table 4. Number of reviews by year, product usage duration, and company size based on the number of employees for the review description.
Number of Reviews by Year for Review Description
IncentivizedSentiment20172018201920202021----
NoIncentivePositive12073777371225531878----
Negative35110131125931755----
IncentivePositive10278702750033592251----
Negative9433301281717321276----
Number of Reviews by Product Usage Duration for Review Description
IncentivizedSentimentFree Trial<6 months6–12 months1–2 years2+ years----
NoIncentivePositive6114172254320582751----
Negative28813869187001002----
IncentivePositive8954553506542075870----
Negative4921679174718823056----
Number of Reviews by Company Size Based on the Number of Employees for Review Description
IncentivizedSentimentMyself only1–1011–5051–200201–500501–10001001–50005001–10,00010,001+
NoIncentivePositive1452378928431469604371484163445
Negative47413179884811911241183293
IncentivePositive1782512452703572165211581812367805
Negative727206520091382577369444142265
Table 5. The top 20 words from review descriptions and trigrams from combined strings.
Table 5. The top 20 words from review descriptions and trigrams from combined strings.
Top 20 Words from Review Descriptions
Positive NoIncentivePositive IncentiveNegative NoIncentiveNegative Incentive
greatuseuseuse
usegreatsoftwareneed
goodgoodneedCRM
workworkCRMwork
businessneedworksoftware
softwarebusinessgoodgreat
helpteamtimetool
teamwellbusinessgood
needsoftwareproductemail
wellhelpgreateasy
easymakehelpmake
loveeasy usecompanywell
CRMclientonecompany
makecompanysupportone
easy usetoolmakebusiness
toolCRMclientsale
clientsaleeasyproject
supportprojectemailhelp
companyeasyfeaturetime
projectcustomersystemclient
Top 20 Trigrams from Combined Strings
Trigrams in NoIncentiveFrequencyTrigrams in IncentiveFrequency
software easy use18.02project management tool14.34
would like see15.13software easy use13.07
sensitive content hidden13.39easy use easy11.95
easy use great11.58would like see11.14
easy use easy11.39project management software9.41
great customer service10.84easy use great9.29
project management tool9.62help keep track9.01
help keep track8.50use free version7.90
user-friendly easy8.44steep learning curve7.84
great customer support7.61user-friendly easy7.67
really easy use7.57super easy use7.08
save lot time7.39simple easy use7.06
everything one place7.34really easy use6.74
project management software7.08easy keep track6.70
would like able6.90bit learning curve6.58
software user-friendly6.85take time learn6.35
would highly recommend6.51would highly recommend6.28
customer service great6.48great project management6.15
customer service team6.47save lot time6.14
product easy use6.35customer relationship management5.95
Table 6. The results of A/B testing, hypothesis testing, and bootstrap distribution.
Table 6. The results of A/B testing, hypothesis testing, and bootstrap distribution.
AttributeIncentivizedMeanStdStd
Error
5%
Threshold
Observed
Difference
Empirical
p
Observed
t-Value
Overall RatingNoIncentive4.4970.9130.0074.483–4.5110.0230.000−2.848
Incentive4.4740.7020.0044.467–4.482
Value for MoneyNoIncentive3.6371.9160.0153.609–3.6650.2960.000−16.294
Incentive3.3411.9650.0113.319–3.362
Ease of UseNoIncentive4.1331.350.0104.113–4.153−0.1461.00012.774
Incentive4.2790.890.0054.269–4.288
FeaturesNoIncentive4.1441.3290.0104.124–4.164−0.1741.00015.749
Incentive4.3190.8150.0054.310–4.328
Customer SupportNoIncentive3.6571.9540.0153.628–3.6860.5050.000−26.961
Incentive3.1522.0600.0113.129–3.173
Likelihood to RecommendNoIncentive7.6663.4310.0267.615–7.717−0.1771.0005.880
Incentive7.8432.6850.0157.813–7.872
LengthNoIncentive110.143132.1511.006108.194–112.1130.3041.000−0.260
Incentive109.839107.3830.593108.694–111.023
Table 7. Queries used for recommendations.
Table 7. Queries used for recommendations.
QueryNature of QueryQuery Text
Query 1 (Q 1)Complex customer preferencesFor my work I need the software to facilitate my work and give me the will to recommend that to others as I am frustrated with other software I have used. I need the software to work well, no matter if it is complex or not as I like challenges, with good CRM, and good customer support, has enough features and I can work with that on my phone. The price is not that important.
Query 2 (Q 2)Moderate Customer PreferencesI need the product with good features, which has a low price, I can learn how to work with that fast and easily
Query 3 (Q 3)Simple Customer PreferencesI need Good CRM
Query 4 (Q 4)One NoIncentive ReviewSurprised Franklin Covey would even advertise think the program would good could get work customer support beyond horrible there no pro point possibly layout great but would not know since can not get work tired sync w ical with no success when you call to support you route voice mailit take least hour someone calls you back in sale hour later not in my office in front computer etc work out issue
Query 5 (Q 5)Part of NoIncentive ReviewWould not know since can not get workI tired sync w ical with no success when you call support you route voice mailit take least hour someone call you back in sale hour later not in my office in front computer etc work out issue
Query 6 (Q 6)Synonyms Replacement in ReviewAstonished would even publicize think program would decent could get work customer provision yonder awful there no pro opinion perhaps design countless but would not know since can not get workI exhausted synchronize w l with no achievement when you call support you way voice mailit take smallest hour someone call you back in transaction hour later not in my office in forward-facing computer etc. work out problem
Table 8. Similarity scores of the top five recommended listing IDs.
Table 8. Similarity scores of the top five recommended listing IDs.
QueryModelListing
ID1
Similarity
Score 1
Listing
ID2
Similarity
Score 2
Listing
ID3
Similarity
Score 3
Listing
ID4
Similarity
Score 4
Listing
ID5
Similarity
Score 5
Q 1TF-IDF1132130.0421093950.029103170.0151014050.0151197230.013
SBERT911790.86294480.856204060.852103170.8501025330.848
Q 2TF-IDF909410.02799080.0051025170.0031063310.002103170.000
SBERT1063310.8441024450.828908440.82695310.825911960.824
Q 3TF-IDF1025170.008103170.000908590.0001042470.0001063310.000
SBERT20466860.7241063310.70220354030.69594010.6941063310.693
Q 4TF-IDF906020.011908590.00799080.005905070.004103170.002
SBERT912030.9201139010.919103170.914912030.914906020.913
Q 5TF-IDF103170.000908590.0001042470.0001063310.000908440.000
SBERT912030.91623480.9051420990.8921042650.8911095610.891
Q 6TF-IDF917340.004103170.000908590.0001042470.0001063310.000
SBERT906020.913905070.9131139010.91123480.910912030.907
Table 9. Evaluation results (based on the top five recommended items).
Table 9. Evaluation results (based on the top five recommended items).
QueryModelPrecisionRecallF1-ScoreAccuracyMatch RatioMean Reciprocal Rank
Q 1TF-IDF1.0000.0200.0380.9951.0001.000
SBERT1.0000.0200.0380.9951.0001.000
Q 2TF-IDF1.0000.0200.0380.9951.0001.000
SBERT1.0000.0200.0380.9951.0001.000
Q 3TF-IDF1.0000.0200.0380.9951.0001.000
SBERT1.0000.0190.0380.9951.0001.000
Q 4TF-IDF1.0000.0200.0380.9951.0001.000
SBERT1.0000.0190.0380.9951.0001.000
Q 5TF-IDF1.0000.0200.0380.9951.0001.000
SBERT1.0000.0200.0380.9951.0001.000
Q 6TF-IDF1.0000.0200.0380.9951.0001.000
SBERT1.0000.0200.0380.9951.0001.000
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Kargozari, K.; Ding, J.; Chen, H. Empowering Consumer Decision-Making: Decoding Incentive vs. Organic Reviews for Smarter Choices Through Advanced Textual Analysis. Electronics 2024, 13, 4316. https://doi.org/10.3390/electronics13214316

AMA Style

Kargozari K, Ding J, Chen H. Empowering Consumer Decision-Making: Decoding Incentive vs. Organic Reviews for Smarter Choices Through Advanced Textual Analysis. Electronics. 2024; 13(21):4316. https://doi.org/10.3390/electronics13214316

Chicago/Turabian Style

Kargozari, Kate, Junhua Ding, and Haihua Chen. 2024. "Empowering Consumer Decision-Making: Decoding Incentive vs. Organic Reviews for Smarter Choices Through Advanced Textual Analysis" Electronics 13, no. 21: 4316. https://doi.org/10.3390/electronics13214316

APA Style

Kargozari, K., Ding, J., & Chen, H. (2024). Empowering Consumer Decision-Making: Decoding Incentive vs. Organic Reviews for Smarter Choices Through Advanced Textual Analysis. Electronics, 13(21), 4316. https://doi.org/10.3390/electronics13214316

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop