1. Introduction
Wine plays a crucial role in international trade, with its global export value reaching EUR 36 billion in 2023—the second highest ever recorded. This trade primarily occurs between developed economies. Specifically, the top three wine-exporting countries by volume are Italy, Spain, and France. Additionally, significant wine exports come from Chile, Australia, South Africa, Germany, Portugal, Canada, the USA, Argentina, and New Zealand.
Conversely, the leading wine-importing nations, based on volume, are Germany, the United Kingdom, and the USA. Following closely are France, the Netherlands, Russia, Canada, Belgium, Portugal, China, Japan, Italy, and Sweden [
1].
The importance of wine as a consumer product is evident from numerous studies conducted in the field of consumer research [
2,
3,
4]. These studies often focus on the organoleptic characteristics of wine, evaluated by experienced assessors [
5]. Interestingly, the process of comparing organoleptic evaluations between consumers and trained assessors extends beyond wine and applies to other products as well [
6].
The significance of the organoleptic evaluation and sensory analysis of wine extends beyond the market; it also plays a crucial role in the wine’s production. The organoleptic characteristics of a wine can be influenced by several factors, including grape variety [
7], grape-growing region, harvest conditions, soil type, and winemaking practices [
8,
9]. Numerous research publications have illustrated the use of sensory science in comparing winemaking parameters. For example, researchers have conducted studies on the organoleptic properties of wines with varying proanthocyanidin contents [
10], as well as wines aged using oak chips [
11,
12] and during bottle aging [
13]. Additionally, studies have successfully categorized wines based on their organoleptic characteristics related to specific growing regions [
13,
14]. Clearly, sensory analysis is useful for oenologists and winemakers, as winemaking practices [
15,
16] and cultivation regions significantly influence the organoleptic characteristics. Additionally, organoleptic analysis could become a useful tool during the final blending of wines to create the final product.
Numerous organoleptic test methods have been applied to wine research. More specifically, this includes triangular tests [
16]; rank rating [
8]; sensory descriptive profiling using discontinuous 3-point, 5-point, and 7-point scales [
11,
13]; the frequency of attribute citation method [
7]; and sensory descriptive profiling using a continuous scale of 0–10 [
10,
17,
18]. In research with discontinuous scales or with a continuous scale, the average is the most widespread methodology for calculating the value of each attribute [
10,
11,
13,
17]. These methods are mostly employed by experienced wine tasters who undergo further training during the research process. Although the training methodology is briefly described, detailed information is usually not provided. For the statistical validation of the citation frequency-based descriptive methods that have been developed, the reproducibility index (Ri) is mainly used [
19,
20,
21,
22]. This index utilizes the number of descriptors chosen by the judge in both replicates, as well as the number of descriptors chosen by the judge in the first and second replicates. The composition of the organoleptic panel significantly influences its performance. In order to explore this, researchers have conducted comparative studies involving different groups of assessors. Specifically, they have compared two distinct groups: professional sommeliers and trained assessors, both of which are engaged in descriptive analysis [
23]. These studies extend beyond the realm of wine tasting. For instance, in the field of organoleptic analysis of complex odors, researchers have compared the performance of trained and untrained assessors [
24]. Additionally, similar studies have been applied to food evaluation, where the performance of a group of trained assessors was compared to that of consumers [
6].
The great importance of sensory analysis is demonstrated by the multitude of standards issued by the International Organization for Standardization (ISO) in this field. These standards describe general guidelines for the selection of assessors, using tests for color vision, ageusia, and anosmia, as well as for the training and monitoring of selected assessors [
25]. In addition, instructions are provided for determining perception thresholds [
26], the methods related to investigating the sensitivity of taste [
27], and the methodology for training assessors in the detection and recognition of odors [
28]. Simultaneously, the standards outline the subsequent stages after training, including instructions for establishing a sensory profile [
29] and guidelines for monitoring the performance of a quantitative sensory panel [
30]. These standards have been assimilated from the standards, official documents, or EU Regulations on the sensory analysis of specific products.
The International Organization of Vine and Wine (OIV) has created a document as a review of the sensory analysis of wine [
31], in which many elements from the ISO standards have been assimilated. This document represents the consensus reached by the members of the group of experts called the Subcommission Methods of Analysis of the OIV and is a proposal for the selection and training of tasters for the organoleptic analysis of wine. It provides very detailed and precise instructions for a series of tests, including tests to detect disabilities and determine sensory awareness. The document covers various topics, such as describing odors, identifying defects in wine using solutions, and addressing the most common defects in wine. Additionally, it includes guidelines for the sensory evaluation of different types of tannins, as well as instructions for assessing both assessor and panel performance. Specifically, this document presents a list of the main visual, olfactory, tactile, and taste-altered characteristics of wines, which are intended for use in taster training. Although the document provides detailed guidance in some areas, such as the initial training, much of it offers general guidance, necessitating specialization and expansion. Furthermore, it draws a connection between the accreditation for sensory analysis of wines and related works conducted in the field of olive oil.
The International Olive Council (IOC) has published a series of standards on the sensory analysis of olive oil. These standards describe the method of organoleptic testing for virgin olive oil [
32], and additionally, they provide very detailed guidelines for meeting the requirements of ISO 17025 for sensory testing laboratories for virgin olive oil [
33]. The IOC has also issued a guide for the selection and training of tasters, which includes specific objectives for assessing candidate tasters and determining their acceptance or rejection [
34]. Furthermore, internal quality control guidelines for sensory laboratories [
35] cover detailed instructions and examples for evaluating individual tasters and panel performance. This evaluation involves using duplicates and calculating metrics, such as the precision number and deviation number, with specific acceptance criteria for each parameter. The accompanying standards include a general basic vocabulary [
36] and a method for the organoleptic assessment of extra virgin olive oil, which is applied to the designation of the origin [
37].
Further, the literature on sensory analysis in milk and milk products has also been particularly developed. For this product category, the International Organization for Standardization (ISO) has issued three standards. These standards provide general guidance for the recruitment, selection, training, and monitoring of assessors [
38]. Additionally, they recommend methods for sensory evaluation for each category of product, proposing attributes for each product [
39]. Furthermore, the ISO standards offer guidance on a method for evaluating compliance with product specifications, including instructions for creating a relevant test report [
40]. In more detail, the standard on assessors [
38] provides comprehensive tests for basic odor and taste recognition, as well as the training and evaluation of assessors across different product categories. Specific scoring targets are set for the selection or rejection of each assessor. The evaluation tests cover both olfactory and gustatory aspects, including positive attributes and defects.
In the realm of wine, the imperative to establish comparable panels and achieve methodological harmonization in wine sensory analysis has prompted research efforts. These studies aim to furnish a guide on selecting and training tasters, with a specific focus on the taste attributes of wines bearing a Protected Designation of Origin (PDO) rating [
18]. The study provides meticulous instructions for training assessors in the utilization of a continuous scale, employing standards to assess certain taste parameters and their corresponding concentrations.
The purpose of the present study is to develop a standardized method for the organoleptic analysis of wine by employing visual, olfactory, and taste attributes. This method is based on the proposals of the OIV. This study not only implements and refines the specific OIV proposals but also extends them by incorporating standards from other domains, such as virgin olive oil, milk, and milk products, as well as drawing upon relevant research in the field of wine.
The present study outlines a detailed process for training, selecting, and monitoring assessors. It provides an in-depth description of the analysis method and a comprehensive quality control methodology for both the sensory team and individual assessors. The detailed procedures include descriptions of the solutions used, a defined scoring system for assessor selection, and specific performance acceptance criteria. The ultimate goal is to create a standardized method that can be easily applied and replicated by other sensory panels.
At the same time, the two groups of candidate assessors for the developed method are compared: an accredited group for the organoleptic evaluation of virgin olive oil, which has not received training in the organoleptic analysis of wine, and an untrained group of candidate assessors, with no experience in sensory analysis, known as the “virgin” group. The aim is to evaluate whether the groups involved in the organoleptic evaluation of virgin olive oil can serve as a pool of candidate assessors that can be leveraged for the organoleptic analyses of wine, potentially increasing their chances of final selection.
Finally, the developed method is applied to the analysis of 25 commercial wine PDO Nemea products made from the Agiorgitiko variety. The results are used to investigate the discrimination ability of the standardized sensory method. Additionally, the study explores whether there is any correlation between the retail price of Nemea products and specific organoleptic attributes.
The main purpose of this specific study is to offer a sensory analysis method that is easily applicable and produces repeatable results. This method will serve as a valuable tool for oenologists and winemakers during the wine production process and in the international trade of these products.
3. Results
3.1. Taster’s Performance during Basic Training
For the ageusia tests, all tasters in Team A had a pass rate of 80 to 100%. The average successful responses of Team A were 90%. The Team B tasters had a success rate of 60% to 90%. More specifically, two candidate assessors had a percentage of less than 80%. The average of the successful responses of Team B was 78%. For the two candidate assessors who had a percentage of less than 80%, a repetition of the test was carried out, in which they had a percentage of successful answers greater than or equal to 80%.
For the anosmia tests, the Team A tasters had a success rate of 63 to 100%. The average of the successful responses of Team A was 93%. The Team B tasters had a success rate of 38 to 100%. More specifically, two candidate assessors from Team A and six from Team B had a percentage of less than 80%. The average of the successful responses of Team B was 75%. The candidate assessors who had a percentage of less than 80% were retested and given a short training session by the team leader, in which they had a percentage of successful responses greater than or equal to 80%.
In the Ishihara test, the two teams had exactly the same results. All candidates had 100% success results, except two—one from Team A and one from Team B—who had 94% success.
Team A had a lower perception threshold than Team B in all five basic tastes, as shown in
Table 1.
In the tests that determined organoleptic awareness, during the process of detecting taste stimuli using triangle tests, all but one taster in Team A achieved success, resulting in a total of 59 correct answers out of 60. Similarly, all tasters in Team B were successful, except for two tasters, yielding a total of 55 correct answers out of 60.
In the taste identification tests, each taster had to identify 10 flavored solutions. For each correct identification, the taster received 1 point, resulting in a maximum score of 10 points per taster. Team A scored 87 points, and Team B scored 78 points.
In the ranking tests, all tasters successfully identified the color of red wines, leading to identical performances for Teams A and B. For odor discrimination, which involved four aromatic substances (acetic acid, TCA, ethyl acetate, and ethyl phenol), each taster could earn up to 16 points. Team A had four tasters with perfect scores (16/16) and an average score of 14.6. In contrast, Team B had no tasters with a perfect pass and an average score of 12.4.
For taste and astringency discrimination, which included the five basic tastes and astringency, each candidate assessor had the opportunity to earn 24 points. Team A had four tasters with perfect scores (24/24) and an average score of 22.8, while Team B also had four tasters with a perfect pass (24/24) and an average score of 22.1.
Finally, in the odor description tests, each candidate assessor could earn up to 30 points (10 aromas with a maximum score of 3 points per aroma). The scores of Team A ranged from 12 to 30 points, with an average of 19.2, while those for Team B ranged from 9 to 19 points, averaging 14.0.
3.2. Evaluation of Training
In the evaluation of basic training (“recognition and ranking tests”), there were 53 available points. Successful training was considered when the percentage was greater than or equal to 65%, equivalent to 35 points. All the tasters in Team A achieved successful results. The minimum score observed was 35 (65%), and the maximum score reached 53 (100%), resulting in an average of 44.8 points (84%). In contrast, Team B had five tasters with successful outcomes, while the remaining five tasters did not perform well. The minimum score for Team B was 24 (45%), the maximum was 38 (72%), and the average score stood at 33.5 points (63%). These results are presented in
Table 2.
When identifying wine defects and wine characteristics using spiked wine, the number of available points for each candidate assessor was 48, with a minimum score for successful results set at 32 points. All tasters in Team A achieved successful results, with an average score of 43.0 points. The lowest individual taster score in Team A was 32 points, while the highest reached 48 points. In Team B, seven tasters achieved successful results, while the remaining three tasters did not perform well. The mean score for Team B was 36.4 points, with a minimum individual candidate assessor score of 26 points and a maximum of 46 points, as shown in
Table 2.
In the descriptive tests, which focused on the accurate identification of positive or negative wine aromas, as well as the description of different tannins, the maximum achievable score was 10 points. Results equal to or greater than 5 points were considered successful. In Team A, all tasters achieved successful results, with an average score of 8.5 points. In Team B, eight tasters succeeded, while two tasters did not, resulting in an average score of 6.1 points, as shown in
Table 2.
In the total evaluation of each candidate assessor’s training, the training was considered successful if the candidate had successful results in all three areas: the recognition and ranking tests, the triangle test for wine, and the descriptive tests. According to this criterion, all 10 candidate testers from Group A achieved successful results, while only 3 candidate testers from Group B met the criteria. Subsequently, a ranking of the total scores was conducted for the 13 successful candidate testers to facilitate the final selection process. The top 10 assessors formed Panel C, comprising 8 tasters from Group A and 2 from Group B. The remaining tasters constituted Panel D, which served the research purposes of this study.
3.3. Validation of the Method
3.3.1. Checking the Panel’s Performance
Panel C yielded successful results in all duplicate sample measurements, both for normalized error and precision number, across all three organoleptic parameters tested. However, Panel D exhibited unsuccessful precision number results in three duplicate measurements, specifically for the defect (animal) parameter. Notably, all other duplicate measurements in Panel D were successful, demonstrating consistent performance in terms of both normalized error and precision number for the three organoleptic parameters, as detailed in
Table 3.
The overall performance of the panel was assessed using the precision number (PNp). Panel C achieved successful results (PNp ≤ 2.00) across all three organoleptic parameters. In contrast, Panel D exhibited an unsuccessful outcome in defect determination but achieved successful results for the other two organoleptic parameters. Notably, the precision number for Panel C was lower than that of Panel D across all three organoleptic parameters, as shown in
Table 4.
3.3.2. Checking the Taster’s Performance
Among the 10 tasters in Panel C, successful results were observed for all three organoleptic parameters. Specifically, the precision number (PN) remained consistently at or below 2.00, and the deviation number (DN) also met the same criterion, as shown in
Table 4. In contrast, none of the 10 tasters in Panel D achieved successful results across all three organoleptic parameters. For the fruity attribute, seven tasters failed to meet the precision number threshold, and eight tasters fell short in terms of the deviation number. Regarding the animal defect, six tasters did not achieve the required precision number, while seven tasters exceeded the acceptable deviation number. Finally, for the sour attribute, three tasters failed to meet the precision number requirement, and an equal number of tasters fell short in terms of the deviation number, as shown in
Table 4.
3.3.3. Reproducibility
Panel C exhibited a relative standard deviation (RSD) significantly below 10%, spanning a range from 3.6% to 5.8% across the three organoleptic parameters tested. In contrast, Panel D displayed a RSD of approximately 20%, with values ranging from 18.4% to 24.5% for the same set of organoleptic parameters, as indicated in
Table 5.
3.3.4. Sensory Analysis of Wines
A total of 25 commercial Nemea wines were sensorially analyzed by Panel C. The results of the median values for each organoleptic parameter, as well as the retail price, are shown in
Table 6.
Five wines exhibited organoleptic aroma defects at a low intensity. The range of organoleptic defects spanned from 1.1 to 3.2. The tasters characterized two defects as animal, two as oxidized, and one as reductive.
The two samples displaying an animal defect underwent analysis using GC-MS/MS to determine the aroma compounds. The detailed analysis methodology, including the extraction method, GC-MS/MS operating conditions, and selected mass transitions for each compound, was described in a previous study by the authors [
45]. In the first sample, with a sensory aroma defect intensity of 3.2, the following concentrations were observed: 4-ethylphenol (223 μg/L), 4-ethylguajacol (90 μg/L), and 4-vinylphenol (1119 μg/L). In the second sample, with a sensory aroma defect intensity of 1.1, the concentrations were as follows: 4-ethylphenol (127 μg/L), 4-ethylguajacol (31 μg/L), and 4-vinylphenol (314 μg/L).
Regarding the appearance organoleptic parameters, both color and hue exhibited substantial variations. Specifically, color ranged from 3.9 to 8.3, with an average value of 6.8, while hue ranged from 1.8 to 7.6, with an average value of 5.0.
For the aroma attributes, fruity aroma spanned between 3.2 and 6.0, with an average value of 4.5, and barrel aroma ranged from 0.0 to 4.9, with an average value of 3.3. Notably, seven samples showed higher barrel aroma intensity than fruity aroma.
The flavor intensity varied between 2.6 and 4.8, with an average value of 3.9. In terms of taste attributes, sour varied from 3.6 to 5.0, with an average value of 4.1; sweet ranged from 0.4 to 2.0, with an average value of 0.9; bitter exhibited similar values to sweet, ranging from 0.5 to 1.9, with an average value of 0.9; astringent values ranged between 2.8 and 4.1, with a mean value of 3.3; and aftertaste spanned from 3.7 to 5.1, with a mean value of 4.6.
3.3.5. Data Analysis
A one-way analysis of variance (ANOVA) revealed that all 11 organoleptic parameters were statistically significant at a significance level of
p < 0.0001 in product differentiations. Additionally, principal component analysis (PCA) demonstrated a robust positive correlation between the retail price and the organoleptic parameter “barrel aroma”, as evidenced by the results presented in
Table 7. At the same time, the retail price exhibited a negative correlation with the organoleptic parameters “fruity aroma” and “bitter”.
Interestingly, the retail price exhibited a negative correlation with both the bitter and fruity aroma parameters. These findings are visually depicted in
Figure 1.
4. Discussion
In the basic training stage, both teams exhibited identical performance concerning the appearance parameters, specifically for the Ishihara test and the red wine color classification test. However, significant differences emerged concerning the olfactory and taste parameters. Team A outperformed Team B in the olfactory domain, while Team A also performed better than Team B in the taste parameters, albeit with minor discrepancies.
During the subsequent training evaluation, Team A demonstrated markedly superior results compared to Team B. Notably, the disparity in olfactory performance between the two groups was much greater than their differences in taste parameters.
The findings from the basic training and evaluation tests highlight the advantage of candidate assessors participating in an accredited olive oil organoleptic evaluation panel. Members of such panels undergo a rigorous selection process distinct from that of non-participating testers [
34]. This difference in performance can also be attributed to the ability of candidate assessors from accredited panels for olive oil evaluation to concentrate better during assessments, as opposed to new candidate assessors.
Interestingly, the similar performance of Teams A and B in terms of visual parameters can be explained by the fact that olive oil organoleptic evaluation does not incorporate any visual attributes [
32]. In order to eliminate any potential psychological influence from sample appearance, an appropriate glass is used to conceal the color of the olive oil [
46]. Conversely, the focus of olive oil organoleptic evaluation lies in olfactory parameters, which accounts for the substantial performance gap observed between Team A and Team B in this domain.
While candidate assessors from an accredited panel for olive oil evaluation generally excelled, some individuals from Team B outperformed their counterparts in Team A during the training assessments. Specifically, two Team B candidate assessors surpassed two tasters from Team A. Consequently, the selection process for sensory wine analysis (Panel C) included eight assessors from Team A and two from Team B. The unselected tasters formed Panel D, continuing the tests to evaluate the reliability of the selection method.
The evaluation of panel performance and individual taster performance was based on three key sensory attributes: a defect, an odor, and a taste attribute. These parameters were selected following the philosophy of the organoleptic evaluation of olive oil, where a defect and fruity attribute are used to monitor the team and taster performance [
35].
This study’s results unequivocally demonstrate the appropriateness of the assessor selection process. Panel C outperformed Panel D by a significant margin. Although Panel D had several results falling outside the acceptance criteria, it performed slightly better for the taste attributes than the olfactory parameters. Notably, Panel C exhibited a remarkably low relative standard deviation compared to other studies involving trained and untrained panels [
24], providing additional evidence of effective training, evaluation, and final selection.
In the context of Nemea wine samples, the sensory analysis identified five samples with aroma defects. These effects were estimated to be of low intensity, as their values were below 3.5 [
32]. Two samples exhibited an oxidized defect, possibly resulting from chemical aging or storage conditions. Another sample showed a reductive defect, likely arising from random errors introduced by a bottle or batch. Additionally, two samples exhibited an animal defect, which was confirmed through GC-MS/MS analysis. Previous research has demonstrated that the presence of 4-ethylguajacol, 4-ethylphenol, and 4-vinylphenol contributes to the phenolic, animal, and stable characteristics of wine [
47]. The detection of 4-ethylphenol, 4-ethylguajacol, and 4-vinylphenol in both samples suggests potential
Brettanomyces growth. Specifically,
Brettanomyces bruxellensis can produce 4-ethylguajacol from ferulic acid, as well as 4-ethylphenol and 4-vinylphenol from p-coumaric acid [
48]. Furthermore, the confirmation of the organoleptic defect via the GC-MS/MS results underscores the effectiveness of the panel’s training in identifying specific defects.
In seven wine samples, the barrel aroma intensity surpassed the fruity aroma intensity. Interestingly, the average retail price of all the surveyed products was EUR 11.8, but these seven specific products commanded an average retail price of EUR 16.7, representing a substantial 42% increase.
Principal component analysis (PCA) revealed that barrel aroma exhibited the only positive, strong correlation with retail price. Simultaneously, fruity aroma and bitter demonstrated a negative correlation with retail price. The interpretation of this specific correlation centers around the aging process of wines in oak barrels. Evidently, this particular aging process leads to an increase in barrel aroma, a decrease in fruity aroma, and a reduction in bitterness, attributed to the polymerization of tannins. The specific pricing policy is explained by the fact that barrel usage results in increased production costs, which are then passed on to consumers through higher retail prices. Interestingly, our findings diverge slightly from other studies conducted in different countries, where higher astringency, fruity character, and oak influence were associated with more expensive products [
5].
From the one-way analysis of variance (ANOVA) of the sensory descriptive data for the 25 wines, it was found that all 11 attributes differed significantly across the wines. More specifically, all 11 organoleptic parameters exhibited a
p-value of less than or equal to 0.0001 (
p ≤ 0.0001). This result underscores the correct choice of sensory attributes and the panel’s adequate training. Specifically, only two olfactory parameters—fruity aroma and barrel aroma—were chosen. This choice aligns with previous studies, which highlight that humans have limited ability to evaluate the intensities of various odors within complex aroma compositions [
24,
49]. For the same reason, in 1992, the International Olive Council (IOC) revised its method to enhance the consistency of panel assessments, limiting the number of olfactory parameters in fruitiness and aroma defects [
50].