1. Introduction
Consumers are increasingly interested in aspects of food production such as use of high-quality, local raw materials, adherence to traditional processing methods, and the preservation of nutritional values [
1,
2]. However, the complexity of food supply chains and the often inadequate controls by local authorities pose challenges to ensuring food authenticity and the integrity of food systems [
3]. As proven in the literature, the implementation of robust traceability systems can help effectively combat food fraud and provide sufficient communication to consumers about food safety and quality [
1,
4].
Among food categories, dairy products and especially cheese are one of the main hotspots for food safety and food fraud [
5]. It is often challenging to develop a thorough and reliable traceability system for cheese production. Cheese is often produced from milk of different sources, which is later mixed and processed through complex procedures. For instance, feta cheese in Greece is produced under the protected designation of origin (PDO) certification, which requires that it must be made from at least 70% sheep milk and up to 30% of goat milk of animals grazed in Greek territories [
6]. The raw materials for feta cheese production is often a mixture of milk from different small-scale farms, as the dairy sector in Greece is highly fragmented, with 80% of raw milk coming from family-owned farmers with fewer than 100 animals [
7]. Even though the current legal framework stipulates specific requirements for feta cheese processing procedures (National Food Law 5039/2023), Pidiaki et al. (2016) reported that more than 40% of examined feta cheese samples in Greece illegally contained cow milk [
8]. Such counterfeits of feta cheese can hardly be detected by consumers via direct consumption. In this regard, consumers can only rely on the product information to ascertain the authenticity of feta cheese at the point of sale. Given recent food fraud incidents, consumers’ trust in food systems can be degraded, which can subsequently lead to a decrease in consumption for food products concerned [
9]. In addition, the recall cost for dairy products corresponds directly to the recall time, as shown by Velthuis et al. [
10]; thus, timely and accurate response to food safety incidents by food recall is vital for food business operators, which can be achieved by having an effective traceability system in place.
The key areas that motivate this research include the following considerations.
Due to increased cost, the centralised and monolithic nature of the traceability systems, and the lack of a systematic and automated framework of traceability data management, dairy products have become a food commodity with increased fraud.
There is limited connection between measurement of operational parameters (such as pH, temperature, duration of maturation) and the qualitative characteristics of the dairy product. In addition, the traceability information offered to the consumer is limited, so the (mandatory) alignment with the legislation framework and the (optional) alignment with best practices are not necessarily recognised by the consumers.
While blockchain appears to be a promising technology, its application in the food chain appears ‘coarse-grained’, not necessarily considering individual steps or parameters. The codification of the legislative framework into rules and subsequently smart contracts is challenging.
This study introduces a traceability platform built upon blockchain technology to provide a trustworthy system to record traceability data among food chain actors and communicate such data to consumers transparently. Blockchain offers (1) immutability, as data registration is permanent and cannot be tampered with, (2) decentralisation of the platform, and (3) a controlled level of transparency among the stakeholders and the consumers [
11,
12]. However, two of the critiques against blockchain technology are the rather slowness in processing (data) transactions and the costs of the transactions [
13]. In this study, we detail well-selected blockchain consensus algorithms that boost performance and speed and limit the number of transactions that involve costs.
Given its specific production requirements and challenges, feta cheese was chosen as the product of interest for developing this advanced traceability platform. The boundaries of the platform include the collection of the raw materials (milk), cheese production processes, and final product packaging. Quantitative data (related to pH and temperature values, time duration, test results) are selected according to their relationship with cheese characteristics and quality (origin, taste, nutritional value, and proportion between sheep and goat milk types). The relevant processes are monitored, without inflicting disruptive changes, and information is registered in the blockchain.
The objectives of the study are threefold. First, this study aims to define the boundaries of the production chain considering three main stages: (1) collection of raw materials, (2) processing procedures, and (3) final product packaging. Second, a platform is designed to register traceability information in the blockchain ledger and then extract such information to communicate with consumers in a transparent way. Lastly, the evaluation of the mechanism and quantification of the overheads on behalf of the stakeholders is performed.
The study captures the intricacies of a complex production process involving multiple stakeholders and diverse sources of raw materials, specifically milk, within the context of cheese production. These raw materials undergo a series of critical stages, including mixing and maturation. The study places a strong emphasis on acquiring highly detailed information regarding the roles of various stakeholders and their complementary responsibilities, all of which are integrated into the blockchain system. This integration ensures the integrity of the data as they enter the blockchain infrastructure, ultimately enabling consumers to verify traceability information through an interactive online platform.
Moreover, the identification of pertinent production parameters was achieved through a co-creation process. In this process, key stakeholders from the dairy supply chain engaged in formal focus group discussions with technology developers. This collaborative effort ensured that the selected parameters accurately reflect the needs and priorities of the stakeholders involved. Additionally, a preliminary consumer study was conducted to validate the relevance and meaningfulness of the information provided to consumers through the proposed traceability platform. This multifaceted approach underscores the robustness and practicality of the traceability system.
From a technical standpoint, the study introduces an innovative approach to address one of the inherent weaknesses of blockchain: the challenge in handling large and/or dynamically changing volumes of data efficiently. To overcome this limitation, we have applied a dual-chain system. By design, the data-intensive transactions are performed in the private chain, using the public infrastructure for registering the Merkle roots (which are of small data volumes) to allow public access (for data verification, on behalf of the consumer), allowing the private chain to be anchored to the public one. This anchoring strategy is designed to optimise the utilisation of blockchain resources, enhance transaction speed and limit the number of transactions that involve costs. The usage of the private blockchain network (Quorum) does not involve (monetary) costs (except for hosting and managing the network), as the transactions employ gas (related to the computational load).
The current work aligns with and contributes to the establishment of the architectural approach and mechanisms of sensor and actuator networks. Specifically, based on physical activity monitoring, a series of actuations are triggered involving the activation of the Blockchain smart contracts that immutably register the information. These smart contract actuations are discrete, well-defined events propagated in the Blockchain networks, informing traceability applications.
The immutable registration of sensor-measured values using the Blockchain technology allows for data integrity contributing to the security aspects at the upper layers of sensor networks. The current research work also mitigates the challenge of managing the vast set of sensor network measurements in a twofold manner: (a) the selection of concrete process steps and associated parameters composes a filtering mechanism applied to the full set of measured values and (b) the employment of anchored private and public blockchains allows data registration with minimal impact on the delays and costs.
The rest of this article is structured as follows.
Section 2 describes a literature review on current approaches in distributed ledgers and the application of blockchain-based traceability technology in the food industry.
Section 3 describes the methodology followed, the key decisions taken and the design of the system.
Section 4 includes the results and our discussion. The paper closes with conclusions and future work in
Section 5.
3. Methodology
The methodology, depicted in
Figure 1, includes the identification of the key product features, their association with measurable operational parameters, and verifiable rules. The key traits of the feta cheese product, as selected by the consumers according to a consumer study conducted in 2022, include the food integrity as reflected through the compliance with the legislative framework, the compliance with the traditional methods, the quality, and the origin of the raw material. These traits are associated with quantifiable, operational parameters, which can be retrieved from the underlying production processes. The acceptable or recommended range of values for the identified parameters is quantified as ‘rules’ based on the legislative framework and the best practices followed by the stakeholders (mainly the dairy). The operational parameters are retrieved automatically, through sensors, or manually with human intervention/registration of the values, and their values are immutably registered in the blockchain.
In the last phase, consumers retrieve the trustworthy traceability information relevant to the product they intend to purchase (or already purchased) and verify the integrity of the data and the compliance of the process with the established rules.
3.1. Current Value Chain
The production process of feta cheese has been captured in a life cycle assessment (LCA) study [
41], involving a network of supply chain stakeholders.
3.1.1. Actors and Legislation Involved
The stakeholders involved include the goat and sheep breeders, the milk collector, the dairy processor, and the laboratory employee. The breeders are responsible for the milk-producing animals and supplying the milk needed for cheese production. The milk collector receives and inspects the milk collected from the breeder premises. The milk collector is also responsible for the safe transportation of the milk to the dairy. The dairy processor serves as a central stakeholder in the chain, assuming responsibility for the actual cheese production process. This role involves the transformation of milk into feta cheese. The laboratory staff conduct periodic microbiological tests on the milk. These tests help in determining the recent presence or absence of antibiotics in the milk-producing animals, ensuring the quality and safety of the milk used for cheese production. Consumers represent the end-point of the chain, and can make informed purchase decisions on the feta cheese if trustworthy traceability information is accessible to them.
The national legislation, as issued in the official Greek Government Gazette, provides guidelines governing the usage of milk without antibiotics and containing a minimum fat content of 6%. Additionally, it stipulates that only sheep and goat milk from designated Greek regions and at prescribed percentages, i.e., a minimum of 70% sheep milk, can be used. Furthermore, it mandates a minimum maturation duration of at least 2 months.
Dairies also adhere to their own set of best practices, in addition to legislation requirements, developed empirically to ensure the quality of raw materials, the successful completion of each step in the production process and of course the characteristic taste of the product. For instance, the dairy processor can extend the maturation period, which may have a positive impact on the taste of the cheese. These practices are quantified through the application of specific thresholds for the procedural steps and rigorous organoleptic inspections.
3.1.2. Workflows in the Examined Dairy Supply Chain
The workflow is structured into the phases of milk collection, production initiation, maturation, and packaging. Each phase is monitored and managed using qualitative and quantitative parameters that can be validated through rules. Such rules are based on the national and European legislation framework and the best practices of the dairy.
Phase 1—Milk Collection: At regular intervals, usually once a day, the milk collector retrieves the milk from the cooperating breeders. The breeders are in the vicinity of the dairy processor’s facility and have a long-standing relationship with the dairy processor. The milk collection vehicles visit the breeders’ farms and collect the milk. The type of milk (sheep’s and goat’s milk), the quantity (weight), the pH and the temperature of the collected milk (considered indicators of its quality) are measured and registered by the collector in the presence of each breeder. In parallel, samples of the milk are retrieved and measured by an external laboratory in terms of the fat content (which should be at least 6% w/w), the protein, the lactose, the somatic cells, and the presence (or absence) of anti-microbiotics, as stated in EC Regulation 853/2004.
After weighing, the milk is poured into the collection vehicle and mixed with the previously collected milk. Each vehicle has compartments associated with milk types (goat, sheep). Sheep’s milk is considered of high quality if the protein content exceeds 5.5% (per weight, w/w). Upon completion of the milk collection, the milk is temporarily stored in the dairy in cooling tanks at a temperature of below 4 °C.
Phase 2—Production preparation and initiation: The production process initiates with milk purification through a series of treatments, including filtration, centrifugal separation, and pasteurisation (for microbial stabilisation), performed in pasteurisation chambers. After purification, the milk is ready for the production phase. The necessary components are added, and the milk begins to form coagulum (curd). When coagulation is completed, the coagulum is divided into cubes and placed in plastic moulds a few minutes later. Over the next twenty-four hours, the cheese is overturned in three stages to reduce the moisture. Then, the cheese is removed from the plastic moulds, its surface is salted, and it is placed in plastic barrels. One or two days later, the cheese is removed from the original storage media (plastic barrels) and stored in wooden barrels and (metal) sheet metal containers. The tin containers are filled with brine.
Phase 3—Maturation: The first phase of ripening (first maturation) usually lasts about two weeks and is completed when the pH drops to 4.45–4.50. During this phase, the temperature remains at 17 °C. When this phase is completed, the containers and barrels are stored in refrigerators for the second maturation. The total duration of maturation should be at least two months, according to legislation. Visual and organoleptic checks (performed by the processor) confirm successful completion.
Phase 4—Packaging: When the maturation is complete, the cheese is packed in plastic containers and vacuum packs. During packaging, a subset of the information (on ingredients and nutritional value) is printed on the packaging. More detailed traceability information, which is not typically available to the consumer, is linked to a QR code printed on the packaging. A different QR code refers to each specific production lot. The consumer has access to this traceability information, which has been immutably registered in the blockchain, and a verification of a set of rules for the specific package (lot) at hand.
Table 1 summarises the phases, the actors involved, the product traits as associated with the quantitative parameters, and the rules to be applied.
Traceability is supported in the current chain as a legislative obligation with a strictly internal (inbound) focus, i.e., information is curated, managed, and accessed by the dairy to verify and control production and lots. The information provided to the consumer contains the necessary data according to the legal requirements mainly related to ingredients and nutritional value.
3.2. Architecture
The three-tier ICT platform interconnects the processes monitored with the blockchain infrastructure and offers the traceability services to the consumer. Measurements are retrieved automatically through networked sensors or manually when the measuring equipment is not networked.
The blockchain infrastructure must be able to:
- (a)
Register data, the volume of which can be dynamic. This volume affects the number of blockchain transactions, the delays, and the costs. An example of the dynamic volume of the data is related to the number of breeders who contribute milk to the productions, which may vary daily.
- (b)
Verify the integrity of the information and the compliance with the rules based on on-chain data.
- (c)
Allow consumers to verify the information presented after QR scanning using independent mechanisms and infrastructures such as the public Ethereum network.
A public blockchain network allows public access to the registered information, while private ones have policies allowing access to permissioned members. Private networks allow faster transactions and lower or no costs. Due to the relatively low transaction rate of blockchain systems (especially public ones), the volume of persisted data should be as limited as possible. To combine the benefits of both approaches, we have adopted an anchoring approach. The full set of (necessary and effectively coded) information is registered in the private chain, while its secure hash is registered in the public chain, which in turn ensures the integrity and immutability of the full data.
Ethereum was chosen as the public ledger, due to its rich ecosystem and its Turing-complete language (Solidity) executed on Ethereum virtual machines (EVMs) that allows the development of decentralised application (DAPP) language. Ethereum 2.0 has recently adopted proof of stake as the consensus algorithm (substituting PoW), which reduces energy consumption and accelerates the transaction rate. The transaction rate of the Ethereum network is still relatively low (in the range of 12–13 transactions per second according to etherscan.io in September 2023). This delay has limited impact on the feta cheese production process, which lasts for two months.
Quorum has been selected as the private ledger to ensure interoperability and compatibility with the Ethereum ecosystem. Quorum is an Ethereum fork offering permissioned/private access to the blockchain data/contracts. In addition, it employs a consensus algorithm that achieves adequate throughput, being in parallel energy–efficient [
42]. A Quorum node can participate in the network only through authorisation by its designated authority. Scalability and performance are boosted in Quorum through the usage of an IBFT consensus algorithm that has been inspired by PoA [
43]. The approach is depicted in
Figure 2.
On the left, the processes are monitored, and the values of the selected operational parameters are extracted and stored in the ICT platform. Then, they are effectively coded and immutably registered in the private blockchain (Quorum), while in parallel the rules are verified. These activities are executed as smart contract transactions (designed and implemented in Solidity). Both the ICT system and the private blockchain are accommodated in the infrastructure of the stakeholders (middle of
Figure 2) and the Quorum network consists of 4 collaborating nodes.
On the right, the architecture is completed with the public blockchain (Ethereum), which operates independently of the stakeholders, hosted in thousands of nodes. The private chain is anchored to the public through the registration of the hash of the data registered in the private chain. The transactions are registered in public Testnets, which operate on separate ledgers from the main networks, simulating the behaviour of the main networks without cost. After scanning the QR code of the package at hand, consumers (upper) can be directed to a service offered by the ICT system that presents the registered information as retrieved by the private blockchain. Consumers are offered the capability of verifying the integrity of the data presented and the compliance with the rules by directly interacting with the public blockchain.
3.3. Smart Contracts
Smart contracts have been used for two purposes: (a) to register the information on the private and public blockchains, and (b) to verify the conformance of the measured values (according to the rules described in
Table 2). The former provide immutability and verify the integrity of the data, while the latter verify alignment with the rules employing the on-chain data.
Regarding data registration (a), the ICT platform registers the data in the private ledger and builds the respective Merkle tree calculating the hash of the root as a representation of registered data from the preparation phase. Then, it registers the hash root and the production lot on the public ledger. After successful registration of the data, the corresponding URL becomes available, and the hash (Merkle root) can be accessible through the Ethereum block explorer (such as Goerli and Sepolia testnets). On the other hand, the information itself is stored in the private chain (Quorum) and relevant access can be controlled and protected.
Regarding compliance with the rules,
Table 1 is expanded with the quantified rules, as in
Table 2. Each smart contract corresponds to a specific phase.
Figure 3 depicts the code of the smart contract, responsible registering in the Ethereum the hash of the information relevant to the initiation of the production (already stored in the private chain). The smart contract is deployed in Ethereum, indicating the Solidity version (v0.8) and the license selected (GNU AGPLv3). The smart contract, when executed, automatically triggers the event ProductionStart event to the chain, notifying other smart contracts listening to this event.
5. Discussion
The methodology and the platform have been applied under realistic conditions. Specifically, a series of cheese productions have been performed with similar characteristics as the commercial productions with the addition of blockchain-enabled traceability. During the pilot, a series of challenges and limitations emerged, which we discuss in this section as well as relevant design decisions and mitigations.
5.1. Limitations and Challenges
First, the process of applying the smart contracts is rigorous. Once designed and implemented, the smart contracts cannot change. This characteristic increases the trustworthiness of the information and validation, and at the same time necessitates special attention to the design and specification of the contracts.
The volume of the operational data to be registered can be dynamic. For instance, the number of breeders participating in a milk collection can vary per day, and this is reflected in the volume of data to be registered. In addition, the thoroughness and verbosity of describing the measurable parameters can vary, affecting the volume of the data that will be registered in the blockchain. The volume of data typically affects the number of blockchain transactions.
Increase in and volatility of data volume and rate of transactions are challenging to blockchain infrastructure, especially public ones, due to delays and cost. Depending on the consensus algorithm, the transactions can be executed in one or more nodes, affecting the delay. For example, in the Ethereum network, every transaction is executed by each participating node within the network and the chain employs a fee structure relative to the computational complexity of each transaction, using gas as the unit of measurement. This results in a relatively low transaction rate in the ledger. This limitation affects the registration of information, i.e., the production steps. While this aspect is considered for the scalability perspective, in our case the interval between the production steps significantly exceeds such delays, so the impact is negligible. Furthermore, the consumer retrieves information, which is a much lighter task (in comparison with the execution of transactions). This means that the performance is not affected by the transaction rate and is comparable with communication with a typical ICT system, and the impact to the consumer is minimal.
The blockchain can verify data integrity and authenticity after it is registered in the network. However, it is challenging to ensure the validity of the data at the edge of the network. These challenges are further amplified considering that measurements may be taken and registered manually, as in the case of black-boxed equipment, i.e., not interoperable with the rest of the ICT infrastructure. The level of automation and networking of the operational processes can vary. For example, in some cases, milk-collecting vehicles are equipped with networked sensors that can retrieve and share measurements in real time, while in other cases, the measurements are taken with human intervention. Human intervention can be a vulnerable point for data authenticity due to unintended errors and/or data tampering.
One more challenge is related to the dynamic nature of the blockchain infrastructures as well as the evolution of the consensus algorithms applied. For example, the Ethereum network migrated from proof-of-work to proof-of-stake consensus algorithms, in a transition known as the Merge (See more:
https://ethereum.org/en/roadmap/merge/). This drastic change has necessitated the transition of the smart contracts to the new testnets. While the logic of the smart contracts for this study has been straightforward, frequent changes could potentially pose difficulties.
To mitigate the transaction rate limitations, a dual-chain (private and public) architecture has been applied. The private ledger, Quorum, is using the IPFT consensus algorithm with increased throughput capabilities (transaction rate) by design. This performs the transactions registering the full set information. The public chain, with a lower transaction rate, registers (anchors) the Merkle roots of the data registered in the private one. This combination provides for efficiency and public visibility.
5.2. Effective Management of On-Chain Data
The rationalisation and management of the on-chain data volume has been based on effective data coding, data ordering, and memory allocation.
Considering the types and formats of operational parameters, we have employed ‘laconic’ data coding, avoiding unnecessary/redundant information and separating values with a special character (#).
The (optimised and compressed) data-coding schema is as follows:
BreederID#pH_value#temperature_value#sheepQuantity_value#goatQuantity_value
The values of the parameters are included in a specific order, while the breeders are referred to through their ID without repetition of the names within the on-chain data. This results in an explainable but not verbose format (contrary to the more interoperable formatting of JSON structures).
In terms of data ordering, the data stored on smart contracts are ordered according to their data type. The Ethereum virtual machine memory has slots of 256 bits, and the variables are declared in an order that ensures the minimum number of slots used. If, for example, we declare two uint128 (unsigned integers of 128 bits) variables and a string variable on a smart contract, the uint128 variables occupy 128 bits while the string occupies a whole memory slot (256 bits) as the minimum. In this case, it is preferable to declare the two uint128 variables first, as they fit in a single slot and in total two memory slots are occupied, as depicted in
Figure 8. Instead, if the string variable is declared in between the other variables, three slots will be occupied.
To rationalise the memory used by the smart contracts, the variables are allocated only the memory needed. This is calculated considering the lower and upper boundaries of the operational parameters. If, for example, a variable is expected to have a value below or equal to 255, an 8-bit unsigned integer is preferable to an integer. Regarding the milk collection items, the maximum number of collection items is set to 65,000 so that it is coded as an integer of 16 bits.
5.3. Data Verification at the Edge
To mitigate the risk of faulty data input at the edge of the infrastructure, an internal, sequential verification of the information among the supply chain actors is employed to ensure the authenticity of the input data. This is performed by stakeholders with complementary (or even competitive) roles. Particularly, during the milk collection phase, the collector records the initial information about the collected milk. Such recorded data must be subsequently verified by the processor (who receives the milk) and breeders (who provide the milk).
Another countermeasure that can be applied is to accompany the measurements with further ‘proof’, such as a geolocated and timestamped photograph of the measured parameter that is inserted and linked with the blockchain, e.g., through an IPFS (InterPlanetary File System). While this can be a mitigation measure, it cannot be applied in all cases. For example, in specific parts of the processing, the operator cannot/is not allowed to use extra equipment such as a camera and/or smartphone due to hygiene restrictions.
5.4. Migration of the Smart Contracts
The deployment and piloting of smart contracts were affected by the Ethereum Merge in September 2022 (See more:
https://ethereum.org/en/roadmap/merge/). After this merge, the mainnet Ethereum operations were shifted to PoS (proof of stake) consensus algorithms. While this development/upgrade has been positive (considering the expected decrease of the carbon footprint of the network), it has made the testnets in use at that time (i.e., Ropsten, Rinkeby and Kovan) deprecated. The smart contracts have been migrated to testnets compatible with the consensus algorithm, namely, Goerli and Sepolia. The migration has been performed smoothly, and the PoS algorithm is currently used without any need for contract adaptation.
6. Conclusions
In this work, we have modelled the production of the PDO feta cheese and associated high-level product characteristics with measurable parameters. Through the analysis of the process steps (milk collection, production, maturation and packaging), the legislation framework, and best practices, we have derived a set of quantifiable rules. Based on these rules, structured traceability information is extracted and registered in the blockchain infrastructure. The rules, codified as blockchain smart contracts, verify the compliance with legislation. The rigorous and stringent process for feta production contributed to the structured and verifiable traceability.
From the blockchain perspective, the architecture involves a dual infrastructure, using a private ledger (Quorum) anchored to the public (Ethereum). This schema ensures data immutability, addressing the challenges of inadequate transaction throughput, and constrained, costly storage capacity. Efficient data coding, ordering, and memory allocation also contributed to effective data management. The adoption of the PoS (proof of stake) consensus algorithm in the Ethereum platform necessitated the migration of smart contracts, which has been seamless and verified backward compatibility. In addition, the consumer application has been designed to provide information and tools to perform direct verification of the traceability data and interact with the public blockchain.
Blockchain systems, while not immune to the possibility of inaccurate input data, offer robust protection against data tampering. Once data, whether accurate or not, are entered into a blockchain-based system, they become immutable and resistant to manipulation. Deliberate misinformation by supply chain actors can be swiftly traced and attributed to them, thanks to the inherent transparency and accountability of blockchain technology. Consequently, such misconduct can be deterred over time, as supply chain participants become collectively aware that falsifying data will not go unnoticed. To minimise the risk of erroneous data entry, corrective measures can be implemented by adding accurate information to supplement and rectify the erroneous data within the blockchain system. In our study, multiple independent chain actors participate to verify input information at the edge before data are registered in the ledger (e.g., the quantity of collected milk is verified by the breeder, the collector, and the processor).
Future work involves fine-tuning of the consumer application to ensure user-friendliness and the intuitiveness of the user interface, as well as the extension of the modelling process and the platform towards new dairy products (such as yoghurt). Furthermore, additional sources of information, either context-based (such as animal welfare) or generic (i.e., not associated with a specific lot) information (such as animal welfare and recipes, respectively), can be used as feeds to the blockchain infrastructure.