1. Introduction
With the advent of the IoT era, the IoT era will change. The IoT is a connected environment where every device can communicate seamlessly, and various devices will be able to participate in different communication channels. The data emitted by each IoT device will move beyond mere raw data into personalized insights tailored to user preferences and, in some cases, aggregated with other data. The basic concept of the Internet of Things is very simple, but its widespread application is expected to spark innovation and push traditional technologies forward.
In contrast to the pre-IoT era, where users primarily relied on data provided by service providers, the advent of the IoT grants users direct access to sensors. This enables users to send instructions directly to applications, facilitating seamless and relevant operations. The data harnessed from the IoT not only transforms the user experience but also serves as the foundation for novel services catering to industries, academia, and individuals alike [
1].
Currently, popular approaches involve utilizing various database technologies, including distributed database technologies, for effective data management. However, a prevalent challenge arises as data owners often specialize within specific industries. As enterprises grow and diversify into multiple business divisions, each division accumulates its distinct dataset. Unfortunately, these datasets are frequently siloed, stored separately, and governed by unique definitions. The resulting landscape resembles isolated data islands, hindering the seamless connection and interaction of data across different divisions within an enterprise [
2]. This phenomenon, termed isolated data islands, underscores the imperative to establish an efficient and secure data management paradigm to facilitate seamless collaboration among diverse data owners in the IoT.
As a new type of system, blockchain technology is expected to change the organization of IoT systems. Unlike traditional methods of routing data through a central processing unit, blockchain provides decentralized point-to-point connections for seamless data transfer. This decentralization empowers distributed computing to handle a staggering volume of transactions, reaching into the realm of billions [
3]. Simultaneously, the latent computing power, storage capacity, and bandwidth residing in millions of idle devices dispersed across various locations can be harnessed to their full potential. This utilization of idle resources contributes significantly to the processing of transactions while substantially reducing computing and storage costs [
4].
The combination of blockchain technology with smart contracts is a good application that turns every smart device into an autonomous network node. These nodes execute predefined or embedded rules, facilitating functions such as information exchange and identity verification with other nodes. This innovative approach ensures that IoT products remain relevant and functional throughout their life cycle, minimizing equipment maintenance costs and mitigating the risk of obsolescence [
5].
The application of blockchain technology provides a way of thinking about the problems that exist in the Internet of Things, diminishing or eliminating the need for third-party authentication. It offers a clear solution to scalability, single points of failure, time stamping, logging, privacy, trust, and reliability concerns. Through the utilization of smart contracts, IoT devices can participate in secure message exchange, simulating agreements between parties without centralized authorization. The fusion of blockchain and the IoT has prompted extensive research efforts. For instance, Kumar et al. [
6] have proposed scalable blockchain frameworks, ensuring data integrity and secure transmission, with the added security layer of Ethereum smart contracts. BCoT (Blockchain for IoT), introduced by Banerjee et al. [
7], delineates an architecture that amalgamates blockchain characteristics with IoT, outlining promising application prospects. Azbeg et al. [
8] have delved into designing a secure medical system, addressing concerns such as security, scalability, and processing time. Their solution incorporates data hashing, smart contracts, and the Inter-Planetary File System (IPFS) to ensure data security and credibility.
Blockchain technology can provide inherent anonymity to all parties to a transaction; in other words, the nodes in the blockchain are all peers. The use of hash-value addresses as unique identifiers on the blockchain, while ensuring privacy, may pose challenges in facilitating seamless data flow between these parties. Decentralized Identifiers (DIDs) have emerged as a crucial solution to address identity concerns within decentralized blockchain systems [
9]. DIDs play a pivotal role in minimizing the risk of user credential exposure by furnishing relevant contextual information based on the specific information that needs to be disclosed. In our innovative solution, DIDs are employed to establish and verify both user and data identities.
In order to offer greater flexibility, users have the option to deploy smart contracts for the transaction of data rights and reduce the need for direct data delivery. This forward-thinking approach, facilitated by federated learning [
10], introduces a dynamic paradigm for the exchange of data rights. The blockchain serves as a comprehensive repository, where all transaction and storage information can be seamlessly queried, contributing to the overall traceability of IoT data [
11]. This not only fortifies transparency but also establishes a foundation for building trust within the intricate ecosystem of IoT data transactions.
Building upon prior research findings, we present an innovative framework dedicated to the secure management and transaction of IoT data based on blockchain technology. Our platform is designed to capitalize on the distinctive features of blockchain, creating an IoT data trading system that facilitates transactions involving data sourced from the IoT devices and sensors. Our system empowers individuals to acquire data from diverse IoT sources, ensuring both the secure transmission of data and the execution of payments through peer-to-peer (P2P) transactions. The proposed system is able to allow users to upload summaries of data to the blockchain, providing them with a way to generate revenue. According to the relevant mechanisms of blockchain technology, the integrity of transaction data is guaranteed, and trust and reliability are promoted. The addition of smart contracts further improves the efficiency of transactions on the blockchain, simplifying the process and enhancing the user experience.
The following sections of this article are as follows: First, we introduce the technologies involved in the system we designed, including analyzing the inherent advantages and disadvantages of these technologies to understand their implications. Next, we explained our system design, including the overall architecture, the blockchain composite layer, and the complex data transaction process. Following that, the practical implementation of our proposed system is introduced, along with the testing of relevant parameters. It also includes discussion and analysis of the survey results. Finally, a summary of this paper is provided, along with suggestions for future research. This systematic approach ensures a structured and informative exploration of our technical efforts, system design, implementation, and future avenues of exploration.
3. Proposed Hybrid Framework
Existing frameworks are mainly divided into two categories: Public-Chains (PB), based on Ethereum and others, and solutions based on Consortium Blockchain (CB). Our solution is a modification of the blockchain structure and content based on reference to the Bitcoin source code, which can be well adapted to the application scenarios of Internet of Things data circulation. As stated in
Table 1, In terms of transaction speed, BDIDA-IoT can appropriately reduce the difficulty of reaching consensus through the trust foundation of decentralized identity. In terms of supervision, the behavior of a user can be traced based on the decentralized DID. In terms of identity management, BDIDA-IoT combines the mechanism of decentralized identity, while PB requires additional compatibility.
BDIDA-IoT is a blockchain platform that references the source code of Bitcoin. Ethereum’s transactions require the additional cost of Gas, while the associated features deployed using smart contracts lack good scalability. In our solution, we can add different plug-ins to support more services according to the needs. Related functions can be opened in the form of apis in the future to support more development needs.
In terms of trust mode, PB allows any user to access the blockchain network and ensures the trustworthiness of the block through a high degree of difficulty. CB requires users to register with a third party to demonstrate the trustworthiness of their identity. BDIDA-IoT uses the DID scheme to map users and data into concrete DID Documents. The transaction information in the blockchain is replaced with a change in DID permissions, which means the nodes in the blockchain do not need to care about the identity information of other users, thus reducing the block time. In terms of data flow, PB and CB cannot directly identify the change of permission of data. BDIDA-IoT realizes the fine-grained permission changes of data through the unique identification pattern of DID and the addition of related data structures.
Issues such as data tampering, unauthorized access, and data privacy exist in traditional IoT architectures. In terms of data tampering, IoT data need to calculate a data digest and add it to the DID Document before it is uploaded to the chain. During the data transfer process, we can determine whether the data has been tampered with by checking whether the data summary is consistent before and after, thereby ensuring data consistency. In terms of unauthorized access, we can check the controller field in the DID Document to determine whether the user has relevant permissions for the data, thereby solving the data permission problem. In terms of privacy protection, the original data are not stored on the blockchain. Users can query the entire life cycle of the data they own through the blockchain, and they can realize their privacy concerns about the data by controlling the DID Document.
3.1. Framework Description
The proposed blockchain system framework in this study is shown in
Figure 4. Based on the traditional three-tier structure of the IoT, the framework adds a blockchain composite layer. The addition of the blockchain composite layer adds a buffer between the application layer and the transport layer and relieves the pressure of the application layer when facing massive data delivery. The perception layer, at the bottom of the framework, consists of IoT devices large and small and is a key part of information collection. The transport layer is a network cluster with seamless connection and all-round coverage of the IoT, and its main function is to transmit information acquired by the perception layer. The network transmission function of the transport layer overlaps with the network layer in the blockchain composite layer, which is set as a separate layer for convenient representation. The function of the application layer is information processing. The application layer, together with the lowest sensing layer, is the salient feature and core of the IoT. The application layer can calculate, process, and mine the data collected by the sensing layer, so as to realize the real-time control, accurate management, and scientific decision-making of the physical world.
In the proposed hybrid framework, IoT data can be transmitted to the sensing layer via IoT protocols and then be converted into the appropriate blockchain transaction format and transmitted to other nodes through the blockchain network. Blockchain nodes can verify the legitimacy of transactions and add them to the blockchain to ensure the security and immutability of the data. This combination can provide greater security, trust, and transparency to IoT systems, while enabling data exchange and sharing across devices and platforms.
In our framework, when more and more IoT devices start to connect to the system, the result is more and more data transaction information. In our design, a block can store up to 4000 transactions, each transaction can declare 500 MB of data, and each block can be up to 1 MB. This means that 1 GB of space can claim up to 1000 blocks and 500 GB of data, so our framework is able to guarantee scalability of the ability to handle an increasing number of IoT devices and transactions. It is worth mentioning that as more and more IoT devices begin to connect to the blockchain network, the processing power and stability of our system will also increase.
In the blockchain composite layer, the data layer defines the data structure of the blockchain and the chain and formulates relevant standards for various information in the blockchain, such as the storage of blockchain information, Unspent Transaction Output (UTXO), and Data Stored Sets (DSSs). The network layer is responsible for the access and verification of blockchain nodes (users). The consensus layer is responsible for synchronizing local blockchain, UTXO, and DSS information between nodes; verifying the legality of each packaged block; and maintaining the operation sequence and fairness of the system. The incentive layer sets up relevant rewards for users who participate in maintaining the blockchain. A smart contract is a piece of code deployed on a blockchain, where users can deploy the corresponding smart contract to realize data-related functions. It differs from the application layer in that it is a piece of code on a blockchain and can provide lightweight and decentralized services.
The flow of data is based on DIDs. A DID should be issued by one or more trusted third parties, and this DID can be verified by other users and provide relevant verification services. In the blockchain system, users can be represented by public key information, which uniquely identifies a user, and this public key information is also the identity identifier of the DID decentralized identity. In
Figure 5, a DID Document should include the Context, ID, Controller, Public-Key, Service, and Authentication fields. The content of Context is the version information of the DID protocol and some explanations about the DID, such as data size, data type, data organization form, usage method, etc. ID is the number of the DID, and a hash value is generated according to the content of the DID Document to represent it. Controller is the owner of the DID Document, and this field can have multiple users. Public-Key is a collection of public keys used to verify a user’s ownership of the DID. Service is the relevant service information provided by the issuer of the DID, and the content is a URL. Authentication is the organization that issues the DID, and this field can be a collection. In the process of data flow, the fields of a DID Document corresponding to a set of data should be changed at any time. The Controller and Public-Key fields need to be added or modified at any time. The changes are saved on the blockchain, and each change needs to generate a data digest pointing to the historical change record of the DID, so that we can perform relevant traceability operations.
3.2. Blockchain Composite Layer
The user’s data transaction process is realized at the blockchain composite layer. Users join our platform by connecting to the blockchain network. Whether or not a user is allowed to access the blockchain network was included in a study by Steichen et al. [
23]. Steichen et al., utilized smart contracts to maintain access control lists (ACLs), which enables network access control for users. In the blockchain network, users can earn coins by providing data and mining, and coins are used in our system to measure the trustworthiness of a use and also to reflect the user’s activity. There are two types of transactions in the transaction pool: storage transactions and data transactions. Users declare their data ownership to other users through storage transactions. Data transaction corresponds to the act of buying and selling data between users. Meanwhile, data transactions correspond to changes in UTXO, while storage transactions correspond to changes in DSS. We set the DSS to speed up the querying of the amount of data and smart contracts owned by users so that we do not have to waste any significant time traversing the blockchain.
3.3. Data Stored Set
In addition to maintaining UTXO locally, nodes added to blockchain also need to maintain a data store set DSS, as shown in
Figure 6. The purpose of our data store set is to speed up the query of the amount of data on the node chain. Instead of traversing the entire blockchain and spending significant time, users can just look up information on the data store set.
The red transaction in the picture represents a storage transaction. Each storage transaction indicates that a certain address has declared 100 MB of data in the blockchain. The green transaction in the figure represents a contract transaction, which is the public information after the user deploys the smart contract. Other users can obtain the information of calling the smart contract by querying the contract transaction, which can better improve the efficiency of the transaction.
DSS is a collection of hash tables that can be looked up by setting address–value pairs. When a user successfully publishes a data store transaction, each node in the blockchain needs to add the user’s address to the data store and add related records. Values are chained structures, where each block record stores information, time, data hash, and contract address. We stipulate that each stored transaction of the user can only declare 100 MB of data at a time, and 100 MB of data will generate a random number of 32 B after the data summarization algorithm. A block of the blockchain is 1 MB in size, with 80 B block headers and variable block bodies. The transaction information is stored in the block body, which means that a block can contain up to 2.5 GB of data storage information, thus preventing the blockchain from being too long. If a user successfully publishes a storage message, the user will locate the message through the address first in DSS. If there is no record of the address, a new record will be added. If it exists, the chain structure is traversed in the value corresponding to the address. Each block represents 100 MB of data for the address to be linked. The process of querying user data volume is to traverse the chain structure of an address.
3.4. Transaction Process
Figure 7 is an overview of the proposed transaction process, which is mainly composed of various edge computing devices, transaction pools, and blockchain. By default, all nodes have successfully joined the blockchain network. The nodes in
Figure 7 represent individual user entities with data; miners can be entities that do not own data but have enough computing power, or they can be entities that own data. They are the main workers to maintain the normal operation of blockchain. Miners can receive corresponding rewards through mining. The trading pool stores the trading information or storage information published by the node; blockchain stores information about activity between nodes.
If node A, with a large amount of IoT data, can provide data for other nodes to use in order to receive a reward, node A first needs to publish a stored transaction to the transaction pool. The stored transaction includes the hash value of the data, the timestamp, and node A’s address. Miners select the transaction from the trade pool and package it into a block, which can be uploaded to the blockchain after other miners verify that the block is valid. When other nodes need to use the data, the blockchain can be queried to see which nodes have the data in order to initiate transactions.
If node A needs to use the local data of node B, the process is as follows: The node searches the DSS data storage set to see if node B has enough data and whether a smart contract has been deployed. If there is enough data and a smart contract has been deployed, it can call it directly. Smart contract A completes the relevant transaction. If there is no smart contract, it sends the transaction to the transaction pool to wait for confirmation. The transaction information stores the address of node B, the address of node A, the payment amount, and the timestamp. Miners select transactions from the transaction pool and package them into a block, which can be uploaded to the blockchain after other miners verify that the block is valid. Successful blockchain upload means the transaction is completed and node B sends its local encrypted data to node A through the transport layer.
Meanwhile, node B can also deploy smart contracts on the blockchain. Specifically, smart contracts are a big part of why blockchain is called “decentralized”, allowing us to perform traceable, irreversible, and secure transactions without the need for third parties. Once a smart contract is linked, all nodes connected to the blockchain can execute this code locally, which performs obligations between the parties to the contract. In other words, node A can execute the smart contract deployed by node B on the blockchain to complete the transaction process of data, which makes the whole transaction more decentralized and intelligent.
In the above process, transactions between users do not require trust endorsement from a third party, and user data only needs to be stored locally, which greatly reduces the risk of data leakage. At the same time, users only need to upload their own address, hash value, and timestamp of the data, and then they can start transactions between other nodes, which greatly reduces the communication overhead of the IoT data flow in the blockchain network. It is worth noting that after the user uploads the data hash value, the hash value will be compared with the data hash value delivered during the transaction to verify whether the data has been tampered with, which further improves the reliability of the IoT data.
5. Conclusions
In this paper, we propose a blockchain-based trusted data management and trading platform for the IoT. After analyzing the credibility of the platform, the data provider, and the data provider’s autonomy to the data in the traditional data management mode of the IoT, we consider two kinds of transactions, namely payment transactions and storage transactions, and realize the up-chain and trusted transaction of the IoT data by utilizing the characteristics of blockchain. At the same time, in order to improve the transaction automation and efficiency, we designed a smart contract to complete this process. Finally, we implemented the proposed system.
While the results in the experimental test show positive results, in the actual IoT data flow scenario the participating transactions do not belong to the same network environment, so the time to reach consensus on the blockchain will increase. At the same time, our tests of the system were evaluated with a small number of nodes. In a real-world scenario, the number of nodes participating in the blockchain would be higher, which would increase the data processing burden on each node and reduce performance. In the process of running the blockchain system, it may face the problem of increased demand, so it is also necessary to consider the problem of system scalability and update the version in a more convenient way.
In terms of system scalability, we have layered the system, and scalability can be deployed at the application layer in the form of plug-ins. At the same time, our blockchain platform is modified with reference to the source code of Bitcoin, and relevant data structures can be flexibly modified to adapt to more needs. In the future, we will make various functions of the blockchain into APIs, and the application layer only needs to use these APIs to achieve good scalability.
Future works can be improved in the following ways:
- (1)
The experiment in this article was conducted on a single machine and multiple nodes. In the future, we will consider using clusters or edge computing services for deployment to further verify the performance of the system.
- (2)
In the future, more physical equipment can be used to test the reliability and throughput of the system.
- (3)
Future research can try to improve scalability and support more IoT application integration. At the same time, the content of our blockchain platform is modifiable, which means we can expand more services at the application layer through plug-ins.
- (4)
Future research can consider adding a federated learning framework to deal with the distributed training scenarios of large-scale IoT data and achieve this expansion by dividing the roles of nodes in the blockchain system.