1. Introduction
The Internet of Things has become a cornerstone in the evolution of a digitally connected world, enabling various sectors to collect and analyze data in real time [
1]. By embedding sensors and software in physical objects, IoT technologies allow for unprecedented levels of monitoring and automation, paving the way for more innovative and efficient systems [
2]. One of the most impactful applications of IoT is in the domain of Intelligent Transportation Systems. An IoT-enabled ITS aims to optimize traffic flow, improve road safety, and enhance the overall transportation experience for individuals and logistics providers [
3] through interconnected sensors, vehicles, and traffic management tools. These systems are becoming particularly crucial in urban environments, where managing complex, congested networks is a growing challenge [
4]. One of the most formidable challenges and opportunities posed by IoT-enabled Intelligent Transportation Systems is generating voluminous and highly complex data, often called Big Data [
5]. These systems employ interconnected sensors, vehicles, traffic lights, and other IoT devices that continuously collect and transmit real-time data. The data can range from vehicle speed and location to weather conditions, road quality, and driver behavior [
6]. The diversity of data types, including structured, semi-structured, and unstructured data, adds another layer of complexity.
The data is generated at an unprecedented velocity, requiring rapid processing for actionable insights. Given the velocity, volume, and variety, which are the three Vs of Big Data, it becomes evident that traditional data processing systems must be equipped to handle the complexities of data flow and analytics in an IoT-enabled ITS [
7]. This enormous scale and complexity of data not only necessitate more advanced Big Data analytics but also makes it imperative to address challenges related to data storage, privacy, integration, and real-time processing [
8]. IoT-enabled ITSs inherently generate colossal amounts of data due to the continuous real-time collection and transmission of various types of information [
9]. This ever-growing mountain of data falls under the category of Big Data, characterized by its high velocity, volume, and variety [
10]. While Big Data provides opportunities for deep analytics and insights, it also presents many challenges. Traditional data processing frameworks often need to be revised to handle this data’s sheer scale and complexity, which requires real-time analysis for actionable insights [
11]. Furthermore, integrating disparate data types and sources adds a layer of complication to the analytical processes. One of the most pressing challenges in dealing with Big Data from IoT-enabled ITSs is the issue of data privacy.
Since individual vehicles and devices contribute sensitive information, these data’s centralized collection and processing raise significant privacy and security concerns [
12]. Federated Learning (FL) offers a novel approach to tackling data privacy challenges. In a federated model, machine learning algorithms are trained across multiple decentralized devices or servers holding local data samples without exchanging them [
13]. This allows for practical model training and ensures that the data remain on the local device, thereby maintaining individual privacy. FL is a machine learning paradigm where multiple decentralized devices or servers collaboratively train a shared model while keeping their data locally stored. Unlike the traditional centralized machine learning approach, FL ensures data privacy by transmitting only model updates, rather than raw data, between participating entities. This decentralized approach addresses significant privacy and security concerns, especially in domains with sensitive data. Its core benefit lies in enabling machine learning on edge devices, preserving data ownership, and minimizing data transmission overheads.
Big Data analytics using FL offers a transformative approach that addresses critical challenges like data privacy, scalability, and real-time analysis. By allowing machine learning models to be trained across multiple decentralized devices or servers, FL eliminates the need to move data to a central location. This ensures privacy compliance, optimizes resource usage, and enhances model robustness. Moreover, FL can handle real-time data analytics, non-IID data distributions, and data imbalance and heterogeneity, making it a promising solution for future Big Data analytics [
14]. FL emerges as a potent solution for handling Big Data analytics in the context of Big Data produced in the IoT-enabled ITS environment. ITS produces a wealth of data from various IoT sensors embedded in the transportation infrastructure and vehicles. FL offers a decentralized approach to model training, allowing these devices to perform localized analytics without sending sensitive or voluminous data to a central server [
15].
This addresses data privacy concerns and adds a layer of efficiency and real-time responsiveness crucial for transportation systems. The localized analytics provided by FL can lead to more accurate and personalized models, better traffic management, and enhanced safety measures, positioning FL as a critical enabler for more innovative and more secure IoT-based ITSs. One well-known constraint in Federated Learning is the network bandwidth that limits the rate at which local updates from different organizations can be combined in the cloud. To mitigate this, Fedavg uses local data for gradient descent optimization before conducting a weighted average aggregation of the models uploaded by each client. The algorithm proceeds iteratively, updating the global model in each training round based on the contributions from participating organizations. Given the challenges mentioned earlier and the opportunities, this paper proposes an improved Big Data analytics architecture incorporating FL for IoT-enabled ITS. Using FL, our architecture aims to provide robust, real-time analytics while preserving user privacy. This approach will lead to more efficient data management in ITS, providing a scalable and effective solution for modern urban transportation systems. The contributions of this work include:
We introduce a Big Data analytics architecture that synergizes FL and IoT-enabled ITS to address critical issues such as data integration, data processing, and privacy, offering comprehensive solutions within the architecture.
Diverging from conventional Federated Averaging techniques, we introduce a more personalized algorithm.
Various personalization methods are introduced to enhance the FedAvg algorithm, including local fine-tuning and weighted averaging to tailor the global model to individual client data; custom learning rates are utilized to boost the performance further, and regular evaluations are advised to maintain model efficacy.
In personalized Federated Averaging, individual contributions from clients are weighted based on their data volume to utilize the Big Data features and model performance. We improve the FedProx with FedAvg, which is used for robust aggregation, accounting for system heterogeneity and stragglers. We deploy advanced adaptive aggregation techniques that factor in the attributes of client updates for a better-informed global update.
We execute a broad range of tests using real-world data to prove the efficacy of our suggested strategies.
The remainder of this paper is structured as follows:
Section 2 reviews related works in ITS, Big Data analytics, and Federated Learning.
Section 3 describes the proposed architecture.
Section 4 presents our empirical findings.
Section 5 provides a discussion on findings, and
Section 6 concludes the paper while providing directions for future research.
2. Literature Review
Big Data analytics involves examining, cleaning, transforming, and modeling data to discover useful information, draw conclusions, and support decision making. With the increasing importance of data privacy and distributed data sources, FL is emerging as a powerful tool that complements traditional Big Data analytics. Utilizing FL techniques in Big Data analytics allows for decentralized model training across a myriad of data sources without the need for central data aggregation. This provides an efficient and privacy-preserving mechanism for harnessing insights from vast amounts of data scattered across multiple locations or organizations. Unlike traditional machine learning, FL enables model training across multiple decentralized nodes without requiring raw data to be shared centrally, thus ensuring data privacy and reducing data movement [
16]. One of the main challenges in Big Data analytics is data privacy. FL stands out as a privacy-preserving method since it enables model training without requiring raw data to be transferred to a central server, aligning with privacy regulations like GDPR and HIPAA [
17]. Big Data is often characterized by its enormous volume and the speed at which it is generated. The scalability of FL allows it to handle the challenges of Big Data efficiently by facilitating decentralized training across multiple nodes.
The architecture of FL enables real-time data analytics as data is analyzed at the source, and no latency is involved in sending the data to a centralized location for processing. This feature is crucial for applications requiring immediate insights [
18]. Traditional Big Data analytics often requires the assumption that data are independently and identically distributed (IID). FL can handle non-IID data distributions, enabling more personalized and accurate model training. Data transfer over the network is resource-intensive [
19]. FL alleviates this issue by localizing the data and reducing the need to send data over the network. Instead, model updates are the only information exchanged, conserving computational resources [
20]. In Big Data analytics, one of the goals is to generalize findings across diverse and complex datasets. FL contributes to model robustness by aggregating learning from diverse data sources. The issue of imbalanced and heterogeneous data is also present in Big Data analytics. FL can adapt to these challenges due to its flexible and distributed architecture. Federated Learning presents a promising avenue for tackling the challenges of Big Data analytics, offering solutions for data privacy, scalability, real-time analysis, and more [
21]. Its features complement the goals of Big Data analytics, paving the way for more secure and efficient data analysis techniques.
The exponential growth of the IoT has precipitated a revolution in ITS, notably in urban environments. IoT has been a driving force behind significant advancements in ITS, especially within urban settings. ITS leverages advancements in communication technologies and data analytics to enhance the efficiency and intelligence of transport networks. This fusion aims to elevate the intelligence and efficiency of transportation networks, making them more responsive to the needs of modern urban environments. One of the groundbreaking integrations in ITS is the incorporation of FL. By leveraging FL, ITS can enable vehicles and transportation infrastructure to engage in collaborative learning. This collaboration is pivotal in optimizing traffic flow, enhancing safety protocols, and improving the overall efficiency of travel routes, as cited in [
22]. A standout feature of this approach is its emphasis on data privacy. Unlike traditional systems, FL ensures that data generated by individual vehicles or sensors is not required to be sent to a central repository. Instead, learning and model improvements occur at the edge, ensuring data remains decentralized. All this is accomplished while ensuring data privacy, as the data generated by individual vehicles and sensors does not have to be centrally collected to build and improve the predictive models. The ITS model envisages a network of interconnected vehicles that communicate with each other and intelligent infrastructure [
23]. The envisioned model for ITS is a highly interconnected network where vehicles are not isolated entities. They are part of a larger ecosystem, communicating continuously with each other and with smart infrastructure components. However, implementing such a vision is not without challenges. These challenges can be broadly categorized into four main areas:
Related Work
FedGRU, an algorithm that combines Federated Learning with Gated Recurrent Unit (GRU) networks, is proposed for privacy-focused traffic flow prediction [
26]. This approach excels in both preserving privacy and prediction accuracy while employing Federated Averaging to reduce communication overhead. In contrast, another study integrates Federated Learning and blockchain technology to maintain data privacy and integrity in Intelligent Transport Systems (ITSs), using a blockchain-based smart contract to securely aggregate threat-detection models trained on individual vehicles securely [
27]. However, this approach shows a slight trade-off with a 7.1% decrease in detection accuracy and precision. A survey offers a comprehensive overview of combining blockchain and Federated Learning to address data privacy and security in the Internet of Vehicles (IoVs), identifying key challenges and future research directions [
28]. Similarly, a blockchain-based asynchronous Federated Learning scheme called DBAFL is introduced for intelligent public transportation systems [
29]. This scheme balances efficiency, reliability, and learning performance using a committee-based consensus algorithm and a dynamic scaling factor.
A thorough review of Federated Learning applications in Connected and Automated Vehicles (CAVs) analyzes data modalities, evaluates various applications, and outlines future research directions [
30]. Another study proposes a contextual client selection pipeline for Federated Learning in transportation systems, using Vehicle-to-Everything (V2X) messages to predict latency and select clients accordingly [
31]. A Federated Learning framework designed for autonomous controllers in CAVs is introduced, presenting a novel algorithm called Dynamic Federated Proximal (DFP) that outperforms traditional machine learning solutions in various traffic scenarios [
32]. Transformation of the Internet of Vehicles into Intelligent Transportation Systems through advancements like 5G networks is discussed, identifying key challenges such as scalability and data privacy while proposing Federated Learning as a solution [
33]. A study addresses the non-identical data distribution across clients in Federated Learning systems, introducing a new FedOT scheme based on the Optimal Transport theory [
34]. Lastly, communication challenges in Federated Learning within dynamic and dense vehicular networks are addressed, introducing a Communication Framework for Federated Learning (CF4FL) that reduces training convergence time by 39% [
35].
Federated Optimal Transport (FedOT) is introduced to address data distribution issues in Federated Learning, validated through numerical tests [
36]. Selective Federated Reinforcement Learning (SFRL) aims to improve the efficiency and adaptability of Connected Autonomous Vehicles through a unique selection process, confirmed by extensive simulations [
37]. FedSup employs Bayesian Convolutional Neural Networks for fatigue detection in the Internet of Vehicles, showcasing reduced communication costs and improved training [
38]. Federated Transfer-Ordered-Personalized Learning (FedTOP) is tailored for driver monitoring, demonstrating improved accuracy, efficiency, and scalability across two real-world datasets [
39,
40]. A Hybrid Federated and Centralized Learning (HFCL) framework merges the advantages of federated and centralized learning, achieving up to 20% higher accuracy and 50% less communication overhead [
41]. Driver Activity Recognition (DAR) is explored through a Federated Learning model, showing competitive performance in centralized and decentralized settings while considering data privacy and computational resources [
42].
While the existing body of literature extensively covers various aspects of Big Data using FL in IoT-enabled ITS, it primarily focuses on privacy and data distribution. However, a notable gap remains in exploring FL systems’ real-time adaptability and resilience to dynamic changes in performance and data distribution in vehicular settings. It generally needs to offer an integrated, IoT-enabled Big Data architecture that addresses data integration and real-time processing while maintaining data privacy. Moreover, current studies often rely on generic Federated Averaging techniques, needing a personalized approach tailored to the unique data characteristics of individual clients in a vehicular network. Our work fills these critical gaps by introducing a Big Data analytics architecture that synergizes FL and IoT technologies for a more robust ITS. We diverge from conventional Federated Averaging techniques by introducing a personalized algorithm enhanced by local fine-tuning, weighted averaging, and custom learning rates. Custom learning rates refer to adjusting the learning rate during training rather than using a fixed rate. The learning rate is a hyperparameter that controls how much to change the model in response to the estimated error each time the model weights are updated. FL involves training on multiple decentralized devices or servers (clients) and aggregating the updates on a central server. The learning rate can be crucial in both the client and server updates. Additionally, we employ transfer and ensemble learning strategies to optimize pre-existing models for specialized tasks, thereby improving prediction accuracy. These contributions are empirically validated through a comprehensive suite of tests using real-world data, thereby advancing the field by addressing these unmet needs.
3. Proposed Framework
Our proposed architecture is designed to seamlessly equip Big Data analytics with FL in an IoT-enabled ITS. The proposed approach is a personalized FL approach used to tailor the global aggregation and averaging for improved performance. Various personalization methods are utilized to enhance the Federated Averaging (FedAvg) algorithm. Local fine-tuning and weighted averaging tailor the global model to individual client data. Custom learning rates are utilized to boost the performance further. Regular evaluations are advised to maintain model efficacy. Overall, these approaches offer a robust strategy for personalizing FedAvg. The architecture comprises five major modules or layers, each addressing specific requirements to ensure robust, real-time analytics while preserving user privacy. The architecture leverages ensemble techniques to enhance model performance. The proposed model is depicted in
Figure 1.
3.1. Big Data Preprocessing
The first layer of our architecture serves a crucial role in preprocessing the extensive volume of Big Data generated by IoT-enabled ITSs. In real-world scenarios, data often comes with a lot of ’noise’ that can adversely affect the performance of machine learning models. Hence, this step is crucial for maintaining the dataset’s integrity. It is worth noting that these preprocessing techniques were tailored explicitly for our dataset, which contained numerous missing values and needed modification to suit the problem of vehicle detection. These comprehensive preprocessing steps have been vital for preparing the dataset for further analytics, ensuring quality and making it amenable to solving complex problems like vehicle detection. This layer consists of four integral sub-modules designed to address specific challenges:
The Missing Values Management sub-module employs a sophisticated imputation algorithm to address the issue of data gaps. Given that our dataset had many missing values, this sub-module ensures that the dataset remains comprehensive and reliable for further analysis.
The Data Reduction sub-module comes into play to make the dataset more manageable in size and computational complexity. We utilize advanced techniques like Principal Component Analysis (PCA), which reduce the data’s dimensionality and retain the most critical features and relationships within the dataset. This is particularly important in Big Data analytics, where computational resources could become a bottleneck.
The Data Filtration sub-module is designed to enhance the data quality by using statistical methods to identify and remove noise and outliers.
Data encoding is a critical preprocessing technique in Big Data analytics and machine learning. It involves converting raw data into a format easily ingested and analyzed by data algorithms. The primary aim is to transform the data into a form that reduces complexity and size while retaining the essential features and relationships within the data. In Big Data, which often involves massive and heterogeneous datasets, encoding is vital for reducing storage space, improving computational efficiency, and enabling faster data processing.
3.2. Federated Data Preparation
Federated Learning requires data to be in a specific format. This layer transforms the pre-processed data into a format suitable for FL. The transformation includes data tokenization, batching, and serialization. In our methodological framework, we have emphasized the creation of meta-files, the arrangement of directories, and the strategic segmentation of data, among other vital activities. Initially, we produced metadata to assist with efficient data mapping processes. Subsequently, our image directories were systematically organized for effortless data retrieval. With a focus on effective data handling, we divided our dataset into discrete ‘shards’ to facilitate experimental operations. We established ordered dictionaries to guarantee consistent data access. In addition, our dataset was custom formatted to support Federated Averaging, which is a vital element in FL. Federated Averaging allows for the aggregation of local model updates from multiple devices in a decentralized manner, ensuring efficient global model training without compromising data privacy. Given its significance in FL, our dataset was specifically custom-formatted to align with the requirements of Federated Averaging, ensuring seamless integration and optimization for our FL-based analysis. Beyond that, we constructed a comprehensive directory hierarchy for image categorization, which aids in more evenly disseminating data across client devices. Such a structured approach optimizes the dataset and enhances the training milieu, allowing machine learning algorithms to discern class-specific features better. This degree of meticulous structuring, whereby images are sorted into separate class-specific directories, contributes to a decrease in classification errors, ultimately elevating the overall accuracy and effectiveness of the model. The federated data preparation, including client map generation, is described as Algorithm 1.
Algorithm 1 Generate Clients Map for Federated Learning |
- 1:
procedure gen_clients_map(train_ds, size, alias) - 2:
Initialize train_iter as iterator from train_ds - 3:
Initialize empty ordered dictionary clients_map - 4:
Compute batch_from_ds = length(train_ds) / size - 5:
for index_ in 0 to size - 1 do - 6:
Initialize empty lists data_list, label_list - 7:
for each batch in batch_from_ds do - 8:
Retrieve image, label from train_iter.next() - 9:
Convert label, image to numpy arrays - 10:
Append image to data_list - 11:
Append label to label_list - 12:
end for - 13:
Create ordered dictionary data with keys ’pixels’ and ’label’ - 14:
Add data to clients_map with key client_alias_index_ - 15:
Clear Keras session - 16:
end for - 17:
return clients_map - 18:
end procedure
|
3.3. Simulating Collaborative Learning Environment
To mimic real-world applications, a collaborative learning environment is simulated. This involves:
The creation of various clients: Various nodes or clients are created to simulate a distributed environment. A dynamic algorithm is proposed to vary the number of clients for multiple experiments and verification.
Data distribution: Data that have been pre-processed and formatted are allocated across these clients to mimic real-world conditions. A dynamic algorithm is also introduced to distribute the data among various nodes, allowing for varying data sizes and samples to be held by different nodes.
Within our FL procedure, we load directories pre-configured with positive, normal, and overall image data into memory for either the training or testing phase. These directories serve multiple roles, such as facilitating label assignment, determining batch size, resizing images, shuffling data, and managing color channels. This ensures optimized resource use while mitigating the risk of memory overload. By organizing images into batches for training, we align with FL’s standard practices for data segmentation and enable targeted performance evaluations, which are vital when dealing with classes of varying or imbalanced attributes. For the node-mapping aspect of FL, the dataset of images is transformed into an ordered dictionary, which allows TensorFlow objects to facilitate the partitioning of nodes, each receiving a distinct subset of the entire dataset. This function iterates through the dataset, dividing it into smaller batches that are subsequently allocated to simulated client nodes. This mimics a dispersed transportation data setting where each vehicle can access only a fraction of the collective dataset. By disseminating data among multiple nodes, our methodology replicates authentic FL conditions and supports decentralized model training. Each node conducts training on its specific data subset; afterward, improvements to the global model are synthesized from all the updates received from the nodes. This enhances the FL process as a whole. Furthermore, our proposed algorithm has unique dynamic node creation and data allocation capabilities. The number of nodes can be dynamically generated through the algorithm, offering flexibility in creating a scalable and adaptable collaborative learning environment. The process of creating the collaborators is depicted in Algorithm 2.
Algorithm 2 Creating Collaborators for Federated Learning Round |
- 1:
procedure creating_collaborators(dataset, client_data) - 2:
Initialize Sample Size: - 3:
Calculate sample_size as half of the total number of client IDs in client_data - 4:
Sample Clients: - 5:
Randomly select sample_size number of client IDs without replacement - 6:
Store these in sampled_clients_ids - 7:
Generate Sampled Client Datasets: - 8:
for each client_id in sampled_clients_ids do - 9:
Generate a TF dataset - 10:
end for - 11:
Preprocess Datasets: - 12:
Preprocess each dataset using preprocess() function - 13:
Store the preprocessed datasets in sampled_clients_data - 14:
return sampled_clients_data - 15:
end procedure
|
3.4. Client Model Training
Machine learning models, such as MLP, CNN, and VGG16, are deployed for training within each client. The system’s architecture incorporates two key strategies. First, it utilizes Transfer Learning by fine-tuning pre-existing models like VGG16 to better suit the specialized task. Second, it employs Ensemble Learning by integrating the outputs from multiple models, thereby enhancing the overall prediction accuracy. We employed Multi-Layer Perceptrons (MLPs) as a foundational algorithm to rigorously validate our hypotheses. Often referred to as a class of artificial neural networks, an MLP consists of at least three layers of nodes or neurons: an input layer, one or more hidden layers, and an output layer. As a supervised learning model, MLPs are trained using labeled data for tasks like prediction and classification. Each neuron within a layer is interconnected with every neuron in the following layer via weighted connections. To optimize global accuracy, we explored a variety of algorithms for different client nodes. We strategically deployed MLPs in specific client scenarios where we assessed they would yield favorable outcomes. In parallel, we also implemented other models, like CNN and VGG16, to enrich our proposed FL framework. Each client model is trained on a local dataset during the FL collaborative environment simulation, specifically allocated to that client node. By generating a diverse set of clients, we could closely emulate real-world scenarios. It explains how these different algorithms perform in a Federated Learning context. Each client’s learning rates are personalized based on their local loss landscapes. Some nodes benefit from a faster learning rate, while others might need a slower one for better convergence.
3.5. Proposed Personalized Server/Global Aggregation
One well-known constraint in FL is the network bandwidth that limits the rate at which local updates from different organizations can be combined in the cloud. To mitigate this, FedAvg uses local data for gradient descent optimization before conducting a weighted average aggregation of the models uploaded by each node. The algorithm proceeds iteratively, updating the global model in each training round based on the contributions from participating organizations. Traditional centralized learning approaches merge data from different organizations into a single database. This results in considerable communication costs and risks to data privacy. To tackle these challenges, we introduce a privacy-preserving module equipped with a prediction algorithm for vehicle detection. Our solution starts by leveraging the FedAvg algorithm for parameter aggregation, collecting gradient data from various nodes. We then introduce an enhanced version of FedAvg to minimize communication overhead and perform efficient aggregation. This is particularly beneficial for large-scale and distributed prediction tasks.
The server layer aggregates the trained models from all clients to create a comprehensive global model. This is carried out using:
Individual contributions from clients are weighted based on their data volume and model performance.
An improvised Federated Proximal algorithm with Federated Averaging is used for robust aggregation, accounting for system heterogeneity and stragglers.
Instead of simple averaging, we used weighted averaging, where the weights are determined based on each client’s data distribution, quality, or performance metrics. This will give more influence to clients with more relevant or high-quality data.
In place of straightforward averaging, we deploy advanced aggregation techniques that factor in the statistical attributes of client updates, such as variance or confidence intervals, for a better-informed global update.
The aim of using weighted averaging is to consider data’s uneven distribution and quality across clients. By doing so, we prevent clients with minimal or low-quality data from dominating the global model update. Instead of uniformly averaging the model updates from each client, we assigned weights to each client’s update. The weights were calculated based on the client’s data distribution, quality, and training performance.
Weight Calculation: For client
i, let
be its data size,
represent the quality score (based on internal metrics), and
represent its training performance. The weight
for the client can be formulated as:
where
and
are hyperparameters determining the significance of data size and data quality, respectively.
The idea behind adaptive aggregation is to consider the variations in model updates from different clients. By accounting for these attributes, we ensure a robust global model update. Instead of naive averaging, we integrated the statistical attributes of client updates to formulate the global update. This method ensures that outliers or divergent updates do not adversely impact the global model.
Aggregation Formula: Let
be the model update from client
i, and
be its variance. The aggregated update
U is then computed as:
where
is a hyperparameter determining the influence of variance on the aggregation.
Our proposed algorithm for model aggregation combines the client updates to construct a unified global model. This guarantees a well-balanced and precise representation of data from all collaborating clients. During each communication round, each device calculates a local update, which is then transmitted to a central server for aggregation. This loop persists until the model converges or a predefined number of communication rounds is met. Metrics like training loss and accuracy are diligently tracked in every round. The FedAvg algorithm not only amalgamates these local updates but also refines the global model by considering the volume and quality of each client’s data. The study follows a cyclical training protocol, where each round selectively chooses client data subsets for training and modifies the server’s status accordingly. Performance indicators like accuracy are continuously logged, offering a dynamic snapshot of how the model fares over time. This strategy promotes decentralized and cooperative model training, achieving fairness and comprehensive data representation. Alongside the standard FedAvg, we also experimented with an optimized version of FedAvg to yield personalized and optimal outcomes. Additionally, we implemented an advanced version of FedProx to refine global accuracy further. This allows for a more nuanced and compelling Federated Learning process. The proposed algorithm is depicted in Algorithm 3.
Algorithm 3 Personalized Federated Averaging Algorithm |
- 1:
procedure personalized_algorithm(node_data, num_rounds) - 2:
Initialize Metrics and Optimizers: - 3:
Define Client and Server optimizer functions - 4:
Specify Model Input: - 5:
input_spec = get_input_spec(node_data) - 6:
Build Federated Averaging Process: - 7:
Build the federated averaging process using TFF - 8:
Perform weighted averaging for global model customization - 9:
Initialize Federated Averaging: - 10:
Initialize state for federated averaging - 11:
for round_num in range(num_rounds) do - 12:
Create federated_train_data - 13:
Update state and metrics - 14:
end for - 15:
Local Fine-Tuning: - 16:
Fine-tune model locally, utilize custom learning rates - 17:
Regularly evaluate model for efficacy - 18:
Clear Session: - 19:
Clear Keras session - 20:
return losses, accuracy - 21:
end procedure
|
5. Discussion
This research aimed to address the challenges arising from the heterogeneity of devices, the dynamic conditions of ITSs, and data privacy concerns in the Big Data landscape. Our proposed architecture leverages an optimized Federated Averaging strategy to address these issues effectively, offering a robust solution in terms of scalability, real-time decision making, and data privacy preservation. Our empirical findings are aligned with these objectives. The accuracy of 93.27% underscores the model’s proficiency in real-time Big Data analytics and highlights its capability in a real-life federated environment. This result further substantiates our claim that personalized approaches to Federated Averaging are effective and practical for modern ITS utilizing big data. In our study, the primary focus has been on the accuracy and robustness of the proposed architecture. Our personalized Federated Averaging method consistently achieved higher accuracy levels compared to the standard FedAvg, with the top performance being 93.27% accuracy using 30 nodes. This accentuates the model’s proficiency in real-time Big Data analytics within a federated environment. In terms of efficiency and convergence speed, our method incorporates personalization techniques, which, while enhancing accuracy, can occasionally introduce slightly extended convergence times compared to the standard FedAvg. Nevertheless, the benefits of the improved accuracy outweigh the marginal increase in training time, especially when considering the critical nature of decision making in real-world ITS scenarios. The testing phase for our proposed approach remained comparable in speed to the standard FedAvg, ensuring timely decision making. Our model balances accuracy and efficiency, making it a promising solution for modern ITSs utilizing Big Data. Concerning stability, it is essential to distinguish between brief variations and sustained stability. Although our technique shows increased short-term variations post-convergence, its overarching trend suggests sustained performance dominance over extended durations. The tailored strategy guarantees that the overarching model stays resilient, even when there is diversity in data distributions across individual nodes. Notably, the fluctuations we observed are within a range of 4–5%, without any significant deviations.
The proposed model shows more pronounced fluctuations in loss and accuracy after convergence compared to the standard FedAvg. This is because of the following factors:
Personalized learning approach: Our approach diverges from standard Federated Averaging by incorporating personalized techniques. While this results in better-tailored models for individual nodes, it can also introduce variability in the global model, especially when individual client models differ significantly.
Local fine-tuning and weighted averaging: We employ local fine-tuning and weighted averaging mechanisms that result in diverse local updates, contributing to oscillations during global model updates.
Custom learning rates: As mentioned in the manuscript, we leverage custom learning rates, which can sometimes lead to more pronounced fluctuations, especially when the learning rate is not optimally set for some training rounds.
6. Conclusions
This paper tackles the burgeoning challenges posed by the intersection of Big Data analytics, the Internet of Things (IoT), and ITS. With data volume, variety, and velocity becoming increasingly formidable, traditional data analytics frameworks must be revised. Our research fills a significant gap by introducing a comprehensive Big Data analytics architecture tailored for an IoT-enabled ITS. Leveraging FL, we address pressing data integration issues, real-time analytics, and privacy concerns. Departing from conventional Federated Averaging methods, we champion a more personalized approach that refines global models to suit individual client data better. This personalization is achieved through innovative techniques, including local fine-tuning, weighted averaging, and custom learning rates. Transfer and ensemble learning approaches further amplify the model’s accuracy and robustness. Empirical validations using the Udacity Self-Driving Car Dataset underline the efficacy of our architecture in terms of scalability, real-time decision making, and data privacy preservation. Overall, this work advances the state of the art in FL and ITS. It sets a new standard for how personalized, real-time Big Data analytics can be effectively conducted in complex, dynamic urban transportation environments. We attained accuracy levels of 93.27%, 92.89%, and 92.96% for our proposed model in a Federated Learning architecture with 10 nodes, 20 nodes, and 30 nodes, respectively. This is particularly noteworthy given the consistently high accuracy maintained across different client counts, be it 10, 20, or 30, showcasing the algorithm’s resilience even as the network’s complexity escalates. This constancy in performance, even with an expanding network size, signifies a remarkable deviation from typical pitfalls observed in FL systems. The architecture we present, fortified by an optimized Federated Averaging strategy, offers a potent solution for data privacy.