Next Article in Journal
Solar Power Generation Forecasting in Smart Cities and Explanation Based on Explainable AI
Previous Article in Journal
Smart Cities, Digital Inequalities, and the Challenge of Inclusion
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Hypervector Approximation of Complex Manifolds for Artificial Intelligence Digital Twins in Smart Cities

1
Centre for Data Analytics and Cognition, La Trobe University, Bundoora, VIC 3086, Australia
2
Department of Computer Science, Lulea University of Technology, 971 87 Luleå, Sweden
3
Department of Computer Science, Virginia Commonwealth University, Richmond, VA 23284, USA
*
Author to whom correspondence should be addressed.
These authors contributed equally to this work.
Smart Cities 2024, 7(6), 3371-3387; https://doi.org/10.3390/smartcities7060131
Submission received: 12 October 2024 / Revised: 5 November 2024 / Accepted: 6 November 2024 / Published: 7 November 2024

Abstract

:

Highlights

What are the main findings?
  • The proposed AI approach is effective at hypervector approximation of complex manifolds in smart city settings.
  • The Hyperseed algorithm can generate fine-grained local variations that can be tracked for anomalies and temporal changes, as well as incremental changes in dynamic data streams.
What is the implication of the main finding?
  • This approach can be integrated into AI digital twins that have to process complex manifolds of high-dimensional datasets and data streams generated by smart cities.
  • The interplay between digital twins and novel AI approaches is crucial in unpacking the complexities of urban systems and shaping sustainable and resilient smart cities.

Abstract

The United Nations Sustainable Development Goal 11 aims to make cities and human settlements inclusive, safe, resilient and sustainable. Smart cities have been studied extensively as an overarching framework to address the needs of increasing urbanisation and the targets of SDG 11. Digital twins and artificial intelligence are foundational technologies that enable the rapid prototyping, development and deployment of systems and solutions within this overarching framework of smart cities. In this paper, we present a novel AI approach for hypervector approximation of complex manifolds in high-dimensional datasets and data streams such as those encountered in smart city settings. This approach is based on hypervectors, few-shot learning and a learning rule based on single-vector operation that collectively maintain low computational complexity. Starting with high-level clusters generated by the K-means algorithm, the approach interrogates these clusters with the Hyperseed algorithm that approximates the complex manifold into fine-grained local variations that can be tracked for anomalies and temporal changes. The approach is empirically evaluated in the smart city setting of a multi-campus tertiary education institution where diverse sensors, buildings and people movement data streams are collected, analysed and processed for insights and decisions.

1. Introduction

More than half of the global population currently resides in urban areas, and this rate is predicted to reach 70% by 2050 [1]. Smart cities have been proposed as an overarching and enabling framework that can support and sustain this rapid urbanisation [2,3]. However, smart cities are impacted by a unique set of challenges, including the development and interoperability of new technologies with legacy systems, safety, privacy and confidentiality of people, systems, operations, regulatory frameworks that promote citizen, community and organisational engagement in decision-making and preparation for potential disasters and emergencies [4]. Smart cities have been defined and described by various research studies, organisations and government entities, and an aggregate of these definitions can be presented as, “urban environments that deploy and leverage innovative technology solutions and platforms to enhance the quality of life through improved urban services, reduced costs, and increased engagement across all stakeholders” [5,6].
Artificial intelligence (AI) has been recognised as a potential foundational technology, if not the strategic layer of technological infrastructure, that promotes innovation, efficiencies, productivity and resilience within smart city environments [7,8]. With the recent developments in generative AI (Gen AI), the value and contribution of AI-generated content in improving workflows, increasing engagement, fostering innovation and reducing complexities within smart city environments has been amply demonstrated and studied [9,10]. Collectively, AI and Gen AI has been defined by the Organisation for Economic Cooperation and Development (OECD) as “a machine-based system that, for explicit or implicit objectives, infers, from the input it receives, how to generate outputs such as predictions, content, recommendations, or decisions that can influence physical or virtual environments. Different AI systems vary in their levels of autonomy and adaptiveness after deployment” [11]. Despite the collective definition, the distinction between narrow AI (or task-oriented AI) and Gen AI is crucial in recognising the potential application areas and domains of relevance for AI in smart cities [12]. Narrow AI can be further deconstructed into the core capabilities of prediction, classification, association and optimisation. Diverse applications have been researched and reported in smart city settings, such as energy systems [13,14], intelligent transport [15,16], and digital health [17,18]. Alongside its transformative potential, AI is also being scrutinised for the risks posed from bias, inaccuracies, lack of transparency, accountability and potential risks to physical and mental health [19]. Responsible AI adoption and AI regulation are making headway into addressing some of these risks [20,21].
Digital twins are a complementing technology capability to AI that have also found diverse and effective applications in smart city settings. Digital twins are defined as “virtual representations (or replicas) of a physical asset—an object, person, process or system that can be used to simulate behaviours to improve our understanding and management of this entity” [22,23]. The Internet of Things and edge computing have accelerated the development and adoption of digital twins to detect the physics of structures and evolving conditions measured by sensors and IoT driven by edge computing [24,25]. Digital twins run simulations that explore limitations, including challenges with options for improvements through service updates to the physical asset [26,27]. Digital twins for smart cities have been deliberated in various studies, including inclusion of socio-economic components [23], virtual feedback loops where citizens can interact and report feedback on planned changes in the city [28], disaster management [29], transportation and smart grids [30] and climate adaptive resilience [31].
A computational challenge in smart city digital twins is in the complexity and large volumes of data that need to be processed, analysed, learned and leveraged for AI capabilities such as predictions and classifications [32,33]. Smart cities by definition are large urban areas that will continue to grow and evolve as populations within those settings increase, and this means the volumes and complexities of data will continue to grow. These data typically exist in high-dimensional spaces and can exhibit intricate, non-linear relationships. The diversity, magnitude and variability of such datasets and data streams with intersecting properties can be recognised as “complex manifolds” [34]. A manifold is a mathematical structure that locally resembles Euclidean space but can have more intricate global properties, making it a useful concept for understanding non-linear data landscapes [35]. In the context of smart cities, manifolds can model multiple interacting components, accommodating non-linearities and relationships between various urban systems such as transportation networks, energy grids and technology platforms. Manifolds, including examples like Riemannian and Finsler manifolds, are defined over the field of real numbers and provide flexibility for modelling a wide range of complex systems. While these manifolds capture the complexity of urban environments, novel AI algorithms need to be developed to navigate and analyse these spaces, generating meaningful insights for predictions and classifications [35,36]. Recent work includes AI digital twins that mitigate undesirable emergent behaviours [37] and digital twin-driven prognostics for complex medical equipment [38]. Hypervectors are the encoded data vectors of hyperdimensional computing, an unconventional computing paradigm that aims to emulate brain-like processing, where the encoding process maps the input data from the scalar domain to a hyperspace. Hypervectors have very low computational needs compared to standard counterparts due to straightforward processing through simple logic elements. Hypervectors are also robust to errors and missing data, which makes them further ideal candidates for representing noisy and irregular data streams found in smart city settings.
In this paper, we present a novel AI approach for hypervector approximation of complex manifolds in high-dimensional datasets and data streams such as those encountered in smart city settings. This approach begins with the simple yet effective clustering capability of the K-means algorithm, followed by the Hyperseed algorithm [39] that unpacks each cluster to preserve internal structures and similarities while overcoming limitations in handling complex, non-linear manifolds. Additionally, we extend the Hyperseed algorithm to support incremental learning, enabling the model to adapt dynamically to continuous data streaming at varied frequencies and magnitudes. This approach is empirically evaluated within the smart city setting of a multi-campus tertiary education institution where diverse sensors, buildings and people movement data streams are collected, analysed and processed for insights and decisions. These experiments conducted on temperature sensor data streams collected at 15-min intervals from all buildings across campuses demonstrate its effectiveness in capturing local structures and incrementally adapting to new data streams.
The rest of this paper is organised as follows. Section 2 provides the necessary background on vector symbolic architectures (VSAs) and related concepts essential for understanding our proposed approach. Section 3 details the methodology, introducing the proposed approach and its incremental extension for approximating complex manifolds within high-dimensional datasets. Section 4 presents the experimental setup, including the dataset description, implementation details and the results obtained from applying our approach to real-world temperature sensor data from the La Trobe Energy Analytics Platform. Section 5 concludes this paper by summarising the key contributions and discussing the implications of our findings.

2. Background

A comprehensive review of hyperdimensional computing and hypervectors can be found in several recent studies. In this section, we outline only the aspects directly relevant to the proposed approach: hypervector binding, fractional power encoding and the resonator network. The bundling and ordering operations relevant to encoding input data involve combining multiple hypervectors to represent complex data structures and arranging them in a specific sequence to preserve temporal or spatial relationships. Bundling refers to the process of summing or combining hypervectors to aggregate information, while ordering ensures that the sequence of data points is maintained within the hyperdimensional space. These operations are integral to accurately encoding input data into hypervectors that capture the essential characteristics and relationships inherent in the original data [39].

2.1. Hypervector Binding

The binding operation is used to associate two hypervectors together, where the result of the binding is another hypervector. For example, for two hypervectors v 1 and v 2 , the result of binding of their hypervectors (denoted as b ) is calculated as follows: b = v 1 v 2 , where the notation ∘ is used to denote the binding operation. In Holographic Reduced Representations [40], the binding operation was implemented as a circular convolution of v 1 and v 2 that can in turn be implemented as the component-wise multiplication in the frequency domain. This observation inspired the Fourier Holographic Reduced Representation (FHRR) model, where the atomic hypervectors are already in the frequency domain in a form of phasors so that the component-wise multiplication, which is equivalent to the addition of phasors modulo 2 π , plays the role of the binding operation. There are two important properties of such binding operation. First, the resultant hypervector b is dissimilar to the hypervectors being bound, i.e., the similarity between b and v 1 or v 2 is approximately 0. Second, the binding operation preserves similarity. That is, the distribution of the similarity measure between hypervectors from some set S is preserved after binding each hypervector in S with the same hypervector v . The binding operation is reversible and has been demonstrated in generalisations that require only integer operations for weight updates [41]. The unbinding, denoted as ⊘, is implemented via component-wise multiplication with a complex conjugate of the corresponding hypervector: v 2 b = v 1 . Being the inverse of the binding operation, the unbinding obviously has the same similarity preservation property when performed on all hypervectors in S with same hypervector v .

2.2. Fractional Power Encoding

Fractional power encoding (FPE) was first proposed in [42,43] as a method for representing subsymbolic data in a vector space. The method was further generalised to represent functions in a vector space in [44]. The data sample is encoded by exponentiating a fixed random base hypervector. Consider an example of transforming a two-dimensional data domain. Let x and y denote each dimension. To represent a value in each dimension, we first generate two random base hypervectors x 0 , y 0 C d as x 0 e j · 2 π · U ( 0 , 1 ) and y 0 e j · 2 π · U ( 0 , 1 ) . In this context, d is set to a high value to ensure that each hypervector has sufficient dimensionality to capture the complexity of the data being encoded. The imaginary unit j is crucial for generating hypervectors with properties suitable for representing complex data structures in hyperdimensional space. Let us denote the bandwidth parameter β regulating the similarity between the adjacent values ( x i , x i + 1 ) and ( y i , y i + 1 ) in the vector space. Value i along the x and y dimensions will be transformed via FPE as: x i = x 0 β · i , y i = y 0 β · i . FPE together with VSA operations allows representing multi-dimensional continuous-valued spaces by binding together FPE hypervectors representing the values of each dimension. Hypervector p ( i , j ) representing a pair of values ( x i , y j ) is represented as p ( i , j ) = x i y j . With this transformation, the inner product between the hypervectors of any two data vectors represents a similarity kernel applied to the original data domain. Specifically, the kernel of FPE, when the base hypervectors are generated in the interval ( 0 , 2 π ) , is the sinc function: K ( x ) = 1 2 π sinc ( π d ) . Figure 1 shows the similarity between FPE of the values x = 15 and y = 15 and all other FPEs of integer values on the 50 × 50 2D grid.

2.3. Resonator Network

The resonator network [45] is a method for factoring compound hypervectors constructed by binding several hypervectors (e.g., a b c ). Finding hypervectors by brute force checking of all possible combinations of the arguments is impractical since the number of such combinations grows exponentially with the number of arguments being involved. A resonator network depicted in Figure 2 solves this problem by a parallel search in the space of all possible combinations.
The resonator network assumes that none of the arguments are given but that they are contained in separate dictionaries (item memories), which are known to the resonator network. In a nutshell, the resonator network is a recurrent neural network that uses VSA principles to solve a combinatorial optimisation problem. Due to the space limitations, we omit the detailed description of the concept and refer the reader to [45] for examples of factoring hypervectors of data structures with it and to [46] for its comparison with other standard optimisation-based methods.

3. The Proposed Approach

The proposed approach aims to approximate a complex manifold by first applying the K-means algorithm to capture local neighbourhoods and then using the Hyperseed algorithm on these local clusters. Each cluster can subsequently be represented in a 2D map using FPE. This approach leverages FHRR representations and FPE to preserve the internal structure and similarities within each cluster. By applying K-means, initially, the complexity of the manifold is reduced, allowing Hyperseed to operate more effectively within these local clusters, thereby mitigating its limitations when dealing with complex, non-linear manifolds. The result is an efficient factorisation and representation of the data. This combined method facilitates the management and analysis of high-dimensional datasets, such as smart city data streams, by capturing local structures and providing a meaningful 2D representation. The architectural view of the proposed approach is presented in Figure 3, where the approach begins with K-means clustering, followed by the Hyperseed algorithm and the incremental Hyperseed algorithm that allows the model to adapt as new data are introduced. The algorithmic flow of this approach is presented in Algorithm 1.

3.1. K-Means Clustering

To manage the complexity and structure of high-dimensional data, we first apply the K-means clustering algorithm to partition the data into K clusters. Each cluster represents a subset of the data that is homogeneously distributed in the high-dimensional space. K-means clustering is a widely used method for partitioning a dataset into a set of distinct, non-overlapping groups. It operates by iteratively assigning each data point to the nearest cluster centre and then updating the cluster centres as the mean of the assigned points. Despite its simplicity, K-means can effectively handle clustering tasks in high-dimensional spaces, although several challenges and considerations arise in such contexts. The K-means algorithm is summarised in Algorithm 2.
Algorithm 1 Algorithmic flow of the proposed approach
 1:
Input: High-dimensional dataset X C d , number of clusters K
 2:
Output: Clustered and approximated high-dimensional data in VSA space
 3:
Step 1: K-means Clustering
 4:
Initialize K cluster centers randomly
 5:
repeat
 6:
   for each data point x i X  do
 7:
     Assign x i to the nearest cluster center
 8:
   end for
 9:
   for each cluster j { 1 , 2 , , K }  do
10:
     Update cluster center c j as the mean of all points assigned to cluster j
11:
   end for
12:
until cluster centers converge
13:
Step 2: Deploy Hyperseed on Clusters
14:
for each cluster j { 1 , 2 , , K }  do
15:
   Initialize HD-map P j using FPE
16:
   Apply Hyperseed algorithm to transform data hypervectors in cluster j to HD-map P j
17:
end for
Algorithm 2 K-means clustering
 1:
Input: Dataset X = { x 1 , x 2 , , x n } , number of clusters K
 2:
Output: Cluster centers C = { c 1 , c 2 , , c K } , cluster assignments A = { a 1 , a 2 , , a n }
 3:
Initialize K cluster centers randomly
 4:
repeat
 5:
   for each data point x i X  do
 6:
     Assign x i to the nearest cluster center c j using Euclidean distance
 7:
     Update cluster assignment a i j
 8:
   end for
 9:
   for each cluster j { 1 , 2 , , K }  do
10:
     Update cluster center c j as the mean of all points assigned to cluster j
11:
   end for
12:
until cluster assignments no longer change significantly

3.2. Hypervector Mapping

To ensure that the high-dimensional data in each cluster can be effectively represented in a VSA space, the next step is to map the original data hypervectors D to an HD-map P , as illustrated in Figure 4 using the unbinding operation, ensuring the preservation of the data’s internal similarity layout.
More formally, consider two vector spaces:
  • D C d : contains distributed representations of data sampled from an arbitrary manifold with unknown structure.
  • P C d : represents a subsymbolic data domain as an n-dimensional FPE tensor.
The goal is to construct P such that the transformation F ( D ) maps vectors from D to vectors in P , preserving the similarity layout. This ensures that the unknown similarity layout of D is accurately reflected in P .
D s P
Here, s is a seed hypervector obtained through an unsupervised learning rule, allowing the similarity structure of the original data to be preserved in the transformed high-dimensional (HD)-map.
The factorisation of the unbinding result is efficiently solved using the Resonator network, which leverages the similarity preservation property of the (un)binding operation. This approach allows for the effective clustering and representation of high-dimensional data, facilitating more efficient data processing and analysis. By combining K-means clustering with the Hyperseed algorithm, high-dimensional data can be approximated and managed more effectively, preserving its intrinsic structure and enabling scalable analysis.

3.3. The Hyperseed Algorithm

Hyperseed [39] is an unsupervised learning algorithm that maps original data hypervectors to an HD-map using frequency holographic reduced representations (FHRRs) and fractional power encoding (FPE). The detailed steps of the Hyperseed algorithm are provided in Algorithm 3. The computational complexity of Hyperseed is O ( I N d | P | ) , where I is the number of iterations, N is the number of training data hypervectors, d is the dimensionality of hypervectors and  | P | is the size of the HD-map. When implemented on neuromorphic hardware, the search for the best matching vector (BMV) happens in constant time, further optimising the algorithm’s performance. Despite its low computational complexity, Hyperseed is primarily effective for datasets where the underlying data structure can be captured through relatively simple linear mappings. However, its performance degrades when dealing with more complex datasets that require mapping to non-linear manifolds. In these cases, where the data lie on or near a non-linear manifold with intricate feature interactions and high-dimensional correlations, Hyperseed may struggle to accurately capture the manifold’s structure, leading to reduced accuracy and robustness. This limitation impacts its applicability in scenarios where the data’s complexity is inherently non-linear and manifold-based.

3.4. Computational Complexity Analysis

The proposed approach integrates K-means clustering with the Hyperseed algorithm, aiming to enhance scalability and efficiency in handling high-dimensional data.

3.4.1. Complexity of the Proposed Approach

K-means clustering has a computational complexity of O ( t · K · N · d ) , where t is the number of iterations, K the number of clusters, N the number of data points and d the dimensionality.
The Hyperseed algorithm operates with a complexity of O ( I · N K · d · | P | ) per cluster, where I is the number of iterations and | P | the size of the HD-map. On neuromorphic hardware, this reduces to O ( I · N ) · τ , with  τ representing constant-time search operations.
Overall, the combined complexity is O ( t · K · N · d ) + O ( I · N · d · | P | ) , demonstrating linear scalability with respect to N and d.
Algorithm 3 Hyperseed algorithm
 1:
Input: Set of data hypervectors D C d
 2:
Output: Seed hypervector s , HD-map P
 3:
Initialization:
 4:
Generate initial random unit hypervectors x 0 , y 0 C d
 5:
Generate HD-map P using FPE:
 6:
for each grid coordinate ( i , j )  do
 7:
    x i = x 0 ϵ · i
 8:
    y i = y 0 ϵ · i
 9:
    p ( i , j ) = x i y j
10:
end for
11:
Initialize seed hypervector s e j · 2 π · U ( 0 , 1 )
12:
Learning Phase:
13:
for each iteration do
14:
   Search Procedure:
15:
   for each data hypervector d i D  do
16:
     Compute noisy vector p i * = d i s
17:
     Find BMV in P with highest cosine similarity to p i *
18:
   end for
19:
   Weakest Match Search (WMS) Phase:
20:
   Initialize lowest similarity variable D m i n = 1
21:
   for each d i D  do
22:
     Compute p i * and find BMV in P
23:
     Compute similarity D B M V
24:
     if  D B M V < D m i n  then
25:
        Update D m i n = D B M V
26:
        Select d i for update
27:
     end if
28:
   end for
29:
   Update Phase:
30:
   Select target hypervector p target using FFTR heuristic
31:
   Compute perfect mapping hypervector d i p target
32:
   Update seed hypervector: s = s + d i p target
33:
end for

3.4.2. Comparison with Existing Manifold Learning Methods

Traditional manifold learning techniques such as t-SNE, UMAP and Isomap exhibit higher computational complexities, making them less scalable for large datasets. Specifically, t-SNE has a complexity of O ( N 2 · d ) , which poses scalability challenges due to its quadratic growth with the number of data points. UMAP improves upon this with a complexity of O ( N log N ) , offering better scalability than t-SNE but still higher than our proposed method for large N. Isometric Feature Mapping (Isomap), on the other hand, involves three main computational steps: K-nearest neighbour computation with a complexity of O ( N 2 D ) , shortest path computation using dynamic programming at O ( N 2 k + N 2 log N ) , where k is the number of nearest neighbours, and multidimensional scaling (MDS) with a complexity of O ( N 2 d ) , where d is the dimensionality of the output space. Consequently, the overall complexity of Isomap simplifies to O N 2 · max ( D , k , log N , d ) , highlighting its quadratic dependence on N and thereby restricting its applicability to smaller datasets. In contrast, our approach maintains linear scalability, making it more suitable for large-scale datasets.

3.4.3. Advantages of the Proposed Approach

The integration of K-means and Hyperseed offers superior scalability, parallelisability and memory efficiency. Additionally, theoretical guarantees from K-means ensure convergence to a local optimum, while Hyperseed ensures stable transformation of hypervectors, especially when accelerated by neuromorphic hardware.

3.5. Incremental Hyperseed Algorithm

The proposed approach is further refined to accommodate new data samples in an online learning scenario, where dynamic updates drive clustering and representation mechanisms. This online learning process ensures that the system can adapt to new data samples by either updating existing clusters and HD-maps or creating new ones when necessary. This dynamic approach allows for continuous learning and representation of high-dimensional data while maintaining the efficiency and scalability of the system. The detailed steps of the incremental Hyperseed algorithm are provided in Algorithm 4.
For each new data sample, we perform the following steps outlined in Algorithm 1.
Algorithm 4 Incremental hyperseed algorithm
 1:
Input: New data sample x n e w C d , existing cluster centers { c 1 , c 2 , , c K } , threshold θ
 2:
Output: Updated clusters and HD-maps
 3:
Step 1: Assign to Nearest Cluster
 4:
Assign x n e w to the nearest cluster center c k
 5:
Step 2: Check Threshold Criterion
 6:
Compute the unbinding result r = x n e w s , where s is the seed hypervector
 7:
Reconstruct x n e w as r ^ using base vectors b 1 , b 2 of the HD space
 8:
Compute the similarity D ( x n e w , r ^ )
 9:
if  D ( x n e w , r ^ ) θ   then
10:
   Update cluster c k with x n e w
11:
   Update HD-map P k using the Hyperseed algorithm
12:
else
13:
   Create a new cluster with x n e w as the initial center
14:
   Initialize a new HD-map P n e w using FPE
15:
   Apply Hyperseed algorithm to transform x n e w to HD-map P n e w
16:
end if

4. Experiments

The empirical evaluation was conducted on data streams from the La Trobe Energy AI Platform (LEAP), part of La Trobe University’s Net Zero Carbon Emissions Program. Located in Victoria, Australia, La Trobe University is a multi-campus, multi-functional tertiary education institution. LEAP serves as the core AI technology stack for the Net Zero Carbon Emissions Program, which aims to reduce the university’s carbon footprint to net zero emissions by 2029, alongside improving energy efficiency and increasing resource utilisation. These experiments were conducted on the data streams of energy usage, energy generation and internal temperature sensor data collected at 15-min intervals from all buildings within the university campuses and includes post-COVID data from January 2023 to December 2023. For each experiment, we selected a single office building comprising 14 distinct zones, each equipped with temperature sensors. The temperature readings in these zones typically range between 15 °C and 30 °C. Data were collected at 15-min intervals, resulting in 96 readings per day per zone. Over the 12-month period, this setup generated approximately 490,560 temperature readings (96 readings/day × 14 zones × 365 days). The buildings are characterised by specific energy consumption patterns, where each day’s data are aggregated into a daily profile consisting of 96 readings (4 readings per hour).

4.1. Baseline Results with t-SNE

We utilised t-SNE [47] to establish baseline results for empirical comparison. The t-distributed stochastic neighbour embedding (t-SNE) algorithm is a widely accepted dimensionality reduction technique for visualising high-dimensional data. By reducing the data to two or three dimensions, t-SNE helps in understanding the structure and distribution of the data, identifying clusters and detecting patterns. Given that our approach employs a piecewise approximation technique, which differs fundamentally from traditional manifold learning methods, a direct comparison with other such methods is not directly applicable. Therefore, t-SNE serves as an appropriate baseline for evaluating the effectiveness of our proposed algorithm.
The t-SNE visualisation (Figure 5) provides insights into the overall distribution and clustering of the temperature sensor data. This visualisation also includes predefined profile vectors, which are described as follows:
  • Low_Profile: a low consistent temperature profile, represented by a constant temperature of 15 °C.
  • High_Profile: a high consistent temperature profile, represented by a constant temperature of 27 °C.
  • Average_Profile: an average consistent temperature profile, represented by a constant temperature of 20 °C.
  • Typical_8-5_Profile: a typical daily temperature profile, with lower temperatures before 8 a.m. and after 5 p.m. and slightly higher temperatures (21 °C) during the 8 a.m. to 5 p.m. working hours.
These predefined profiles serve as benchmarks for identifying typical patterns and deviations within the dataset. By comparing the data points to these reference profiles, we can more effectively assess the performance of the projection algorithms and identify anomalies or patterns that require further investigation.

4.2. Local Clusters with Hyperseed

In the first experiment, we apply the proposed algorithm to visualise local clusters within the temperature sensor data. This method is particularly effective in capturing the subtle patterns of the local clusters, which might be overlooked by traditional clustering algorithms. For K-means, we use 12 as the number of clusters to initially capture local neighbourhoods. We determined the optimal number of clusters using the elbow method, which indicated that 12 clusters provide a suitable balance between computational efficiency and cluster compactness. The Hyperseed algorithm is then applied with an epsilon value of ϵ = 0.01 and a codebook size of 50 to refine these clusters.
The Hyperseed visualisations (Figure 6) reveal granular details about the local temperature variations within the sensor data. By providing a more detailed view, these visualisations help us understand intricate temperature patterns, allowing us to pinpoint specific locations with anomalous temperature settings and address potential issues in those regions. By following a given sensor profile over time, we can understand specific time-dependent patterns in different regions of the building. This dynamic analysis helps us track how temperature readings evolve, offering insights into temporal changes and trends. We demonstrate how different regions evolve over time between different Hyperseed HD planes. This temporal evolution visualisation helps us identify patterns such as gradual increases in temperature, periodic fluctuations or sudden anomalies that could indicate malfunctions or inefficiencies in the building’s climate control system.
The temporal evolution visualisation (Figure 7) highlights how specific regions of the building respond to various conditions over time. For example, during the first week of June, there is a noticeable deviation of almost all region profiles from their normal behavior, which could indicate an anomalous event in the building. By comparing these patterns across different Hyperseed HD planes, we can better understand the dynamics of the building’s internal environment and develop targeted strategies for improving energy efficiency and maintaining optimal temperature conditions.

4.3. Incremental Hyperseed

The second experiment evaluates the incremental extension of the algorithm using the remaining data. This involves continuously updating the model with new data from December 2023 onwards to evaluate its adaptability and performance over time.
The incremental learning results (Figure 8) demonstrate the model’s adaptability to new data. For instance, Region 2 shows convergence to a new HD plane, indicating the model’s ability to recognise and adapt to evolving data patterns, which may reflect a historically unforeseen event requiring attention for that region. Additionally, the changes consistently align with the historical patterns, emphasising the model’s capability to maintain performance and provide accurate predictions as new data become available. This validates the model’s robustness and suitability for dynamic, real-world environments. As demonstrated in the incremental learning experiment, the algorithm effectively adapts to outliers—instances that deviate from normal conditions—by creating new HD planes upon detection of anomalous data points. If these outliers do not persist over time, the corresponding HD planes are subsequently removed, thereby ensuring that the model maintains its integrity by not retaining transient or irrelevant anomalies. This dynamic adjustment underscores the robustness of our approach in handling outliers while preserving the overall stability and accuracy of the model.

5. Conclusions

This paper presents a novel AI approach for hypervector approximation of complex manifolds in AI digital twins for smart cities. Starting with the broad-based clustering capability of the K-means algorithm, the approach then leverages the Hyperseed algorithm to preserve internal structures and similarities of the non-linear manifolds, followed by incremental learning to accommodate online, continuous learning of the complex manifold. As reported in our experiments, the t-SNE algorithm provided a baseline comparator, while the proposed AI approach based on the Hyperseed algorithm offered a more detailed analysis, enabling us to detect finer details and specific anomalies within the temperature sensor data that reside on complex manifolds. The results confirm that by applying the proposed algorithm, we can uncover local variations within complex manifolds and recognise specific anomalous regions. It also demonstrates the temporal evolution of temperature profiles to detect historically unforeseen events and validates the model’s robustness through incremental learning, showcasing its adaptability to new data while maintaining performance. Additionally, the incremental learning approach demonstrated effectiveness in adapting to new data, showcasing the robustness and practicality of our methods in a dynamic, real-world environment. The integration of AI with digital twin technology offers a powerful approach to managing the complexities of urban systems. By continuously updating digital twins with real-time data, the proposed AI approach can also anticipate and respond to emerging challenges effectively. As urban areas continue to evolve, the interplay between AI, digital twins and the underlying complexities of urban systems will be crucial in shaping sustainable and resilient smart cities.

Author Contributions

Conceptualization, S.K., D.D.S. and E.O.; Methodology, S.K., N.M. (Nuwan Madhusanka), D.D.S., E.O., M.M. and A.J.; Software, N.M. (Nishan Mills); Validation, N.M. (Nishan Mills) and M.M.; Formal analysis, S.K., N.M. (Nuwan Madhusanka), E.O., M.M. and A.J.; Investigation, S.K., D.D.S., E.O., N.M. (Nishan Mills), M.M. and A.J.; Resources, N.M. (Nuwan Madhusanka) and N.M. (Nishan Mills); Data curation, N.M. (Nuwan Madhusanka); Writing—original draft, S.K., N.M. (Nuwan Madhusanka), D.D.S., E.O., N.M. (Nishan Mills), M.M. and A.J.; Visualization, N.M. (Nuwan Madhusanka); Supervision, D.D.S.; Project administration, A.J. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the Department of Climate Change, Energy, the Environment and Water of the Australian Federal Government, as part of the International Clean Innovation Researcher Networks (ICIRN) program, grant number ICIRN000077, and supported in part by the Swedish Research Council (VR grant no. 2022-04657).

Data Availability Statement

The data presented in this study are openly available in https://github.com/CDAC-lab/UNICON (accessed on 1 October 2024).

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Cremonini, L.; Carotenuto, F.; Famulari, D.; Fiorillo, E.; Nardino, M.; Neri, L.; Georgiadis, T. Urban Environment and Human Health: Motivations for Urban Regeneration to Adapt. Environ. Sci. Proc. 2023, 27, 28. [Google Scholar] [CrossRef]
  2. Gracias, J.S.; Parnell, G.S.; Specking, E.; Pohl, E.A.; Buchanan, R. Smart cities—A structured literature review. Smart Cities 2023, 6, 1719–1743. [Google Scholar] [CrossRef]
  3. Moreno, C.; Allam, Z.; Chabaud, D.; Gall, C.; Pratlong, F. Introducing the “15-Minute City”: Sustainability, resilience and place identity in future post-pandemic cities. Smart Cities 2021, 4, 93–111. [Google Scholar] [CrossRef]
  4. Syed, A.S.; Sierra-Sosa, D.; Kumar, A.; Elmaghraby, A. IoT in smart cities: A survey of technologies, practices and challenges. Smart Cities 2021, 4, 429–475. [Google Scholar] [CrossRef]
  5. Lai, C.S.; Jia, Y.; Dong, Z.; Wang, D.; Tao, Y.; Lai, Q.H.; Wong, R.T.; Zobaa, A.F.; Wu, R.; Lai, L.L. A review of technical standards for smart cities. Clean Technol. 2020, 2, 290–310. [Google Scholar] [CrossRef]
  6. Ramaprasad, A.; Sánchez-Ortiz, A.; Syn, T. A unified definition of a smart city. In Proceedings of the Electronic Government: 16th IFIP WG 8.5 International Conference, EGOV 2017, St. Petersburg, Russia, 4–7 September 2017; Proceedings 16. Springer: Berlin/Heidelberg, Germany, 2017; pp. 13–24. [Google Scholar]
  7. Nikitas, A.; Michalakopoulou, K.; Njoya, E.T.; Karampatzakis, D. Artificial intelligence, transport and the smart city: Definitions and dimensions of a new mobility era. Sustainability 2020, 12, 2789. [Google Scholar] [CrossRef]
  8. Cugurullo, F. Urban artificial intelligence: From automation to autonomy in the smart city. Front. Sustain. Cities 2020, 2, 38. [Google Scholar] [CrossRef]
  9. Dashkevych, O.; Portnov, B.A. How can generative AI help in different parts of research? An experiment study on smart cities’ definitions and characteristics. Technol. Soc. 2024, 77, 102555. [Google Scholar] [CrossRef]
  10. Xu, H.; Omitaomu, F.; Sabri, S.; Li, X.; Song, Y. Leveraging Generative AI for Smart City Digital Twins: A Survey on the Autonomous Generation of Data, Scenarios, 3D City Models, and Urban Designs. arXiv 2024, arXiv:2405.19464. [Google Scholar]
  11. OECD. OECD AI Definition. Available online: https://oecd.ai/en/wonk/definition (accessed on 1 October 2024).
  12. Tupayachi, J.; Xu, H.; Omitaomu, O.A.; Camur, M.C.; Sharmin, A.; Li, X. Towards next-generation urban decision support systems through ai-powered construction of scientific ontology using large language models—A case in optimizing intermodal freight transportation. Smart Cities 2024, 7, 2392–2421. [Google Scholar] [CrossRef]
  13. Bose, B.K. Artificial intelligence techniques in smart grid and renewable energy systems—Some example applications. Proc. IEEE 2017, 105, 2262–2273. [Google Scholar] [CrossRef]
  14. De Silva, D.; Yu, X.; Alahakoon, D.; Holmes, G. Semi-supervised classification of characterized patterns for demand forecasting using smart electricity meters. In Proceedings of the 2011 International Conference on Electrical Machines and Systems, Beijing, China, 20–23 August 2011; pp. 1–6. [Google Scholar]
  15. Nallaperuma, D.; De Silva, D.; Alahakoon, D.; Yu, X. Intelligent detection of driver behavior changes for effective coordination between autonomous and human driven vehicles. In Proceedings of the IECON 2018—44th Annual Conference of the IEEE Industrial Electronics Society, Washington, DC, USA, 21–23 October 2018; IEEE: Piscataway, NJ, USA, 2018; pp. 3120–3125. [Google Scholar]
  16. Bandaragoda, T.; De Silva, D.; Kleyko, D.; Osipov, E.; Wiklund, U.; Alahakoon, D. Trajectory clustering of road traffic in urban environments using incremental machine learning in combination with hyperdimensional computing. In Proceedings of the 2019 IEEE Intelligent Transportation Systems Conference (ITSC), Auckland, New Zealand, 27–30 October 2019; IEEE: Piscataway, NJ, USA, 2019; pp. 1664–1670. [Google Scholar]
  17. Adikari, A.; De Silva, D.; Ranasinghe, W.K.; Bandaragoda, T.; Alahakoon, O.; Persad, R.; Lawrentschuk, N.; Alahakoon, D.; Bolton, D. Can online support groups address psychological morbidity of cancer patients? An artificial intelligence based investigation of prostate cancer trajectories. PLoS ONE 2020, 15, e0229361. [Google Scholar] [CrossRef] [PubMed]
  18. De Silva, D.; Ranasinghe, W.; Bandaragoda, T.; Adikari, A.; Mills, N.; Iddamalgoda, L.; Alahakoon, D.; Lawrentschuk, N.; Persad, R.; Osipov, E.; et al. Machine learning to support social media empowered patients in cancer care and cancer treatment decisions. PLoS ONE 2018, 13, e0205855. [Google Scholar] [CrossRef] [PubMed]
  19. Wolniak, R.; Stecuła, K. Artificial Intelligence in Smart Cities—Applications, Barriers, and Future Directions: A Review. Smart Cities 2024, 7, 1346–1389. [Google Scholar] [CrossRef]
  20. Wang, W.; Chen, L.; Xiong, M.; Wang, Y. Accelerating AI adoption with responsible AI signals and employee engagement mechanisms in health care. Inf. Syst. Front. 2023, 25, 2239–2256. [Google Scholar] [CrossRef]
  21. Arrieta, A.B.; Díaz-Rodríguez, N.; Del Ser, J.; Bennetot, A.; Tabik, S.; Barbado, A.; García, S.; Gil-López, S.; Molina, D.; Benjamins, R.; et al. Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI. Inf. Fusion 2020, 58, 82–115. [Google Scholar] [CrossRef]
  22. Botín-Sanabria, D.M.; Mihaita, A.S.; Peimbert-García, R.E.; Ramírez-Moreno, M.A.; Ramírez-Mendoza, R.A.; Lozoya-Santos, J.d.J. Digital twin technology challenges and applications: A comprehensive review. Remote Sens. 2022, 14, 1335. [Google Scholar] [CrossRef]
  23. Shahat, E.; Hyun, C.T.; Yeom, C. City digital twin potentials: A review and research agenda. Sustainability 2021, 13, 3386. [Google Scholar] [CrossRef]
  24. Ieva, S.; Loconte, D.; Loseto, G.; Ruta, M.; Scioscia, F.; Marche, D.; Notarnicola, M. A Retrieval-Augmented Generation Approach for Data-Driven Energy Infrastructure Digital Twins. Smart Cities 2024, 7, 3095–3120. [Google Scholar] [CrossRef]
  25. Ramadan, R.; Huang, Q.; Zalhaf, A.S.; Bamisile, O.; Li, J.; Mansour, D.E.A.; Lin, X.; Yehia, D.M. Energy Management in Residential Microgrid Based on Non-Intrusive Load Monitoring and Internet of Things. Smart Cities 2024, 7, 1907–1935. [Google Scholar] [CrossRef]
  26. Semeraro, C.; Lezoche, M.; Panetto, H.; Dassisti, M. Digital twin paradigm: A systematic literature review. Comput. Ind. 2021, 130, 103469. [Google Scholar] [CrossRef]
  27. Wright, L.; Davidson, S. How to tell the difference between a model and a digital twin. Adv. Model. Simul. Eng. Sci. 2020, 7, 13. [Google Scholar] [CrossRef]
  28. White, G.; Zink, A.; Codecá, L.; Clarke, S. A digital twin smart city for citizen feedback. Cities 2021, 110, 103064. [Google Scholar] [CrossRef]
  29. Ariyachandra, M.M.F.; Wedawatta, G. Digital twin smart cities for disaster risk management: A review of evolving concepts. Sustainability 2023, 15, 11910. [Google Scholar] [CrossRef]
  30. Jafari, M.; Kavousi-Fard, A.; Chen, T.; Karimi, M. A review on digital twin technology in smart grid, transportation system and smart city: Challenges and future. IEEE Access 2023, 11, 17471–17484. [Google Scholar] [CrossRef]
  31. Makvandi, M.; Li, W.; Li, Y.; Wu, H.; Khodabakhshi, Z.; Xu, X.; Yuan, P.F. Advancing Urban Resilience Amid Rapid Urbanization: An Integrated Interdisciplinary Approach for Tomorrow’s Climate-Adaptive Smart Cities—A Case Study of Wuhan, China. Smart Cities 2024, 7, 2110–2130. [Google Scholar] [CrossRef]
  32. Shulajkovska, M.; Smerkol, M.; Noveski, G.; Gams, M. Enhancing Urban Sustainability: Developing an Open-Source AI Framework for Smart Cities. Smart Cities 2024, 7, 2670–2701. [Google Scholar] [CrossRef]
  33. Hammoumi, L.; Maanan, M.; Rhinane, H. Characterizing Smart Cities Based on Artificial Intelligence. Smart Cities 2024, 7, 1330–1345. [Google Scholar] [CrossRef]
  34. Kodaira, K. Complex Manifolds and Deformation of Complex Structures; Springer Science & Business Media: Berlin, Germany, 2012; Volume 283. [Google Scholar]
  35. Schleich, B.; Anwer, N.; Mathieu, L.; Wartzack, S. Shaping the digital twin for design and production engineering. CIRP Ann. 2017, 66, 141–144. [Google Scholar] [CrossRef]
  36. Ma, Y.; Zhou, H.; He, H.; Jiao, G.; Wei, S.X. A digital twin-based approach for quality control and optimization of complex product assembly. In Proceedings of the 2019 International Conference on Artificial Intelligence and Advanced Manufacturing (AIAM), Dublin, Ireland, 16–18 October 2019. [Google Scholar] [CrossRef]
  37. Grieves, M.; Vickers, J. Digital twin: Mitigating unpredictable, undesirable emergent behavior in complex systems. In Transdisciplinary Perspectives on Complex Systems; Kahlen, J., Flumerfelt, S., Alves, A., Eds.; Springer: Cham, Switzerland, 2016; pp. 85–113. [Google Scholar] [CrossRef]
  38. Tao, F.; Zhang, M.; Liu, Y.; Nee, A.Y.C. Digital twin driven prognostics and health management for complex equipment. CIRP Ann. 2018, 67, 169–172. [Google Scholar] [CrossRef]
  39. Osipov, E.; Kahawala, S.; Haputhanthri, D.; Kempitiya, T.; Silva, D.; Alahakoon, D.; Kleyko, D. Hyperseed: Unsupervised Learning With Vector Symbolic Architectures. IEEE Trans. Neural Netw. Learn. Syst. 2022, 35, 6583–6597. [Google Scholar] [CrossRef] [PubMed]
  40. Plate, T.A. Holographic Reduced Representations: Convolution Algebra for Compositional Distributed Representations. In Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI), Sydney, Australia, 24–30 August 1991; pp. 30–35. [Google Scholar]
  41. Kleyko, D.; Osipov, E.; De Silva, D.; Wiklund, U.; Alahakoon, D. Integer self-organizing maps for digital hardware. In Proceedings of the 2019 International Joint Conference on Neural Networks (IJCNN), Budapest, Hungary, 14–19 July 2019; IEEE: Piscataway, NJ, USA, 2019; pp. 1–8. [Google Scholar]
  42. Plate, T.A. Holographic Recurrent Networks. In Proceedings of the Advances in Neural Information Processing Systems (NIPS), Denver, CO, USA, 30 November–3 December 1992; pp. 34–41. [Google Scholar]
  43. Plate, T.A. Distributed Representations and Nested Compositional Structure. Ph.D. Thesis, University of Toronto, Toronto, ON, Canada, 1994. [Google Scholar]
  44. Frady, E.P.; Kleyko, D.; Kymn, C.J.; Olshausen, B.A.; Sommer, F.T. Computing on Functions Using Randomized Vector Representations. In Proceedings of the 2022 Annual Neuro-Inspired Computational Elements Conference, Online, 28 March–1 April 2021; pp. 1–33. [Google Scholar]
  45. Frady, E.P.; Kent, S.J.; Olshausen, B.A.; Sommer, F.T. Resonator Networks, 1: An Efficient Solution for Factoring High-Dimensional, Distributed Representations of Data Structures. Neural Comput. 2020, 32, 2311–2331. [Google Scholar] [CrossRef] [PubMed]
  46. Kent, S.J.; Frady, E.P.; Sommer, F.T.; Olshausen, B.A. Resonator Networks, 2: Factorization Performance and Capacity Compared to Optimization-Based Methods. Neural Comput. 2020, 32, 2332–2388. [Google Scholar] [CrossRef] [PubMed]
  47. Van der Maaten, L.; Hinton, G. Visualizing data using t-SNE. J. Mach. Learn. Res. 2008, 9, 2579–2605. [Google Scholar]
Figure 1. Similarity distribution in a VFA vector space. The FPE bandwidth β = 0.05 .
Figure 1. Similarity distribution in a VFA vector space. The FPE bandwidth β = 0.05 .
Smartcities 07 00131 g001
Figure 2. An example of a resonator network with three arguments, which is factoring a compound hypervector s = a b c ; A , B and C denote the corresponding item memories containing atomic hypervectors for a , b and c arguments, respectively.
Figure 2. An example of a resonator network with three arguments, which is factoring a compound hypervector s = a b c ; A , B and C denote the corresponding item memories containing atomic hypervectors for a , b and c arguments, respectively.
Smartcities 07 00131 g002
Figure 3. Architecture diagram of the proposed approach. The colour is demonstrative of diversity of data points.
Figure 3. Architecture diagram of the proposed approach. The colour is demonstrative of diversity of data points.
Smartcities 07 00131 g003
Figure 4. Transformation of domain D to P with binding operation.
Figure 4. Transformation of domain D to P with binding operation.
Smartcities 07 00131 g004
Figure 5. Baseline t-SNE visualisation of the initial dataset with predefined profile vectors.
Figure 5. Baseline t-SNE visualisation of the initial dataset with predefined profile vectors.
Smartcities 07 00131 g005
Figure 6. Visualisation of local clusters using the Hyperseed algorithm.
Figure 6. Visualisation of local clusters using the Hyperseed algorithm.
Smartcities 07 00131 g006
Figure 7. Temporal evolution of different regions between Hyperseed HD planes.
Figure 7. Temporal evolution of different regions between Hyperseed HD planes.
Smartcities 07 00131 g007
Figure 8. Incremental learning and adaptation between Hyperseed HD planes.
Figure 8. Incremental learning and adaptation between Hyperseed HD planes.
Smartcities 07 00131 g008
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Kahawala, S.; Madhusanka, N.; De Silva, D.; Osipov, E.; Mills, N.; Manic, M.; Jennings, A. Hypervector Approximation of Complex Manifolds for Artificial Intelligence Digital Twins in Smart Cities. Smart Cities 2024, 7, 3371-3387. https://doi.org/10.3390/smartcities7060131

AMA Style

Kahawala S, Madhusanka N, De Silva D, Osipov E, Mills N, Manic M, Jennings A. Hypervector Approximation of Complex Manifolds for Artificial Intelligence Digital Twins in Smart Cities. Smart Cities. 2024; 7(6):3371-3387. https://doi.org/10.3390/smartcities7060131

Chicago/Turabian Style

Kahawala, Sachin, Nuwan Madhusanka, Daswin De Silva, Evgeny Osipov, Nishan Mills, Milos Manic, and Andrew Jennings. 2024. "Hypervector Approximation of Complex Manifolds for Artificial Intelligence Digital Twins in Smart Cities" Smart Cities 7, no. 6: 3371-3387. https://doi.org/10.3390/smartcities7060131

APA Style

Kahawala, S., Madhusanka, N., De Silva, D., Osipov, E., Mills, N., Manic, M., & Jennings, A. (2024). Hypervector Approximation of Complex Manifolds for Artificial Intelligence Digital Twins in Smart Cities. Smart Cities, 7(6), 3371-3387. https://doi.org/10.3390/smartcities7060131

Article Metrics

Back to TopTop