A Survey on Neuromorphic Architectures for Running Artificial Intelligence Algorithms

Al Abdul Wahid, Seham; Asad, Arghavan; Mohammadi, Farah

doi:10.3390/electronics13152963

Open AccessReview

A Survey on Neuromorphic Architectures for Running Artificial Intelligence Algorithms

by

Seham Al Abdul Wahid

^*,†,

Arghavan Asad

^† and

Farah Mohammadi

^†

Electrical and Computer Engineering Department, Toronto Metropolitan University, 350 Victoria St., Toronto, ON M5B 2K3, Canada

^*

Author to whom correspondence should be addressed.

^†

These authors contributed equally to this work.

Electronics 2024, 13(15), 2963; https://doi.org/10.3390/electronics13152963

Submission received: 30 June 2024 / Revised: 13 July 2024 / Accepted: 24 July 2024 / Published: 26 July 2024

(This article belongs to the Special Issue Neuromorphic Device, Circuits, and Systems)

Download

Browse Figures

Versions Notes

Abstract

:

Neuromorphic computing, a brain-inspired non-Von Neumann computing system, addresses the challenges posed by the Moore’s law memory wall phenomenon. It has the capability to enhance performance while maintaining power efficiency. Neuromorphic chip architecture requirements vary depending on the application and optimising it for large-scale applications remains a challenge. Neuromorphic chips are programmed using spiking neural networks which provide them with important properties such as parallelism, asynchronism, and on-device learning. Widely used spiking neuron models include the Hodgkin–Huxley Model, Izhikevich model, integrate-and-fire model, and spike response model. Hardware implementation platforms of the chip follow three approaches: analogue, digital, or a combination of both. Each platform can be implemented using various memory topologies which interconnect with the learning mechanism. Current neuromorphic computing systems typically use the unsupervised learning spike timing-dependent plasticity algorithms. However, algorithms such as voltage-dependent synaptic plasticity have the potential to enhance performance. This review summarises the potential neuromorphic chip architecture specifications and highlights which applications they are suitable for.

Keywords:

neuromorphic computing architecture; neuromorphic computing learning; spiking neural networks; non-Von Neumann computer; brain-inspired chip; artificial intelligence; machine learning

1. Introduction

Artificial intelligence (AI) is a continuously emerging field that is commonly used for various applications and purposes. Conventional technologies used for running complex AI algorithms use Von Neumann computers which have a high rate of power consumption and cause a carbon emission problem. Von Neumann computers are based on an architecture where instructions and data share a common memory and are processed sequentially by a central processing unit (CPU) that includes a control Unit and an arithmetic and logic unit (ALU) [1]. As stated by the authors of [2], training a single deep learning (DL) model can equate to the same total lifetime carbon footprint as five cars and consumes approximately 656,347 kilowatt-hours of energy [2]. In addition, recent advancements in computing speed and capacity have reached a saturation level in performance due to the continuous application of Moore’s law which resulted in the memory wall phenomenon. Moore’s law refers to the decrease in the size of transistors on digital integrated chips to achieve a faster performance. This also resulted in an increase in data movement, and as computational speed continuously increased, overall memory performance remained the same leading to the memory wall phenomenon and saturation of the system’s performance [3]. As AI algorithms continue to evolve, a new technology that meets the high performance, energy efficiency, and large bandwidth requirements is needed [4]. Neuromorphic computing, which is a brain-inspired computing system, has the capability to enhance performance at a decreasing level of power consumption. Neuromorphic computers are non-Von Neumann computers which are composed of neurons and synapses as opposed to separate CPUs and memory units [5].

Neuromorphic chips are programmed using spiking neural networks (SNNs) which provide a more energy efficient, computationally powerful network and fast and massively parallel data processing compared to artificial neural networks (ANNs). ANNs process information using fixed continuous values and structured layers, whereas SNNs mimic biological neural systems by using discrete, time-dependent spikes for information and processing. SNNs are implemented using one of the four main spiking neuron models (shown in Figure 1) which include the Hodgkin–Huxley (HH) model, Izhikevich model, integrate-and-fire (IF) model, and spike response model (SRM). These models closely exhibit biological neurons characteristics and behaviours [6,7]. In neuromorphic computing, various architectures can be developed based on the hardware implementation platform, network topologies, and neural models. Hardware implementation platforms follow three approaches: analogue, digital, and a combination of both, as depicted in Figure 2. The subsections of a neuromorphic unit include the computational unit (neural model), the information storage unit (synaptic model), the communication unit (dendrites and axons), and the learning mechanism (weights update). Considering the advantages of both digital and analogue implementation methods, they can be combined or used separately to implement the subsections of neuromorphic computing hardware. Additionally, various memory technologies can be employed in both analogue and digital systems for two important reasons: synaptic plasticity (non-volatile information storage) and weight updates (fast read and write capabilities), as presented in Figure 2 [8].

An analogue device for neuromorphic computing is a more cost-effective approach compared to digital design and can provide in-memory computing but lacks flexibility. In a digital implementation, data exchange is required between an arithmetic logic unit (ALU) and memory cells, making its implementation at a large-scale challenging. However, a digital implementation can implement almost any learning algorithm and allows for more customisation and flexibility [9]. A mixed design approach which includes the advantages of both analogue and digital implementation can overcome several limitations. Digital communication stored in the form of digital spikes can be utilised for analogue neuromorphic systems, increasing the duration of storage of the synaptic weights and the reliability of the system [10].

Analogue circuits for neuromorphic computing can be implemented using memristors, CMOS or resistive RAM. Memristors are an emerging memory device with a memristive memory and have fast operational speed, low energy consumption and small feature size. They have a switching mechanism between states through programming pulses. They can be classified into non-volatile and volatile types, where the non-volatile is capable of developing in-memory computing and the volatile is typically utilised for synapse emulators, selectors, hardware security, and artificial neurons [11,12]. Complementary metal oxide semiconductor (CMOS) transistors have been successfully used to implement neurons and synapses for neuromorphic architecture. In addition, they are widely used for large-scale spiking neural networks (SNNs) [13,14]. Lastly, resistive access memory (ReRAM) is a two-terminal nanodevice that is promising for neuromorphic computing as it can enable highly parallel, ultra-low-power computing in-memory for AI algorithms. It is structurally simple and thus can be easily integrated into the system at a low rate of power consumption [15].

A digital implementation of neuromorphic architecture can be completed using FPGAs, ASIC, or a heterogenous system composed of CPUs and GPUs. Field-programmable gate arrays (FPGAs) provide several advantages for neuromorphic computing which include flexibility, high performance and reconfiguration capability, and excellent stability. In addition, they can implement SNNs due to their parallel processing ability and the sufficient size of their local memory to restore weights. Recent implementations of FPGA-based neuromorphic systems utilise random access memory (RAM) to optimise the latency of memory access [9]. Application-specific integrated circuit (ASIC) implementations of neuromorphic systems are less flexible, have a higher production cost compared to FPGA and are limited to specific neuron models and algorithms [7,10]. However, ASIC provides low power consumption and a high-density local memory which are attractive features for the development of neuromorphic systems [16]. Modern ASICs include flash memory as they have a long retention time (>10 years). Flash memory has a three-terminal structure, is charge-based, and has a non-volatile memory [6]. A heterogenous system architecture composed of both central processing units (CPUs) and graphics processing units (GPUs) for neuromorphic computing can provide flexibility in the programming due to the CPUs as well as parallel processing and accelerated computing due to the GPUs [17]. However, they cannot be easily scaled due to their high energy demands [16]. RAM or ReRAM can be utilised for the heterogenous system to store the weights [18].

As illustrated in Figure 3, there are three main different machine learning methods that are commonly used: supervised learning, unsupervised learning, and reinforcement learning [7]. Non-machine learning methods are less common but can also be used for neuromorphic computing for applications that solve a particular task [5]. Learning mechanisms are an essential step for developing neuromorphic systems as they are used to adapt to the specified application. On-chip training is extremely desired for many applications and refers to learning in a neuromorphic chip. Off-chip training is when learning is implemented externally through software for example and the weights are then postprocessed and used to fabricate the neuromorphic system [7].

Supervised learning is the training of data using labelled datasets and can be divided into backpropagation and gradient descent algorithms. Unsupervised learning is the training of data with an unlabelled dataset and can be divided into STDP (spike-timing-dependent plasticity) and VDSP (voltage-dependent synaptic plasticity) algorithms. Lastly, reinforcement learning is when the machine learning algorithm learns from experiences and feedback without any labelled data. It is an iterative long-term process and can be divided into Q-learning and DQN algorithms [19].

Neuromorphic computing can be used for various applications and industries which include medical, large-scale operations and product customisation, artificial intelligence, and imaging. Its design parameters ultimately depend on the desired application, and several companies have each implemented a neuromorphic chip with different architecture to solve different tasks [19]. This review focuses on the various possible neuromorphic chip architectures and their capabilities.

2. Background

Neuromorphic chips consist of artificial neurons and synapses to achieve similar functions to the human brain. There are 10¹⁰–10¹² neurons in the human brain that each have 10⁴ synaptic connections operating simultaneously and communicating with each other through spike signals. The human brain inspired the development of this chip due to its ability to perform high-order intelligence tasks at a low energy consumption rate [6]. Neuromorphic chips are defined as non-Von Neumann due to their governing of both processor and memory by neurons and synapses and their reception of inputs as spikes. They have a parallel operation and are asynchronous (event-driven). Contrary to this, Von Neumann computers are composed of separate CPUs and memory units, and information is encoded as numerical values. They perform sequential processing and are synchronous (clock-driven) [5]. The main differences between Von Neumann architecture and neuromorphic architecture are illustrated in Figure 4.

Neuromorphic chips provide various advantages over current Von Neumann computers due to their operations properties, which include the following:

Connectionism: This is described using neural networks (NN) which consist of many simple units (neurons) interconnected together with weights. Determining the appropriate weights results in the NNs’ ability to learn and solve a given problem [9].
Parallelism: All neurons work in parallel to each other to simultaneously perform various functions and ensure the efficient and successful operation of neural networks [9].
Asynchrony: To achieve parallelism, synchronisation of all neurons is not required as each neuron performs a specified task. Asynchrony reduces the power consumption that would otherwise be required to achieve synchronisation [9].
Impulse nature of information transmission: The information encoded as spikes differs between different pairs of neurons and does not occur instantly. A synapse is therefore characterised by the weight and time delay and provides advantages over traditional neural networks. It is asynchronous, allows the use of dynamic data due to its inclusion of the time component, it is a complex non-linear dynamic system, and the neuron is only activated upon the receival of a spike, reducing the power consumption as its inactive state does not consume a large amount of energy [9].
On-device learning: It has the ability to learn in a continuous and incremental manner which in turn allows the customisation and personalisation of smart devices based on the user’s needs while maintaining privacy through the avoidance of user data transmission to the cloud [9].
Local learning: Conventional neural networks use backpropagation algorithms which introduce two problems: the weight transport problem and the update locking problem. The weight transport problem is the system’s inability to exchange information about the weight value and the update locking problem is the requirement of forward pass activation values to be stored for backward pass. Local learning is an alternative to backpropagation and uses a spike-timing-dependent plasticity (STDP) model where the synapses are strengthened upon receival of a spike before the neuron generated the spike or weakened if the spike was received after the neuron generated the spike. As a result, local learning can train any size of network as it does not require large amounts of global data transfer operations [9].
Sparsity: Not all neurons are activated to perform a task. Neuromorphic chips have temporal, spatial, and structural sparsity. Temporal sparsity is the data sparse in time which is determined by the transmission of only the changed part of a signal. Spatial sparsity is sparsity in data streams that are a result of neurons that only activate only upon reaching a certain threshold value. Structural sparsity refers to the data flow with respect to the network topology, as each neuron has a limited number of connections, and they are not all fully interconnected together [9,20].
Analog computing: Digital computing is limited due to its high costs. Analog circuits can be used to model the dynamics of the membrane potential and to model synaptic operations. Analog circuits provide a more time- and energy-efficient alternative [9].
In-memory computing: Each individual neuron has its own memory or stored state which eliminates the need for transferring intermediate data or the competitive memory access [9].

2.1. Spiking Neural Networks (SNN)

Neuromorphic chips are programmed using spiking neural networks (SNNs) rather than artificial neural networks (ANNs) due to their biological functionalities and employment of biological neuron models such as the integrate-and-fire model, leaky integrate-and-fire (LIF) model, and Izhikevich model which all allow communication between neurons through the generation of spike signals [6]. SNNs provide a more energy-efficient, computationally powerful network and fast and massively parallel data processing for neuromorphic chips compared to ANNs. They are implemented using differential equations and have memory while ANNs are implemented using activation functions and have no memory [7]. The spike signals sent to a neuron accumulate in the neuron membrane potential and the signal is passed to other connected neurons only when the membrane potential reaches a certain threshold [6]. A charge leakage that dissipates overtime can occur if the threshold is not reached. In addition, outgoing synapses can be affected due to axonal delays which in turn results in information delay. Figure 5 illustrates the pre-synaptic and post-synaptic neurons connected by the synapses which carry the associated weight value. The weight value is excitatory if positive or inhibitory if negative. The synapses are trained using the selected learning mechanisms to alter the weights and activate the synapse only when needed. SNNs are organised into layers and their capability to transmit information at different times is known as the asynchronous function of neuromorphic chips, which aids in reducing power consumption [5].

Hardware implementations of SNNs are split into two categories: large-scale neuromorphic accelerators and low-power SNN accelerators. A large-scale neuromorphic accelerator provides high throughput and scalable architecture with several core on-chip and multi-chip configurations, while a low-power SNN accelerator is a small-scale platform focusing on power consumption and accuracy optimisation, suitable for edge applications [21].

Large-scale SNN involves extensive memory access for each neuron update. To avoid the occurrence of a memory bottleneck in traditional Von Neumann architecture, multiple neuromorphic cores operating in parallel are used. Each core has dedicated neuron update logic and synapse memory, allowing local handling of neuron state and synapse weight updates. Communication between different cores is facilitated by a network-on-chip, typically using the address-event-representation (AER) protocol. In the AER protocol, each neuron is assigned a unique address, which is sent to the AER bus whenever the neuron fires, with the firing time encoded in real-time. Four systems that provide large-scale neuromorphic accelerator implementation are SpiNNaker, TrueNorth, Neurogrid and Loihi. SpiNNaker provides a digital implementation with 18 processor cores per chip and 48 chips per board, aimed to stimulate spiking neural networks in real-time, but has low energy efficiency for complex behaviours. TrueNorth is a 28 nm digital CMOS chip by IBM with 4096 synaptic cores, optimised for ultra-low-power consumption, and capable of reaching 46 billion synaptic operations per second. Neurogrid is a mixed-signal SNN emulator supporting complex neural behaviour with 16 cores per board, capable of handling up to 1 million neurons and billions of synaptic connections. Lastly, Loihi, developed by Intel using a 14 nm FinFET process, features 128 neural cores per chip and supports various adaptive neuron models and online learning algorithms, with a throughput of up to 30 billion synaptic operations per second [21].

Low-power SNN accelerators are commonly used for image recognition application due to their ability to achieve high accuracy and energy efficiency. Frenkel et al. [22] developed a digital neuromorphic chip supporting shallow SNNs and on-chip spike-driven plasticity, achieving 84.5% accuracy with 15 nJ energy consumption per inference. Yin et al. [23] introduced a modified leaky integrate-and-fire (LIF) model enabling offline backpropagation training, achieving over 99% accuracy with energy consumption ranging from 51.4 to 773 nJ per inference. Zheng et al. [24] proposed an event-driven neuromorphic hardware for small-scale SNNs with weight-dependent spike-timing-dependent plasticity (STDP) algorithms, achieving 90% accuracy with 1.12 µJ per inference, which provides lower energy efficiency compared to the previously mentioned works. Chen at al. [25] presented a reconfigurable neuromorphic hardware platform with 4096 neurons, supporting LIF models and STDP-based learning, reaching up to 97.7% accuracy with 1.7 µJ per classification by leveraging structural sparsity and approximate computing techniques [21].

2.2. Spiking Neuron Models

Four popular and widely used spiking neuron models include the Hodgkin–Huxley (HH) model, Izhikevich model, integrate-and-fire (IF) model, and spike response model (SRM). These models exhibit similar behaviours and characteristics to biological neurons, while ANN models include sigmoid, rectified linear unit (ReLU), or tanh, which are computation units [7].

The first developed model of a spiking neuron is the HH model. It described the initiation and propagation of action potentials of a neuron and describes the mathematical description of electric current through the membrane potential. It is the most accurate model in terms of mimicking real neurons; however, it is computationally expensive with a requirement of approximately 1200 floating-point computations (FLOPS) per 1 ms of simulation. Therefore, this model is hard to implement for large-scale neural network simulations. The second proposed model is the Izhikevich model which is two-dimensional, offering a good trade-off between biological plausibility and computational efficiency. It requires only 13 FLOPs per 1 ms of simulation making it a better alternative for implementing a large-scale neural network. The IF model is a simple model that generates an output spike upon reaching a defined threshold. The LIF model is a type of IF neuron model with an addition of a leak to the membrane potential. LIF requires only 5 FLOPS making it the model with the lowest computational cost and widely used due to its added benefits of accuracy in mimicking the spiking behaviour of biological neurons and simulation speed. It is extremely suitable for large-scale network simulation and is commonly used for analogue hardware implementations due to its ease of integration and modelling using transistors and capacitors. However, it is challenging to use for machine intelligence applications as the role of different firing patterns in learning and cognition is unclear and additional adaptation variables increase the model’s complexity [7,10]. Lastly, the SRM uses response kernels (filters) rather than differential equations to achieve similar behaviours to the LIF model, where the output spike is generated upon the internal membrane potential reaching the threshold. It requires 50 FLOPS per 1 ms of simulation which is higher than the previous two models but is still considered as a low computation cost. In addition, it provides a less accurate representation of a biological neuron compared to the HH model and is computationally complex if implemented digitally. Analogue implementations of the SRM are less complex and can be performed using charging and discharging RC circuits [7].

2.3. SNN Testing

Neuromorphic computing is an emerging field with a limited number of datasets that can be used to assess its performance. As each chip is designed for a specific application or task and it is not widely versatile, assessing its overall performance can be a challenge. A study [26] developed an on-line testing methodology for neuromorphic hardware that can detect real-time abnormal operations due to hardware level faults. It can assess the confidence in the SNN prediction using a lightweight and non-intrusive on-die symptom detector that operates in parallel with the SNN. It determines whether the running input will be correctly predicted by the SNN using a system of two classifiers: strict and lenient. If both classifiers agree then it outputs a high-confidence decision; otherwise, the test decision has low confidence. The algorithm was tested on an FPGA-based neuromorphic hardware platform and achieved a trustworthy operation with zero-latency transparent decisions for over 99.6% of the SNN inferences [26].

3. Neuromorphic Circuit Design

3.1. Analog Design

There are three main types of analogue implementations of neuromorphic chips: memristors, CMOS, and resistive RAM. Memristors, also known as resistive memory devices, apply the working principle of causing a chance of resistance due to a modification of the material at the atomic level. Both neuromorphic and memristive computing aim to enhance computational efficiency by drawing inspiration from biological systems. Memristive computing leverages the unique properties of memristors to perform in-memory computing and integrate computation with storage. Integrating memristors into neuromorphic chip architecture can offer good scalability and in-memory computing. Memristors can include resistive-switching random access memory (RRAM), phase-change memory (PCM), magnetic random access memory (MRAM), or ferroelectric random access memory (FeRAM). Their design depends on the required parameters of the application [16]. Memristors offer characteristics similar to biological synapses that have various advantages for neuromorphic chips such as in-memory computation, power efficiency, fast operational speed, and small feature size [12,27]. A single-layer configuration of a memristor includes a memory density of up to 4.5 terabits per square inch. Memristors can be categorised into non-volatile memory switching (MS) and volatile threshold switching (TS). Non-volatile MS offers high-density memory and in-memory computing, while volatile TS is useful for synapse emulators, selectors, hardware security, and artificial neurons. Bifunctional memristors are optimal for neuromorphic chips. They include the functions of both volatile and non-volatile memristors to mimic functions of artificial synapses and neurons. However, their downside is their large required storage windows, which are not guaranteed, alongside their endurance and simultaneous implementations of functions. Versatile memristors for multi-function circuits are yet to be successfully developed. Memristors can implement the SNN using the LIF neuron model due to its neuron-like threshold switching and artificial synapse properties [12].

Another method to implement analogue circuits is the use of CMOS technology. CMOS-based neuromorphic chips have successfully simulated functions of neurons and synapses but are limited due to their insufficient on-chip memory that results in their inability to store weights and implement a large-scale neural network. In addition, the DRAM off-chip storage used requires a great amount of power consumption. However, CMOS technology and devices that are CMOS compatible are continuously being researched and developed as they are low in production cost, computationally efficient, and have a high-density integration. In addition, they are extremely reliable and stable, allowing neuromorphic devices to operate for extended periods of time without compromising their performance [13,15].

ReRAM device is a good alternative to tackle the limitations that are introduced by CMOS devices. They offer advantages such as low programming voltage, fast switching speed, high integration density, and excellent performance scalability. However, they still experience inherent sneak-path leakage, signal noise and limited conductance states due to weight drifting which reduce computational accuracy. ReRAM devices can achieve the synaptic function of STDP [15]. Ye et al. [28] introduced a novel Bayesian optimisation method called BayesFT (Bayesian optimisation for fault-tolerant neural network architecture) that can optimise neural network architectures to be robust against the weight drifting in ReRAM devices by automating the search for dropout rates of the architecture [28].

As discussed by Ye et al. [29], implementation of analogue deep neural networks (DNNs) can further enhance the speed and power efficiency of neuromorphic computations. The crossbar computing architecture proposed in [29] can be utilised which allows in-place dot product computations, significantly reducing the need for data transfer between memory and processing units and enhancing computational efficiency [29].

3.2. Digital Design

Digital design can be implemented using FPGAs, ASICs, or a heterogeneous system combining CPUs and GPUs. Overall, digital implementations are more flexible and cost effective for processing large-scale SNN models compared to analogue implementations. Digital hardware represents all variables of neurons using bits and the bit precision is influenced by the energy consumption and memory requirements, thus indicating that the precision of variables is controllable and guaranteed. FPGAs may be more suitable for application compared to ASICs or CPUs/GPUs due to their shorter design and implementation time and excellent stability. The possibility of utilising a single FPGA device to implement a SNN can result in speed enhancement and lower power consumption. In addition, FPGAs support parallel processing which is essential for neuromorphic computing and contain sufficient space in the local memory for weight storing. A study [7] demonstrated that FPGA hardware using a complex network with a large number of filters and convolutional layers is able to process one image per second by implementing SNNs in real-time. Meanwhile, a CPU with a much simpler network can process one image per minute. However, FPGA implementations continue to have some limitations, such as their time-consuming implementation of neural networks compared to CPUs and GPUs. CPUs and GPUs are more widely used in neural networks due to their low programmability [7]. Heterogenous systems are more beneficial than individually using CPUs or GPUs, as CPUs provide flexibility in programming but are unable to handle large-scale SNN computations, slowing down their performance and requiring longer training periods. Meanwhile, GPUs excel in parallel processing and can handle large-scale SNN computations at a high training speed and inference processes. However, as stated above, their downside is their high energy consumption [17].

3.3. Mixed Design

A mixed design incorporating digital and analogue implementations can overcome the limitations that are introduced by analogue hardware. Analog systems can be used for neuromorphic computing; however, the synaptic weights are stored in a digital memory for reliability and longer duration. In addition, digital communication can be utilised within the chip through the generation of digital spikes [10].

4. Machine Learning Algorithms

4.1. Supervised Learning

Supervised learning trains data using well-labelled training datasets and can be divided into two steps: regression and classification. Regression is the identification of the relationship between the dependent and independent variables, while classification is categorising the output variables, which are then used to predict the output’s label [19]. Implementation of supervised learning is less commonly used for neuromorphic computing as it requires complex neurons and synaptic models or floating-point values’ communication of gradients between layers and neural cores. As a result, its hardware implementation is complex. Common supervised algorithms are backpropagation (shown in Figure 6 [5]) and gradient descent. They are successful methods for traditional artificial neural networks; however, they are challenging when training SNNs due to the nondifferentiable nature of spike events. Both algorithms provide less efficiency and stability when computing complex algorithms as they require adaptations due to their lack of direct mapping to the SNNs [5,7]. Gradient descent is used in backpropagation algorithm to update the network’s weights and minimise the loss function. Alternative approaches for using backpropagation include mapping a pre-training deep neural network and then converting it into an SNN which achieved substantial state-of-the-art performance [5]. In addition, backpropagation is not suitable for memristor-based hardware as they do not have the ideal device properties which include limited endurance, non-linear conductance modulation and device variability. In addition, continuous and incremental learning is not possible with backpropagation algorithms [16].

4.2. Unsupervised Learning

Unsupervised learning trains data without a labelled dataset and is typically used to identify hidden patterns in the data. It can be divided into two steps: clustering and association. Clustering is the grouping of similar entities together while association is the determining of relations between the variables or features of the dataset [19]. Unsupervised learning algorithms include spike-timing-dependent plasticity (STDP) and voltage-dependent synaptic plasticity (VDSP).

STDP is the most implemented algorithm in spiking neuromorphic systems as it is inspired by brain function and is straightforward to implement, especially when using analogue hardware. It allows fast real-time, online, and asynchronous learning without compromising its computational complexity [4,7,27]. STDP operates by adjusting the weights on the relative spike timings from pre- and post-synaptic neurons, as shown in Figure 7 [5]. Synapses are strengthened upon receival of spikes before generation of neurons and weakened if spikes are received after generation of neurons. STDP allows local learning which reduces the amount of global data transfer operations and has the capability to train an unlimited size of network [9]. A hybrid system consisting of CMOS neurons and memristive synapses to achieve an STDP can result in accelerating neuromorphic computing and providing a high-density connection and efficient implementation of matrix-vector multiplication [10,30]. Although STDP is primarily considered as an unsupervised learning mechanism, it can also be integrated into supervised learning frameworks. In some applications, it can be combined with global signals that guide the synaptic adjustments. Supervised STDP can be used to integrate error signals that provide feedback on the network’s performance and in turn enhance the system’s learning capability and flexibility [31].

VDSP is proposed to overcome the two limitations of STDP. The first one is its requirement of storing precise spike times and traces in memory which are used at every update to the processor. The added memory requirement in digital implementations of STDP is costly and is challenging in analogue implementations due to the circuit area and power spent. The second limitation of STDP is its fixed time window which must include the spike time difference between post- and pre-synaptic neurons in order to update the weight accordingly. Good performance is achieved only upon optimising the region of the time windows based on the temporal dynamics of the spike signals. It is challenging to choose the appropriate STDP time window as well as to design flexible circuits to accommodate the time window. VDSP does not include a fixed time window to update the weights and can be easily incorporated into in-memory computing hardware by preserving local computing. Rather than using spike timings to evaluate the correlation between pre- and post-neurons, VDSP relies on the membrane potential of a pre-synaptic neuron. Using the LIF (leaky integrate-and-fire) SNN model for VDSP can exhibit exponential decay from its membrane potential and relay information about the neuron’s spike time. A high membrane potential indicates that a neuron is about to fire, while a low membrane potential indicates that a neuron has already been fired [32].

4.3. Reinforcement Learning

Reinforcement learning is a long-term iterative algorithm that learns from previous experiences or feedback without using any labelled datasets. Its accuracy increases with the amount of feedback received. Implementing reinforcement learning for bio-inspired hardware such as neuromorphic chips remains to be a challenge [19]. A study used reinforcement learning with reward-modulated STDP (R-STDP), which is a three-factor learning rule that can achieve the same effect as STDP which is identifying the correlations between pre- and post-synaptic neurons. It can also capture reward, which represents the progress of learning in any given iteration [33]. Further development and progress on R-STDP can result in a beneficial algorithm and help realise the overall performance of the system.

5. Neuromorphic Projects

There are various neuromorphic projects implemented in industry or in academia. They each include different implementation methods, either digital or analogue, include on-chip learning or external learning, and have different features. A study by Ivanov et al. provided a summary of all projects, as demonstrated in Table 1 [9]. Comparing the properties of each project, it is observed that in-memory computations have not been implemented using digital design as they require data exchange between the arithmetic logic unit (ALU) and memory cells, introducing complexities and added costs. This limitation can be resolved, as it was by the Loihi and TrueNorth projects, by using more SRAM (static random access memory) to move the memory closer to computing [9].

6. Proposed Method and Future Work

Designing a heterogenous quantum neuromorphic computing system can further enhance performance and reduce energy consumption in artificial neurons. Quantum computing processes information based on principles of quantum mechanics, allowing for simultaneous parallel computations of different possibilities. Information is represented using quantum bits, also known as qubits, which use the principle of superposition, existing in multiple states (0 and 1). The use of quantum computing and materials can leverage the excellent pattern recognition capabilities of neuromorphic computing while reducing its overall power consumption. However, implementing quantum neural networks directly in hardware poses a challenge due to the need for precise control over connection strengths. Quantum coherence is susceptible to dissipation and dephasing, making hardware implementation complex. In addition, large spatial variation in heating and temperature can occur in this heterogenous system. Further research is required regarding these limitations to enable the system to successfully operate [34,35,36].

In our previous work [37], we set out an architecture to achieve efficient processing of neural networks through neuromorphic processing. The NeuroTower is effectively a 2D, mesh-connected network-on-chip integrated with stacks of DRAM (dynamic random access memory) integrated on top for 3D stacked memory. This architecture employs programmable neurosequence generators, which act as a medium of communication in the system to aid with the retrieval of data between the DRAM stacks and processing elements. Our research introduces a pruning component to exploit sparsity and reduce network-on-chip traffic, a significant source of power consumption in many hardware accelerators. The pruning unit prevents ineffectual operations from being executed and leaves only the effectual data required for processing.

In NeuroTower, the memory is integrated as a stack of multiple DRAM chips each separated into 16 partitions. Along one column of partitions is a vault, as shown in Figure 8 below. Each of these vaults has an associated vault controller which controls data movement in and out of the vaults to other elements of the NeuroTower. Each vault is connected to one processing element to allow for parallel processing and these connections are realised by using high speed through silicon vias (TSVs) [37,38]. The DRAM stack is crucial to the operation of the system as all the information for processing is contained here. Every layer of the neural network, its state, and connectivity weights are stored in the vaults of the DRAM. This implies that the data movement paths are known before beginning processing. To make use of this, the paths are compiled into finite state machine descriptions which drive the programmable neurosequence generators (PNG) [37,39]. To initiate processing, the host must load these state machine descriptions into the PNG which begins the data-driven processing of each layer of the neural network.

7. Conclusions

With further advancements in neuromorphic computing, which include large-scale implementations and on-chip learning, it has the potential to replace current Von Neumann computers for running complex algorithms. Neuromorphic computing’s power efficiency and learning capabilities allows it to drastically enhance the performance of a system. Future research needs to be completed regarding optimising neuromorphic chips’ properties and learning techniques and using them for a wide range of applications, rather than only one specified application. Adopting a digital or mixed-design hardware approach for running complex AI algorithms with a NeuroTower architecture coupled with quantum computing can result in a flexible computing system with large memory, enhanced performance and speed, while reducing energy consumption.

Author Contributions

Conceptualization, A.A. and S.A.A.W.; writing—original draft preparation, S.A.A.W.; writing—review and editing, S.A.A.W.; visualization, S.A.A.W.; supervision, F.M.; project administration, A.A. and F.M. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

All data underlying the results are available as part of the article and no additional source data are required.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Arikpo, I.I.; Ogban, F.U.; Eteng, I.E. Von neumann architecture and modern computers. Glob. J. Math. Sci. 2008, 6, 97–103. [Google Scholar] [CrossRef]
Luo, T.; Wong, W.F.; Goh, R.S.M.; Do, A.T.; Chen, Z.; Li, H.; Jiang, W.; Yau, W. Achieving Green AI with Energy-Efficient Deep Learning Using Neuromorphic Computing. Commun. ACM 2023, 66, 52–57. [Google Scholar] [CrossRef]
Kumar, S.; Wang, X.; Strachan, J.P.; Yang, Y.; Lu, W.D. Dynamical memristors for higher-complexity neuromorphic computing. Nat. Rev. Mater. 2022, 7, 575–591. [Google Scholar] [CrossRef]
Xu, B.; Huang, Y.; Fang, Y.; Wang, Z.; Yu, S.; Xu, R. Recent Progress of Neuromorphic Computing Based on Silicon Photonics: Electronic–Photonic Co-Design, Device, and Architecture. Photonics 2022, 9, 698. [Google Scholar] [CrossRef]
Schuman, C.D.; Kulkarni, S.R.; Parsa, M.; Mitchell, J.P.; Date, P.; Kay, B. Opportunities for neuromorphic computing algorithms and applications. Nat. Comput. Sci. 2022, 2, 10–19. [Google Scholar] [CrossRef]
Byun, K.; Choi, I.; Kwon, S.; Kim, Y.; Kang, D.; Cho, Y.W.; Yoon, S.K.; Kim, S. Recent Advances in Synaptic Nonvolatile Memory Devices and Compensating Architectural and Algorithmic Methods toward Fully Integrated Neuromorphic Chips. Adv. Mater. Technol. 2022, 8, 2200884. [Google Scholar] [CrossRef]
Javanshir, A.; Nguyen, T.T.; Mahmud, M.A.P.; Kouzani, A.Z. Advancements in Algorithms and Neuromorphic Hardware for Spiking Neural Networks. Neural Comput. 2022, 34, 1289–1328. [Google Scholar] [CrossRef]
Bartolozzi, C.; Indiveri, G.; Donati, E. Embodied neuromorphic intelligence. Nat. Commun. 2022, 13, 1024. [Google Scholar] [CrossRef]
Ivanov, D.; Chezhegov, A.; Kiselev, M.; Grunin, A.; Larionov, D. Neuromorphic artificial intelligence systems. Front. Neurosci. 2022, 16, 959626. [Google Scholar] [CrossRef] [PubMed]
Shrestha, A.; Fang, H.; Mei, Z.; Rider, D.P.; Wu, Q.; Qiu, Q. A Survey on Neuromorphic Computing: Models and Hardware. IEEE Circuits Syst. Mag. 2022, 22, 6–35. [Google Scholar] [CrossRef]
Wei, Q.; Gao, B.; Tang, J.; Qian, H.; Wu, H. Emerging Memory-Based Chip Development for Neuromorphic Computing: Status, Challenges, and Perspectives. IEEE Electron Devices Mag. 2023, 1, 33–49. [Google Scholar] [CrossRef]
Guo, T.; Pan, K.; Jiao, Y.; Sun, B.; Du, C.; Mills, J.P.; Chen, Z.; Zhao, X.; Wei, L.; Zhou, Y.N.; et al. Versatile memristor for memory and neuromorphic computing. Nanoscale Horiz. 2022, 7, 299–310. [Google Scholar] [CrossRef]
Zhu, Y.; Mao, H.; Zhu, Y.; Wang, X.; Fu, C.; Ke, S.; Wan, C.; Wan, Q. CMOS-compatible neuromorphic devices for neuromorphic perception and computing: A review. Int. J. Extrem. Manuf. 2023, 5, 042010. [Google Scholar] [CrossRef]
Kimura, M.; Shibayama, Y.; Nakashima, Y. Neuromorphic chip integrated with a large-scale integration circuit and amorphous-metal-oxide semiconductor thin-film synapse devices. Sci. Rep. 2022, 12, 5359. [Google Scholar] [CrossRef]
Li, B.; Zhong, D.; Chen, X.; Liu, C. Enabling Neuromorphic Computing for Artificial Intelligence with Hardware-Software Co-Design. Artif. Intell. 2023. [CrossRef]
Christensen, D.V.; Dittmann, R.; Linares-Barranco, B.; Sebastian, A.; Le Gallo, M.; Redaelli, A.; Slesazeck, S.; Mikolajick, T.; Spiga, S.; Menzel, S.; et al. 2022 roadmap on neuromorphic computing and engineering. Neuromorphic Comput. Eng. 2022, 2, 022501. [Google Scholar] [CrossRef]
Pham, M.D.; D’Angiulli, A.; Dehnavi, M.M.; Chhabra, R. From Brain Models to Robotic Embodied Cognition: How Does Biological Plausibility Inform Neuromorphic Systems? Brain Sci. 2023, 13, 1316. [Google Scholar] [CrossRef]
Zhang, H.; Ho, N.M.; Polat, D.Y.; Chen, P.; Wahib, M.; Nguyen, T.T.; Meng, J.; Goh, R.S.M.; Matsuoka, S.; Luo, T.; et al. Simeuro: A Hybrid CPU-GPU Parallel Simulator for Neuromorphic Computing Chips. IEEE Trans. Parallel Distrib. Syst. 2023, 34, 2767–2782. [Google Scholar] [CrossRef]
Das, R.P.; Biswas, C.; Majumder, S. Study of Spiking Neural Network Architecture for Neuromorphic Computing. In Proceedings of the 2022 IEEE 11th International Conference on Communication Systems and Network Technologies (CSNT), Indore, India, 23–24 April 2022. [Google Scholar] [CrossRef]
Panzeri, S.; Janotte, E.; Pequeño-Zurro, A.; Bonato, J.; Bartolozzi, C. Constraints on the design of neuromorphic circuits set by the properties of neural population codes. Neuromorphic Comput. Eng. 2023, 3, 012001. [Google Scholar] [CrossRef]
Nguyen, D.-A.; Tran, X.-T.; Iacopi, F. A Review of Algorithms and Hardware Implementations for Spiking Neural Networks. J. Low Power Electron. Appl. 2021, 11, 23. [Google Scholar] [CrossRef]
Frenkel, C.; Lefebvre, M.; Legat, J.-D.; Bol, D. A 0.086-mm² 12.7-pJ/SOP 64k-Synapse 256-Neuron Online-Learning Digital Spiking Neuromorphic Processor in 28 nm CMOS. IEEE Trans. Biomed. Circuits Syst. 2018, 13, 145–158. [Google Scholar] [CrossRef] [PubMed]
Yin, S.; Venkataramanaiah, S.K.; Chen, G.K.; Krishnamurthy, R.; Cao, Y.; Chakrabarti, C.; Seo, J.-S. Algorithm and hardware design of discrete-time spiking neural networks based on back propagation with binary activations. In Proceedings of the 2017 IEEE Biomedical Circuits and Systems Conference (BioCAS), Turin, Italy, 19–21 October 2017. [Google Scholar] [CrossRef]
Zheng, N.; Mazumder, P. A Low-Power Hardware Architecture for On-Line Supervised Learning in Multi-Layer Spiking Neural Networks. In Proceedings of the 2018 IEEE International Symposium on Circuits and Systems (ISCAS), Florence, Italy, 27–30 May 2018. [Google Scholar] [CrossRef]
Chen, G.K.; Kumar, R.; Sumbul, H.E.; Knag, P.C.; Krishnamurthy, R.K. A 4096-Neuron 1M-Synapse 3.8-pJ/SOP Spiking Neural Network with On-Chip STDP Learning and Sparse Weights in 10-nm FinFET CMOS. IEEE J. Solid-State Circuits 2019, 54, 992–1002. [Google Scholar] [CrossRef]
Spyrou, T.; Stratigopoulos, H.-G. On-Line Testing of Neuromorphic Hardware. In Proceedings of the 2023 IEEE European Test Symposium (ETS), Venezia, Italy, 22–26 May 2023. [Google Scholar] [CrossRef]
Frenkel, C.; Bol, D.; Indiveri, G. Bottom-Up and Top-Down Approaches for the Design of Neuromorphic Processing Systems: Tradeoffs and Synergies between Natural and Artificial Intelligence. Proc. IEEE 2023, 111, 623–652. [Google Scholar] [CrossRef]
Ye, N.; Cao, L.; Yang, L.; Zhang, Z.; Fang, Z.; Gu, Q.; Yang, G.-Z. Improving the robustness of analog deep neural networks through a Bayes-optimized noise injection approach. Commun. Eng. 2023, 2, 25. [Google Scholar] [CrossRef]
Ye, N.; Mei, J.; Fang, Z.; Zhang, Y.; Zhang, Z.; Wu, H.; Liang, X. BayesFT: Bayesian Optimization for Fault Tolerant Neural Network Architecture. In Proceedings of the 2021 58th ACM/IEEE Design Automation Conference (DAC), San Francisco, CA, USA, 5–9 December 2021. [Google Scholar] [CrossRef]
Zhong, Y.; Wang, Z.; Cui, X.; Cao, J.; Wang, Y. An Efficient Neuromorphic Implementation of Temporal Coding Based On-chip STDP Learning. IEEE Trans. Circuits Syst. II-Express Briefs 2023, 70, 4241–4245. [Google Scholar] [CrossRef]
Agebure, M.A.; Wumnaya, P.A.; Baagyere, E.Y. A Survey of Supervised Learning Models for Spiking Neural Network. Asian J. Res. Comput. Sci. 2021, 9, 35–49. [Google Scholar] [CrossRef]
Clark, K.; Wu, Y. Survey of Neuromorphic Computing: A Data Science Perspective. In Proceedings of the 2023 IEEE 3rd International Conference on Computer Communication and Artificial Intelligence (CCAI), Taiyuan, China, 26–28 May 2023. [Google Scholar] [CrossRef]
Garg, N.; Balafrej, I.; Stewart, T.C.; Portal, J.M.; Bocquet, M.; Querlioz, D.; Rouat, J.; Beilliard, Y.; Alibart, F. Voltage-dependent synaptic plasticity: Unsupervised probabilistic Hebbian plasticity rule based on neurons membrane potential. Front. Neurosci. 2022, 16, 983950. [Google Scholar] [CrossRef] [PubMed]
Wunderlich, T.; Kungl, A.F.; Müller, E.; Hartel, A.; Stradmann, Y.; Aamir, S.A.; Grübl, A.; Heimbrecht, A.; Schreiber, K.; Stöckel, D.; et al. Demonstrating Advantages of Neuromorphic Computation: A Pilot Study. Front. Neurosci. 2019, 13, 260. [Google Scholar] [CrossRef]
Ghosh, S.; Nakajima, K.; Krisnanda, T.; Fujii, K.; Liew, T.C.H. Quantum Neuromorphic Computing with Reservoir Computing Networks. Adv. Quantum Technol. 2021, 4, 2100053. [Google Scholar] [CrossRef]
Hoffmann, A.; Ramanathan, S.; Grollier, J.; Kent, A.D.; Rozenberg, M.J.; Schuller, I.K.; Shpyrko, O.G.; Dynes, R.C.; Fainman, Y.; Frano, A.; et al. Quantum materials for energy-efficient neuromorphic computing: Opportunities and challenges. APL Mater. 2022, 10, 070904. [Google Scholar] [CrossRef]
Asad, A.; Kaur, R.; Mohammadi, F. A Survey on Memory Subsystems for Deep Neural Network Accelerators. Future Internet 2022, 14, 146. [Google Scholar] [CrossRef]
Asad, A.; Mohammadi, F. NeuroTower: A 3D Neuromorphic Architecture with Low-Power TSVs. In Lecture Notes in Networks and Systems; Springer International Publishing: Cham, Switzerland, 2022; pp. 227–236. [Google Scholar] [CrossRef]
Kaur, R.; Asad, A.; Mohammadi, F. A Comprehensive Review on Processing-in-Memory Architectures for Deep Neural Networks. Computers 2024, 13, 174. [Google Scholar] [CrossRef]

Figure 1. Four main spiking neuron models used in neuromorphic chips.

Figure 2. Neuromorphic architecture characterisation diagram.

Figure 3. Neuromorphic computing learning methods characterisation diagram.

Figure 4. Von Neumann architecture versus neuromorphic architecture.

Figure 5. Example of SNN and information transmission between neurons through synapses.

Figure 6. Backpropagation algorithm network structure.

Figure 7. Spike-timing-dependent plasticity architecture where the weights are adjusted based on the spike timings of the pre-synaptic neurons (i) and post-synaptic neurons (j).

Figure 8. NeuroTower architecture with depiction of stacked memory.

Table 1. Summary of neuromorphic project properties [9].

Property	TrueNorth		Loihi
In-memory computation	Near-memory		Near-memory
Signal	Spikes		Spikes
Size neurons/synapses	1 M/256 M		128 K/128 M
On-device learning	No		STDP
Analogue	No		No
Event-based	Yes		Yes
nm	28		14
Features	First industrial neuromorphic chip without training (IBM)		First neuromorphic chip with training (Intel)
Property	Loihi2		Tianjic
In-memory computation	Near-memory		Near-memory
Signal	Real numbers, Spikes		Real numbers, Spikes
Size neurons/synapses	120 K/1 M		40 K/10 M
On-device learning	STDP		No
Analogue	No		No
Event-based	Yes		Yes
nm	7		28
Features	Non-binary spikes, neurons can be programmed		Hybrid chip
Property	SpiNNaker		Brain-ScaleS
In-memory computation	Near-memory		Yes
Signal	Real numbers, Spikes		Real numbers, Spikes
Size neurons/synapses	-		512/130 K
On-device learning	STDP		STDP
Analogue	No		Yes
Event-based	No		Yes
nm	22		65
Features	Scalable computer for SNN simulation		Analog neurons, large size
Property	GrAIOne		Akida
In-memory computation	Near-memory		Near-memory
Signal	Real numbers, Spikes		Spikes
Size neurons/synapses	200 K/-		1.2 M/10 B
On-device learning	No		STDP
Analogue	No		No
Event-based	Yes		Yes
nm	28		28
Features	NeuronFlow architecture, effective support of sparse computations		Incremental, one-shot and continuous learning for CNN
Property		Memristor (IBM)
In-memory computation		Yes
Signal		Spikes
Size neurons/synapses		512/64 K
On-device learning		Yes
Analogue		Yes
Event-based		Yes
nm		50
Features		Allows each synaptic cell to operate asynchronously

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Al Abdul Wahid, S.; Asad, A.; Mohammadi, F. A Survey on Neuromorphic Architectures for Running Artificial Intelligence Algorithms. Electronics 2024, 13, 2963. https://doi.org/10.3390/electronics13152963

AMA Style

Al Abdul Wahid S, Asad A, Mohammadi F. A Survey on Neuromorphic Architectures for Running Artificial Intelligence Algorithms. Electronics. 2024; 13(15):2963. https://doi.org/10.3390/electronics13152963

Chicago/Turabian Style

Al Abdul Wahid, Seham, Arghavan Asad, and Farah Mohammadi. 2024. "A Survey on Neuromorphic Architectures for Running Artificial Intelligence Algorithms" Electronics 13, no. 15: 2963. https://doi.org/10.3390/electronics13152963

APA Style

Al Abdul Wahid, S., Asad, A., & Mohammadi, F. (2024). A Survey on Neuromorphic Architectures for Running Artificial Intelligence Algorithms. Electronics, 13(15), 2963. https://doi.org/10.3390/electronics13152963

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Survey on Neuromorphic Architectures for Running Artificial Intelligence Algorithms

Abstract

1. Introduction

2. Background

2.1. Spiking Neural Networks (SNN)

2.2. Spiking Neuron Models

2.3. SNN Testing

3. Neuromorphic Circuit Design

3.1. Analog Design

3.2. Digital Design

3.3. Mixed Design

4. Machine Learning Algorithms

4.1. Supervised Learning

4.2. Unsupervised Learning

4.3. Reinforcement Learning

5. Neuromorphic Projects

6. Proposed Method and Future Work

7. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI