1. Introduction
In 1943, McCulloch and Pitts introduced a mathematical representation of neural activity, laying the foundation for ANNs [
1]. Although ANNs have gained prominence, they come with challenges, such as hyperparameter tuning complexities, the necessity for abundant labeled training datasets, excessive model refinement, and inherent opaqueness stemming from their black-box nature [
2,
3]. Interestingly, these approaches often overlook the essence of deep neural networks: the efficient data processing capability of individual neurons. Instead, they frequently boost model performance through statistical theories or intricate learning strategies. Some researchers have also chosen to reconceptualize neuron models rather than merely deepening neural networks [
4]. Furthermore, the McCulloch–Pitts neuron model, which solely represents connection strength between two neurons using a weight, has faced criticism for its simplification [
5].
A real biological neuron possesses intricate spatial and temporal features. Drawing inspiration from the neuron’s ability to process temporal information, a unique and biologically accurate dendritic neuron model (DNM) was proposed. This model stands out due to its distinct architecture and excitation functions, incorporating sigmoid functions for synaptic interactions and a multiplication operation to emulate dendritic interactions [
6]. As a new model, there is ample scope for refining and improving the DNM. For addressing classification problems through combinatorial optimization with algorithmically trained artificial neurons, it is evident that utilizing more advanced algorithms in conjunction with sophisticated neuron models can substantially enhance performance. Recently, an upgraded iteration of the DNM has emerged as a promising choice for training neurons, demonstrating superior outcomes in classification tasks. Notably, among these advancements, the refined DNM-R model has achieved the highest classification accuracy when paired with the same algorithm [
7]. Nonetheless, this optimization approach comes with a notable drawback: the DNM employed as the training model encompasses numerous parameters requiring adjustment. Typically, this involves a minimum of three parameters, namely,
k,
q, and
M. These parameters essentially correspond to the amplification factor, discrimination factor, and the number of dendritic branches. In prior investigations,
k and
q were often segmented into 5 parameters each for the purpose of tuning experiments, while
M sometimes entailed as many as 20 parameters. From the perspective of orthogonal experiments, this implies that parametric experiments alone would require repetition a minimum of 500 times. This imposes a substantial burden, both in terms of research endeavors and practical applications. Therefore, it is imperative to introduce a methodology capable of adaptively optimizing the model parameters to facilitate improvements.
Given that the training issue associated with the neuron model may constitute an NP-hard problem [
8], evolutionary algorithms (EAs) emerge as a potent solution. The genesis of EAs can be traced back to the genetic algorithm (GA) [
9], which subsequently spurred the creation of various algorithms, including differential evolution (DE) [
10] and the success-history-based parameter adaptation for differential evolution (SHADE) [
11]. A key enhancement of DE over GA is its mutation strategy, which leverages differences among individuals instead of mere random variations. In contrast, SHADE refines the DE approach by differentially linking offspring to the optimal parent individuals. Moreover, parameters from top-performing individuals are preserved through iterations to guide the learning of subsequent generations. The efficacy of DE-derived algorithms has been well documented [
12], with their enhanced versions frequently securing leading spots in the IEEE CEC contests [
13].
Various adaptation strategies can be employed in optimization processes, including random variations in crossover and variance rates using probability distributions like normal and Cauchy distributions, a common practice in many differential evolution algorithms [
14]. Additionally, some research endeavors have explored the utilization of fitness–distance balance strategies to adaptively fine-tune algorithmic parameters [
15,
16]. These investigations have, to varying degrees, shown that adaptive strategies not only obviate the need for meticulous parameter tuning but also contribute to enhanced algorithmic performance. The implementation of suitable adaptive strategies is particularly pertinent in the context of optimizing real-world problems, such as classification tasks. It is important to recognize that the training process of algorithms on artificial neurons essentially involves iterating their weights, constituting an adaptive process in itself. Consequently, the algorithm itself can be construed as an efficient adaptive instrument endowed with nonlinear properties, setting it apart from conventional mathematical techniques. Advanced algorithms are inherently designed to tackle intricate black-box problems, which typically necessitate robust exploitation and exploration capabilities. These attributes render them well suited for the adaptive adjustment of hyperparameters. In essence, it is advisable to treat the hyperparameters within artificial neurons as variables subject to iteration within the algorithmic framework, leveraging the algorithm’s evolutionary prowess to fine-tune these hyperparameters.
In this study, we have integrated the key hyperparameters of the DNM with SHADE. We have implemented an adaptive hyperparameter-adjustment approach, leveraging the inherent evolutionary capabilities of these algorithms. The resultant novel optimization framework is denoted as hyperparameter-tuning success-history-based parameter adaptation for differential evolution (HSHADE), with a type of neuron that can self-evolve as the algorithm iterates. We conducted a comprehensive evaluation of HSHADE using a benchmark comprising 10 real-world problems commonly employed for assessing algorithmic performance in classification tasks. Comparative analyses were performed against the original algorithm, well-established algorithms with a track record of effectiveness, and contemporary state-of-the-art algorithms within the same problem set. The findings conclusively demonstrate that HSHADE exhibits a notable advantage in terms of classification accuracy and significantly streamlines the parameter tuning process.
The main contributions of this study are summarized below:
- (1)
HSHADE successfully iterates the fixed parameters of the DNM adaptively using evolutionary algorithms, thus reducing its tuning workload.
- (2)
HSHADE achieves the same or better accuracy than the state-of-the-art algorithms on the same set of classification problems.
- (3)
HSHADE maintains the fast problem-solving characteristics of a single neuron and also achieves very high accuracy.
- (4)
The power-law distribution of information interaction networks observed during the iterative process of HSHADE provides new insights for future neural model training.
The remainder of the paper is structured as follows: the DNMs, EAs, and self-evolving DNM are formulated in
Section 2. The experimental results are analyzed in
Section 3.
Section 4 presents the discussion and conclusions.
4. Discussions and Conclusions
In the realm of artificial neural networks, the deep understanding and optimization of individual neuron dynamics have always held profound significance. The DNM is one such example that has caught the attention of the research community, particularly due to its unique architectural nuances and data processing capabilities. Despite its promising features, the DNM poses certain complexities, especially when it comes to parameter tuning. Recognizing this challenge, this study has pioneered the development of HSHADE—an innovative approach that synergistically marries the evolutionary prowess of the SHADE algorithm with the DNM’s hyperparameters. This integration heralds a new generation of neurons, ones that are characterized by their ability to self-evolve during algorithmic iterations.
The comparative evaluations and benchmarks indicate that the HSHADE framework significantly outperforms its contemporaries, particularly in terms of classification accuracy. Moreover, it presents a more streamlined and efficient method for parameter tuning, thereby addressing one of the primary challenges associated with the DNM. However, as with most pioneering endeavors, the HSHADE is not without its limitations. A salient challenge that still remains is the auto-tuning of the parameter M, which corresponds to the count of dendritic branches in the neuron model. The crux of the challenge lies in the fact that any change in M results in variations in the computational dimensions, making matrix operations unfeasible. This, in essence, implies that while the HSHADE framework has made significant strides in auto-tuning certain parameters, automation of the M parameter remains elusive.
As we gaze into the future of this research, the overarching goal will pivot towards devising methodologies that can seamlessly and effectively auto-tune the
M parameter. As previously highlighted, variations in
M result in non-uniform dimensions among individuals within the population. In this context, the problem being optimized is termed a metameric variable-length problem [
28]. Contemporary mainstream algorithms rely on matrix operations to generate new individuals, making them incompatible with the metameric variable-length problem setting. Essentially, this study serves as an intermediary step. Our primary objective was to incorporate all parameters (including
M) into the individuals of the evolutionary algorithm, facilitating the self-evolution of these parameters. However, only the self-evolution of
k and
q was successfully achieved. Tackling this challenge would not only enhance the efficacy of the HSHADE framework but also further solidify its position as a game-changer in the domain of artificial neuron modeling and optimization.
In terms of validity threats, the sole variable examined in this study pertains to the integration of hyperparameters into the algorithm’s iterative process, with all other experimental conditions being held constant. Consequently, there are no discernible threats to internal validity at present. Nonetheless, given that our selection was confined to the UCI Machine Learning Repository dataset, it is imperative that future studies broaden the spectrum of datasets used to address potential threats to external validity.