Next Article in Journal
Experimental Investigation of Synchronous Grouting Material Prepared with Different Mineral Admixtures
Next Article in Special Issue
Analysis of Experimental Results Regarding the Selection of Spring Elements in the Front Suspension of a Four-Axle Truck
Previous Article in Journal
In-Situ Crystallization and Characteristics of Alkali-Activated Materials-Supported Analcime-C from a By-Product of the Lithium Carbonate Industry
Previous Article in Special Issue
Influence of Thermal Shocks on Residual Static Strength, Impact Strength and Elasticity of Polymer-Composite Materials Used in Firefighting Helmets
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Specifications for Modelling of the Phenomenon of Compression of Closed-Cell Aluminium Foams with Neural Networks

1
Faculty of Civil Engineering, Cracow University of Technology, 31-155 Cracow, Poland
2
Faculty of Electrical and Computer Engineering, Cracow University of Technology, 31-155 Cracow, Poland
3
Faculty of Mechanical Engineering and Robotics, AGH University of Science and Technology, 30-059 Cracow, Poland
*
Author to whom correspondence should be addressed.
Materials 2022, 15(3), 1262; https://doi.org/10.3390/ma15031262
Submission received: 6 December 2021 / Revised: 2 February 2022 / Accepted: 4 February 2022 / Published: 8 February 2022

Abstract

:
The article presents a novel application of the most up-to-date computational approach, i.e., artificial intelligence, to the problem of the compression of closed-cell aluminium. The objective of the research was to investigate whether the phenomenon can be described by neural networks and to determine the details of the network architecture so that the assumed criteria of accuracy, ability to prognose and repeatability would be complied. The methodology consisted of the following stages: experimental compression of foam specimens, choice of machine learning parameters, implementation of an algorithm for building different structures of artificial neural networks (ANNs), a two-step verification of the quality of built models and finally the choice of the most appropriate ones. The studied ANNs were two-layer feedforward networks with varying neuron numbers in the hidden layer. The following measures of evaluation were assumed: mean square error (MSE), sum of absolute errors (SAE) and mean absolute relative error (MARE). Obtained results show that networks trained with the assumed learning parameters which had 4 to 11 neurons in the hidden layer were appropriate for modelling and prognosing the compression of closed-cell aluminium in the assumed domains; however, they fulfilled accuracy and repeatability conditions differently. The network with six neurons in the hidden layer provided the best accuracy of prognosis at M A R E 2.7 % but little robustness. On the other hand, the structure with a complexity of 11 neurons gave a similar high-quality of prognosis at M A R E 3.0 % but with a much better robustness indication (80%). The results also allowed the determination of the minimum threshold of the accuracy of prognosis: M A R E 1.66 % . In conclusion, the research shows that the phenomenon of the compression of aluminium foam is able to be described by neural networks within the frames of made assumptions and allowed for the determination of detailed specifications of structure and learning parameters for building models with good-quality accuracy and robustness.

1. Introduction

1.1. Problem Origins

Closed-cell aluminium is a well-known engineering material, mostly used where light-weight applications require satisfactory mechanical properties [1,2,3] or energy absorption as a determinant [2,4]. Other properties, which make this material multifunctional, are: sound wave attenuation [5,6], electromagnetic wave absorption [7,8], vibration intimidation [9], thermal conductivity [10,11], relatively easy shape tailoring [12] and potential for usage in composites [13,14,15]. Examples of the usage of aluminium foams include, among others: the automotive industry, space industry, energy and battery field, military applications and machine construction [16,17,18,19,20]. We would also like to highlight civil engineering and architecture here, since these application fields are unjustly underestimated in the metal foam industry even though they have significant potential. Examples of the usage of closed and open cellular metals include: structural elements (e.g., wall slabs, staircase slabs, parking slabs) [17,21,22], interior and exterior architectural design [23,24], highway sound absorbers [5,25], architectural electromagnetic shielding [26], sound absorbers in metro tunnels [17], dividing wall slabs with sound insulation (e.g., for lecture halls) [27] and the novel concept of earthquake protection against building pounding [28].
Taking into consideration so much engineering and design interest in closed-cell aluminium foams, it is a natural consequence that much scientific attention is drawn to the appropriate description of this material in various aspects. A significant number of works focus on structural characterization, e.g., [16,29,30,31], property analysis, e.g., [1,2,3,4,5,6,7,8,9,10,11,32], manufacturing methods, e.g., [12,16,33], experimental investigations, e.g., [34,35,36,37,38,39] and modelling. As for modelling, this field is widely researched, and the number of publications about this subject is extensive. They cover the modelling of basic mechanical properties or constitutive relations with different approaches: the analytical derivation of models based on the foam’s cell geometry, incorporation of probabilistic approach, application of theory of elasticity and numerical solutions with finite-element methods, e.g., [40,41,42,43,44,45,46,47,48]. Application of the most up-to-date numerical tool, i.e., neural networks, to the modelling of mechanical characteristics of metal foams (open-cell) can be found in papers [49,50,51,52]. The authors are not aware of any works which apply this valuable method to the modelling of base relations in closed-cell metal foams. However, the authors note that neural networks have been used in the modelling of closed-cell polymer cellulars [53]. Neural networks are more often used for the analysis of specific features of metal foams and sponges, mainly heat exchange, e.g., [54,55].

1.2. Problem Statement and Proposed Solution’s Generals

There are extensive specific material models for closed-cell aluminium foams, which could be a starting point in the present discussion, such as the general relation given in Expression (1) [40]. It reflects the intuitive dependence of the mechanical behaviour of foam on its structural nature:
cellular   material   property skeleton   property = C ( ρ ρ s ) n ,
Formula (1) relates by a power law a chosen cellular material’s property and a respective skeleton’s property to both the cellular material’s density ρ and skeleton’s density ρ s . Parameters C and n are supposed to be determined experimentally for the given material. This formula may assume specific forms, depending on what kind of property is desired (compressive strength, material’s modulus, etc.) and on a general characterization of the considered material (closed- or open-cellular; elastic, plastic or brittle). However, it does not express a continuous model. Parameters C and n were already determined for some specific foams and sponges [40], but they are given mostly in the form of intervals of values and should be confronted with experimental data each time. Additionally, Relation (1) requires one to know the skeleton’s density and the respective skeleton’s property in order to determine the analogous foam’s property. This fact may be a serious inconvenience in the case when the foam is bought as a ready product from an external supplier and data about the skeleton’s material are inaccessible. However, despite all its shortcomings, the crucial premise of (1) is that a material’s density is reflected in its behaviour. This dependency was a key for the assumption of the form of a general relationship that was the basis for neural network modelling in the presented research:
σ = f ( ε , ρ ) .
Formula (2) refers to a general relationship between strain ε , the material’s apparent density ρ and its response in uniaxial static compression as expressed by stress σ .
The main goal of the presented research was then to use neural networks to find the most appropriate model, which, according to the above general formula, would be able to estimate a stress response for a given strain of a closed-cell aluminium foam of a given density.
Together with the assumption about the general form of Relation (2), some choices about the artificial intelligence tool also had to be determined. The discussion on neural network structural specifications such as the number of layers, activation functions, the number of inputs, the optimization of weights, the number of neurons, preprocessing of inputs, the choice of learning parameters, inclusion of statistical approach, etc., has been ongoing in recent years, e.g., [56,57,58,59,60]. However, it is a common belief that no universal method for assuming these parameters exists or that there is rather little guidance [57,61,62]. In consequence, the approach to each data set and each application has to be designed individually or within a class of similarity. In the face of a lack of prior works using neural networks for the modelling of closed-cell foams, it was decided that the main directions for the network structure in the presented study would be based on previous research on open-cell aluminium [49,50,51].

1.3. Research Significance

Artificial intelligence is an interesting, modern approach in engineering [63,64,65]. It can be used to address, among other things, mechanical problems in structural engineering [66,67], in civil engineering and architecture [68,69,70,71] and in material engineering [72,73,74,75,76]. As has already been said, metal foams can find their place in these fields, so building good-quality models for cellular metals with the help of neural networks is a new, tempting solution worth investigation and development. The starting point has already been reached for open-cell aluminium [49,50,51], and now the research has been continued for closed-cell aluminium foam—the results of which are reported in this article.
The presented research consisted of a few stages. It was decided compressive tests would first be performed to obtain experimental data for network training (Section 2 and Section 3.1). Next, a general form of the network structure was accepted: a two-layer feedforward network with a Levenberg–Marquardt training algorithm (Section 3.2). Specification of hyperparameters was performed in a specially designed algorithm (Section 3.3 and Section 3.4). Results were assessed in a two-step evaluation procedure according to assumed measures (Section 3.5 and Section 4). All in all, the research was aimed at answering the following questions:
  • Is it possible to describe the phenomenon of compression of aluminium foams with a model generated from neural networks based on the assumed general relation?
  • What assumptions/general choices about the networks’ structure and learning parameters should be determined?
  • How should the obtained results be evaluated? What criteria and what measures should be assumed?
  • What structure and learning parameters should be assumed to most adequately describe the phenomenon?
  • Is the model valid only for the training data (particular model), or is it capable of prognosing for new data (general model)?
The obtained results prove that these questions can be answered positively and with details that are described in the present paper.

2. Material and Experiment

2.1. Material

The studied material was aluminium foam with the following general morphological characteristics: closed-cell, stochastic and isotropic (in representative volume). Material was cut into cubic specimens of 5 × 5 × 5 cm3. Detailed samples’ dimensions were determined according to procedure from [77] with callipers with a 0.01 mm scale VIS (VIS, Warszawa, Poland) and Vogel 202040.3 (Vogel Germany GmbH & Co. KG, Kevelaer, Germany). Masses of foams were measured using balance WPS600/C (Radwag, Poland). Apparent density of specimens was calculated as the ratio of mass over volume; average density was ρ = 0.240 g/cm3. Details of specimens’ characteristics are presented in Table 1, and a photo of one of the samples is shown in Figure 1a. Photos of all specimens before the experiments are enclosed in Appendix A.

2.2. Uniaxial Compression Experiments

Samples were tested using MTS 810 testing machine of class 1 (MTS Systems Corporation, Eden Prairie, MN, USA) with the additional force sensor interface (capacity: 25 kN). Experimental data were gathered with the help of the computer programme TestWorks4 (MTS Systems Corporation, Eden Prairie, MN, USA, version V4.08D). Photographs were taken with Casio Exilim EX-Z55 camera (Casio Computer Co, Ltd., Tokyo, Japan). The tests were conducted at room temperature. The compression procedure was performed quasi-statically—the strain rate was assumed as 2.5 × 10−4 m/s. The initial force (preload force) was assumed as 10 N. Figure 1b presents one of the specimens in the testing machine. All specimens were compressed up to strain ~ 70 % . Testing was performed in accordance with the procedure from [34,35,36,37].
Figure 2 shows results of the compression experiments in the form of a stress–strain graph. One can observe that the material’s response is connected to its density so that the plot values of the lightest samples are the lowest and those of the heaviest are the highest. Additionally, all plots exhibit traits characteristic of compression of a closed-cell metal foam [40]: the initial steep region interpreted as the elastic phase, then the first local maxima associated with compressive strength, followed by the plateau region where densification occurs and lastly, a section where the curves become steep again. It is worth mentioning that during densification individual cells or cell groups collapse plastically, which is reflected in the graph by many local maxima and minima appearing alternately among the plateaus.
General material features, which were determined based on experimental results, included average compressive strength σ c = 1.40 MPa, average plateau strength σ pl = 1.44 MPa and average plateau end ε pl . f = 45.78 %.

3. Methods: Computations with Artificial Neural Networks

The main concept of the research stage devoted to neural networks was to generate and train a considerable set of comparable networks, then assess them according to assumed criteria and finally—based on the choice of the ‘best’ network—determine the most adequate neural network structure and learning parameters for building a model of the phenomenon of closed-cell aluminium compression.
Before the realization of this idea, a set of assumptions was determined. The most important pre-choice was designing a two-step evaluation—this assumption affected all specific research actions. It was decided that we would generate and train all networks using experimental data for 11 (out of 12) specimens. Obtained networks were then evaluated for the first time in terms of whether they were good-quality models of compression of those particular 11 samples. This step was important from the point of view of understanding the complexity of the physical phenomenon of aluminium foam compression, which was approached to be described. The data for the left-out sample (12th) were used in the second evaluation in terms of whether the networks were capable of adequate prognosis. This step was important for the generalization potential of the obtained models.
Other assumptions involved choosing neural network learning parameters, designing the path for the building and training of networks and selection of criteria measures. They will be discussed below, together with the detailed description of the proposed computational method. First, the processing and preparation of experimental data will be reported (Section 3.1). Next, detailed information about the structure of the assumed networks will be given (Section 3.2). Following that, the choice of learning parameters (Section 3.3.) and the algorithm used to build and train networks (Section 3.4) will be described. Finally, evaluation criteria for accuracy will be discussed (Section 3.5).
Calculations were performed using Matlab R2017b and R2019A in conjunction with Excel 2016.

3.1. Data for the Networks

During the experimental stage 12 aluminium foam specimens underwent compression. Collected data were initially preprocessed to suit as arguments and targets for neural networks (Section 3.1.1). Thereafter, the data set was divided into parts dedicated to network learning and verification (Section 3.1.2). The last aspect of data preprocessing was normalization, and this had already been performed within the NN computations. The reverse procedure (denormalization) had to be performed after the training of networks (Section 3.1.2).

3.1.1. Initial Preprocessing of Experimental Data

Raw data collected during experiments with data acquisition frequency of 100 Hz were subject to initial preprocessing consisting of smoothing—to attenuate noise on the load and displacemnet transducer signals—and rediscretization—to set a uniform strain data vector, common for all specimens. Smoothing of the data was performed in the time domain using cubic smoothing splines [78,79]. The aim of smoothing was to eliminate the scatter of the raw data and, at the same time, to preserve the original stress–strain response, as exemplified in Figure 3. For this purpose the inbuilt Matlab function csaps was used with a smoothing parameter p = 0.01 [80]. The parameter’s p value was chosen by a visual examination of the stress–strain plots corresponding to the raw and smoothed data for the interval p 0.01 ; 0.99 (examples are depicted in Figure A2 in Appendix B). As a result, smoothed sigma–epsilon data for every specimen contained 1000 data pairs, in the strain range from 0 to 69%, which was the widest common range recorded for all specimens. Such a number of points ensured the possibility of sufficiently precise mapping of the considered stress–strain curves. Due to the assumption that the strain vector was common to all specimens, neural networks were expected to correctly predict only the stress values as the responses to the given strains. Simultaneously, the same number of data points assures the same impact of each specimen on the learning/validation process of neural networks.
Depending on specimens’ density and stochastic cellular structure their compressive response varied, which—let it be recalled again—can be seen in the above Figure 2: Plots for lighter samples are in the bottom part, while plots for the heavier ones are in the top. Density then, along with strain, had to be the input arguments for the networks. The target was to be stress. This is in agreement with the already cited theoretical approach as in Formula (1) and with the primary assumption expressed in Formula (2). Thus, arguments for the neural networks were set into n = 12,000 vectors (1000 for each sample): A i = [ ε i , ρ i ] T , where: i = 1 , 2 , , n ; ε i —the initially preprocessed value of strain from experiments for the given sample in (%); and ρ i —the apparent density of the given sample in (g/cm3). The targets were experimental values of stress σ i in (MPa) after initial preprocessing. The sequence of indices i in arguments and targets is of course corresponding.

3.1.2. Division of the Data Set

The experimental data set obtained for 12 specimens was divided in general into two subsets:
  • Data of 11 specimens, which were devoted to building the NN model of the phenomenon of compression of these particular aluminium foam samples;
  • Data of 1 specimen, which were to be used later for verification of whether the obtained model could be used as a general model, that is, for prognosing the phenomenon of the compression of aluminium foam with respect to different materials’ apparent density.
As for building the particular model, the assumed neural network learning procedure consisted of three stages: training, validation (network self-verification) and testing (exposition to new data) [81,82,83]. So the data set from the compression of 11 specimens had to be subdivided to assure data for all three stages. It was assumed that 60% of the data would be devoted to the training phase, 20% to validation and 20% to testing. This was practically conducted by assigning sequentially every fifth input–target pair starting from i = 4 to the validation data subset and every fifth input–target pair starting from i = 5 to the test data subset. The remaining data constituted the data subset for the training phase. The division into subsets was not changed, so the subsets contained exactly the same data for each studied network.
Experimental results for specimen Z_14_p were separated as the data for verification of the prognosis capability. This sample was chosen because its graph was more or less in the middle of all individual stress–strain plots (Figure 2).

3.1.3. Normalization and Denormalization

As for the normalization, the inbuilt Matlab function mapminmax was used [84,85,86]. This function is a linear transformation of data comprising a certain range into the interval of given desired boundaries and can be expressed as in Formula (3):
V = V V min V max V min · ( V max V min ) + V min .
In Formula (3): V is the value to be transformed; V is the new value; V max ,   V min are original interval boundaries; and V max ,   V min are the desired range boundaries (in normalization they are assumed as −1 and 1). In our study vectors A i were normalized respectively into the following input vectors: X i = [ x 1 , i = ε norm , i ,   x 2 , i = ρ norm , i ] T , where i = 1 , 2 , , n .
Such prepared data were used in network training. Networks’ outputs, which were supposed to correspond to stresses, were obtained. Yet, their values were within the interval of normalization 1 , 1 : y i = σ appr . norm , i . Hence, the reverse procedure of output denormalization was necessary: y i = σ appr . norm , i post processing y i = σ appr , i .

3.2. Assumed Artifitial Networks Architecture

The authors assumed the general network structure type and activation function types according to what is recommended for nonlinear function approximation in the literature [87] and also to what had been proved to work well in a previously investigated case of open-cell metals [49,50].
Figure 4 shows a detailed scheme of the network architecture, which will be explained below in detail. The index i = 1 ,   2 ,   ,   n , which indicates the numbering of the given input data and the respective target, is omitted in the below discussion and Expressions (4)–(13) for simplification. This does not affect the logic of the reasoning since networks use all inputs and targets for training and verification, so all data (all i-s) are used, and each is only used once.
The neural network architecture was chosen as a feedforward network with two layers: one hidden layer, labelled in the research with {1} and one output layer labelled with {2}. Argument A, after normalization, entered the hidden layer {1} as input X. The number of neurons in the hidden layer was assumed as varying within the range s { 1 } = 1 ; 50 . The function tansig—a hyperbolic tangent sigmoid (mathematically equivalent to tanh [84])—was chosen as the activation function for the hidden layer. It was denoted as f { 1 } activ and expressed as in Formula (4):
f { 1 } activ ( a r g { 1 } ) = e 2 · a r g { 1 } 1 e 2 · a r g { 1 } + 1 = tan h ( a r g { 1 } ) ,
where the argument of the transfer function in the hidden layer was defined as:
a r g { 1 } = W { 1 } · X + B { 1 } .
Symbols in Formula (5) denoted the following magnitudes:
  • X —the input vector, mathematically formulated as in Equation (6) below;
  • B { 1 } —the column vector of biases for layer {1}, mathematically formulated as in Equation (7) below;
  • W { 1 } —the matrix of weights of inputs for layer {1}, mathematically formulated as in Equation (8) below:
    X = [ x 1 ,   x 2 ] T ,
    B { 1 } = [ b 1 { 1 } , b 2 { 1 } ,   , b p 1 { 1 } ,   b p { 1 } , b p + 1 { 1 } ,   ,   b s { 1 } { 1 } ] T ,
    W { 1 } = [ w 1 , 1 { 1 } w 1 , 2 { 1 } w 1 , 3 { 1 } w 1 , 4 { 1 } w 2 , 1 { 1 } w 2 , 2 { 1 } w 2 , 3 { 1 } w 2 , 4 { 1 } w p 1 , 1 { 1 } w p 1 , 2 { 1 } w p 1 , 3 { 1 } w p 1 , 4 { 1 } w p , 1 { 1 } w p , 2 { 1 } w p , 3 { 1 } w p , 4 { 1 } w p + 1 , 1 { 1 } w p + 1 , 2 { 1 } w p + 1 , 3 { 1 } w p + 1 , 4 { 1 } w s { 1 } , 1 { 1 } w s { 1 } , 2 { 1 } w s { 1 } , 3 { 1 } w s { 1 } , 4 { 1 } ]
Computations in the hidden layer {1} led to the column vector of outputs Y { 1 } of the hidden layer {1}. This vector had the form shown in Formula (9):
Y { 1 } = [ y 1 { 1 } , y 2 { 1 } ,   , y p 1 { 1 } ,   y p { 1 } , y p + 1 { 1 } ,   ,   y s { 1 } { 1 } ] T .
Vector Y { 1 } then entered the output layer {2}. The number of neurons in layer {2} was unchangeable and was assumed as s { 2 } = 1 , taking into account the single variable output [88]. The activation function for the output layer, f { 2 } activ , was chosen as purelin [84] and expressed as in Formula (10):
f { 2 } activ ( a r g { 2 } ) = a · a r g { 2 } ,
where a was a directional coefficient assumed as constant a = 1 and the argument of the transfer function in the output layer was defined by the following Formula (11):
a r g { 2 } = W { 2 } · Y { 1 } + b 1 { 2 } .
Symbols in the above expression denote the following magnitudes:
  • Y { 1 } —the hidden layer outputs, as in Formula (9);
  • b 1 { 2 } —the bias for the output layer, a scalar value;
  • W { 2 } —the row vector of weights of inputs for layer {2}, mathematically formulated as in Equation (12) below:
    W { 2 } = [ w 1 , 1 { 2 } w 1 , 2 { 2 } w 1 , j 1 { 2 } w 1 , j { 2 } w 1 , j + 1 { 2 } w 1 , s { 1 } { 2 } ] .
The final result of network training y was the output of the layer {2}: y { 2 } , as in Formula (13) [89]:
y = y { 2 } = f { 2 } activ ( W { 2 } · ( f { 1 } activ ( W { 1 } · X + B { 1 } ) ) + b 1 { 2 } ) .
In the last stage, the outputs underwent denormalization so as to express approximated stress.

3.3. Choice of Learning Parameters

The selection of learning parameters, such as activation functions, training algorithm, performance function and its goal, learning rate, momentum and others should be in correspondence with the specific data assigned to the learning process and the phenomenon that they represent [90]. Below are the presented choices determined for this study and their justification. The numerical values of the assumed learning parameters are summarized in Table 2.
As for the activation functions, a hyperbolic tangent sigmoid function (Formula (4)) was implemented in the hidden layer {1}. According to [87], tansig is a recommendation for addressing nonlinear problems, and the closed-cell aluminium compression is a nonlinear phenomenon. Additionally, the hyperbolic tangent sigmoid function was successfully verified in preliminary calculations and previous studies on the modelling of open-cell aluminium [49,50]. The activation function for the output layer {2} was the linear activation function—purelin (Formula (10)).
Regarding the training algorithm, the Levenberg–Marquardt procedure was selected [91]. For this procedure the mean square error, as defined in Formula (14), was chosen as the performance function:
M S E = i = 1 n ( t i o i ) 2 n = i = 1 n ( e i ) 2 n ,
where:
  • t i i -th target for the network;
  • o i i -th output for the network;
  • i —individual data index;
  • n —number of all data.
The error was defined as in Expression (15):
e i = t i o i .
The performance function’s goal was set as 0, and the minimum performance gradient was assumed as 10−10. Based on the former application of neural network modelling to the compressive behaviour of cellular metals [49,50] the number of epochs to train was set as 100,000.
The learning rate and momentum were assumed as the result of a specially designed procedure. The procedure consisted of the examination of a number of networks in terms of assigning them various values of these two learning parameters and comparing the obtained values of the performance function ( M S E ) in each case. Based on the robustness analysis for a related phenomenon of compression in open-cell aluminium [49,50], it was decided that the architecture of examined networks would have the complexity of 12 neurons in the hidden layer. Learning rate values were taken from the range 0.05 ; 1.00 with the step 0.05 . Momentum values were taken from the range 0.1 ; 3.0 with the step 0.1 . Results are shown in Figure 5; the chosen values were 2.0 for momentum and 0.05 for the learning rate, and they occurred for the minimum M S E m i n = 0.023   MPa 2 . Additional remarks about the presented calibration can be found in Appendix C, Figure A3.

3.4. Algorithm for Building and Training Networks

In order to generate a considerable number of comparable networks that modelled the aluminium foam compression, the algorithm shown in Figure 6 was implemented. The algorithm consisted of two procedures: P1 (parent) and P2 (nested). The main aim of P1 was to provide varying unit network architecture parameters by attributing a given number of neurons s { 1 } to hidden layers of NNs. The aim of P2 was to build, train, validate and test the given network, which was structured according to parameters from P1, and also to compute measures used in further criterial network evaluation. Please note, that in accordance with the general research concept and the main assumption explained in the beginning of Section 3, the data used in P2 were the subset for 11 out of 12 specimens. In conclusion, the algorithm (P1 + P2) served to build, train and test individual models; however, the first-step collective evaluation was performed later (Section 3.5).
The range of s { 1 } in P1 was assumed as s { 1 } = 1 ; 50 . Such a range was selected due to the specificity of the data for NN. Additionally, previous studies regarding open-cell metals [49,50] have shown that such an interval allows for additional conclusions about robustness and overfitting [82,87].
In the first iteration of a network learning process the initial values of weights and biases in the first layer are assigned randomly. This means that networks with the same architecture specifications almost certainly lead to different solutions. Taking this fact into account, in the designed algorithm there were not only networks varying in the hidden layer neuron size built but also for each given s { 1 } , and 10 networks were created, trained, validated and tested (procedure P2). These calculations were labelled as a p p r o a c h e s and numbered consecutively from 1 to 10. Such repetitions increased the probability of obtaining the minimum of the global performance function [84,89]. An additional advantage of multiple approaches is that they enable the discussion of robustness.

3.5. Evaluation Criteria

At this point it should be noted that the choice of learning parameters (mostly the selection of the performance function and its goal, Section 3.3) already imposed certain aspects of the evaluation approach. This is inevitable and ‘internally’ connected to building networks and assuming their structure and learning mode.
As for the evaluation of network results ‘from the outside’, there may be multiple approaches assumed. Two most obvious paths are: one may expect the network to either provide the most accurate outputs or to provide results in a short time, which also applies to a simpler model. Additionally, the repeatability of results may play a role. In general, then, balancing between—or combining—different evaluation strategies is what designers choose most often, and so the same was implemented in the present research. A formal description of the mentioned evaluation approaches and mathematical expressions for the respective criteria are given in Section 3.5.1, Section 3.5.2, Section 3.5.3 and Section 3.5.4.
In the present research, the complexity of the structure of the neural network consists of the number of neurons in the hidden layer {1}. The parameter which characterizes this complexity is s { 1 } . As was scrupulously explained in Section 3.2, this parameter decides the sizes of the matrix W { 1 } and the vectors B { 1 } and W { 2 } . So, one can say that the number of neurons in the hidden layer characterizes the modelled phenomenon together with the data assigned to the learning process. The aim of the evaluation in the present study, then, is to choose such an s { 1 } , which would provide the most appropriate model.

3.5.1. The Idea of a Two-Step Evaluation

This study was designed to conduct a two-step evaluation (compare: the key assumption described in the beginning of Section 3), which will now be explained thoroughly.
The networks built and trained in the algorithmic computations were particular models of the phenomenon of the compression of 11 physical objects composed of closed-cell aluminium. Part of the experimental data for these specimens was devoted to learning: 3/5 of data to the training stage and 1/5 of data to the validation stage. The remaining 1/5 of data were devoted to testing the model against unknown information, which still concerned the 11 specimens. Results from the test stage were the subject of the first-step evaluation. This evaluation allowed for a collective view of all particular models and the choice of the most appropriate model of the compression of the 11 specimens.
The following step consisted of exposing the trained networks obtained from the algorithm to data for another physical object—the 12th sample. Results of this mapping were subject to the second-step evaluation. This time the performed evaluation allowed for the assessment of whether the networks could be used not only as particular models but also as general models capable of prognosis. If the answer was positive, the second-step evaluation also allowed for choosing the best general model.
Such a design of the evaluation stages reflects the bias of the data that we intentionally wanted to introduce; by the assumption of the relation type expressed in Formula (2) it was assumed that apparent density affected the response of aluminium foam subjected to compression. This was the basis for holding one specimen away, so that the prediction potential of the given model was verified with respect to the new apparent density value formed outside the values that the model would ‘know’.

3.5.2. Accuracy of Outputs, Overfitting

Accuracy and overfitting are two sides of the same coin, and the criteria to assay them may be formulated analogously. In the used approach, accuracy would be assessed in the first-step evaluation and overfitting in the second-step evaluation.
In the first case the mean absolute relative error calculated for the network testing stage obtained for the given a p p r o a c h was chosen as the measure used for the formulation of the assessment criterion [92]. The criterion reads: the minimal value of all mean absolute relative errors obtained for all architectures and all a p p r o a c h e s from the test stage is the indicator of the ‘best’ network. In other words, it indicates the particular model with s { 1 } best neurons in the hidden layer and trained at the particular a p p r o a c h best , for which the condition is fulfilled. Such a formulated main criterion is symbolically expressed as in Formula (16):
C r i t 1 1 = min { ( M A R E Test ) s { 1 } , a p p r o a c h } ,
where:
  • C r i t 1 1 —value of the measure assumed for Criterion 1 used for the first-step evaluation;
  • s { 1 } —given number of neurons in the hidden layer;
  • a p p r o a c h —given number of repetitions of the network learning for the given network architecture;
  • ( M A R E Test ) s { 1 } , a p p r o a c h —maximum absolute relative error obtained for the testing stage, according to the Formula (17):
( M A R E Test ) s { 1 } , a p p r o a c h = mean { ( | t i Test o i Tes t t i Test | ) s { 1 } , a p p r o a c h } ,
where:
  • t i Tes t i -th target for the network in the testing stage;
  • o i Test i -th output for the network in the testing stage;
  • i —individual data index, should exhaust all data.
The above criteria for the best accuracy should be complemented by an additional criterion to prevent the choice of the model, which is overfitted. That is, to prevent the situation in which the chosen best network memorized the data instead of working out relations hidden in the data. Such a network would not be capable of prognosis, so it could not be used as a general model. For this reason, the second-step evaluation was proposed, in which results of the verification of the network were analyzed against data previously unknown to it. Again, a mean absolute relative error was chosen as the criterion measure. The criterion takes the form symbolically written in Formula (18):
C r i t 1 2 = ( M A R E Verif ) s { 1 } best , a p p r o a c h best C r i t 1 2 . threshold ,
where:
  • C r i t 1 2 —value of the measure assumed for Criterion 1 used for the second-step evaluation;
  • ( M A R E Verif ) s { 1 } best , a p p r o a c h best —mean absolute relative error from the verification of the network with the given s { 1 } best and taught in the given a p p r o a c h best against external data;
  • C r i t 1 2 . threshold —threshold for Criterion 1 used for the second-step evaluation;
In cases where accuracy is particularly important one may demand that:
C r i t 1 2 . threshold C r i t 1 1 .
In the event of considerable overfitting, it would not be possible to fulfil Expression (19). One would then iteratively verify networks respective to next consecutive local minima among the set { ( M A R E Test ) s { 1 } , a p p r o a c h } until Condition (19) is met.
There might be more detailed demands imposed on the outputs, e.g., that outputs are equally credible in the whole mapping range or that none of the absolute relative errors exceed a certain value. In such cases one might choose other or additional measures as auxiliaries in evaluation criteria. Such a measure could be, for example, the maximum absolute relative error obtained for the network with the given number of neurons in the hidden layer and in the given approach, ( M a x A R E ) s { 1 } , a p p r o a c h , as in Formula (20):
( M a x A R E ) s { 1 } , a p p r o a c h = max { ( | t i o i t i | ) s { 1 } , a p p r o a c h } ,
where all symbols at the right side of the equation are defined as in Relations (14,17).

3.5.3. Speed of Calculations

When balancing model complexity and computation time is needed, one should note that there are three main circumstances that affect the speed of network operation: computation capacity of the computing machine, precision of significant numbers (also depending on the data format) and complexity of calculations resulting from the size of the network structure. So, from the viewpoint of the network design, the crucial factor is the minimum network complexity that assures satisfactory outputs. The criterion for finding the minimum sufficient number of neurons s { 1 } min (and the assigned a p p r o a c h min ) might be to demand that the assumed measure, let it be mean absolute relative error from the test stage, does not exceed a certain level. Mathematically, in the first-step evaluation, the criterion reads as in Expression (21) below:
C r i t 2 1 = ( M A R E Test ) s { 1 } , a p p r o a c h C r i t 2 1 . threshold ,
where:
  • C r i t 2 2 —value of the measure assumed for Criterion 2 used for the first-step evaluation;
  • C r i t 2 2 . threshold —threshold for Criterion 2 used for the first-step evaluation;
  • s { 1 } , a p p r o a c h and ( M A R E Test ) s { 1 } , a p p r o a c h —defined as in Formulas (16) and (17).
For the second-step evaluation this could be formulated in Condition (22) below:
C r i t 2 2 = ( M A R E Verif ) s { 1 } min , a p p r o a c h min C r i t 2 2 . threshold ,
where:
  • C r i t 2 —value of the measure assumed for Criterion 2 used for the second-step evaluation;
  • ( M A R E Verif ) s { 1 } best , a p p r o a c h best —mean absolute relative error from the verification of the network with the given s { 1 } min and taught in the given a p p r o a c h min against external data;
  • C r i t 2 2 . threshold —threshold for Criterion 2 used for the second-step evaluation.
In the case Condition (22) is not satisfied, one should then verify networks with an increased number of neurons in the hidden layer or agree to lower the accuracy demand by increasing the threshold until Expression (22) is met.
The assumption of the mean absolute relative error value as the criterion measure is not the only possibility. In the case the designer or user is more interested in obtaining results which have the same reliability in the whole interval or having no errors (not only the mean) that exceed the assumed threshold, they could select another measure, such as the maximum absolute relative error, as in Definition (20).

3.5.4. Robustness

Balancing accuracy and computation speed does not exploit the network evaluation problem. Obtained networks should also be verified with regard to robustness, which can be understood as insensitiveness to the randomness of the initial bias and weight assumptions. One would be searching for such a number of neurons in the hidden layer, for which, regardless of the random parameters, repeatability of the results is ensured [93]. Analysis of robustness may indicate more than one specific network complexity s { 1 } robust = { s { 1 } robust , j } , which would comply with the demand of repeatability.
In the present research the sum of absolute errors from validation is assumed as the measure for robustness criterion [94]. This criterion is expressed in Formula (23) below:
C r i t 3 = | ( S A E Valid ) s { 1 } , a p p r o a c h 1 k a p p r o a c h = 1 k ( S A E Valid ) s { 1 } , a p p r o a c h | C r i t 3 threshold , for   all   a p p r o a c h 1 ; k ,
where:
  • C r i t 3 —value of the measure assumed for Criterion 3;
  • k —total number of a p p r o a c h e s for the given s { 1 } ;
  • C r i t 3 threshold —threshold for Criterion 3;
  • ( S A E Valid ) s { 1 } , a p p r o a c h —as in Formula (24):
    S A E = i = 1 n ( t i o i ) = i = 1 n e i ,
    where remaining symbols are denoted as in (14,15).
The strict demand would be to assume that the threshold is near or equal to zero:
C r i t 3 threshold 0 .
However, a strongly posed demand such as Demand (25) is not necessary in all cases. Moreover, there would be instances for which a lighter condition would be justified: only limited (23) and not necessarily (25). This would include cases for which for ‘only’ a considerable majority of a p p r o a c h e s for the given s { 1 } robust , j would meet (23) and a negligible number of networks would not. In some cases, this would also be acceptable solution. Let it be expressed it as Alternative Criterion 3 (26) below:
A l t C r i t 3 = | ( S A E Valid ) s { 1 } , a p p r o a c h 1 k a p p r o a c h = 1 k ( S A E Valid ) s { 1 } , a p p r o a c h | A l t C r i t 3 threshold , for   M   of   all   a p p r o a c h 1 ; k ,
where:
  • A l t C r i t 3 —value of the measure assumed for Alternative Criterion 3;
  • A l t C r i t 3 threshold —threshold for Alternative Criterion 3, which may also not necessarily be assumed as 0;
  • M —number or percentage value for total a p p r o a c h e s that must comply with Condition (26);
  • other symbols—as defined in (23).

4. Results and Discussion

As was said in Section 3.1.2 and Section 3.5.1, experimental data from the compression of 12 specimens were divided into two sets: the data of 11 specimens were devoted to building particular models, and their first-step evaluation and the data of one sample were left aside for external verification in the second-step evaluation. Sample Z_14_p was selected for the external verification. In consequence, the results are now presented as follows:
  • Section 4.1 will give results from the validation stage from the training of networks (11 sample data set).
  • Section 4.2 will be devoted to choosing the most adequate network according to criteria of the first- and second-step evaluation and thus will show results from the test stage of teaching networks (11 sample data set) as well as from the verification of networks against external data (specimen Z_14_p).
  • Section 4.3 will present detailed results for the final chosen networks.
Please note that the colour and notification convention is common for all figures presented in Section 4 and Appendix D. The convention will be explained in detail by the description of mean square error results in Section 4.1 and later on is applied analogously and treated as known.

4.1. Internal Network Evaluation and Robustness

It was assumed that the performance function for the analyzed networks was the mean square error M S E , Definition (14). The goal for this function was set as 0. In Figure 7 below, there are the presented results obtained for the performance function at the validation stage for all networks. Individual mean square errors are depicted as hollow blue dots. Additionally, there are solid orange dots in the plot denoted as av _ M S E , which refer to the arithmetic average of M S E s obtained for all a p p r o a c h e s for the given number of neurons in the hidden layer s { 1 } . A trend line for magnitude av _ M S E is also shown (dashed orange line).
The general conclusion drawn from these results might be that modelling of the compression of closed-cell aluminium with networks of increasing complexity of the hidden layer is not a chaotic but an ordered phenomenon. The relation between complexity and convergence, understood as nearing to the achievement of the performance function’s goal, can be very well described by a power law (determinacy coefficient for such a relation was obtained as R 2 = 0.9852 ).
The problem of robustness will now be discussed. Looking at Figure 8 and taking into account Criterion 3 (23) and Demand (25), one can observe that the difference between S A E Valid s from individual es and the average of all a p p r o a c h e s for the given s { 1 } tends to be zero for all instances of the a p p r o a c h for s { 1 } robust 18 . We can also slightly alleviate the robustness condition and utilize Alternative Criterion 3 (26) with a threshold near 0, assuming M 9 = 9 of 10 a p p r o a c h e s or M 8 = 8 of 10 a p p r o a c h e s . In these cases, we read from Figure 8 that s { 1 } robust . M 9 14 and s { 1 } robust . M 8 11 .
The results are in agreement with the pre-assumption determined at the stage of the choice of learning parameters. At that phase the influence of learning rate and momentum on the training process of networks with s { 1 } = 12 neurons in the hidden layer was evaluated. Such a choice fits Alternative Criterion 3 (26) with the parameter M 8 = 8 .

4.2. Choice of the Most Appropriate Network Specifications

In general, we are looking for the most appropriate number of neurons in the hidden layer that would guarantee the desired outputs’ quality with regard to the assumed criteria. The total interval of s { 1 } in the studied algorithm was s { 1 } = 1 ; 50 . In Section 4.1 it was already stated that this interval should be narrowed to an s { 1 } less than 18 or 14 or 11 neurons, depending on the level of the desired results’ repeatability. These boundaries should be taken into account in the evaluation of models. Both the first- and second-step evaluation are referred to in Figure 9 and Figure 10 below. Figure 9 shows the mean absolute relative error from the test stage of training networks, M A R E Test , with respect to the size of the hidden layer. Figure 10 gives the mean absolute relative error from the verification of trained networks (particular models) against external data, M A R E Verif , with respect to the size of the hidden layer. Please note that the vertical axis in Figure 10 was scaled. Due to this graphic processing, some of the results could not fit in the plots. This was performed in order to present the results clearly and legibly in the range of the hidden layer neuron number important for the discussion. The omission of some of the results in these plots did not affect reasoning or conclusions. In Appendix D we include the respective graph (Figure A4), which gives all obtained results without vertical axis scaling.

4.2.1. Most Accurate Outputs, Overfitting

In the first-step evaluation, according to Criterion (16) for accuracy, the minimum value of M A R E Test and the respective network structure for it (identified by s { 1 } best ,   a p p r o a c h best ) are sought. Table 3 presents results found in this search. The complexity of the ‘best’ network was 48 neurons in the hidden layer, which is far beyond the boundaries set in the analysis of robustness. The application of the second-step evaluation shows that for this network overfitting was on the unsatisfactory level (the last column of Table 3), and though the particular model was the best in terms of accuracy, it cannot be used for prognosis. One could iteratively search for consequent minima in C r i t 1 1 , but judging from the lack of diversity in the results until the limit of robustness, this path would be too inefficient to follow. Instead, we proceed to the alternative approach described in Section 3.5.2.

4.2.2. Outputs in Terms of Increasing Speed of Calculations

In this analysis Criterion (21) and (22) were used with the mean absolute relative errors chosen as the measures M A R E Test , M A R E Verif , respectively, in the first- and second-step evaluation. It was decided that several threshold values in Criterion (21) would be assumed in the first-step evaluation, so multiple indications of s { 1 } min and the respective   a p p r o a c h min were obtained. Then Criterion (22) was applied in the second-step evaluation to the indicated models. Results from this evaluation are summarized in Table 4.
Looking at Figure 9, one notices that all particular models with s { 1 } 5 (except one) produced a mean absolute relative error on a good engineering accuracy level of less than M A R E Test 5 % . The 5% threshold had already been obtained for the first time by a particular model with four neurons in the hidden layer. The simplest network [ s { 1 } min ,   a p p r o a c h min ] = [ 4 , 2 ] provides a relatively good particular model but gives an almost two-times-greater mean absolute relative error when it comes to prognosis. Distinctively good results were obtained for the network [ s { 1 } min ,   a p p r o a c h min ] = [ 6 , 6 ] ; errors obtained in the verification of prognosing the capability of the model were even better than in the particular model itself. It should be noted that all the remaining networks listed in Table 4 exhibited M A R E Verif 5 % , which is a good engineering accuracy level for prognosis capability. Lastly, none of the analyzed particular models achieved a better accuracy in prognosis than M A R E Verif . min = 1.661 % (the network [ s { 1 } min ,   a p p r o a c h min ] = [ 14 , 5 ] ). This could mean that it is extremely difficult to obtain a model capable of a more accurate prognosis without changing the structure or learning parameter assumptions and that at least such a value of error is inevitable. One more observation should be noted: Figure 10 shows that for networks with 13 neurons, instances of a p p r o a c h e s with considerable overfitting already start to occur.
The networks with four and six neurons in the hidden layer, distinguished in the previous paragraph, do not fall into the intervals, which assures robustness ( s { 1 } robust 18 , s { 1 } robust . M 9 14 , s { 1 } robust . M 8 11 ). This condition is fulfilled by the model [ s { 1 } min ,   a p p r o a c h min ] = [ 11 , 4 ] , for which M A R E Test 2 % and M A R E Verif 3 % . These two values confirm that the particular model here is very good in terms of accuracy, and it also has the ability to provide high-quality prognoses. The probability of obtaining similar a quality in a model in repeating network training is sufficient ( s { 1 } robust . M 8 11 ).
In Section 4.2. the analysis of robustness was presented. Quite strict demands, including C r i t 3 threshold 0 and A l t C r i t 3 threshold 0 in Conditions (23) and (26), respectively, were assumed. Nevertheless, one does not have to be that rigorous. Let us now complement the analysis of repeatability, but for the stage of the first- and second-step evaluation. This requires the introduction of another measure: av _ M A R E —the arithmetic average of the M A R E s obtained for all a p p r o a c h e s for the given number of neurons in the hidden layer s { 1 } :
av _ M A R E = 1 k a p p r o a c h = 1 k ( M A R E s { 1 } , a p p r o a c h ) ,
where:
  • s { 1 } —given fixed number of neurons in the hidden layer;
  • k —total number of a p p r o a c h e s for the given s { 1 } ; here k = 10 .
In the first-step evaluation the measure defined in Formula (27) is the average mean absolute relative error for the test stage in training of the given network ( av _ M A R E Test ) , and in the second-step evaluation it is the mean absolute relative error from the verification of the given network against external data ( av _ M A R E Verif ) .
By application of Measure (27) in Criterion (21), it was possible to indicate the number of neurons in the hidden layer for which the new measure complied with the assumed thresholds s { 1 } min . av . M . T . Results are summarized in the first three columns of Table 5. The results show that if we regard the average results of all a p p r o a c h e s for a given s { 1 } , av _ M A R E Test 5 % is fulfilled already for five neurons in the hidden layer. We also found confirmation that networks with at least 11 neurons in the hidden layer have very good accuracy on average: av _ M A R E Test 2.5 % .
The average mean absolute error was also substituted in Criterion (22) in the second-step evaluation. This time, however, the criterion thresholds had to be elevated. They are listed in Table 5 together with corresponding minimum numbers of neurons in the hidden layer s { 1 } min . av . M . V . and obtained values of the criterion measure in columns 4–6. The first thing that catches attention is that none of the networks achieved av _ M A R E Test 4 % , and only the result for s { 1 } min . av . M . V . = 11 is slightly above this limit. It turns out that the criterion measure value obtained for networks with 11 neurons in the hidden layer was the global minimum for s { 1 } min . av . M . V . 1 , 50 . This once again shows that such model complexity produces good-quality outputs with relatively low errors both in the particular model as well as in prognosis. Lower complexity networks do not satisfy the engineering precision threshold of 5%.

4.3. Results for Optimal Networks

Considerations presented in Section 4.1 and Section 4.2 allowed for choosing particular networks to finally show as examples of how the structure and learning parameters of neural network models influence the quality of the description of the phenomenon of the compression of aluminium foams. We selected the following networks:
  • The network [ s { 1 } min ,   a p p r o a c h min ] = [ 4 , 2 ] is the least complex structure, but still provides acceptable accuracy itself ( M A R E Test ) 4 , 2 = 4.455 % and for prognosis ( M A R E Verif ) 4 , 2 = 8.688 % ; however, four neurons do not guarantee robustness.
  • The network [ s { 1 } min ,   a p p r o a c h min ] = [ 6 , 6 ] is still a relatively simple structure but assures good accuracy itself ( M A R E Test ) 6 , 6 = 3.572 % and for prognosis ( M A R E Verif ) 6 , 6 = 2.689 % ; but six neurons do not guarantee robustness.
  • The network [ s { 1 } min ,   a p p r o a c h min ] = [ 11 , 4 ] is a relatively complex structure; however, it shows very good accuracy on many levels, including ( M A R E Test ) 11 , 4 = 1.959 % and for prognosis ( M A R E Verif ) 11 , 4 = 2.976 % ; also, 11 neurons are within the boundary of 80% robustness.
  • The network [ s { 1 } best ,   a p p r o a c h best ] = [ 48 , 10 ] is a very complex structure, showing extremely good particular accuracy ( M A R E Test ) 48 , 10 = 0.507 % and very adverse overfitting in prognosis ( M A R E Verif ) 48 , 10 = 35.049 % ; 48 neurons are very safe in terms of robustness.
The performance function’s course for the above networks is presented in Figure 11. Results of the least mean square error M S E together with the epoch, for which they were attained, are located at the top of each plot and in Table 6.
It can be observed that the structure with 11 neurons needs about two times more operations than the structure with four neurons. Additionally, the very complex structure with 48 neurons needs about five times more operations than the simplest network. On the other hand, the minimum of the mean square error is about five times smaller for the network [ 11 , 4 ] than for [ 4 , 2 ] and hundred times smaller for the most complex one with respect to the simplest one. Taking into consideration these results, one notices that 11 neurons in the hidden layer provide satisfactory performance results still at a relatively low cost of calculation time.
Figure 12 below presents plots of regression for joined stages of network training (training + validation + test). Values of the Pearson coefficient are shown at the top of each graph. Equations for the linear regression line are given at the left side of each graph. These plots are supplemented by the regression of each training stage separately, but those graphs were moved to Appendix E: Figure A5, Figure A6, Figure A7 and Figure A8. One can observe that Pearson’s coefficient increases together with the increase in network complexity. However, all results are R 0.99 , which means that all particular models provide very good correlation between outputs and targets.
Figure 13, Figure 14, Figure 15 and Figure 16 depict the four chosen particular models (red dots), errors (light blue) and targets (dark blue). Those plots allow one to see how all individual outputs and targets relate. For the networks with 4, 6 and 11 neurons in the hidden layers we observed satisfactory quality, while the particular network with 48 neurons in the hidden layer maps targets almost perfectly.
After the graphical presentation of the performance function, regression and accuracy of particular models, it is now time to present the prognosis capability of the chosen networks. At first, Figure 17 shows the regression for the verification of an external specimen. There are included values of the Pearson coefficient at the top of each graph and equations for the linear regression line at the left side of each graph. One can observe that for the networks with 4, 6 and 11 neurons in the hidden layer the correlation between outputs and targets is on a very good level R 0.997 . On the other hand, a lack of correlation for the network with 48 neurons confirms considerable overfitting in this case ( R = 0.5992 ).
Lastly, Figure 18, Figure 19, Figure 20 and Figure 21 present the detailed results of the prognosis of the chosen four models (red dots), errors (light blue) and targets (dark blue). In Figure 18 (network [ 4 , 2 ] ) the prognosis is over the actual stress–strain plot. Figure 19 and Figure 20 show that for 6 and 11 neurons in the hidden layer large regions of the actual stress–strain plot overlap with the prognosed outputs. Figure 20 depicts how inaccurate the prognosis from the network with 48 neurons in the hidden layer is. Such a result is a consequence of the considerable overfitting in this network.

5. Conclusions

The presented research aimed to verify the possibility of describing the phenomenon of the compression of closed-cell aluminium by means of neural networks. Additionally, it was expected that specifications for a good-quality model would be found.
The starting point was the assumption of the general relationship between strength measures and apparent density for cellular materials: σ = f ( ε , ρ ) (2). Data from compression experiments were used to train neural networks varying in structure. The verification of the obtained models was a two-step procedure: 1. Verification of particular models built using an 11-sample data set was achieved, and 2. completely new data were introduced to the networks, and verification of prognosis was performed. A series of criteria (16)–(26) were proposed and used for the evaluation of accuracy, over- and underfitting and robustness. The study was performed in the following domains: strain ε 0 , 69 % and apparent density ρ 0.200 , 0.297 g/cm3; furthermore, the specimen used for the second-step evaluation had an apparent density ρ = 0.236 g/cm3.
Obtained results prove the hypothesis that neural networks are appropriate tools for building models of the phenomenon of the compression of aluminium foams. Additionally, the results enabled the identification of specifications of computations with artificial intelligence, which allows one to build good-quality models. These two general conclusions are now described in detail to list the specific contributions of our research:
  • The following neural network architecture specifications can be successfully used to model the addressed phenomenon: a two-layer feedforward NN with one hidden layer and one output layer. As for the activation functions, one may use the hyperbolic tangent sigmoid function in the hidden layer and the linear activation function for the output layer. As for the training algorithm, the Levenberg–Marquardt procedure was verified positively. For this procedure, the mean square error was used as the performance function with 0 as its goal. The learning rate and momentum should be calibrated; however, for the given experimental data and the number of neurons in the hidden layer assumed as 12 (near optimum) the results show that the influence of these two parameters was not the deciding factor. Values for momentum, learning rate, number of epochs to train, gradient and maximum validation failures, which were applied and recommended, are given in Table 2.
  • Regarding the number of neurons in the hidden layer, the interval s { 1 } = 1 ; 50 was investigated. It was shown that even a relatively low complexity of four neurons can provide a satisfactory particular model and acceptable accuracy for the prognosis ( ( M A R E Test ) 4 , 2 = 4.5 % , ( M A R E Verif ) 4 , 2 = 8.7 % ); nevertheless, the probability of obtaining such results by the first approach of training a model is low. Increasing the complexity by two neurons—up to six—considerably improves the accuracy of a particular model itself and prognosis ( ( M A R E Test ) 6 , 6 = 3.6 % , ( M A R E Verif ) 6 , 6 = 2.7 % ); however, robustness is not satisfied for such networks. If one is interested in complying with insensitivity in the random assumption of weights and biases, networks with 11 neurons in the hidden layer provide robustness with a probability of 0.8 and a very good accuracy level at the same time ( ( M A R E Test ) 11 , 4 = 2.0 % , ( M A R E Verif ) 11 , 4 = 3.0 % ). A greater number of neurons in the hidden layer ( > 11 ) also gives accurate results, but the accuracy is not increased substantially, and the overfitting risk is higher with 13 neurons or more.
  • In order to choose the model which most appropriately prognoses the mechanical characteristics of the studied materials, it is necessary to consider certain statistical measures for the assessment of the obtained results. In particular, evaluation parameters which indicate the occurrence of single instants of significant deviations between a mapped value and the respective target (e.g., MaxARE) should be introduced. Such individual considerable errors might disqualify a given model even if overall mean error would be on satisfactory level (for example MARE, MSE).
  • A series of criteria (16)–(26) is proposed to evaluate obtained models in a two-step evaluation. The idea of the two-step verification allows one to assess the fitting of the particular model to the data with which it was trained and to assess whether this particular model is capable of prognosing. Based on the presented research, it is recommended that the two-step model evaluation is performed with regard to the following qualities and measures explained in Section 3.5: accuracy ( M A R E Test , M A R E Verif ), under- and overfitting ( M A R E Test , M A R E Verif , av _ M A R E Test , av _ M A R E Verif ) and robustness ( S A E Valid ).
  • The relationship between the number of neurons in the hidden layer and convergence (meant as nearing to M S E Valid = 0 ) can be very well described by a power law, which proves that the modelling of closed-cell aluminium during compression is not a chaotic but ordered phenomenon. However, at the same time the results show that for networks with 13 neurons and more, instances burdened with considerable overfitting start to occur. These two facts may indicate that in the pursuit of better accuracy, instead of increasing the number of neurons in the hidden layer {1}, one may choose to lower it while also adding another hidden layer. However, the multilayer network approach was beyond the scope of the presented work and is planned as further research.
  • None of the analyzed particular models had an accuracy in prognosis better than M A R E Verif . min = 1.661 % . This threshold, below which even the most complex networks were unable to perform, is the premise for the idea that when using the tool of artificial intelligence, one has to balance the satisfactory demand of accuracy, network complexity and number of experimental data used for model training. The more data that are obtained from experiments, the better the accuracy, but the larger the computational time and costs of data harvesting also. On the other hand, if one agrees on some inevitable threshold of prognosis quality, they may be still be successful, but this still requires less time and cost investment.
As for the potential for further development, the following ideas seem interesting. One could assume another form of the initial relation (2), for example, by also incorporating morphological data of the material (e.g., cell wall thickness, average cell size) into the equation. Additionally, in the present research the verification of external data was performed for the specimen from the middle of the interval of density. One could extend this procedure to cross-validation and see how models would be capable of extrapolation. Other approaches could include using multilayer perceptions or different quality assessment criteria. Finally, one could investigate how neural networks model certain characteristics of closed-cell aluminium, which are important from material design or application engineering points of view—such research has already been started by the authors, the results of which are promising and will be published soon [95].

Author Contributions

Conceptualization, A.M.S., M.D. and T.M.; methodology, A.M.S., M.D. and T.M.; software, M.D., A.M.S. and T.M.; validation, A.M.S., M.D. and T.M.; formal analysis, A.M.S., M.D. and T.M.; investigation, A.M.S. and T.M.; resources, A.M.S.; data curation, A.M.S. and T.M.; writing—original draft preparation, A.M.S. and T.M.; writing—review and editing, A.M.S. and T.M.; visualization, A.M.S., M.D. and T.M.; funding acquisition, A.M.S. and T.M. All authors have read and agreed to the published version of the manuscript.

Funding

The research was co-funded by the Cracow University of Technology, Cracow, Poland (Research Grant of the Dean of the Civil Engineering Faculty 2020–2021) and the AGH University of Science and Technology, Cracow, Poland.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Acknowledgments

The authors would like to kindly acknowledge Roman Gieleta for his help in the experimental part of the present article and publication [28].

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A

Figure A1. Specimens of metal foam: (a) Z_01, (b) Z_02, (c) Z_03, (d) Z_05, (e) Z_06_p, (f) Z_09_p, (g) Z_12_p, (h) Z_14_p, (i) X_Z_01_p, (j) X_Z_02, (k) X_Z_06_p, and (l) X_Z_08_p.
Figure A1. Specimens of metal foam: (a) Z_01, (b) Z_02, (c) Z_03, (d) Z_05, (e) Z_06_p, (f) Z_09_p, (g) Z_12_p, (h) Z_14_p, (i) X_Z_01_p, (j) X_Z_02, (k) X_Z_06_p, and (l) X_Z_08_p.
Materials 15 01262 g0a1

Appendix B

The examples in Figure A2 demonstrate the effect of the parameter p used to smooth the stress–strain experimental data on the resulting NN-input dataset, as described in Section 3.1.1.
Figure A2. Comparison of the quality of smoothing of experimental data for exemplary values of parameter p (specimen Z_14_p).
Figure A2. Comparison of the quality of smoothing of experimental data for exemplary values of parameter p (specimen Z_14_p).
Materials 15 01262 g0a2

Appendix C

In the calibration of learning parameters, which was described in Section 3.4, the momentum and learning rate were examined in relation to the performance function M S E . Here an additional remark is supplied to this analysis: the vast majority of the results falls into the interval M S E majority   0.0015 ; 0.002   MPa 2 , and the maximum instance is not greater by one order. This may be interpreted as the negligible influence of the choice of learning rate and momentum on the NN training process within the investigated ranges.
Figure A3. Histogram of MSE in calibration of learning parameters.
Figure A3. Histogram of MSE in calibration of learning parameters.
Materials 15 01262 g0a3

Appendix D

Figure 10 from Section 4 was scaled in order to present results clearly and legibly. Due to this graphic processing, some of the results could not fit in the drawing. The omission of some results in the plot in the main body of the article did not affect the conclusions or reasoning. Here the plot respective to Figure 10 from Section 4 is presented, but with all obtained results for the reader to have the complete picture—Figure A4.
Figure A4. Mean absolute relative error from verification of external data.
Figure A4. Mean absolute relative error from verification of external data.
Materials 15 01262 g0a4

Appendix E

Here are provided figures that are complementary to Figure 12 from Section 4.3. Figure A5, Figure A6, Figure A7 and Figure A8 present plots of regression for all stages of network training (training, validation, test) and for the three stages cumulatively for the networks [ s { 1 } min ,   a p p r o a c h min ] = [ 4 , 2 ] , [ s { 1 } min ,   a p p r o a c h min ] = [ 6 , 6 ] , [ s { 1 } min ,   a p p r o a c h min ] = [ 11 , 4 ] and [ s { 1 } best ,   a p p r o a c h best ] = [ 48 , 10 ] . Values of the Pearson coefficient are shown at the top of each graph. The equation for the linear regression line is given on the left side of each graph.
Figure A5. Regression for all stages of network training (training, validation, test) and for the three stages cumulatively for the network [ s { 1 } min ,   a p p r o a c h min ] = [ 4 , 2 ] .
Figure A5. Regression for all stages of network training (training, validation, test) and for the three stages cumulatively for the network [ s { 1 } min ,   a p p r o a c h min ] = [ 4 , 2 ] .
Materials 15 01262 g0a5
Figure A6. Regression for all stages of network training (training, validation, test) and for the three stages cumulatively for the network [ s { 1 } min ,   a p p r o a c h min ] = [ 6 , 6 ] .
Figure A6. Regression for all stages of network training (training, validation, test) and for the three stages cumulatively for the network [ s { 1 } min ,   a p p r o a c h min ] = [ 6 , 6 ] .
Materials 15 01262 g0a6
Figure A7. Regression for all stages of network training (training, validation, test) and for the three stages cumulatively for the network [ s { 1 } min ,   a p p r o a c h min ] = [ 11 , 4 ] .
Figure A7. Regression for all stages of network training (training, validation, test) and for the three stages cumulatively for the network [ s { 1 } min ,   a p p r o a c h min ] = [ 11 , 4 ] .
Materials 15 01262 g0a7
Figure A8. Regression for all stages of network training (training, validation, test) and for the three stages cumulatively for the network [ s { 1 } min ,   a p p r o a c h min ] = [ 48 , 10 ] .
Figure A8. Regression for all stages of network training (training, validation, test) and for the three stages cumulatively for the network [ s { 1 } min ,   a p p r o a c h min ] = [ 48 , 10 ] .
Materials 15 01262 g0a8

References

  1. Chen, Y.; Das, R.; Battley, M. Effects of cell size and cell wall thickness variations on the strength of closed-cell foams. Int. J. Eng. Sci. 2017, 120, 220–240. [Google Scholar] [CrossRef]
  2. Idris, M.I.; Vodenitcharova, T.; Hoffman, M. Mechanical behaviour and energy absorption of closed-cell aluminium foam panels in uniaxial compression. Mater. Sci. Eng. A 2009, 517, 37–45. [Google Scholar] [CrossRef]
  3. Koza, E.; Leonowic, M.; Wojciechowski, S.; Simancik, F. Compressive strength of aluminum foams. Mater. Lett. 2004, 58, 132–135. [Google Scholar] [CrossRef]
  4. Nammi, S.K.; Edwards, G.; Shirvani, H. Effect of cell-size on the energy absorption features of closed-cell aluminium foams. Acta Astronaut. 2016, 128, 243–250. [Google Scholar] [CrossRef]
  5. Nosko, M.; Simančík, F.; Florek, R.; Tobolka, P.; Jerz, J.; Mináriková, N.; Kováčik, J. Sound absorption ability of aluminium foams. Met. Foam. 2017, 1, 15–41. [Google Scholar] [CrossRef] [Green Version]
  6. Lu, T.; Hess, A.; Ashby, M. Sound absorption in metallic foams. J. Appl. Phys. 1999, 85, 7528–7539. [Google Scholar] [CrossRef]
  7. Catarinucci, L.; Monti, G.; Tarricone, L. Metal foams for electromagnetics: Experimental, numerical and analytical characterization. Prog. Electromagn. Res. B 2012, 45, 1–18. [Google Scholar] [CrossRef] [Green Version]
  8. Xu, Z.; Hao, H. Electromagnetic interference shielding effectiveness of aluminum foams with different porosity. J. Alloy. Compd. 2014, 617, 207–213. [Google Scholar] [CrossRef]
  9. Albertelli, P.; Esposito, S.; Mussi, V.; Goletti, M.; Monno, M. Effect of metal foam on vibration damping and its modelling. Int. J. Adv. Manuf. Technol. 2021, 117, 2349–2358. [Google Scholar] [CrossRef]
  10. Gopinathan, A.; Jerz, J.; Kováčik, J.; Dvorák, T. Investigation of the relationship between morphology and thermal conductivity of powder metallurgically prepared aluminium foams. Materials 2021, 14, 3623. [Google Scholar] [CrossRef]
  11. Hu, Y.; Fang, Q.-Z.; Yu, H.; Hu, Q. Numerical simulation on thermal properties of closed-cell metal foams with different cell size distributions and cell shapes. Mater. Today Commun. 2020, 24, 100968. [Google Scholar] [CrossRef]
  12. Degischer, H.-P.; Kriszt, B. Handbook of Cellular Metals: Production, Processing, Applications, 1st ed.; Wiley-VCH Verlag GmbH & Co. KGaA: Weinheim, Germany, 2001; pp. 12–17. [Google Scholar]
  13. Stöbener, K.; Rausch, G. Aluminium foam–polymer composites: Processing and characteristics. J. Mater. Sci. 2009, 44, 1506–1511. [Google Scholar] [CrossRef]
  14. Duarte, I.; Vesenjak, M.; Krstulović-Opara, L.; Anžel, I.; Ferreira, J.M.F. Manufacturing and bending behaviour of in situ foam-filled aluminium alloy tubes. Mater. Des. 2015, 66, 532–544. [Google Scholar] [CrossRef]
  15. Birman, V.; Kardomatea, G. Review of current trends in research and applications of sandwich structures. Compos. Part B Eng. 2018, 142, 221–240. [Google Scholar] [CrossRef]
  16. Banhart, J. Manufacture, characterisation and application of cellular metals and metal foams. Prog. Mater. Sci. 2001, 46, 559–632. [Google Scholar] [CrossRef]
  17. Garcia-Moreno, F. Commercial applications of metal foams: Their properties and production. Materials 2016, 9, 85. [Google Scholar] [CrossRef]
  18. Singh, S.; Bhatnagar, N. A survey of fabrication and application of metallic foams (1925–2017). J. Porous. Mater. 2018, 25, 537–554. [Google Scholar] [CrossRef]
  19. Atwater, M.; Guevara, L.; Darling, K.; Tschopp, M. Solid state porous metal production: A review of the capabilities, characteristics, and challenges. Adv. Eng. Mater. 2018, 20, 1700766. [Google Scholar] [CrossRef] [Green Version]
  20. Baumeister, J.; Weise, J.; Hirtz, E.; Höhne, K.; Hohe, J. Applications of Aluminum Hybrid Foam Sandwiches in Battery Housings for Electric Vehicles. Proced. Mater. Sci. 2014, 4, 317–321. [Google Scholar] [CrossRef]
  21. Simančík, F. Metallic foams–Ultra light materials for structural applications. Inżynieria Mater. 2001, 5, 823–828. [Google Scholar]
  22. Banhart, J.; Seeliger, H.-W. Recent trends in aluminum foam sandwich technology. Adv. Eng. Mater. 2012, 14, 1082–1087. [Google Scholar] [CrossRef]
  23. Chalco Aluminium Corporation. Aluminium Foams for Architecture Décor and Design. Available online: http://www.aluminum-foam.com/application/aluminum_foam_for_architecure_decor_and_design.html (accessed on 30 November 2021).
  24. Cyamat Technologies Ltd.: ALUSION™ an Extraordinary Surface Solution. Available online: https://www.alusion.com/index.php/products/alusion-architectural-applications (accessed on 30 November 2021).
  25. Miyoshi, T.; Itoh, M.; Akiyama, S.; Kitahara, A. ALPORAS aluminum foam: Production process, properties, and applications. Adv. Eng. Mater. 2000, 2, 179–183. [Google Scholar] [CrossRef]
  26. Wang, L.B.; See, K.Y.; Ling, Y.; Koh, W.J. Study of metal foams for architectural electromagnetic shielding. J. Mater. Civil Eng. 2012, 24, 488–493. [Google Scholar] [CrossRef]
  27. Chalco Aluminium Corporation. Aluminium Foams for Sound Absorption. Available online: http://www.aluminum-foam.com/application/aluminum_form_for_Sound_absorption.html (accessed on 30 November 2021).
  28. Stręk, A.M.; Lasowicz, N.; Kwiecień, A.; Zając, B.; Jankowski, R. Highly dissipative materials for damage protection against earthquake-induced structural pounding. Materials 2021, 14, 3231. [Google Scholar] [CrossRef] [PubMed]
  29. Jang, W.-Y.; Hsieh, W.-Y.; Miao, C.-C.; Yen, Y.-C. Microstructure and mechanical properties of ALPORAS closed-cell aluminium foam. Mater. Charact. 2015, 107, 228–238. [Google Scholar] [CrossRef]
  30. Maire, E.; Adrien, J.; Petit, C. Structural characterization of solid foams. Comptes Rendus Phys. 2014, 15, 674–682. [Google Scholar] [CrossRef]
  31. Neu, T.R.; Kamm, P.H.; von der Eltz, N.; Seeliger, H.-W.; Banhart, J.; García-Moreno, F. Correlation between foam structure and mechanical performance of aluminium foam sandwich panels. Mater. Sci. Eng. A 2021, 800, 140260. [Google Scholar] [CrossRef]
  32. Stręk, A.M. Ocena Właściwości Wytrzymałościowych i Funkcjonalnych Materiałów Komórkowych. (English Title: Assessment of Strength and Functional Properties of Cellular Materials). Ph.D. Thesis, AGH University, Kraków, Poland, 2017. [Google Scholar]
  33. Stręk, A.M. Methods of production of metallic foams. Przegląd Mechaniczny 2012, 12, 36–39. [Google Scholar]
  34. Stręk, A.M. Methodology for experimental investigations of metal foams and their mechanical properties. Mech. Control 2012, 31, 90. [Google Scholar] [CrossRef] [Green Version]
  35. Stręk, A.M. Determination of material characteristics in the quasi-static compression test of cellular metal materials. In Wybrane Problem Geotechniki i Wytrzymałości Materiałów dla Potrzeb Nowoczesnego Budownictwa, 1st ed.; Tatara, T., Pilecka, E., Eds.; Wydawnictwo Politechniki Krakowskiej: Kraków, Poland, 2020. (In Polish) [Google Scholar]
  36. DIN 50134:2008-10 Prüfung von Metallischen Werkstoffen—Druckversuch an Metallischen Zellularen Werkstoffen. Available online: https://www.beuth.de/en/standard/din-50134/108978639 (accessed on 10 June 2021).
  37. ISO 13314:2011 Mechanical Testing of metals—Ductility Testing—Compression Test for Porous and Cellular Metals. Available online: https://www.iso.org/standard/53669.html (accessed on 10 June 2021).
  38. Ashby, M.F.; Evans, A.; Fleck, N.; Gibson, L.J.; Hutchinson, J.W.; Wadley, H.N. Metal Foams: A Design Guide; Elsevier Science: Burlington, MA, USA, 2000. [Google Scholar]
  39. Daxner, T.; Bohm, H.J.; Seitzberger, M.; Rammerstorfer, F.G. Modelling of cellular metals. In Handbook of Cellular Metals; Degischer, H.-P., Kriszt, B., Eds.; Wiley-VCH: Weinheim, Germany, 2002; pp. 245–280. [Google Scholar]
  40. Gibson, L.J.; Ashby, M.F. Cellular Solids, 1st ed.; Pergamon Press: Oxford, UK, 1988. [Google Scholar]
  41. Jung, A.; Diebels, S. Modelling of metal foams by a modified elastic law. Mech. Mater. 2016, 101, 61–70. [Google Scholar] [CrossRef]
  42. Beckmann, C.; Hohe, J. A probabilistic constitutive model for closed-cell foams. Mech. Mater. 2016, 96, 96–105. [Google Scholar] [CrossRef]
  43. Hanssen, A.G.; Hopperstad, O.S.; Langseth, M.; Ilstad, H. Validation of constitutive models applicable to aluminium foams. Int. J. Mech. Sci. 2002, 44, 359–406. [Google Scholar] [CrossRef]
  44. De Giorgi, M.; Carofalo, A.; Dattoma, V.; Nobile, R.; Palano, F. Aluminium foams structural modelling. Comput. Struct. 2010, 88, 25–35. [Google Scholar] [CrossRef]
  45. Miedzińska, D.; Niezgoda, T.; Gieleta, R. Numerical and experimental aluminum foam microstructure testing with the use of computed tomography. Comput. Mater. Sci. 2012, 64, 90–95. [Google Scholar] [CrossRef]
  46. Nowak, M. Application of periodic unit cell for modeling of porous materials. In Proceedings of the 8th Workshop on Dynamic Behaviour of Materials and Its Applications in Industrial Processes, Warszawa, Poland, 25–27 June 2014; pp. 47–48. [Google Scholar]
  47. Raj, S.V. Microstructural characterization of metal foams: An examination of the applicability of the theoretical models for modeling foams. Mater. Sci. Eng. A 2011, 528, 5289–5295. [Google Scholar] [CrossRef] [Green Version]
  48. Raj, S.V. Corrigendum to Microstructural characterization of metal foams: An examination of the applicability of the theoretical models for modeling foams. Mater. Sci. Eng. A 2011, 528, 8041. [Google Scholar] [CrossRef]
  49. Dudzik, M.; Stręk, A.M. ANN architecture specifications for modelling of open-cell aluminum under compression. Math. Probl. Eng. 2020, 2020, 26. [Google Scholar] [CrossRef] [Green Version]
  50. Dudzik, M.; Stręk, A.M. ANN model of stress-strain relationship for aluminium sponge in uniaxial compression. J. Theor. Appl. Mech. 2020, 58, 385–390. [Google Scholar] [CrossRef]
  51. Stręk, A.M.; Dudzik, M.; Kwiecień, A.; Wańczyk, K.; Lipowska, B. Verification of application of ANN modelling for compressive behaviour of metal sponges. Eng. Trans. 2019, 67, 271–288. [Google Scholar]
  52. Settgast, C.; Abendroth, M.; Kuna, M. Constitutive modeling of plastic deformation behavior of open-cell foam structures using neural networks. Mech. Mat. 2019, 131, 1–10. [Google Scholar] [CrossRef]
  53. Rodríguez-Sánchez, A.E.; Plascencia-Mora, H. A machine learning approach to estimate the strain energy absorption in expanded polystyrene foams. J. Cell. Plast. 2021, 29. [Google Scholar] [CrossRef]
  54. Baiocco, G.; Tagliaferri, V.; Ucciardello, N. Neural Networks implementation for analysis and control of heat exchange process in a metal foam prototypal device. Procedia CIRP 2017, 62, 518–522. [Google Scholar] [CrossRef]
  55. Calati, M.; Righetti, G.; Doretti, L.; Zilio, C.; Longo, G.A.; Hooman, K.; Mancin, S. Water pool boiling in metal foams: From experimental results to a generalized model based on artificial neural network. Int. J. Heat Mass Trans. 2021, 176, 121451. [Google Scholar] [CrossRef]
  56. Ojha, V.K.; Abraham, A.; Snášel, V. Metaheuristic design of feedforward neural networks: A review of two decades of research. Eng. Appl. Artif. Intel. 2017, 60, 97–116. [Google Scholar] [CrossRef] [Green Version]
  57. Bashiri, M.; Farshbaf Geranmayeh, A. Tuning the parameters of an artificial neural network using central composite design and genetic algorithm. Sci. Iran. 2011, 18, 1600–1608. [Google Scholar] [CrossRef] [Green Version]
  58. La Rocca, M.; Perna, C. Model selection for neural network models: A statistical perspective. In Computational Network Theory: Theoretical Foundations and Applications, 1st ed.; Dehmer, M., Emmert-Streib, F., Pickl, S., Eds.; Wiley-VCH Verlag GmbH & Co. KGaA: Weinheim, Germany, 2015. [Google Scholar]
  59. Notton, G.; Voyant, C.; Fouilloy, A.; Duchaud, J.L.; Nivet, M.L. Some applications of ANN to solar radiation estimation and forecasting for energy applications. Appl. Sci. 2019, 9, 209. [Google Scholar] [CrossRef] [Green Version]
  60. Anders, U.; Korn, O. Model selection in neural networks. Neural Netw. 1999, 12, 309–323. [Google Scholar] [CrossRef] [Green Version]
  61. Oken, A. An Introduction to and Applications of Neural Networks. Available online: https://www.whitman.edu/Documents/Academics/Mathematics/2017/Oken.pdf (accessed on 21 February 2019).
  62. Mareš, T.; Janouchová, E.; Kučerová, A. Artificial neural networks in the calibration of nonlinear mechanical models. Adv. Eng. Softw. 2016, 95, 68–81. [Google Scholar] [CrossRef] [Green Version]
  63. Rafiq, M.Y.; Bugmann, G.; Easterbrook, D.J. Neural network design for engineering applications. Comput. Struct. 2001, 79, 1541–1552. [Google Scholar] [CrossRef]
  64. Kurzyński, M. Metody Sztucznej Inteligencji dla Inżynierów; (English Title: Methods of Artificial Intelligence for Engineers); Państwowa Wyższa Szkoła Zawodowa im. Witelona: Legnica, Poland; Stowarzyszenie “Wspólnota Akademicka”: Legnica, Poland, 2008. [Google Scholar]
  65. Lefik, M. Zastosowanie Sztucznych Sieci Neuronowych w Mechanice i w Inżynierii; (English Title: Application of Artificial Neural Networks in Mechanics and Engineering); Wydawnictwo Politechniki Łódzkiej: Łódź, Poland, 2005. [Google Scholar]
  66. Jakubek, M. Zastosowanie Sztucznych Sieci Neuronowych w Wybranych Zagadnieniach Eksperymentalnej Mechaniki Materiałów i Konstrukcji. (English Title: Application of Artificial Neural Networks in Selected Problems of Experimental Mechanics and Structural Engineering). Ph.D. Thesis, Politechnika Krakowska (Cracow University of Technology), Kraków, Poland, 2007. [Google Scholar]
  67. Waszczyszyn, Z.; Ziemianski, L. Neural networks in the identification analysis of structural mechanics problems. In Parameter identification of Materials and Structures; Mróz, Z., Stavroulakis, G.E., Eds.; Springer: Wien, Austria; NewYork, NY, USA, 2005; pp. 265–340. [Google Scholar]
  68. Flood, I.; Kartam, N. Neural network in civil engineering I: Principles and understandings. ASCE J. Comput. Civ. Eng. 1994, 8, 131–148. [Google Scholar] [CrossRef]
  69. Flood, I.; Kartam, N. Neural network in civil engineering II: Systems and application. ASCE J. Comput. Civ. Eng. 1994, 8, 149–162. [Google Scholar] [CrossRef]
  70. Pineda, P.; Rubio, J.N. Topic Review Efficient Structural Design with ANNs Subjects: Computer Science, Artificial Intelligence Construction & Building Technology. Available online: https://encyclopedia.pub/item/revision/68a3c44a440d84b08f5e35e634fc4892 (accessed on 30 November 2021).
  71. Ray, R.; Kumar, D.; Samui, P.; Roy, L.B.; Goh, A.T.C.; Zhang, W. Application of soft computing techniques for shallow foundation reliability in geotechnical engineering. Geosci. Front. 2021, 12, 375–383. [Google Scholar] [CrossRef]
  72. Sumelka, W.; Łodygowski, T. Reduction of the number of material parameters by ANN approximation. Comput. Mech. 2013, 52, 287–300. [Google Scholar] [CrossRef] [Green Version]
  73. Chaabene, W.B.; Flah, M.; Nehdi, M.L. Machine learning prediction of mechanical properties of concrete: Critical review. Construct. Build. Mater. 2020, 260, 119889. [Google Scholar] [CrossRef]
  74. Asteris, P.G.; Apostolopoulou, M.; Skentou, A.D.; Moropoulou, A. Application of artificial neural networks for the prediction of the compressive strength of cement-based mortars. Comput. Concr. 2019, 24, 329–345. [Google Scholar]
  75. Kardani, N.; Bardhan, A.; Gupta, S.; Samui, P.; Nazem, M.; Zhang, Y.; Zhou, A. Predicting permeability of tight carbonates using a hybrid machine learning approach of modified equilibrium optimizer and extreme learning machine. Acta Geotech. 2021, 17. [Google Scholar] [CrossRef]
  76. Diamantopoulou, M.; Karathanasopoulos, N.; Mohr, D. Stress-strain response of polymers made through two-photon lithography: Micro-scale experiments and neural network modeling. Addit. Manuf. 2021, 47, 102266. [Google Scholar] [CrossRef]
  77. Standard PN-EN ISO 1923; Tworzywa Sztuczne Porowate i Gumy–Oznaczanie Wymiarów Liniowych. Polish Committee for Standardization: Warsaw, Poland, 1999. (In Polish)
  78. Reinsch, C.H. Smoothing by spline functions. Numer. Math. 1967, 10, 177–184. [Google Scholar] [CrossRef]
  79. Champion, R.; Lenard, C.T.; Mills, T.M. An introduction to abstract splines. Math. Sci. 1996, 21, 8–26. [Google Scholar]
  80. Mathworks Documentation: Csaps. Available online: https://www.mathworks.com/help/curvefit/csaps.html (accessed on 21 November 2021).
  81. Ripley, B.D.; Hjort, N.L. Pattern Recognition and Neural Networks; Cambridge University Press: New York, NY, USA, 1995. [Google Scholar]
  82. Kuhn, M.; Johnson, K. Applied Predictive Modeling; Springer: New York, NY, USA, 2013. [Google Scholar]
  83. Russell, S.J. Artificial Intelligence: A Modern Approach; Prentice Hall: Hoboken, NJ, USA, 2010. [Google Scholar]
  84. Demuth, H.; Beale, M.; Hagan, M. Neural Network Toolbox 6 User’s Guide; The MathWorks Inc.: Natick, MA, USA, 2009. [Google Scholar]
  85. Mathworks Documentation: Mapminmax. Available online: https://www.mathworks.com/help/deeplearning/ref/mapminmax.html (accessed on 21 February 2019).
  86. Matlab and Automatic Target Normalization: Mapminmax. Don’t Trust Your Matlab Framework! Available online: https://neuralsniffer.wordpress.com/2010/10/17/matlab-and-automatic-target-normalization-mapminmax-dont-trust-your-matlab-framework/ (accessed on 21 February 2019).
  87. Hagan, M.T.; Demuth, H.B.; Beale, M.H.; De Jesus, O. Neural Network Design, 2nd ed.; Amazon: Seattle, WA, USA, 2014. [Google Scholar]
  88. Famili, A.; Shen, W.-M.; Weber, R.; Simoudis, E. Data preprocessing and intelligent data analysis. Intell. Data Anal. 1997, 1, 3–23. [Google Scholar] [CrossRef] [Green Version]
  89. Dudzik, M. Współczesne Metody Projektowania, Weryfikacji Poprawności i Modelowania Zjawisk Trakcji Elektrycznej; (English Title: Modern Methods of Designing, Verification and Modelling of Phenomena Concerning Electric Traction); Wydawnictwo Politechniki Krakowskiej: Kraków, Poland, 2018. [Google Scholar]
  90. Hutter, F.; Hoos, H.; Leyton-Brown, K. An efficient approach for assessing hyperparameter importance. In Proceedings of Machine Learning Research, Proceedings of the 31st International Conference on Machine Learning, Beijing, China, 21–26 June 2014; Volume 32, pp. 754–762. [Google Scholar]
  91. Madsen, K.; Nielsen, H.; Tingleff, O. Methods for Non-Linear Least Squares Problems, 2nd ed.; Technical University of Denmark: Kongens Lyngby, Denmark, 2004. [Google Scholar]
  92. Layer, E.; Tomczyk, K. Determination of non-standard input signal maximizing the absolute error. Metrol. Meas. Syst. 2009, 17, 199–208. [Google Scholar]
  93. Tomczyk, K.; Piekarczyk, M.; Sokal, G. Radial basis functions intended to determine the upper bound of absolute dynamic error at the output of voltage-mode accelerometers. Sensors 2019, 19, 4154. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  94. Dudzik, M. Towards characterization of indoor environment in smart buildings: Modelling PMV index using neural network with one hidden layer. Sustainability 2020, 12, 6749. [Google Scholar] [CrossRef]
  95. Stręk, A.M.; Machniewicz, T.; Dudzik, M. ANN Model and Characteristics of Closed-Cell Aluminium in Compression. 2022; in preparation. [Google Scholar]
Figure 1. Experiment of uniaxial compression: (a) an aluminium foam cubic specimen; (b) experimental set with one of the samples between the presses ready for the compression test.
Figure 1. Experiment of uniaxial compression: (a) an aluminium foam cubic specimen; (b) experimental set with one of the samples between the presses ready for the compression test.
Materials 15 01262 g001
Figure 2. Stress–strain relationships from compression of aluminium foam specimens. Following the sample’s name is its density given in [g/cm3].
Figure 2. Stress–strain relationships from compression of aluminium foam specimens. Following the sample’s name is its density given in [g/cm3].
Materials 15 01262 g002
Figure 3. Exemplary comparison of raw stress–strain data with data smoothed using cubic smoothing splines with the smoothing parameter p = 0.01 (specimen Z_14_p).
Figure 3. Exemplary comparison of raw stress–strain data with data smoothed using cubic smoothing splines with the smoothing parameter p = 0.01 (specimen Z_14_p).
Materials 15 01262 g003
Figure 4. The structure of neural networks used in the study. All symbols are explained in the text (Section 3.2).
Figure 4. The structure of neural networks used in the study. All symbols are explained in the text (Section 3.2).
Materials 15 01262 g004
Figure 5. Values of the performance function ( M S E ) with reference to momentum and learning rate obtained in calibration of these parameters for networks with 12 neurons in the hidden layer.
Figure 5. Values of the performance function ( M S E ) with reference to momentum and learning rate obtained in calibration of these parameters for networks with 12 neurons in the hidden layer.
Materials 15 01262 g005
Figure 6. The algorithm for identification of the best possible value of s { 1 } in the feedforward neural network with one hidden layer model. On the left side: the parent procedure P1; and on the right side: the nested procedure P2.
Figure 6. The algorithm for identification of the best possible value of s { 1 } in the feedforward neural network with one hidden layer model. On the left side: the parent procedure P1; and on the right side: the nested procedure P2.
Materials 15 01262 g006
Figure 7. Mean square error from the validation stage.
Figure 7. Mean square error from the validation stage.
Materials 15 01262 g007
Figure 8. Sum of square errors from the validation stage.
Figure 8. Sum of square errors from the validation stage.
Materials 15 01262 g008
Figure 9. Mean absolute relative error from the test stage.
Figure 9. Mean absolute relative error from the test stage.
Materials 15 01262 g009
Figure 10. Mean absolute relative error from verification of external data.
Figure 10. Mean absolute relative error from verification of external data.
Materials 15 01262 g010
Figure 11. Performance function’s course: (a) network [ s { 1 } min ,   a p p r o a c h min ] = [ 4 , 2 ] ; (b) network [ s { 1 } min ,   a p p r o a c h min ] = [ 6 , 6 ] ; (c) network [ s { 1 } min ,   a p p r o a c h min ] = [ 11 , 4 ] ; and (d) network [ s { 1 } best ,   a p p r o a c h best ] = [ 48 , 10 ] .
Figure 11. Performance function’s course: (a) network [ s { 1 } min ,   a p p r o a c h min ] = [ 4 , 2 ] ; (b) network [ s { 1 } min ,   a p p r o a c h min ] = [ 6 , 6 ] ; (c) network [ s { 1 } min ,   a p p r o a c h min ] = [ 11 , 4 ] ; and (d) network [ s { 1 } best ,   a p p r o a c h best ] = [ 48 , 10 ] .
Materials 15 01262 g011
Figure 12. Regression for all training stages: (a) network [ s { 1 } min ,   a p p r o a c h min ] = [ 4 , 2 ] ; (b) network [ s { 1 } min ,   a p p r o a c h min ] = [ 6 , 6 ] ; (c) network [ s { 1 } min ,   a p p r o a c h min ] = [ 11 , 4 ] ; and (d) network [ s { 1 } best ,   a p p r o a c h best ] = [ 48 , 10 ] .
Figure 12. Regression for all training stages: (a) network [ s { 1 } min ,   a p p r o a c h min ] = [ 4 , 2 ] ; (b) network [ s { 1 } min ,   a p p r o a c h min ] = [ 6 , 6 ] ; (c) network [ s { 1 } min ,   a p p r o a c h min ] = [ 11 , 4 ] ; and (d) network [ s { 1 } best ,   a p p r o a c h best ] = [ 48 , 10 ] .
Materials 15 01262 g012aMaterials 15 01262 g012b
Figure 13. Particular model [ s { 1 } min ,   a p p r o a c h min ] = [ 4 , 2 ] (red dots). Additionally, errors (light blue) and targets (dark blue) are given.
Figure 13. Particular model [ s { 1 } min ,   a p p r o a c h min ] = [ 4 , 2 ] (red dots). Additionally, errors (light blue) and targets (dark blue) are given.
Materials 15 01262 g013
Figure 14. Particular model [ s { 1 } min ,   a p p r o a c h min ] = [ 6 , 6 ] (red dots). Additionally, errors (light blue) and targets (dark blue) are given.
Figure 14. Particular model [ s { 1 } min ,   a p p r o a c h min ] = [ 6 , 6 ] (red dots). Additionally, errors (light blue) and targets (dark blue) are given.
Materials 15 01262 g014
Figure 15. Particular model [ s { 1 } min ,   a p p r o a c h min ] = [ 11 , 4 ] (red dots). Additionally, errors (light blue) and targets (dark blue) are given.
Figure 15. Particular model [ s { 1 } min ,   a p p r o a c h min ] = [ 11 , 4 ] (red dots). Additionally, errors (light blue) and targets (dark blue) are given.
Materials 15 01262 g015
Figure 16. Particular model [ s { 1 } min ,   a p p r o a c h min ] = [ 48 , 10 ] (red dots). Additionally, errors (light blue) and targets (dark blue) are given.
Figure 16. Particular model [ s { 1 } min ,   a p p r o a c h min ] = [ 48 , 10 ] (red dots). Additionally, errors (light blue) and targets (dark blue) are given.
Materials 15 01262 g016
Figure 17. Regression for verification of external specimen data: (a) network [ s { 1 } min ,   a p p r o a c h min ] = [ 4 , 2 ] ; (b) network [ s { 1 } min ,   a p p r o a c h min ] = [ 6 , 6 ] ; (c) network [ s { 1 } min ,   a p p r o a c h min ] = [ 11 , 4 ] ; and (d) network [ s { 1 } best ,   a p p r o a c h best ] = [ 48 , 10 ] .
Figure 17. Regression for verification of external specimen data: (a) network [ s { 1 } min ,   a p p r o a c h min ] = [ 4 , 2 ] ; (b) network [ s { 1 } min ,   a p p r o a c h min ] = [ 6 , 6 ] ; (c) network [ s { 1 } min ,   a p p r o a c h min ] = [ 11 , 4 ] ; and (d) network [ s { 1 } best ,   a p p r o a c h best ] = [ 48 , 10 ] .
Materials 15 01262 g017
Figure 18. Prognosis of model [ s { 1 } min ,   a p p r o a c h min ] = [ 4 , 2 ] (red dots). Additionally, errors (light blue) and targets (dark blue) are given.
Figure 18. Prognosis of model [ s { 1 } min ,   a p p r o a c h min ] = [ 4 , 2 ] (red dots). Additionally, errors (light blue) and targets (dark blue) are given.
Materials 15 01262 g018
Figure 19. Prognosis of model [ s { 1 } min ,   a p p r o a c h min ] = [ 6 , 6 ] (red dots). Additionally, errors (light blue) and targets (dark blue) are given.
Figure 19. Prognosis of model [ s { 1 } min ,   a p p r o a c h min ] = [ 6 , 6 ] (red dots). Additionally, errors (light blue) and targets (dark blue) are given.
Materials 15 01262 g019
Figure 20. Prognosis of model [ s { 1 } min ,   a p p r o a c h min ] = [ 11 , 4 ] (red dots). Additionally, errors (light blue) and targets (dark blue) are given.
Figure 20. Prognosis of model [ s { 1 } min ,   a p p r o a c h min ] = [ 11 , 4 ] (red dots). Additionally, errors (light blue) and targets (dark blue) are given.
Materials 15 01262 g020
Figure 21. Prognosis of model [ s { 1 } min ,   a p p r o a c h min ] = [ 48 , 10 ] (red dots). Additionally, errors (light blue) and targets (dark blue) are given.
Figure 21. Prognosis of model [ s { 1 } min ,   a p p r o a c h min ] = [ 48 , 10 ] (red dots). Additionally, errors (light blue) and targets (dark blue) are given.
Materials 15 01262 g021
Table 1. Characteristics of the samples.
Table 1. Characteristics of the samples.
Sample IDV (mm3)m (g)ρ (g/cm3)
X_Z_02122,902.0936.550.297
Z_01127,739.1835.560.278
Z_02126,160.7235.900.285
Z_03122,854.7628.270.230
Z_05124,804.3926.720.214
X_Z_01_p120,565.1326.110.217
X_Z_06_p110,950.8324.830.224
X_Z_08_p113,904.1827.920.245
Z_06_p125,270.0428.130.225
Z_09_p125,154.2829.150.233
Z_12_p122,038.1424.360.200
Z_14_p124,430.5729.350.236
Table 2. Learning parameters for each approach.
Table 2. Learning parameters for each approach.
Learning ParameterValue
performance function goal0
minimum performance gradient10−10
maximum number of epochs to train100,000
maximum validation failures12
maximum time to train in secondsinfinity
learning rate0.50
momentum2.0
Table 3. Criterial measures and results from evaluation with criteria for accuracy (16) and overfitting (18).
Table 3. Criterial measures and results from evaluation with criteria for accuracy (16) and overfitting (18).
C r i t 1 1 s { 1 } best   a p p r o a c h best C r i t 1 2
0.507% 48 10 35.049%
Table 4. Criterial measures and results from evaluation with Criteria (21) and (22).
Table 4. Criterial measures and results from evaluation with Criteria (21) and (22).
C r i t 2 1 . threshold C r i t 2 1 s { 1 } min   a p p r o a c h min C r i t 2 2 | C r i t 2 1 C r i t 2 2 | C r i t 2 1 . threshold
5%4.455% 4 28.688%85%
4%3.572% 6 62.689%22%
3%2.767% 7 34.731%65%
2.5%2.313% 8 23.881%63%
2%1.959% 11 42.976%51%
1.5%1.497% 17 42.521%68%
1%0.997% 24 84.187%319%
Table 5. Number of neurons in the hidden layer for which thresholds in Criteria (21) and (22) are fulfilled with av _ M A R E as the criterion measure.
Table 5. Number of neurons in the hidden layer for which thresholds in Criteria (21) and (22) are fulfilled with av _ M A R E as the criterion measure.
C r i t 2 1 . threshold s { 1 } min . a v . M . T . C r i t 2 1 C r i t 2 2 . threshold s { 1 } min . a v . M . V . C r i t 2 2
5%54.775%10%49.701%
4%83.850%9%58.579%
3%112.419%8%67.106%
2.5%112.419%7%86.014%
2%151.771%6%95.519%
1.5%211.406%5%114.051%
1%330.967%4%------
Table 6. Performance of chosen networks.
Table 6. Performance of chosen networks.
[ s { 1 } , a p p r o a c h ] M S E min M S E min M S E min , [ 4 , 2 ] e p o c h e p o c h e p o c h [ 4 , 2 ]
[ 4 , 2 ] 0.00968881.0011391.00
[ 6 , 6 ] 0.0050760.5218821.65
[ 11 , 4 ] 0.00169330.1725392.23
[ 48 , 10 ] 0.000102960.0159595.23
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Stręk, A.M.; Dudzik, M.; Machniewicz, T. Specifications for Modelling of the Phenomenon of Compression of Closed-Cell Aluminium Foams with Neural Networks. Materials 2022, 15, 1262. https://doi.org/10.3390/ma15031262

AMA Style

Stręk AM, Dudzik M, Machniewicz T. Specifications for Modelling of the Phenomenon of Compression of Closed-Cell Aluminium Foams with Neural Networks. Materials. 2022; 15(3):1262. https://doi.org/10.3390/ma15031262

Chicago/Turabian Style

Stręk, Anna M., Marek Dudzik, and Tomasz Machniewicz. 2022. "Specifications for Modelling of the Phenomenon of Compression of Closed-Cell Aluminium Foams with Neural Networks" Materials 15, no. 3: 1262. https://doi.org/10.3390/ma15031262

APA Style

Stręk, A. M., Dudzik, M., & Machniewicz, T. (2022). Specifications for Modelling of the Phenomenon of Compression of Closed-Cell Aluminium Foams with Neural Networks. Materials, 15(3), 1262. https://doi.org/10.3390/ma15031262

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop