1. Introduction
The world is experiencing the fourth industrial revolution where digital technologies such as artificial intelligence, robotics, and the Internet of Things are used to improve the productivity, efficiency and sustainability of manufacturing processes [
1,
2]. This transformation is underpinned by the enhanced collection and use of data, and therefore sensors are one of the most important technologies in Industry 4.0 [
3]. Although sensors exist for basic measurements such as temperature and pressure, there is a need for more advanced techniques that can monitor materials and processes. Mixing is one of the most common manufacturing processes. It is not only used for combining materials, but also for increasing heat and mass transfer, providing aeration, and suspending solids. Correct active ingredient dosing in the pharmaceutical industry is critical for patient safety and treatment effectiveness and effective mixing is essential to achieve this. In food manufacturing, mixing provides uniform heating and modifies material structure. In material manufacturing such as the polymer, cement, and rubber industries, final product qualities are determined by the level of homogeneity [
4]. Sensors that provide automatic, real-time data acquisition capabilities are required to monitor critical processes such as mixing. These sensors are termed in- or on-line, where in-line methods directly measure the process material with no sample removal, and on-line methods automatically take samples to be analysed without stopping the process [
5]. Sensors able to characterise whether a mixture is non-mixed or fully mixed offer benefits of reducing off-specification products, early identification of process upset conditions, and reduced resource consumption from overmixing. Furthermore, techniques able to predict the required time remaining until mixing completion would improve batch scheduling and therefore process productivity.
There are numerous in-line and on-line techniques available to monitor industrial mixing processes, with the major categories of techniques being point property measurements, tomographic (e.g., electrical resistance tomography), and spectroscopic (e.g., Near Infrared Spectroscopy (NIRS)). Discussion of the aptitude of each technique to different mixing applications is provided in [
4]. Active acoustic techniques introduce sound waves into a material or system by converting electrical signal pulses into pressure waves using piezoelectric transducers. Either a single transducer sends and receives the sound wave after reflection from an interface (pulse–echo mode) or a second transducer receives the sound wave after it has been transmitted through the material (pitch catch mode) [
6]. Low power, high frequency sound waves in the ultrasonic frequency range are used for material characterisation, and do not affect the structure of the material [
6]. Typical ultrasonic parameters measured to characterise a system include the speed of sound, sound wave attenuation, and the material’s acoustic impedance. The speed of sound through the material is dependent on its density and compressibility, and is calculated by measuring the time of flight of the sound wave. The attenuation of the sound wave can be measured as a decrease in the signal amplitude, and is caused by sound wave scattering, reflection, or energy dissipation. The acoustic impedance is dependent on the speed of sound and density of the material, and the proportion of reflected sound wave from a material boundary is dependent on the magnitude of the acoustic impedance mismatch between the neighbouring materials [
7]. Ultrasound sensors are low-cost, real-time, in-line, and capable of operating in opaque systems. However, the large changes in acoustic impedance when transmitting from liquid or solid to gas causes strong reflection of the sound wave, making transmission difficult in the presence of gas bubbles. Furthermore, the speed of sound in a material is strongly dependent on temperature [
7]. Ultrasound has found application for material characterisation in industries such as food, chemicals, pharmaceuticals, and biotechnology [
6,
7,
8,
9].
Several studies have used ultrasonic measurements to monitor mixing. However, many of these require transmission of the sound wave through the mixture in order to measure the speed of sound or attenuation. Stolojanu and Prakash [
10] used two invasive transducers in the pitch–catch mode to characterise glass bead suspensions up to concentrations of 45 wt % in a laboratory scale mixing system. The ultrasonic velocity, attenuation, and peak frequency shift were used to determine particle concentration and size. Both Ribeiro et al. [
11] and Yucel and Coupland [
12] used two non-invasive transducers in the pitch–catch mode to characterise laboratory scale systems. However, transmission-based measurements are unable to be used for most mixing systems at the industrial scale. Firstly, the increased distance that the sound wave must travel increases the attenuation of the signal. Secondly, industrial mixtures are typically more complex than simple model systems tested at laboratory scale. The number of materials being mixed in industrial mixers creates an increased number of heterogeneities causing scattering and reflection of the sound, or the presence of gas bubbles cause strong reflection of the sound wave. These also contribute to greater attenuation of the signal and transmission becomes more difficult without high power, high cost transducers.
Bamberger and Greenwood [
13] mounted pitch–catch mode transducer pairs to a probe to monitor solids suspension in an industrial slurry mixing tank. However, this technique was invasive and the attenuation correlation with solids concentration was only possible over the short sound wave propagation distance. Sun et al. [
14] monitored the dispersion homogeneity of calcium carbonate in polypropylene during extrusion. Two transducers in the pitch–catch mode measured the ultrasound attenuation. Again, this was invasive and transmission was only possible due to the short sound wave propagation distance. Fox et al. [
15] and Salazar et al. [
16] used the invasive pulse–echo mode ultrasound probes to monitor air incorporation into aerated batters during mixing. Due to the strong reflectance of sound waves caused by gas bubbles, transmission was not possible. The acoustic impedance of the probe-batter interface was measured to determine the optimal mixing time. Hunter et al. [
17] and Bux et al. [
18] used intrusive pulse–echo transducers to monitor particle suspension. Acoustic backscatter techniques were used to measure speed of sound and attenuation, where the reflected sound wave from the particles was measured opposed to transmission through the suspension. Invasive techniques suffer from problems such as probe fouling, probe breakage, and difficulty in installation, thereby limiting their appeal in industrial settings. Ultrasound is applicable for non-invasive measurement by transmitting the sound wave through the wall of the vessel. Therefore, this current work uses a non-invasive, pulse–echo ultrasound technique to monitor mixing, which requires no sound wave transmission through the mixture being characterised. The only examples of non-invasive, no-transmission ultrasonic sensors for mixing processes are those used to monitor particle suspension. Buurman et al. [
19] used non-invasive ultrasonic Doppler velocimetry to detect whether particles were suspended at the bottom of an opaque mixing vessel to monitor particle suspension. Zhan et al. [
20] used a non-invasive pulse–echo transducer attached to the base of the vessel to monitor particle suspension by measuring the acoustic impedance of the base-suspension interface.
In this study, two laboratory-scale mixing systems are monitored: honey-water mixing and flour-water batter mixing. These two model systems were selected to show the application of ultrasonic sensors to monitor different mixing processes. As honey is completely miscible in water, this system is representative of the development of homogeneity in liquid–liquid blending. Flour-water batter was used in this study to monitor structural changes as the gluten proteins in the flour become hydrated and aligned into a network, as opposed to air incorporation as investigated in Fox et al. [
15] and Salazar et al. [
16]. Therefore, this flour-water batter system is similar to dough mixing, only with higher water content. This system was chosen as during dough mixing at atmospheric pressure, the dough pulls away from the mixer sides and is therefore not measurable using low-power ultrasound due to the created air gap. However, industrial dough mixing is typically performed at reduced pressure or vacuum pressure, where the dough will be in contact with the mixer sides. Furthermore, batter mixing has been shown to follow the same physical and chemical changes as dough during mixing, and is therefore representative of industrial dough mixing [
21].
For in-line industrial process monitoring, suitable signal processing and interpretation is required for automatic process diagnosis. Supervised Machine Learning (ML) maps input data to output classes (classification) or values (regression) during training so that it may then be used to predict outputs from new input data. The advantage of ML is the ability to fit functions to input–output relationships without the need to define the often complex underlying physical models. The success of ML models is dependent on the input feature variables used to make predictions. A received ultrasonic waveform consists of an amplitude at each time period sample. From this waveform, useful features are typically manually engineered, e.g., selecting the maximum waveform amplitude, or monitoring the speed of the sound wave. This approach of using manually engineered features is termed shallow ML. Ultrasonic measurements have been combined with shallow ML algorithms such as Artificial Neural Networks (ANNs) [
22,
23,
24,
25,
26,
27,
28,
29] and Support Vector Machines (SVMs) [
23,
25,
30,
31], using waveform features from the time domain [
23,
25,
27,
31,
32] and frequency domain [
24,
27,
31,
32] after analyses such as wavelet transforms [
22,
24]. These have been used for applications such as predicting sugar concentration during fermentation [
33], measuring particle concentration in multicomponent suspensions [
34], and classification of heat exchanger fouling in the dairy industry [
23,
25]. There are no examples of using ultrasonic measurements and ML to follow a mixing process; however, El-Hagrasy et al. [
35] used the Soft Independent Modelling of Class Analogies (SIMCA) and Principal Component Modified Bootstrap Error-adjusted Single-sample Technique (PC-MBEST) algorithms to analyse NIRS spectra during pharmaceutical solids blending. Typically, shallow ML requires some expertise of the sensor signal to engineer useful features from the raw data. In contrast, Convolutional Neural Networks (CNNs) utilise representation learning, which requires no manual feature engineering by transforming the raw data into higher, more abstract levels to automatically extract features [
36]. CNNs use convolutional filters to measure the spatial relationship data values and have found application in image recognition tasks [
37,
38]. CNNs have also been used to improve ML prediction from ultrasonic signals in both the time [
26] and frequency domain after the wavelet transform [
39]. The focus of this study is to compare different feature engineering methods and ML algorithms to classify the mixture state and predict the time remaining until mixing completion for two model mixing systems. ANNs, SVMs, and Long Short-Term Memory (LSTM) neural network shallow ML algorithms are compared with CNNs. The wavelet transform will also be investigated to provide the frequency content of the waveforms as inputs to the ML models. The sensors used in this current work only characterise material close to the vessel wall and therefore the potential for non-representative readings must be investigated. This is achieved by comparing the results from multiple low-cost sensors distributed around the vessel along with data fusion between the sensors. Multisensor data fusion is the combination of measurements from multiple sensors to produce improved analysis over that which could be achieved by using the data from each sensor independently.
4. Discussion
Although ML algorithms were able to achieve similar regression accuracy for both the honey-water blending and flour-water batter mixing, the classification accuracy was lower for the flour-water batter mixing. This is because despite the waveform energies of both processes changing by a similar proportion throughout the mixing processes (
Figure 5 and
Figure 7), the honey-water blending waveform energy profile has a sharper change during the time of mixing completion. The waveform energy increased as the honey was removed from the measurement area of the central sensor, giving greater resolution of this sensor around the end of the mixing process.
Different ML approaches performed best on each prediction task. To classify the honey-water mixture state, predict honey-water mixing time remaining, and predict the flour-water batter mixing time remaining, the use of previous time-steps as features was useful for prediction accuracy. However, the ability of LSTMs to represent all previous time-states in the internal network, and time domain input CNNs ability to use the previous 10 s of acquired waveforms, performed better than the fixed feature gradient lengths used for the ANNs. In contrast, to classify the mixture state of flour-water batter, no previous time-steps were required. Instead, decomposition of the time domain waveform by the wavelet transform was needed to monitor a state change signature in the frequency domain. Time domain input CNNs were the best performing algorithm to predict the mixing time remaining of both the honey-water blending and flour-water batter mixing. This suggests that the ability to use the amplitude at every sample point in the waveform was better equipped to predict the mixing time remaining than using the waveform energy, SAA, or PCs. However, the time domain input CNNs began to overfit when classifying the state of the honey-water mixture. Therefore, the ANN and LSTM prediction accuracy may only sometimes be improved by using the amplitude of all sample points in a waveform. The use of only one acquired waveform for prediction hindered the CWT input CNNs ability to predict the mixing time remaining for both systems, and classified the state of the honey-water blending. Therefore, the addition of an LSTM layer would aid the prediction performance of the CNNs by storing representations of previous time-step data. The only ML task that required combining two sensor outputs was predicting the mixing time remaining for the honey-water blending. This is because the different sensor positions gave increased resolution at different stages of the mixing process. SVMs performed the worst for all prediction tasks. This is likely due to overfitting causing low prediction accuracy on test data outside the parameter bounds of the training and validation data. This is because SVM have convex optimisation functions that produce a global minima. In comparison, ANNs only converge to local minima, which may have aided their ability to generalise to test data outside the parameter space of training.
The application of the combined sensor and ML techniques to monitor processes relies on attaining ground truth data to label the outputs of all sensor signals. In industrial settings, product quality evaluations are typically conducted off-line and require considerable time, expense, or manual operations. This can mean ground truth values to produce labelled data are difficult to obtain, and therefore only a small set of labelled data is available for ML model development. In this case, additional techniques must be considered. For example, semisupervised learning can be used to first perform unsupervised learning on the combined set of labelled and unlabelled data to extract features. Supervised learning models using the labelled data can then be used to predict the class or value of the unlabelled data [
73]. Subsequently, active learning can be employed to automatically select data, which would be most useful to the model development if labelled rather than employing annotation of random samples [
74]. For example, data points close to classification boundaries or those, which expand the model training space. Transfer learning is another technique that can help overcome the limitation of small labelled data sets. It has found particular application for transferring pretrained CNNs for image recognition tasks or for NIR spectroscopy calibration transfer across spectrometers [
75,
76]. A model trained on another system, for example a laboratory or pilot scale model system, can be used to aid in the prediction of the state of the target system. For example, the optimised signal processing, network weights, or ML hyperparameter values from the first system can be used as initial training values for the target system. Alternatively, the outputs of the previously trained model applied to the target system may be used as inputs to a second model [
75].