1. Introduction
Most petrochemical plants use a proportional-integral-differential (PID) controller system for feedback control [
1]. However, a PID system is gradually being expanded and introduced into an advanced process controller (APC), which is an integrated feedforward and feedback control system that predicts external influences.
The representative control method of classical control theory is PID control [
2,
3]. As shown in
Figure 1, the transfer function of the PID controller is composed of three terms: proportional, integral, and derivative.
,
, and
are the proportional, integral, and differential gains, respectively. The overshoot increases with
, but the rise time decreases, approaching the target value faster and reducing the steady-state error. However, this requires considerable control, which can strain the system. The settling time is unaffected. As
increases, the overshoot increases owing to the large change in the amount of control over the residual deviation from the steady state. Since the output changes are gradual, the rise time decreases slightly, the settling time increases, and the steady-state error, which is the goal of integral control, is eliminated. Increasing
reduces the overshoot, rise time, and settling time because the error is corrected rapidly. However, the steady-state error remains unaffected. The Laplace inversion of the transfer function
(s) given by Equation (1) yields the output equation in the time domain, as shown in Equation (2).
PID systems are characterized by 1:1 control between the process variable and the manipulative variable. Additionally, it is a feedback control system that compensates for the difference between predicted and actual values in real time. The control performance is determined by the PID tuning value.
Unlike PID systems, APC systems can be applied to processes with long time delays, unstable processes, and multi-variable processes. APC control is characterized by the N:N control method, which controls multiple process variables with multiple manipulative variables; the mixed feedback control method, which controls by compensating for the difference between the predicted and actual value in real time; and the feedforward control method, which considers the influence of external disturbance variables in advance and controls before the external influence [
4].
Figure 2 shows a schematic of an APC system.
In general, an APC system used in industrial applications is modeled as a first-order time-delay control system, as shown in Equation (3), where K = gain, D = delay, τ = time constant, t = elapsed time, x = manipulative variable, and y = process variable [
5]. Through a Laplace inverse transformation, the output equation can be obtained in the time domain of the first-order time-delay model, as shown by Equation (4).
A typical APC system consists of a system model, constraints, disturbance model, cost function, optimization method, and control range, all of which can affect the performance of the APC system [
6,
7,
8,
9,
10]. Previous papers have demonstrated that APC systems can save energy over PID systems, so replacing a PID system with an APC system can ultimately save energy in a plant [
11,
12,
13,
14,
15]. APC model parameter estimates for a typical APC system can be obtained from dynamic data spheres obtained from plant tests. The performance of the APC system is determined by the APC model parameter values. The APC systems require regular updates of the APC model parameters over time due to changes in the plant’s production grade, equipment obsolescence and replacement, etc. Programs (MATLAB, Model-ID, etc.) that help APC engineers easily obtain model parameters are being commercialized, but in the process of conducting on-site plant tests and calculating APC model parameters, a problem occurs where the results of plant tests and APC model parameters are implemented differently depending on the proficiency of APC engineers. Therefore, to minimize the influence of APC engineer proficiency and calculate universal APC model parameters, a technique is needed to obtain dynamic interval data without a plant test process.
To estimate APC model parameters, we need to know the correlation between process variables (PVs) and manipulative variables (MVs) in the data of dynamic intervals. Here, DVs refer to disturbance variables, which are non-manipulable variables that come from outside the actual process, but for the purpose of the study, which is to estimate APC model parameters, they are the same as the MVs, which are manipulative variables. However, it is often difficult to determine the dynamic intervals because of the complexity of process dynamics [
16]. In this study, we used time-series data from a real petrochemical plant to extract the dynamic intervals of the time-series data using various statistical techniques for change-point detection (CPD) [
17]. To estimate the APC model parameters accurately, pruned exact linear time (PELT)-based, linear kernel-based, and radial basis function (RBF) kernel-based learning of CPD were compared to determine the hyper parameter of the dynamic intervals with the smallest mean absolute error (MAE) [
18]. The APC model parameters were then estimated using the Levenberg–Marquardt algorithm in the dynamic range of the fixed hyperparameter. It involves applying the estimated APC model parameters to the APC program to compare the fitting rate in three intervals of randomized evaluation data to verify the accuracy of the APC model parameters in the dynamic intervals obtained through CPD.
By comparing the estimated APC model parameters with the fitting rate in three random intervals of the evaluation data, we found that the average fitting rates of 86.09% and 79.94% were obtained for Plants A and B, respectively. Through the final verification of the fitting rates, it was confirmed that the identification of dynamic intervals and the estimation of APC model parameters through CPD were highly significant. This demonstrates that APC model parameters can be estimated from dynamic intervals identified through CPD without the need for plant testing, which requires engineers to manipulate them arbitrarily and manually. Various prior studies have confirmed that the data identification of dynamic intervals through CPD is quite accurate [
19,
20,
21,
22]. However, no study has been conducted to estimate APC model parameters from dynamic intervals of data identified by CPD. In this study, we identify dynamic intervals through CPD in time-series data in the petrochemical process industry. Then, we estimate the APC model parameters within the data of the dynamic intervals, and finally, we verify the accuracy of the dynamic intervals and the APC model parameters. The final model fitting rate verification proves the achievement of this study.
2. Background and Methodology
2.1. APC Model Design Flow
Considering the inherent characteristics of the process, such as time delay, mutual interference, back reaction, and process constraints, the APC system was introduced because PID operation alone has limitations in performing optimized operation. The following steps are necessary to carry out an APC project:
By analyzing the flow of the target process and the user’s operational purpose, the PID loop performance of the MV used by the APC is determined, and the control strategy, such as the PV, MV, and DV, is set for optimal operation. During the pre-process test period, the performance of each instrument must be understood to prevent schedule delays.
- (2)
Plant Test
This is the process of developing a model of the APC controller to apply to the actual APC. To develop the model, a plant test is conducted to monitor the movement of the PV by changing the expected MV and DV with the desired amplitude for an appropriate period. Since the actual process is moved arbitrarily, during the plant test, sufficient training of operators must be provided, sufficient consultations with the production team must be conducted, and care is needed to prevent process problems.
Figure 3 shows an example of a plant test.
- (3)
Detail Design
In detailed design, plant test data are used to finalize the process variables selected in functional design. In addition, a dynamic process model is built between the finalized variables. Before applying the constructed dynamic process model to the actual field APC system, an offline test is performed to verify the control structure and performance in a virtual simulation environment. Through the offline test, it can be verified that the PV is properly controlled by the MV and DV according to the change in the PV’s set point. The dynamic process model of the PV is designed using an APC simulator, which is available from most APC vendors, as shown in
Figure 4. After validating the dynamic process model through the APC simulator, it can be built online into the actual field APC system to reduce the implementation error of the APC system (
Figure 5).
- (4)
Commissioning and Performance Analytics
The reason for applying an APC system to a process is to stabilize the process and generate profits by increasing production efficiency. Therefore, it is necessary to quantitatively measure the actual profits by comparing before and after the APC system is applied. Commissioning is the stage where the actual APC is built online and operated optimally. When the APC model parameter tuning is completed, the actual profit is estimated by comparing the performance analysis of optimized operations before and after the application of the APC, as shown in
Figure 6.
2.2. Change-Point Detection
CPD is a statistical technique that searches for points of trend change in time-series data, as shown in
Figure 7.
Thus, it locates the points in time-series data where the time-series characteristics, such as the mean, standard deviation, and slope, change rapidly.
Figure 7 presents an example of CPD in which the vertical blue dashed lines represent the points where trend changes are detected [
23,
24,
25].
CPD is performed by dividing the time-series data into intervals and minimizing the sum of costs per interval. Thus, CPD can be viewed as a type of partial time-series clustering problem that involves intervals of time-series data with similar characteristics [
26,
27,
28,
29]. The starting point of each interval is called the change-point. The change-detection problem for the time-series y = (
y1,
y2,……, y
t) can be formally defined as follows:
Here, S, C, and S* denote the set of intervals, the cost function for the intervals, and the optimal set of intervals, respectively.
- (1)
Pruned Exact Linear Time
The PELT algorithm is a change detection algorithm that determines the optimal interval in linear time when the number of change-points is unknown. It consists of the following steps [
21,
22]:
- (a)
Input: time-series y, cost function C, penalty β.
- (b)
Step 1: Initialize
- -
Initialize z as an empty array of size T + 1
- -
Initialize with Z[0] = −β
- -
Initialize with L[0] = ∅
- -
Initialize with x = {0}
- -
Initialize with t = 1
- (c)
Step 2: Update Z[t], L[t], and χ as follows:
- -
← (Z[τ] + C() + β)
- -
Z[] ← Z[t] + C() + β
- -
L[t] ← L[] ∪ {}
- -
χ ← {τ ∈ χ : Z[τ] + C() ≤ Z[t]} ∪ {}
- (d)
Step 3: Terminate the algorithm if t = τ; otherwise, increment t by 1 and return to step 2
- (2)
Kernel Change-Point Detection
Kernel change-point (KCP) detection is a method of dividing intervals based on the change in the mean of each interval [
22]. In KCP, data are projected onto a high-dimensional space through a measurable function known as the kernel, and change-points are detected by comparing the homogeneity of each sequence [
30,
31]. It is characterized by the fact that individual points are mapped using a mapping function ϕ, i.e., the cost for a set of intervals S is defined as:
Here, is the average of every value in the interval s for each element in the interval s. During the mapping process, we can use the following kernel functions:
where K represents a kernel function. The most commonly used kernel functions include linear kernels and RBF kernels.
4. Experimental Results
4.1. Experimental Environments
To ensure the reproducibility of the experiment, we specify the experimental environment of the study:
Hardware platform architecture: GPU-enabled laptop
Laptop configuration: CPU Core i5-8250, quad-core processor, 8 GB RAM.
Operating system: Windows 10.
For a fast and optimally collaborative development environment, we used the cloud-based Google Collaboratory, which provides a free Jupyter Notebook environment and is available on the cloud without installation. Google Collaboratory enables high-performance development, sharing, and computing resources. In particular, large amounts of data can be processed by this utility.
4.2. Design of the Experimental Datasets
The primary objective of the experiment was to determine data in the dynamic intervals to estimate correct APC model parameters, and the final objective was to train the model with data from only the dynamic intervals and estimate the APC model parameter with good performance. In this study, we used the time-series data obtained from two different factories, Plants A and B, at two different times, which are referred to as A-1, A-2, B-1, and B-2, respectively. The sampling time for each data point was 1 min, and the experimental data were collected over 5 days. The range of each data point was as follows:
A-1: 23 April 2023 00:00:00 to 27 April 2023 23:59:00
B-1: 26 March 2023 00:00:00 to 30 March 2023 23:59:00
A-2: 30 April 2023 00:00:00 to 04 May 2023 23:59:00
B-2: 19 April 2023 00:00:00 to 23 April 2023 23:59:00
Let A-1 and B-1 be the training data, and A-2 and B-2 be the respective test data. In addition, all four datasets described above consisted of one process variable (the PV) and two manipulative variables (the MV and the DV). The experimental trend change detection technique and its hyper parameter grid are shown in
Table 1.
Otherwise, the hyper parameter grid evaluates the following sections:
The specific process for using the dataset is described below.
[Step 1]
To objectively evaluate the proposed Plant A and Plant B data, we divided the dataset into training and test datasets. Specifically, data recorded on the first 5 days (50%) for Plants A and B were used as training data, and the last 5 days of data (50%) were used for testing.
[Step 2]
The hyper parameters were divided into the D and the MS, and dynamic intervals were detected and identified using the PELT, the linear kernel-based technique, and the RBF kernel-based technique. We used these three methods because they are the most commonly used methods for anomaly detection in time-series data in previous studies on CPDs [
32,
33,
34,
35].
[Step 3]
The data trained by the PELT, linear kernel, and RBF kernel-based methods were used to determine the accuracy of the dynamic intervals using the MAE metric. Here, the hyper parameter of the algorithm with the smallest MAE was fixed, and the APC model parameters were estimated.
[Step 4]
The APC model parameters trained with the proposed metrics were learned by applying the following equation to the Levenberg-Marquardt algorithm [
36,
37]:
The variables in this expression are defined as follows:
[Step 5]
The accuracy of the control performance was verified by comparing the fitting rates of predicted and actual values with the APC model parameter estimates obtained in Steps 3 and 4.
4.3. Results
(1) Experiment 1 (results for Plant A): For the manipulative variable MV, 10 models and hyper parameters with the smallest MAE are shown in
Table 2.
For the DV, 10 models and hyper parameters with the smallest MAE are shown in
Table 3.
Table 2 and
Table 3 indicate that kernel-based detection with a linear kernel performs satisfactorily. The average MAEs for D, MS, and ε are listed in
Table 4,
Table 5 and
Table 6, respectively. The distribution of performance across parameters varies significantly depending on the manipulative variables. For MVs, larger values indicate better performance, whereas for DVs, smaller values indicate better performance. This underscores the importance of tuning the appropriate parameters according to the data.
As shown in
Table 7, the linear kernel-based change detection algorithm performs well for both the MV and DV. Based on the best-performing linear kernel-based hyper parameters of D = 10, MS = 10, and ε = 0.05 for the MV, the Levenberg-Marquardt algorithm estimated the model parameters of the MV for the PV as K = 15.3188 and τ = 0.3221. Furthermore, based on the best-performing linear kernel-based hyper parameters for D = 5, MS = 10, and ε = 0.01, the estimated model parameters of the DV for the PV were K = 22.85 and τ = 0.0309. The graphical representation of the APC model parameters estimated based on the best-performing model for the operational variables, MV and DV, and the result of measuring the fitting rate using the APC model program are shown in
Figure 10,
Figure 11 and
Figure 12. Three intervals of approximately 200 min in length were randomly selected from the evaluation data intervals of Plant A. In
Figure 10,
Figure 11 and
Figure 12, the x-axis represents time, and the y-axis represents the range of the PV.
We measured the fitting rate of the estimated APC model parameter to the predicted and actual values using the APC model program and obtained the following results:
(
Section 1) PV fitting rate with the estimated APC model parameters (MV+DV): 81.9%
(
Section 2) PV fitting rate with the estimated APC model parameters (MV+DV): 81.04%
(
Section 3) PV fitting rate with the estimated APC model parameters (MV+DV): 95.35%
(2) Experiment 2 (results for Plant B): For the MV, 10 models and hyper parameters with the smallest MAE are shown in
Table 8.
For the DV, 10 models and hyper parameters with the smallest MAE are shown in
Table 9.
The averages of the MAE for D, MS, and ε are presented in
Table 10,
Table 11 and
Table 12, respectively. The parameterized performance distributions differ significantly depending on the manipulative variables.
A comparison of the average MAE for various algorithms is presented in
Table 13.
As shown in
Table 13, the most appropriate algorithm depends on the manipulative variables. In addition, the difference in performance across the manipulative variables is nearly an order of magnitude. This suggests that the performance difference can be large, depending on the variable used to predict the control variable. Based on D = 0, MS = 5, and ε = 0.05, that is, the linear kernel-based hyper parameters that performed the best for the MV, the model parameter estimation of the MV for the PV using the Levenberg-Marquardt algorithm yielded K = 2.644706 and τ = 0.038512. Based on D = 0, MS = 15, and ε = 0.1, the best-performing hyper parameter based on the PELT technique for the DV, the model parameter estimates of the DV for the PV yielded K = 0.9556 and τ = 0.0337. A graphical representation of the APC model parameters estimated based on the best-performing model for the MV and DV and the fitting rate measured by the APC program are shown in
Figure 13,
Figure 14 and
Figure 15. Three intervals of approximately 200 min in length were randomly selected from the evaluation data intervals of Plant B. In
Figure 13,
Figure 14 and
Figure 15, the x-axis represents time, and the y-axis represents the range of the PV.
We measured the fitting rate of the estimated APC model parameters to the predicted and actual values using the APC program and obtained the following results:
(
Section 1) PV fitting rate with the estimated APC model parameters (MV+DV): 95.04%
(
Section 2) PV fitting rate with the estimated APC model parameters (MV+DV): 74.42%
(
Section 3) PV fitting rate with the estimated APC model parameters (MV+DV): 70.36%
5. Conclusions
APC model parameters play a crucial role in APC control. Many papers have utilized CPD in various fields, and the results have been excellent. However, no study has been conducted to estimate APC model parameters from dynamic intervals of data identified by CPD. In this study, we identify dynamic intervals of data by CPD from the time-series data of the petrochemical process industry. Then, the fitting rate validation of the APC model parameters estimated from the dynamic intervals allowed us to verify the accuracy of the model with significance. In this study, PELT, linear kernel-based, and RBF kernel-based techniques were applied to CPD to evaluate the MAE of the dynamic intervals, as described in
Section 3. The results show that the linear kernel-based method yields the best results for the MV and DV of Plant A, the RBF kernel-based method is the best for the MV of Plant B, and the PELT method is the best for the DV of Plant B. Because the variables can be current set or valve values of flow, pressure, temperature, and so on in petrochemical processes, the performance of the models may differ considerably depending on the variables used to predict the control variables. The experimental results in
Section 4 show that the PV control method that considers both the MV and DV rather than controlling the PV with the MV or DV alone has the highest fitting rate. Thus, by selecting the hyper parameters in the dynamic intervals with the minimum MAE, the estimated APC model parameters were determined for the fitting rate with the predicted and actual values using the APC program. The average fitting rates were 86.09% and 79.94% for Plants A and B, respectively. The final fitting rate validation confirmed the high accuracy of the dynamic interval identification and APC model parameter estimation performance with CPD. This shows that it is possible to estimate APC model parameters from dynamic intervals determined using CPD without a plant test, which can be negatively affected by engineer skill.
In the future, extended experiments with more process data are needed to increase the reliability of the results. Further research is needed to determine whether the three CPD techniques performed in this study are the most optimized methods. Therefore, a study should be conducted to compare the MAE of dynamic intervals utilizing the CPD method in addition to the PELT, linear kernel, and RBF kernel methods to increase the reliability of the results of this study. In addition, since the APC model parameters estimated in this study are theoretical values, further research should be conducted to verify whether the APC system operates normally by applying it to a real process and whether the calculated model can reduce process deviation compared to the PID system.