1. Introduction
Ensuring the quality of produced metal parts in terms of surface integrity and geometrical accuracy while simultaneously ensuring a long tool life are vital for the future of subtractive manufacturing; thus, a more comprehensive understanding of cutting mechanics is necessary to meet these evolving demands [
1,
2]. Therefore, the Finite Element Method (FEM) of machining processes has received much attention by numerous researchers in the last decades [
3]. This method was successfully implemented to improve the understanding of crater and flank tool wear in turning [
4] and determination of residual stresses [
5] or interactions in the chip forming process with internal lubrication in sawing [
6]. The underlying idea of FEM is the discretization of the tool and the workpiece and employing a stiffness structure [
7]. Eulerian, Lagrangian, as well as arbitrary Lagrangian-Eulerian mesh formulations are employed.
Besides these promising applications of FEM cutting simulations, a widespread application of these simulations to real-time cutting processes in industry is not known due to the required computational resources and time. Even for powerful computers, 3D cutting simulations can take multiple hours to reach the force steady state. Although simplified simulation approaches such as 2D cutting exist, the in situ adaption of this great basis of knowledge cannot be exploited.
Potential applications of the presented approach are the detection of anomalies by comparing the predicted forces from FEM cutting simulations with the determined forces in the cutting process and tracking for deviations. The presented approach is also suitable to shorten tool development cycles as well as improve material property estimation using cutting processes.
Key to a successful FEM cutting simulation is to apply correct models such as friction models [
8] or material models. The Johnson–Cook model (J-C) is the most widespread to model machining operations [
9] and relates flow stress with strain, strain rate, and temperature [
10]. The J-C model is stated in Equation (1), with the material dependent parameters
A,
B,
C,
n, and
m, where
A is the initial yield stress at the reference strain rate and temperature,
B is the hardening modulus,
C is the strain rate dependency coefficient,
n is the work hardening exponent, and
m is the thermal softening component [
11].
Regarding the determination of the J-C parameters, two approaches are commonly reported in the literature.
Material testing experiments can be conducted via standardized material tests. The J-C parameters can be determined by means of quasi-static and dynamic uniaxial tests for high tensile strength tendon steel [
12]. The J-C constitutive model can describe the deformation of low-carbon steel after determination of the parameters with compression and tensile tests [
13]. In other works, the J-C parameters
A,
B, and
n were determined by means of quasi-static tests,
C was determined via SHPB test fitting, and
m was obtained by high-temperature compression tests [
14]. The downside to this direct approach of determining J-C parameters is the necessity to access expensive material testing devices. Furthermore, the process conditions occurring in machining are hard to reach in material testing devices, which could eventually lead to the determination of a material model not fitting to this process [
15]. During machining, high strain rates of up to 10
6 s
−1 and temperatures of up to 1200 °C occur. These are difficult to obtain through conventional tensile or compression tests [
16]. For high cutting speeds, strain rates of up to 10
7 s
−1 can occur. That is why special test setups, such as the Split Hopkins Pressure Bar (SHPB), are utilized where strains up to 10
5 –10
6 s
−1 can be reached [
17].
The second approach to determine the J-C parameters is by means of cutting experiments or material testing combined with FEM simulations. An initial set of parameters can be obtained from comparable values in the literature to tune the FEM model by varying the J-C parameters repeatedly to fit the simulations [
18]. Simple compression tests combined with cutting tests were applied to determine the J-C parameters for AISI 1045 heat-treatable steel and Ti10V2Fe3Al (Ti-1023) titanium alloy [
19].
To address the issue of saving computational time when determining constitutive model parameters, few studies have been conducted. In [
20], a Particle Swarm Optimization (PSO) approach was presented to optimize the determination of the J-C parameters to match the simulated cutting forces based on Oxley’s machining theory with experimentally recorded cutting forces. It was demonstrated that this approach led to good agreement of cutting forces in turning while significantly decreasing computational expenses. In a subsequent study, PSO was utilized by the inverse re-identification of an initial parameter set and a deviation of approximately 1% was found compared to the original values [
21]. The Downhill Simplex algorithm was applied to data collected from cutting tests for AISI 1045 to determine the J-C material model parameters that delivered simulation results in good agreement with the process observables such as temperature, cutting force, and chip form [
22,
23]. The Efficient Global Optimization algorithm (EGO) was applied to correlatively determine the J-C parameters as well as the Coulomb’s friction coefficients by minimizing the error between the numerical and experimental results for Ti6Al4V Grade 5 alloy [
24].
Besides these optimization approaches, some machine learning approaches in combination with FEM simulations in cutting are reported in the literature. The implementation of a neural network to predict cutting force, power, and temperature depending on a given set of J-C parameters for aluminum 6061-T6 and a cutting depth is known [
25]. A hybrid approach for the prediction of cutting forces by deep learning, which was trained on experimental and simulation data, was presented [
26]. However, the machine learning model represents a black box, which does not offer any insights into its decision making. By contrast, decision trees are known to additionally offer insight as to how they arrive at their conclusion. Therefore, additional machine learning algorithms, among other things decision trees, need to be investigated, which is the scope of this work.
2. Materials and Methods
To enable cutting simulations in FEM as a tool for in situ analysis of cutting conditions in real industrial cutting processes, the complexity of a cutting simulation needs to be encapsulated within a machine learning model in order to make quick predictions in fractions of seconds. To achieve this, the trained machine learning model will aim to reduce the time needed to solve for the cutting forces in a cutting simulation, limited solely to its dependence of the J-C plasticity model parameters. Therefore, the J-C damage model and its coefficients were not be investigated and were therefore omitted, and all other parameters, such as the boundary conditions, contact, elastomechanic, thermic properties, and cutting conditions, were held constant, so as to only train the dependence on the J-C model. This was done to test the applicability of machine learning models to encapsulate FEM cutting simulation results. Since a certain amount of parameter variations and subsequent simulations were needed, keeping the other parameters constant was also necessary in this first investigation to limit the numbers of simulations needed. The goal was to predict the cutting and thrust forces (Fc, Ft) in the cutting simulation without performing the simulation. Instead, the trained model was given the J-C material model coefficients and determined the forces. Multiple machine learning approaches were investigated to perform these predictions in order to find the most suitable model with regard to prediction accuracy and explainability. The materials and methods applied in this study are described in the following subchapter. As a first step, the fundamentals of the applied FE modelling technique are described. Subsequently the design of a full factorial simulation plan with varying J-C parameters is introduced. The resulting dataset serves as the basis for the applied machine learning models, as described in
Section 2.2, for cutting force prediction.
2.1. FEM Modeling
To generate a database quasi-2D cutting simulations were carried out in Abaqus. It has been shown that cutting simulations can be carried out in 2D if the following geometrical assumptions are met in the cutting process [
3]: free machining is ensured where the tool nose does not participate in the cutting process, the width of the workpiece is smaller than the cutting tool width, the uncut chip thickness should be five times smaller than the width of cut, and the cutting direction is perpendicular to the cutting edge.
The chosen method for simulating this approach was the coupled Eulerian-Lagrangian (CEL) method. The approach combines a Eulerian definition for the workpiece and a Lagrangian definition of the tool. The CEL method stems from fluid–structure interaction simulation approaches and is especially suitable to model processes with a high deformation degree. This is achieved through the Eulerian mesh, which is fixed in space and allows the material to flow through the mesh, in contrast to a Lagrangian mesh, which deforms over time with the material. This circumvents issues with numerical instabilities resulting from severely distorted meshes. The approaches also allow for an indefinitely long simulation of the orthogonal cut since the material is resupplied where the boundaries of the Eulerian mesh and the cutting velocity condition intersect. This is important, since variation in the synthetic J-C model may result in chip morphologies that are numerically unstable otherwise.
It is shown in [
27] that CEL enables good predictions of chip morphology and cutting forces while ensuring better computation results than ALE. In addition, CEL does not encounter problems due to mesh distortion that could lead to a failed simulation for both ALE and Lagrangian simulation approaches [
28].
Since 2D simulations cannot be modeled with the CEL approach in Abaqus [
28], the widely adopted quasi-2D approach was utilized, as shown in
Figure 1. In this approach, 2D orthogonal cutting conditions were enforced through rigid clamping conditions perpendicular to the cutting plane. Through the definition of a workpiece geometry in the Eulerian mesh, the initial Eulerian volume fraction could be defined. However, it should be noted that the CEL approach is known to be computationally expensive, which is partly rooted in its advection processes, where the material flow in the mesh is corrected per iteration. The structure of the model can be seen in
Figure 2.
The elastomechanical and thermal properties of the model are depicted in
Table 1. In contrast to the J-C material coefficients, these parameters were also never varied to limit the extent of the simulations needed to train a model with sufficient accuracy. The melting temperature is one key parameter for the calculation of the flow stress with the J-C model, but it was not considered in this first iteration of the concept. The simulation employed a Coulomb friction model with a coefficient of friction (µ) set at 0.3.
Meshing of the parts was focused toward the cutting region while keeping the model as a whole as small as possible to cover as many simulations as possible in less time. The meshes of the workpiece consisted of 8100 Eulerian elements with thermal coupling and reduced integration (EC3D8RT) and 1032 Lagrangian elements for the workpiece (C3D8RT), also with thermal coupling and reduced integration.
The design of experiments (DOE) was built by systematically varying the J-C parameters from a base model for AISI1045 [
29], as can be seen in
Table 2. Steel was chosen as an initial material due its widespread applications across various sectors. Through the implementation of a python script, the processes of submitting the input jobs as well as the evaluation of means for Fc and Ft were automated. The DOE was conducted fully factorial, which resulted in a total of 1024 simulations. One simulation took 35 min and 24 s to simulate on a workstation, totaling a complete simulation time for the DOE of approx. 587 h. All simulations were conducted until an in-simulation total time of 1.5 ms was reached. This was sufficient to reach a steady state for the cutting forces.
2.2. Machine Learning Approaches
Multiple approaches were compared with regard to their performance on the test set. Since the dataset was still comparatively small, special care was taken to use common regularization techniques for each approach to avoid overfitting (the training data were learned well, but the generalization to unseen data failed [
30]). The individual approaches were tested and the best hyperparameters were selected. At the end, all approaches were compared in terms of their Mean Squared Error (MSE) and their Mean Average Error (MAE) performances on the test set. The approaches were selected due to their prominence for regression tasks or for their training performance improvements, such as Light GBM.
2.2.1. Data Preparation
The data were tabularized and prepared for training. For building a training environment, Python (v. 3.11) was used. The following common libraries were used for data preparation or machine learning itself:
NumPy,
Pandas,
Scikit,
TensorFlow, and
Keras. Data were split into training and test data using an 80–20% ratio (80% for training and validation ((70–30%), 20% for testing). For all approaches, the training data were standardized using:
with
n being the number of samples,
x being an observation of data,
µ being the mean of all observations, and
σ being the standard deviation. Standardization is beneficial in machine learning for faster convergence. Standardization was also necessary since our input parameters were in different ranges and would otherwise be overestimated or underestimated in importance during training.
For all tree-based approaches, the machine learning framework enables the option to initialize the initial state at a defined random state for reproducibility. Using this option, hyperparameter tuning was performed.
2.2.2. Random Forest for Regression
The random forest algorithm is a supervised machine learning approach that uses an ensemble of decision trees and combines them to a strong learner. It does so by creating multiple trees and splitting the nodes with a random selection of features and a random selection of datapoints (also known as bootstrapped sampling). Preventing the trees from growing indefinitely prevents overfitting. The trees vote for the most popular class [
31]. The random forest algorithm for regression and classification is explained in detail in [
32].
Manual hyperparameter tuning was performed. It was found out that, with 20 estimators (“n_estimators”) and a maximum depth of 15, the MAE and the MSE could no longer be lowered on the test dataset. It has to be noted that setting the maximum depth to high could lead to overfitting.
2.2.3. Support Vector Regression (SVR)
Support vector machines (SVMs) are trained to separate data through a set of hyperplanes in high-dimensional parameter spaces. They can be used for classification and regression. Coupled with a soft margin hinge loss, data that are not linearly separable can still be fitted by allowing for some flexibility and a degree of errors in classification or regression. SVRs are an extension of SVMs and allow handling of continuous targets. This is done by expanding the hyperplane to a tube with radius ε and defining the loss function [
33]:
This penalizes every point outside of ε and allows for the fitting of non-linear, multidimensional regression tasks. In the work presented here, default hyperparameters were used.
2.2.4. Feed Forward Neural Network (FFNN)
Neural networks have gained widespread acknowledgement in the field of image recognition tasks; however, their usage in regression tasks is also well known. Depending on the task at hand, multiple different kinds of layers can be introduced to build a specific neural network. For example, dropout layers are a popular regularization scheme to avoid overfitting [
34]. The main component of FFNNs are perceptrons, which are connected to each other. These connections have weights, which are shifted during training [
30]. The output is sent through an activation function, which enables the network to capture non-linear behavior [
35]. A prominent activation function for non-linear behaviors is the rectified linear unit, short ReLU.
As displayed in
Figure 3, three hidden layers were used, two with dropout layers. Manual hyperparameter tuning was performed. Different learning rates, in the range 0.01–0.001, as well as dropout probabilities for both dropout layers were systematically varied (0.3–0.5–0.7). The lowest scores of MAE and MSE were taken for comparison.
2.2.5. Extreme Gradient Boosting (XGBoost)
XGBoost is a scalable tree boosting system, which is known for its performance for many different kinds of problems. The approach uses tree boosting as well [
36]. The most defining aspect that contributes to its success is its scalability, which is derived from new algorithmic advances for handling sparse data as well as parallel computing approaches. However, since gradient boosting decision trees have to scan all of the available data, they can be computationally intensive, especially for large datasets, which may limit their practicality in real time or in resource-constrained environments.
The model was implemented using the XGBoost package (v. 1.4.2). The defining hyperparameters for the XGBoost model (n_estimators, max_depth, and learning_rate) were also varied. Again, the lowest MAE and MSE scores were taken for further consideration.
2.2.6. Light Gradient Boosting Machine (LightGBM)
LightGBM (v4.3.0) improves upon the training performance of gradient boosted trees by implementing two novel techniques: Gradient-based One-Side Sampling (GOSS) and Exclusive Feature Bundling (EFB) [
37]. GOSS aims to prioritize the observations in the data with a larger information gain and selects some observations with a relatively low gain and removes them from the training. Through GOSS, the observations and their targets that are difficult to match gain more attention more efficiently. LightGBM also employs leaf-wise growth of the trees, which can work well with high-dimensional data.
4. Discussion
The final models chosen were two separate models for the cutting force and the thrust force that were able to infer accurate force predictions. Their validity was proven on the test set of the initial dataset as well as in separate data points that were computed to prove the validity outside the synthetic data variation. A total of 819 data points were taken for training and 205 + 2 for testing. It has to be noted that this number could be lower for future models, since the accuracy of the models may not significantly decrease. However, other aspects of cutting simulations, such as thermomechanical field outputs, chip morphology and contact length, friction and contact, and different workpiece and tool geometries may increase this value. Using LightGBM, which is a form of a decision tree, we could use the resulting feature map (
Figure 6) to infer conclusions about the optimal material properties for cutting processes (for the cutting parameter set that was described in
Section 2.1), since the model accurately captured the behavior of the cutting forces within the training parameter space.
Grey Box Modeling for J-C Parameters and the Cutting Process
Grey box modeling is the fusion of machine learning algorithms (black box models) with analytical or empirical models for increased explainability of the results [
39]. In this work, the grey box model was made up of an initial white box model (FE cutting simulation) for determination of the cutting forces to generate a dataset. This was succeeded by a black box model, which aimed to capture the non-linear behavior of the simulations to make meaningful predictions for not yet simulated material parameter combinations to eliminate the need for further numerical simulations.
An analysis of the obtained dataset yielded insight into the dependency of the material parameters on the simulated cutting forces. This basis of knowledge was further increased by considering the feature importance of the trained model. The initial yield stress at a quasi-static strain rate (parameter A) was generally deemed of low importance for the orthogonal cutting process with a positive rake angle since both models evaluated the feature importance to be rather low. However, the importance of Ft was twice as high as that of Fc, which could be explained by the chip morphology, where, combined with the 8° rake angle, the chip acts as a spring on the tool. This effect could be exasperated through higher values of A and B, which could also be seen by the importance of B for the feature importance of Ft in comparison to that of Fc.
By contrast, the strain rate hardening coefficient n and the thermal softening coefficient m had greater impacts on the cutting force than on the thrust force. This could be traced back to thermal softening and subsequent higher strains and strain rates in the shear plane, which are more aligned with the cutting force direction. Resulting from the feature importance investigation, it could be determined that for a decrease in cutting force, the thermal softening coefficient and the strain hardening coefficient of the machined material are crucial.
5. Conclusions
It could be shown that, through machine learning, single aspects of FE modeling can be predicted with good precision. Among the compared models of random forest, SVR, FFNN, XGBoost, and LightGBM, the latter showed the best regression performance. This offers multiple opportunities for speeding up tool development as well as process parallel simulations for process monitoring, since model inference is much faster than numerical simulations. Further research has to be conducted regarding how more parameters can be included (such as friction coefficients and mechanical properties) as well as how different tool geometries can be incorporated. A way to incorporate the geometric dimensions of tools could be to introduce a standardized mesh, where field outputs are sampled and predicted. To limit the simulations needed to build a sufficient database, sensitivity analyses with regard to the training of tree-based models have to be conducted. These sensitivity analyses will also include the dependence of the forces on the coarseness of the mesh. The usage of tree-based models with their explainability offers new possibilities for understanding the main key features of different cutting tool geometries paired with different tools. This has an immediate impact on tool development cycles and offers opportunities for fast-process individual tool development. Furthermore, by incorporating chip thickness, contact length, and temperature in the presented models, inverse parameter identification of the J-C coefficients can be improved and results can be determined from only a few cutting tests, eliminating the need for additional simulations. In future studies, the authors plan to increase the dataset and introduce more input parameters to the model. In order to decrease the amount of input parameters in the regression models, and therefore reduce the required dataset for efficient training of these models, an autoencoder will be utilized to reduce the multidimensional input data to a latent space. In this way, more input parameters can efficiently be compressed.