Next Article in Journal
Utilization of a Novel Immunofluorescence Instrument Prototype for the Determination of the Herbicide Glyphosate
Next Article in Special Issue
On the Thermodynamic Thermal Properties of Quercetin and Similar Pharmaceuticals
Previous Article in Journal
Essential Roles of Peroxiredoxin IV in Inflammation and Cancer
Previous Article in Special Issue
Antioxidant Activity and Kinetic Characterization of Chlorella vulgaris Growth under Flask-Level Photoheterotrophic Growth Conditions
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

QSPR Modeling and Experimental Determination of the Antioxidant Activity of Some Polycyclic Compounds in the Radical-Chain Oxidation Reaction of Organic Substrates

1
Faculty of Chemistry, Bashkir State University, 450076 Ufa, Russia
2
Institute of Petrochemistry and Catalysis of the Ufa Federal Research Center of the Russian Academy of Sciences, 450075 Ufa, Russia
*
Author to whom correspondence should be addressed.
Molecules 2022, 27(19), 6511; https://doi.org/10.3390/molecules27196511
Submission received: 24 August 2022 / Revised: 28 September 2022 / Accepted: 28 September 2022 / Published: 2 October 2022

Abstract

:
The present work addresses the quantitative structure–antioxidant activity relationship in a series of 148 sulfur-containing alkylphenols, natural phenols, chromane, betulonic and betulinic acids, and 20-hydroxyecdysone using GUSAR2019 software. Statistically significant valid models were constructed to predict the parameter logk7, where k7 is the rate constant for the oxidation chain termination by the antioxidant molecule. These results can be used to search for new potentially effective antioxidants in virtual libraries and databases and adequately predict logk7 for test samples. A combination of MNA- and QNA-descriptors with three whole molecule descriptors (topological length, topological volume, and lipophilicity) was used to develop six statistically significant valid consensus QSPR models, which have a satisfactory accuracy in predicting logk7 for training and test set structures: R2TR > 0.6; Q2TR > 0.5; R2TS > 0.5. Our theoretical prediction of logk7 for antioxidants AO1 and AO2, based on consensus models agrees well with the experimental value of the measure in this paper. Thus, the descriptor calculation algorithms implemented in the GUSAR2019 software allowed us to model the kinetic parameters of the reactions underlying the liquid-phase oxidation of organic hydrocarbons.

1. Introduction

In the course of long evolution, people have surrounded themselves with a huge variety of chemical compounds, which are used in the process of life. A significant portion of these substances are organic compounds. During the use of these substances, their properties change under the influence of external conditions (temperature, solar radiation, and many others), thus reducing the quality of their performance. One of the most important processes leading to the deterioration of the performance characteristics is the oxidation by atmospheric (air) oxygen. This process, which follows a radical chain mechanism, can be carried out as auto-oxidation, where an oxygen molecule detaches a hydrogen atom from the weakest C-H bond and forms a primary radical, or has initiated the oxidation where initiators are present in the reaction medium that can easily initiate the oxidation process [1,2,3,4,5,6,7]. This creates a chain process in which labile, highly reactive, intermediates, such as radicals of a different nature and peroxides are formed [8,9,10]. As a result, the organic substrate undergoes a thermal-oxidative degradation and loses its functional characteristics.
To inhibit unwanted free-radical oxidation processes, minor additives of compounds, called antioxidants (AOs), are widely used. This is a group of various classes of organic compounds that can react with radical and peroxidation products and thus either inhibit or significantly slow down the development of free-radical processes in various systems capable of oxidation [1]. For the effective and targeted action of AOs, it is necessary to determine the quantitative characteristics in the form of rate constants for the steps of the mechanism responsible for the inhibition effect. The mechanism of the radical chain oxidation of organic compounds in the presence of AOs is very complex. Therefore, this requires considerable time and involvement of unique and expensive methods of physico-chemical experiments. Meanwhile, there are known methods of mathematical modeling that can be used to determine the necessary rate constants and do not require the investigation of the reaction mechanism. One such method is the quantitative structure–activity relationship (QSAR) / quantitative structure–property relationship (QSPR) modeling, which is widely used to identify the lead structures among the biologically active compounds, including drugs.
This approach is based on the assumption that the properties of chemical compounds are determined by their structure. The essence of the QSAR/QSPR methods is to describe the structures using correctly chosen descriptors and apply these descriptors in combination with mathematical and statistical methods to build valid QSAR/QSPR models focused on the reliable quantitative prediction of various types of biological activities and physico-chemical properties of organic compounds, respectively.
One of important advantages of the QSAR methods is that the physico-chemical properties and biological activities can be modeled on the basis of a relatively small number of training set structures (30 compounds).
There are quite a few monographs and papers describing the ideology of the QSAR/QSPR methods and software packages for their practical implementation, as well as the application of the QSAR/QSPR methodology to search for potential drugs and for the safety assessment of chemicals [11,12,13,14,15,16,17,18,19,20,21,22,23]. Numerous studies have shown that the inclusion of the QSAR approaches in the development of new hit and lead compounds can significantly reduce the time and material resources and provide for a targeted synthesis of the compounds possessing the required set of properties. Several types of classification of the QSAR/QSPR methods demanded by researchers have been described in the scientific literature. These types of classification are based on the choice of a set of descriptors and machine learning methods to construct mathematical equations [24,25,26,27,28,29,30,31,32,33,34,35,36,37]. Detailed considerations of these classifications as described, can be found, for example, in [24,29,38,39,40,41,42,43]. The most widespread of all known QSAR methods are those based on the structural formulas of chemical compounds (2D QSAR), as well as the methods that use a spatial description of the chemical structures (3D QSAR) [24,25,26,38,40]. The use of 3D-QSAR methods is justified for the quantitative analysis of the relationship between the structure and the enzymatic specificity of the biologically active compounds. When modeling the quantitative relationship between the structure and the physico-chemical characteristics, it is quite objective and exhaustive to use 2D QSAR methods.
In the last two decades, the research on antioxidants were concerned, apart from the widely known AOs, such as the sterically hindered phenols and aromatic amines, with the synthesis and experimental study of hybrid molecules, such as chromanol conjugates with lupanoic acids, tetrahydroquinoline, analogues of ecdysteroids with oxygen-containing heteroatoms in the steroidal backbone. Owing to the presence of the pharmacophore moieties, these compounds are promising as potential biologically active additives with a wide spectrum of biological activity. In addition, various research groups are currently synthesizing the structural analogues of these compounds. The rational design and synthesis of these compounds using modern virtual screening techniques, including QSAR/QSPR modeling, pharmacophore search, and molecular docking, occupies a crucial place among the strategies for selecting the directions for the chemical modification of the biologically active substances. The choice of one of these methods depends mainly on the goals of the study. If the ultimate goal of the synthesis is to obtain biologically active substances for further in vitro studies without involving enzymatic systems, then one of the 2D-QSAR/QSPR methods would be the most preferable choice in this case.
The name “2D-QSAR/QSPR” stands for the development of QSAR/QSPR models using 2D descriptors. Two-dimensional descriptors are widely employed in QSAR/QSPR modeling owing to the relatively simple calculation algorithms, based on mathematical equations. This method possesses a reproducible operability and does not require large amounts of time and computational resources. Furthermore, 2D descriptors make a huge contribution to the extraction of chemical attributes, and they can also represent, to some extent, three-dimensional molecular features. However, they should by no means be considered as final, since they often suffer from mutual correlation problems, insufficient chemical information, and the lack of interpretation [44].
The benefits of these methods for modeling the reactivity of oxidation inhibitors (antioxidants) in liquid-phase reactions are obvious both for the above-mentioned rational design and from the ideological standpoint, considering kinetic experiment techniques. In particular, reactant molecules and intermediate radicals formed from the reactants are uniformly distributed throughout the reaction system; the reaction system is a homogeneous solution; the liquid-phase oxidation reaction of organic substrates takes place virtually in the whole reaction system; the reaction rate in a homogeneous solution (reaction mixture) is actually determined by the frequency of collisions of the reactants with one another, the solvent nature, and the oxidation substrate; in the reaction medium, the substrate is oxidized by the atmospheric oxygen; the reaction is started by the initiators in the absence of enzymes.
In this connection, before carrying out a synthesis and experimental studies of the antioxidant activity (AOA) of chromanol conjugates with steroidal and nonsteroidal compounds in non-enzymatic model systems, it is advisable to quantify their AOA using the QSPR methodology. Inclusion of these and other hybrid molecules in the training sets in QSPR modeling will expand the range of applicability of QSPR models focused on the prediction of the logk7 parameter in the series of phenolic antioxidants [45,46,47,48,49,50,51,52,53].
One of the programs used to calculate the physico-chemical and structural descriptors, to select the most significant of them, and to build consensus QSPR models based on them is the GUSAR2019 (General Unrestricted Structure Activity Relationships) program and its earlier versions GUSAR2013 and GUSAR2011 [11,12,17,54,55,56,57,58,59,60]. This program has proven efficient in the modeling of various types of biological activities for some heterogeneous organic compounds [54,55,56,57,61,62]. In our pioneering work [62,63,64,65,66], we reported the use of an earlier version of the program, GUSAR2013, for the QSPR modeling of antioxidants in the series of some phenols, amines, uracils, benzopyrans, and benzofurans. In doing so, the statistically significant valid models were constructed to predict the oxidation chain termination rate constants of logk7, in order to search for new potentially effective antioxidants in virtual libraries and databases. However, the training sets used in our previous models for predicting logk7 for phenolic antioxidants did not contain the structures of the chromanol conjugates with lupanoic acids, 20-hydroxyecdysone, and for this reason, they could not be used for the quantitative AOA prediction for the structural analogues of these organic compounds.
The main goal of the present study is to develop statistically significant valid QSPR models for the prediction of the parameter logk7 for biologically active phenolic antioxidants with general formulas IVIII (Figure 1) (k7 is the rate constant for chain termination by the antioxidant molecule and is actually an objective quantitative characteristic of AOA) and to predict and experimentally determine this quantitative AOA parameter for two promising antioxidants, chromane derivatives. The practical significance of this study should be the applicability of these QSPR models to predict logk7 for the biologically active phenolic derivatives and, hence, an objective selection of such compounds as oxidation inhibitors from virtual and synthetic libraries and databases.

2. Results and Discussion

2.1. Prediction of the Numerical Values of the Parameter k7 Using the GUSAR2019 Program

According to the consensus approach implemented in the GUSAR2019 program, six consensus QSPR models M1–M6 were built to predict the numerical values of logk7 for the phenolic type antioxidants, namely the sulfur-containing alkylphenols, the natural phenols, the chromane derivatives, the betulonic and betulinic acids, and 20-hydroxyecdysone. These models differ in the type of descriptors they contain and the number of partial regression relationships. The descriptive power characteristics of the M1–M6 consensus models, calculated automatically in the GUSAR2019 program by comparing the experimental logk7 values with those predicted by these six models are presented in Table 1. Note that the determination coefficients, the standard deviations, and Fisher’s criterion values presented in Table 1 are the average values obtained taking into account all partial regression models included in the consensus model Mi (i = 1–6).
For a QSPR model to be adequate, that is, to be acceptable for use within its reach, its results must correctly describe and predict the target property. According to the recommendations of the QSAR/QSPR modeling experts, the validation is the most important concept in the development and application of the QSPR models. The validation of the developed QSPR models based on the external test set structures is the “gold standard” that validates the reliability of these models along with the acceptability of each step during their development: the assessment of the input data quality, the diversity of the data sets, the predictivity, the domain of applicability, and interpretability.
In this regard, to objectively characterize the descriptive and predictive powers of the M1–M6 consensus models, we performed the prediction of the logk7 values for the antioxidant structures contained in test sets TR1 and TR2. As in our previous studies [67,68,69,70], in addition to the parameters calculated in the GUSAR2019 program (average R2, average Q2, average F), we used metrics based on the R2 determination factors (R2, R20, R2’0, average R2m, ΔR2m Q2F1, Q2F2, CCC); and the metrics designed to estimate the prediction errors of the logk7 values (RMSE, MAE, SD) [34,35,36,37]. The metrics based on the prediction error estimates were used to determine the true prediction quality index for the parameter logk7 for the compounds of both test sets. Their calculation was performed using the Xternal Validation Plus 1.2 program [71]. The same program was used to check the models for systematic errors.
The statistical criteria measuring the descriptive and predictive powers of the M1–M6 QSPR models, which were estimated for 95% of the structures of the training and test sets TR1, TR2 and TS1, TS2, respectively, are presented in Table 2, Table 3. Tables S2–S8 (Supplementary Material) present a complete set of the statistical parameters calculated using the Xternal Validation Plus 1.2 software for the training and test sets TR1, TR2 and TS1, TS2, taking into account both 100% and 95% of the antioxidant structures they contain.
The analysis of the statistical characteristics of the M1-M6 consensus models, summarized in Table 2, Table 3 showed that almost all of the models successfully reproduced the experimental data contained in both training sets (the condition is satisfied for 100% and 95% of the data). Thus, the values of the determination coefficients R2, R20, R2’0, average R2m, and CCC, evaluated by comparing the values of logk7pred and logk7exp fully met all of the the requirements, corresponding to the models with a high descriptive power listed in part 2.3. The M6 model (100% and 95% of the data) had the highest descriptive power in a number of determination coefficients (R2, R20, CCC). At the same time, other criteria indicated the best reproducibility of the experimental data of test sets TS1 and TS2 (100% and 95% data) using the models M3 (R2’0, ∆R2m), M4 (average R2m), and M5 (R2’0, average R2m). The M3 and M6 models were characterized by the lowest values of the prediction errors of the logk7 value for the structures of both training sets (RMSE, MAE, SD) at 100% and 95% of the data contained in them. At the same time, the best characteristics were in the M3 model (Table 2). The minimum SD value at 100% of the data in both training sets was shown by the M4 model. In the case of 95% of the data in these training sets, the best result was observed for the M6 model. Since the numerical values of the MAE for all of the models were in the range of 0.0599–0.0679, which is significantly lower than 0.706 (10% of the range of the simulated logk7 values), and simultaneously, the numerical values of the MAE+3SD criterion were also significantly smaller than 0.706, we can conclude that almost all of the models had a high descriptive power.
However, the same models were characterized by the rather low values of the different determination coefficients for the comparison of the experimental and predicted logk7 values for the 100% antioxidant structures contained in test sets TS1 and TS2. Thus, the coefficient of determination R2 and its analogs (R20, R2’0) were in the range of 0.4500–0.6882, the CCC criterion ranged from 0.6483 to 0.8086, which allowed us to characterize the prognostic ability of these models as low. At the same time, the most successful predictions, if we focus on these criteria, were observed for the structures of test set TS2. Meanwhile, a more reliable estimate of the predictive power of the M1–M6 models, taking into account 100% of the data in the test sets, can be obtained by analyzing the criteria based on the logk7 prediction errors for the same antioxidant structures. Specifically, the MAE and MAE+3SD criteria ranged from 0.3472 (M6, TS2) to 0.4696 (M5, TS1) and from 1.4586 (M4, TS2) to 2.1112 (M5, TS1). According to these criteria, the models with moderate predictive powers are M1 (TS1), M3 (TS1), and M4–M6 (TS2). Thus, the analysis of prediction errors for antioxidant structures contained in test sets TS1 (100% data) and TS2 (100% data) did not remove the uncertainty factor in assessing the predictive power of the M1–M6 models.
The removal of 5% of the structures from both test sets led to a significant increase in the numerical values of the various types of determination coefficients and a decrease in the logk7 prediction errors for the structures contained in TS1 and TS2.
The numerical value of the R2 criterion increased approximately by 30% and ranged from 0.7289 to 0.8204. The coefficient of determination R20 increased in parallel and was almost in the same range: 0.7263–0.8115. The maximum values of these criteria were found in both cases when the M1 model was used for the prediction tasks in the series of antioxidants contained in test set TS1 (95% of the data). According to the criteria mentioned in Part 2.3, the M5 and M6 models were insignificantly inferior in their predictive power. This fact was established in the prediction of logk7 for the antioxidants included in test set TS2 (95% data). From the analysis of the numerical values of all of the other types of determination coefficients, which are presented in Table 3, we can conclude that in some cases, the M5 model demonstrated the greatest prognostic ability. We reached this conclusion by analyzing CCC, R2’0, and average R2m values for the compounds of test set TS2 (95% data). The highest values of the criteria Q2F1, Q2F2 differed in the results of the prediction of logk7 for the structures of the same test set performed using the M6 model (Table 3). When evaluating the prognostic ability of the M1-M6 models, taking into account the prediction errors of the logk7 values for 95% of the data in test sets TS1, TS2, the most successful predictions were also observed for the test set TS2 structures. The M6 model showed the lowest values of the RMSEP error, the SD standard deviation, and the MAE+3SD criterion. On the same dataset, the M5 model showed the minimum MAE error.
Thus, relying on the set of criteria summarized in Table 3, we can conclude that all of the models had a moderate predictive power in predicting the logk7 values for the antioxidant structures contained in test sets TS1 and TS2. An obvious proof of this fact is the plot depicted in Figure 2, which shows a satisfactory correlation between the experimental and predicted values of logk7 for the structures of test sets TS1 and TS2 (95% data).
The insignificant difference between the numerical values of the various types of determination coefficients, in combination with the acceptable values of the MAE and MAE+3SD parameters, summarized in Table 2, Table 3, indicates that the valid QSPR models focused on predicting logk7 values for the antioxidants can be constructed using either one particular type of descriptor (QNA or MNA descriptors) or a combination of the descriptors in a consensus approach.
Subsequently, the M1–M6 consensus model was used to predict the numerical values of logk7 for the antioxidants AO1 and AO2. The results of these calculations are summarized in Table 4.
The approximate 95% confidence interval for predicting future data is ±2RMSE if the model is correct and the errors are normally distributed.

2.2. Experimental Determination of the Inhibition Rate Constants k7 for Compounds AO1 and AO2. Methods of the Kinetic Experiment to Determine the Antioxidant Activity of Compounds AO1 and AO2

The synthesis, the physico-chemical properties, and the antioxidant assays of compounds AO1 and AO2 (Figure 3) were reported previously [72]. In the present study, we describe the kinetics of the radical chain oxidation of an organic compound in the presence of additives AO1 and AO2.
The experimental logk7 values for compounds AO1 and AO2 were determined by the manometric method using air oxygen absorption as a model liquid-phase oxidation of 1,4-dioxane, initiated by azobis(isobutyronitrile) (AIBN). The experiments were performed according to the standard technique described earlier [72,73,74,75,76,77,78]. The model reaction was carried out in a thermostatically controlled glass reactor where the solutions of the initiator (AIBN) and the studied substances in 1,4-dioxane were loaded. The temperature of the reaction mixture was 348 K. The reaction mixture was maintained in the thermostat for 5 min. The kinetic curves was measured using a universal manometric differential unit, the design of which was reported earlier [75,76,77,78]. Subsequently, the initial rates of the oxidation of 1,4-dioxane were calculated from the initial sections of the kinetic curves recorded in the absence and in the presence of compounds AO1 and AO2 using the least-squares method. The numerical values of the effective inhibition rate constants for compounds AO1 and AO2 were calculated from the degree of the decrease in the initial oxygen uptake rate during the oxidation of 1,4-dioxane. The initiation rate of the oxidative process was constant and was Vi = 1 × 10−7 mol·l−1·s−1. It was determined using the equation Vi = 2ekp[AIBN], where kp is the rate constant of the AIBN decay, e is the probability of the radical escape into the bulk). For kp, the value measured in cyclohexanol was taken [79]:
logkp = 17.70 − 35/(4.575T·10−3), e = 0.5
Since the reaction was performed according to the standard technique [73,74,75,76,77,78], we assumed that the initiated oxidation of 1,4-dioxane proceeded by the radical chain mechanism, which we schematically show in Figure 3 [1,2,3,4,5,6,7].
The antioxidant properties of AO1 and AO2 were studied in the AIBN-initiated radical chain oxidation of 1,4-dioxane in the kinetic regime at 348 K. The typical kinetic curves of the oxygen uptake in the presence of additives of AO1 and AO2 at different concentrations are shown in Figure 4 and Figure 5. In the absence of compounds AO1 and AO2, the kinetic curves of the oxygen uptake in the oxidation of 1,4-dioxane were straight lines, i.e., the reaction order with respect to oxygen was zero. Consequently, the oxidation of 1,4-dioxane proceeded in the kinetic regime. In this case, the chain propagation and termination reactions were run by peroxyl radicals.
As can be seen in Figure 4 and Figure 5, the introduction of the additives of AO1 and AO2 brings about a clear induction period in the kinetic curves of the oxygen uptake, indicating a pronounced antioxidant effect of the studied substances.
Using the Excel 2016 word-processor, we calculated the initial oxidation rates of the model substrate at different concentrations of the added substances. The resulting numerical values are presented in Table 5. As can be seen, the introduction of compounds AO1 and AO2 separately in the concentration range of (0.44–3.13) × 10−6 mol/L for AO1 or AO2, respectively, into 1,4-dioxane being oxidized, led to a decrease in the initial oxidation rate. Thus, the qualitative analysis allows us to conclude that we consider that both compounds effectively inhibit the oxidation process of the model substrate (Figure 4 and Figure 5, Table 5).
The numerical values of the effective rate constants of inhibition fk7 for each of the antioxidants were calculated using Equation (2). The condition for the applicability of this equation is a linear dependence of the inhibition parameter F on the concentration of the antioxidants. As can be seen from Figure 6, in the oxidation chain regime in the (0.44–3.13) × 10−6 mol/L concentration range of the AO1 and AO2 compounds, the inhibition parameter F, calculated from the initial rates of the inhibited oxidation of 1,4-dioxane by formula (2) actually followed a linear dependence on the AO1 and AO2 concentrations (Figure 6):
F = V 0 V V V 0 = f k 7 [ I n H ] 2 k 6 V i ,
where V0 and V are the initial rates of the oxygen uptake during the oxidation of 1,4-dioxane in the absence and in the presence of each of the antioxidants taken separately, respectively, [AO] is the concentration of the added AO, k7 and 2k6 are the rate constants of the oxidation chain termination by the antioxidant and the quadratic chain termination via peroxyl radicals of the substrate, respectively [1,2,3,4,5,6,7], [RH] is the concentration of 1,4-dioxane ([RH] = 11.75 mol/L), k2 is the rate constant of the chain propagation for the oxidation of the model substrate (k2 = 7.9 l·mol−1·s−1 [2]). When calculating these values, we used the quadratic chain termination rate constant 2k6 = 6.67 × 107 l·mol−1·s−1 known from the literature [2]. The errors in determining the fk7 and f values were calculated using the Excel 2016 word processor (Regression tab).
The adjustment of the experimental data in the coordinates of Equation (2), the effective inhibition constants for compounds AO1 and AO2 were determined to be f = (1.32 ± 0.3) × 106 M−1s−1 and f = (1.08 ± 0.2) × 106 M−1s−1, respectively. In addition, to determine the numerical value of the stoichiometric inhibition coefficient, we studied the dependence of the induction period, which appeared on the kinetic curves of the oxygen uptake, on the concentrations of AO1 and AO2. As can be seen from Figure 7, the dependence of the induction period τ on the concentrations of AO1 and AO2 is linear. In this case, it is correct to use Equation (3) to determine the stoichiometric inhibition coefficient f:
τ = f [ I n H ] V i ,
where τ is the induction period on the kinetic curves of the oxygen uptake during the oxidation of 1,4-dioxane inhibited by AO1 and AO2; Vi is the initiation rate of the oxidation. Conversion of the experimental data in the coordinates of Equation (3) gave the stoichiometric inhibition factors f for the antioxidants AO1 and AO2 to be 30 ± 4 and 40 ± 2, respectively.
The inhibition rate constant k7exp for AO1 and AO2 was calculated by formula (4):
k 7 exp = f k 7 exp / f ,
The numerical values of k7exp for the antioxidants AO1 and AO2 were k7exp = (4.3 ± 1.0) × 104 M−1s−1 and k7exp = (2.7 ± 0.5) × 104 M−1s−1, respectively.
The comparative analysis of the calculated logk7pred and the experimental logk7exp values for compounds AO1 and AO2 (Table 4) suggests that the M1–M6 QSPR consensus model has a moderate predictive ability and can be applied to the search and development of new antioxidants. The difference between the predicted and experimentally determined logk7 values for these antioxidants does not exceed the 2RMSEP range.
Thus, all M1–M6 QSPR consensus models are characterized by a high descriptive and moderate predictive power for comparing the experimental and predicted logk7 values for training set structures TR1 and TR2, the external and internal test set structures TS1 and TS2, and compounds AO1 and AO2. These models can be used for the screening of virtual libraries and databases in order to search for new antioxidants in the series of some sulfur-containing alkylphenols, natural phenols, chromane and lupanoic acids, betulonic and betulinic acids, and 20-hydroxyecdysone.
In general, the approach implemented in the GUSAR2019 program, which was previously used only for modeling the biological activity of low-molecular-weight compounds, allows a high degree of reliability in modeling the kinetic characteristics of antioxidants expressed as the k7 parameter. Thus, this program can be recommended as an additional tool in the search for new antioxidants.

3. Research Methods

The simulation procedure was performed for the compounds whose formulas are shown in Figure 1.

3.1. The Methodology of the Computational Experiment

The QSPR modeling of the derivatives of the sulfur-containing alkylphenols, natural phenols, chromane and lupane acids, betulonic and betulinic acids, and 20-hydroxyecdysone with general structural formulas IVIII (Figure 1) was performed using the GUSAR2019 (General Unrestricted Structure Activity Relationships) software [54,55,56,57,58,59,60,61,62].
The QSPR models were built in several stages, schematically presented in Figure 8.

3.2. Formation of the Training and Test Sets

The training sets TR1, TR2 and test sets TS1, TS2 were based on the array of S1 structures in accordance with the procedure described in our earlier studies [67,68,69,70,76,77,78,79,80,81,82,83,84]. This procedure reflects the rational separation strategy and is presented in Figure 9. The array of the set S1 structures included 148 sulfur-containing alkylphenols, natural phenols, hybrid molecules (conjugates of chromane and lupanoic acids, betulonic and betulinic acids, 20-hydroxyecdysone) with their corresponding logk7 values.
The parameter logk7 was obtained by taking logarithms of the numerical values of the inhibition rate constant k7 for the simulated antioxidants, which were measured experimentally and reported in the literature [67,68,69,70,76,77,78,79,80,81,82,83,84]. In fact, the inhibition rate constant k7, which we chose as the simulated parameter, reflects the specific rate of the inhibition of the liquid-phase oxidation of the organic substrates similar in oxidative capacity by the antioxidants. In modeling, it was assumed that the oxidation of the organic substrates in the presence of antioxidants, proceeds in several steps and can be schematically described by the following key steps, which have been studied in detail and described in the literature [1,2,3,4,5,6,7] (Figure 10).
Reactions 1 and 2 are the elementary steps of the oxidation chain propagation, reactions 6 and 7 are chain termination steps via the recombination of the peroxyl radicals RO2 and via the antioxidant molecule, respectively. The antioxidant effect of the antioxidants included in the S1 data array is implemented through their reaction with the peroxyl radical of the RO2- oxidation substrate RO2. As a result, the peroxyl radical active in the chain propagation reaction is replaced by an inactive antioxidant radical. This is the AOA mechanism of the simulated compounds. Obviously, the higher the numerical value of the inhibition rate constant k7, the more pronounced the antioxidant properties of the organic compound.
The M1–M3 QSPR models were constructed based on the training set TR1, which included 123 antioxidant structures with their corresponding logk7 values. To test the predictive power of the M1–M3 models, we used test set TS1, which contained 25 antioxidant structures with their corresponding logk7 values. Both of these sets were derived by a 5:1 split of the S1 data set by transferring every sixth compound from S1 to TS1. The remaining 123 antioxidant structures were used to form the training set TR1. Preliminarily, all structures of the data array S1 were ranked in ascending order of the numerical value of logk7.
The training set TR2 included 103 antioxidants with their respective logk7 values and was designed to build the M4–M6 QSPR models. The validity of the M4–M6 QSPR models was tested using test set TS2. Both TR2 and TS2 sets were formed on the basis of the training set TR1. In this case, the TR1 set was subjected to a 5:1 split, with the transfer of every sixth compound from TR1 to TS2. The characteristics of the training sets TR1 and TR2 and test sets TS1 and TS2 are presented in Table 6 and Table 7, respectively. The data of these tables indicate that the compounds of the training and test sets are fairly evenly distributed over the entire range of the logk7 variability. At the same time, the AOA of the compounds of the TR1 and TR2 sets varies over a wide range (∆logk7 = 7.06). The range of variability of logk7 for the compounds of the test sets does not go beyond the range Δlogk7 = 7.06. In addition, as can be seen from Figure 1, the training sets are characterized by a high degree of molecular diversity. These conditions are important for building high-quality QSPR models and the correct forecasts based on them [34].
The compound structures of the training and test sets TR1, TR2 and TS1, TS2 were built using the MarvinSketch 17.22.0 software [85], and then were converted into the SDF format using DiscoveryStudioVisualiser [86].

3.3. Building QSPR Models

The M1–M6 QSPR models were built based on two types of substructural descriptors of atomic neighborhoods: QNA (quantitative neighbourhoods of atoms) and MNA (multilevel neighbourhoods of atoms) [11,12,17,54,55,56,57,58,59,60,61,62]. The calculation of these types of descriptors in the GUSAR2019 program was performed automatically from the structural formulas of chemical compounds, taking into account the valence and partial charges of all atoms. The specific features of the communication types were not taken into account in the calculations. The ideology of calculating the QNA and MNA descriptors is described in detail in the Supplementary Material and in previous publications [11,12,17,54,55,56,57,58,59,60,61,62]. However, the QNA descriptors cannot be physically interpreted due to the peculiarities of their calculation. In this regard, they are not explicitly displayed under the calculations.
The MNA descriptors are computed using the PASS algorithm (prediction of activity spectra for substances) [17,60], which predicts approximately 6400 “biological activities” with an accuracy threshold of an average prediction of at least 95%. These descriptors are generated based on the structural formulas of the chemical compounds without using any pre-compiled list of the structural fragments [11,17,60,87]. They are generated as a recursively defined sequence:
  • Zero-level MNA descriptor for each atom is the mark A of the atom itself;
  • Any next-level MNA descriptor for the atom is the substructure notation A (D1D2…Di…), where Di is the previous-level MNA descriptor for i–th immediate neighbor of the atom A.
The neighbor descriptors D1D2…Di… are arranged in a unique manner. This may be, for example, a lexicographic sequence. The MNA descriptors are generated using an iterative procedure, which results in the formation of structural descriptors that include the first, second, etc. neighborhoods of each atom. The label contains not only information about the type of atom, but also additional information about its belonging to a cyclic or acyclic system, etc.
The QSPR model additionally included three descriptors of the whole molecule (topological length, topological volume, and lipophilicity), which were also calculated automatically in the selected program.
To reduce the descriptor space and select the most significant descriptors, we used the approach referred to as Both, in the GUSAR2019 program. This approach is new. It is proposed by the developers of the GUSAR2019 program and combines the simultaneous use of the two methods of the descriptor space reduction previously proposed by the same authors: the method of self-consistent regression (SCR), and its combination with radial basis functions (RBF-SCR). A detailed description of this method can be found in the Supplementary Material and in the relevant publication [60].
The developers of the GUSAR2019 program recommend using the SCR-RBF method to select the descriptors when the training set contains structurally heterogeneous compounds.
The stability of the constructed models was tested using a sliding control procedure with a 20-fold randomized outlier of 20% of the compounds from the training sets TR1 and TR2. Both procedures in the GUSAR2019 are implemented automatically [11,12,17,54,55,56,57,58,59,60,61,62].
The four final QSPR models, M1, M2, M4, and M5, were constructed using a consensus approach and included 20 partial regression relationships. The condition for combining several regression equations into one consensus model was their general similarity. The M1 and M4 models were constructed based on the QNA descriptors and three descriptors reflecting the topological length, topological volume, and lipophilicity of the simulated antioxidant structures. The M2 and M5 models were constructed according to a similar principle, but based on the MNA descriptors with the automatic addition of the same three whole molecule descriptors described above. The M3 and M6 models were constructed according to a similar principle, but each of these models included 320 partial regression relationships. At the same time, each of these 320 single models included in the M3 and M6 consensus models was constructed independently of each other, based on the three whole-molecule descriptors described above with the addition of either the QNA or MNA descriptors. Due to specific features of the calculation, the QNA and MNA descriptors do not lend themselves to an unambiguous physical interpretation. In this regard, the regression equations based on them are not displayed explicitly in the GUSAR2019 program. The final prediction of the numerical value of logk7 for a particular compound using a particular model was formed based on the results of averaging the predicted logk7 values of the single regression QSPR models included in this consensus model.

3.4. Assessment of the Descriptive and Predictive Powers of the QSPR Models

In order to ensure the consistency of the results, the same standard parameters were chosen to assess the descriptive and predictive powers of the M1–M6 consensus models. The descriptive power of the M1–M6 models was evaluated using metrics based on the determination coefficients R2 (R2, R20, R2’, average R2m, CCC) and the metrics evaluating the prediction errors of the logk7 values (root mean square error (RMSE), mean absolute error (MAE), standard deviation (SD)) [34,35,36,37]. These statistical parameters were calculated using Xternal Validation Plus 1.2 for 100% and 95% of the data (to account for the outliers) in the training and test sets [88]. The Supplementary Material provides the formulas by which these criteria are calculated in this program. The internal validation of the M1–M6 models was performed using LMO cross-validation (Q2LMO) with a 20-fold exclusion of 20% of the compounds from the training sets.
Additionally, the predictive power of the consensus QSPR models was evaluated by comparing their predicted logk7 values with the experimental values of the same parameter for the new promising antioxidants AO1 and AO2, which were not included in the S1 data set (Figure 10).
The threshold values of the validation criteria for the above parameters for models of the high descriptive and predictive powers were as follows:
  • For 95% of the data of the training set TRi, the numerical values of the determination coefficients R2, R20, R2’0, and the CCC criterion should be close to each other and tend to unite;
  • Numerical value of the criterion R2m > 0.85 with ΔR2m < 0.15;
  • Numerical value of the average absolute error MAE should not exceed 10% of the activity range Δlogk7 of the simulated training set TRi;
  • MAE+3SD parameter value (where SD is standard deviation) should not exceed 10% of the activity range Δlogk7 of the simulated training set TRi;
  • Numerical values of the determination coefficients Q2F1, Q2F2 (calculated for the test sets) should be close to each other and tend to unite.
The quality of the QSPR models was considered low if they met the following criteria:
  • For 95% of the data of the training sample Tri, the numerical values of the determination coefficients R2, R20, R2’0, and the CCC criterion should not exceed the threshold value 0.6;
  • Numerical value of R2m ≤ 0.5 with ΔR2m ≤ 0.2;
  • Numerical value of the mean absolute error of the MAE exceeded 20% of the activity interval of the Δlgk7 compounds simulated by the training sample TRi;
  • The value of the MAE+3SD parameter exceeded 25% of the activity interval of the lgk7 compounds simulated by the training sample TRi;
  • Numerical values of the determination coefficients Q2F1 < 0.70, Q2F2 < 0.70 (calculated for the test sets) should be less than 0.70.
In all other cases, the descriptive and predictive powers of the models were evaluated as moderate, according to the criteria described above.

4. Conclusions

The QSPR strategy implemented in the GUSAR 2019 program was used to establish a quantitative structure–antioxidant activity relationship for a series of 148 sulfur-containing alkylphenols, natural phenols, chromane, betulonic and betulinic acids, and 20-hydroxyecdysone with the general structural formulas IVIII. Six statistically significant valid QSPR consensus models were built. The models demonstrated a satisfactory predictive accuracy in predicting the parameter logk7 for training and test set structures: R2TR > 0.6; Q2TR > 0.5; R2TS > 0.5. All models showed a high performance, as they reproduced the known experimental data for the training sets with a high degree of accuracy. The cross-validation with a 20-fold exclusion of 20% of the training set data also showed good results. The validation of the prediction of logk7 by the estimation of these parameters for the compounds of two test sets and two compounds that were subsequently studied, experimentally demonstrated a moderate predictive power of the M1–M6 QSPR models. Despite the high performance and satisfactory external validation results found for all of the models, we recommend using the M3 and M6 QSPR models for the virtual screening and search for new antioxidants. The M3 and M6 models are based on the combination of the different types of descriptors, which ensures the most objective prognostic estimates of logk7.
The satisfactory agreement between the theoretically calculated logk7pred values and the experimentally determined logk7exp values for the compounds of the test sets TS1, TS2 and antioxidants AO1 and AO2, provides the conclusion that the calculation and selection algorithms for the descriptors, the algorithms of the generation of the regression equations, and their consensus combination implemented in the GUSAR 2019 program allow the correct modeling of the kinetic parameter logk7, which is determined experimentally in the model liquid-phase oxidation reactions of organic hydrocarbons.

5. Patents

This work was supported by grant No. 19-73-20073 of the Russian Science Foundation.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/molecules27196511/s1, Supplementary file contains Tables S1–S8. Table S1: The equations for assessing the descriptive and predictive potentials of the QSAR models based on the R2 and MAE metrics; Table S2: The validation parameters of the QSPR models estimated using the Xternal Validation Plus 1.2 program based on the experimental and predicted lgk7 values of the compounds form internal training sets TR1 and TR2; Table S3: The validation parameters of the QSPR models estimated using the Xternal Validation Plus 1.2 program based on the experimental and predicted lgk7 values of the compounds form test sets TS1 and TS2; Table S4: Prediction of the lgk7 values for the TR1 compounds using models M1-M3; Table S5: Prediction of the lgk7 values for the TR2 compounds using models M4-M6; Table S6: Prediction of the lgk7 values for the TS1 compounds using models M1-M3; Table S7: Prediction of the lgk7 values for the TS1 compounds using models M4-M6; Table S8: Prediction of the lgk7 values for the TS2 compounds using models M4-M6.

Author Contributions

Conceptualization, V.K. and I.S.; methodology, V.K.; software, Y.M.; validation, V.K., I.S. and A.G.; formal analysis, R.L. and G.S.; investigation, V.K. and R.S.; resources, V.K. and R.S.; data curation, V.K.; writing: original draft preparation, V.K. and Y.M.; writing: review and editing, A.G. and I.S.; visualization, G.S.; supervision, V.K., R.L. and R.S.; project administration, A.G.; funding acquisition, V.K. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by Russian Science Foundation, grant number 19-73-20073.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare that they have no conflict of interest.

Sample Availability

The structures of the compounds presented in this study are available upon request from the respective author.

References

  1. Denisov, E.T.; Denisova, T.G. Handbook of Antioxidants: Bond Dissociation Energies, Rate Constants, Activation Energies, and Enthalpies of Reactions. In Chemisrty/Thermodynamics; Denisov, E.T., Denisova, T.G., Eds.; CRC Press: Boca Raton, FL, USA, 1999; p. 312. [Google Scholar]
  2. Denisov, E.T.; Afanas’ev, I.B. Oxidation and Antioxidants in Organic Chemstry and Biology. In Chemisrty/Organic Chemistry; Denisov, E.T., Afanas’ev, I.B., Eds.; CRC Press: Boca Raton, FL, USA, 2005; p. 992. [Google Scholar]
  3. Alam, M.N.; Bristi, N.J.; Rafiquzzaman, M. Review on in vivo and in vitro methods evaluation of antioxidant activity. Saudi Pharm. J. 2013, 21, 143–152. [Google Scholar] [CrossRef] [Green Version]
  4. White, P.; Oliveira, R.; Oliveira, A.; Serafini, M.; Araújo, A.; Gelain, D.; Moreira, J.; Almeida, J.; Quintans, J.; Quintans-Junior, L.; et al. Antioxidant Activity and Mechanisms of Action of Natural Compounds Isolated from Lichens: A Systematic Review. Molecules 2014, 19, 14496–14527. [Google Scholar] [CrossRef]
  5. Kahl, R.; Hildebrandt, A.G. Methodology for studying antioxidant activity and mechanisms of action of antioxidants. J. Food Chem. Toxic. 1986, 24, 1007–1014. [Google Scholar] [CrossRef]
  6. Nimse, S.B.; Pal, D. Free radicals, natural antioxidants, and their reaction mechanisms. J. RSC Adv. 2015, 5, 27986–28006. [Google Scholar] [CrossRef] [Green Version]
  7. Apak, R.; Özyürek, M.; Guclu, K.; Capanoglu, E. Antioxidant Activity/Capacity Measurement. 1. Classification, Physicochemical Principles, Mechanisms, and Electron Transfer (ET)-Based Assays. J. Agric. Food. Chem. 2016, 64, 997–1027. [Google Scholar] [CrossRef]
  8. Sorokina, I.V.; Krysin, A.P.; Khlebnikova, T.B.; Kobrin, V.S.; Popova, L.N. Analiticheskij obzor. In Rol’ Fenol’nyh Antioksidantov v Povyshenii Ustojchivosti Organicheskih Sistem k Svobodno-Radikal’nomu Okisleniyu; Sorokina, I.V., Krysin, A.P., Khlebnikova, T.B., Kobrin, V.S., Popova, L.N., Eds.; State Public Science and Technology Library of the Siberian Branch of the Russian Academy of Sciences: Novosibirsk, Russia, 1997; Volume 46, p. 68. [Google Scholar]
  9. Li, Y. Antioxidants in biology and medicine: Essentials, advances, and clinical applications. In Library of Congress Cataloging-in-Publication Data; Li, Y., Ed.; Nova Science Publishers: New York, NY, USA, 2011; p. 431. [Google Scholar]
  10. Rodrigo, R. Oxidative Stress and Antioxidants: Their Role in Human Disease. In Library of Congress Cataloging-in-Publication Data; Rodrigo, R., Ed.; Nova Science Publishers: New York, NY, USA, 2009; p. 374. [Google Scholar]
  11. Lagunin, A.A.; Romanova, M.A.; Zadorozhny, A.D.; Kurilenko, N.S.; Shilov, B.V.; Pogodin, P.V.; Ivanov, S.M.; Filimonov, D.A.; Poroikov, V.V. Comparison of Quantitative and Qualitative (Q)SAR Models Created for the Prediction of Ki and IC50 Values of Antitarget Ingibitors. J. Front. Pharmacol. 2018, 9, 1136. [Google Scholar] [CrossRef]
  12. Taipov, I.A.; Khayrullina, V.R.; Khoma, V.K.; Gerchikov, A.J.; Zarudiy, F.S.; Bege, K. Virtual screening in the row of effective inhibitor of catalytic activity-A4-hydrolase. J. Vestn. Bashkir. Univ. 2012, 17, 886–891. [Google Scholar]
  13. Tarasov, G.P.; Khayrullina, V.R.; Gertchikov, A.J.; Kirlan, S.A.; Zarudiy, P.S. Derivatives of 4-amino-n-[2-(dietilamino) ethyl] benzamids as potentially low-toxic substances with expressed antiarrhytmic action. J. Vestn. Bashkir. Univ. 2012, 17, 1242–1246. [Google Scholar]
  14. Khayrullina, V.R.; Kirlan, S.A.; Gerchikov, A.J.; Zarudiy, F.S.; Dimoglo, A.S.; Kantor, E.A. Modeling of structures of anti-inflammatory heterocyclic compounds with their toxicity. J. Baskir. Khim. Zh. 2010, 17, 76–79. [Google Scholar]
  15. Liu, J.; Pan, D.; Tseng, Y.; Hopfinger, A.J. 4D-QSAR analysis of a series of antifungal p450 inhibitors and 3D-pharmacophore comparisons as a function of alignment. J. Chem. Inf. Comput. Sci. 2003, 43, 2170–2179. [Google Scholar] [CrossRef]
  16. Scior, T.; Medina-Franco, J.L.; Do, Q.-T.; Martínez-Mayorga, K.; Yunes Rojas, J.A.; Bernard, P. How to recognize and workaround pitfalls in QSAR studies: A critical review. J. Curr. Med. Chem. 2009, 16, 4297–4313. [Google Scholar] [CrossRef] [PubMed]
  17. Lagunin, A.A.; Geronikaki, A.; Eleftheriou, P.; Pogodin, P.V. Rational Use of Heterogeneous Data in Quantitative Structure-Activity Relationship (QSAR) Modeling of Cyclooxygenase/Lipoxygenase Inhibitors. J. Chem. Inf. Model. 2019, 59, 713–730. [Google Scholar] [CrossRef] [PubMed]
  18. Roy, K.; Kar, S.; Narayan Das, R. Fundamental Concepts. In A Primer on QSAR/QSPR Modeling; Roy, K., Kar, S., Narayan Das, R., Eds.; Springer: New York, NY, USA, 2015; p. 129. [Google Scholar] [CrossRef]
  19. Dastmalchi, S.; Hamzeh-Mivehroud, M.; Sokouti, B. A Practical Approach. In Quantitative Structure–Activity Relationship; Dastmalchi, S., Hamzeh-Mivehroud, M., Sokouti, B., Eds.; CRC Press: Boca Raton, FL, USA, 2018; p. 115. [Google Scholar] [CrossRef]
  20. Roy, K. Applications in Pharmaceutical, Chemical, Food, Agricultural and Environmental Sciences. In Advances in QSAR Modeling; Roy, K., Ed.; Springer: Jackson, MS, USA, 2017; Volume 24, p. 555. [Google Scholar] [CrossRef]
  21. Jeremić, S.; Radenković, S.; Filipović, M.; Antić, M.; Amić, A.; Marković, Z. Importance of hydrogen bonding and aromaticity indices in QSAR modeling of the antioxidative capacity of selected (poly)phenolic antioxidants. J. Mol. Graph. Model. 2017, 72, 240–245. [Google Scholar] [CrossRef] [PubMed]
  22. Marković, Z.; Filipović, M.; Manojlović, N.; Amić, A.; Jeremić, S.; Milenković, D. QSAR of the free radical scavenging potency of selected hydroxyanthraquinones. Chem. Papers. 2018, 72, 2785–2793. [Google Scholar] [CrossRef]
  23. Kazachenko, A.S.; Akman, F.; Vasilieva, N.Y.; Issaoui, N.; Malyar, Y.N.; Kondrasenko, A.A.; Borovkova, V.S.; Miroshnikova, A.V.; Kazachenko, A.S.; Al-Dossary, O.; et al. Catalytic Sulfation of Betulin with Sulfamic Acid: Experiment and DFT Calculation. Int. J. Mol. Sci. 2022, 23, 1602. [Google Scholar] [CrossRef]
  24. Verma, J.; Khedkar, V.M.; Coutinho, E.C. 3D-QSAR in drug design-a review. J. Curr. Top. Med. Chem. 2010, 10, 95–115. [Google Scholar] [CrossRef]
  25. Kubinyi, H. Theory Methods and Applications. In QSAR in Drug Design; Kubinyi, H., Ed.; Kluwer/Escom: Dordrecht, The Netherlands, 1993; Volume 1, p. 759. [Google Scholar]
  26. Kubinyi, H. QSAR and 3D QSAR in drug design Part 1: Methodology. Drug Discov. Today 1997, 2, 457–467. [Google Scholar] [CrossRef]
  27. Nantasenamat, C.; Prachayasittikul, V.; Isarankura-Na-Ayudhya, C.; Naenna, T. A Practical Overview of Quantitative Structure-Activity Relationship. EXCLI J. 2009, 8, 74–88. [Google Scholar] [CrossRef]
  28. Kubinyi, H. QSAR: Hansch analysis and related approaches. In Methods and Principles in Medicinal Chemistry; Kubinyi, H., Mannhold, R., Krogsgaard-Larsen, P., Timmerman, H., Eds.; Wiley-VCH: Weinheim, Germany, 2008; Volume 1, p. 993. [Google Scholar]
  29. Baskin, I.I. Modeli rovanie «struktura-svojstvo». In Vvedenie v hemoinformatiku; Baskin, I.I., Madzhidov, T.I., Antipin, I.S., Varnek, A.A., Eds.; Nauchno-izdatel’skij centr “Akademiya estestvoznaniya”: Kazan, Russia, 2015; Volume 3, p. 304. [Google Scholar]
  30. Veselovsky, A.V.; Ivanov, S. Strategy of computer-aided drug design. Curr. Drug Targets-Infect. Disord. 2003, 3, 33–40. [Google Scholar] [CrossRef]
  31. Damale, M.G.; Harke, S.N.; Kalam Khan, F.A.; Shinde, D.B.; Sangshetti, J.N. Recent advances in multidimensional QSAR (4D-6D): A critical review. J. Mini-Rev. Med. Chem. 2014, 14, 35–55. [Google Scholar] [CrossRef]
  32. Alexander, D.L.J.; Tropsha, A.; Winkler, D.A. Beware of R2: Simple, Unambiguous Assessment of the Prediction Accuracy of QSAR and QSPR Models. J. Chem. Inf. Model. 2015, 55, 1316–1322. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  33. Ambure, P.; Gajewicz-Skretna, A.; Cordeiro, M.N.D.S.; Roy, K. New Workflow for QSAR Model Development from Small Data Sets: Small Dataset Curator and Small Dataset Modeler. Integration of Data Curation, Exhaustive Double Cross-Validation, and a Set of Optimal Model Selection Techniques. J. Chem. Inf. Model. 2019, 59, 4070–4076. [Google Scholar] [CrossRef] [PubMed]
  34. Dearden, J.C.; Cronin, M.T.D.; Kaiser, K.L.E. How not to develop a quantitative structure-activity or structure-property relationship (QSAR/QSPR). J. SAR QSAR Environ. Res. 2009, 20, 241–266. [Google Scholar] [CrossRef] [PubMed]
  35. Roy, P.P.; Paul, S.; Mitra, I.; Roy, K. On Two Novel Parameters for Validation of Predictive QSAR Models. Molecules 2009, 14, 1660–1701. [Google Scholar] [CrossRef]
  36. Roy, K.; Ambure, P.; Kar, S. How Precise Are Our Quantitative Structure−Activity Relationship Derived Predictions for New Query Chemicals? J. ASC Omega 2018, 3, 11392–11406. [Google Scholar] [CrossRef] [Green Version]
  37. Roy, K.; Kar, S.; Ambure, P. On a simple approach for determining applicability domain of QSAR models. J. Chemom. Intell. Lab. Syst. 2015, 145, 22–29. [Google Scholar] [CrossRef]
  38. Verma, J.; Malde, A.; Khedkar, S.; Iyer, R.; Coutinho, E. Local indices for similarity analysis (LISA)-a 3D-QSAR formalism based on local molecular similarity. J. Chem. Inf. Model. 2009, 49, 2695–2707. [Google Scholar] [CrossRef]
  39. Yanmaz, E.; Sarıpınar, E.; Şahin, K.; Geçen, N.; Çopur, F. 4D-QSAR analysis and pharmacophore modeling: Electron conformational-genetic algorithm approach for penicillins. J. Bioorg. Med. Chem. 2011, 19, 2199–2210. [Google Scholar] [CrossRef]
  40. Hopfinger, A.; Wang, S.; Tokarski, J.; Jin, B.; Albuquerque, M.G.; Madhav, P.J.; Duraiswami, C. Construction of 3D-QSAR Models Using the 4D-QSAR Analysis Formalism. J. Am. Chem. Soc. 1997, 119, 10509–10524. [Google Scholar] [CrossRef]
  41. Lill, M.A. Multi-dimensional QSAR in drug discovery. J. Drug Discov. Today 2007, 12, 1013–1017. [Google Scholar] [CrossRef]
  42. Polanski, J. Receptor Dependent Multidimensional QSAR for Modeling Drug-Receptor Interactions. J. Curr. Med. Chem. 2009, 16, 3243–3257. [Google Scholar] [CrossRef]
  43. Santos-Filho, O.A.; Hopfinger, A.J. Structure-based QSAR analysis of a set of 4-hydroxy-5,6-dihydropyrones as inhibitors of HIV-1 protease: An application of the receptor-dependent (RD) 4D-QSAR formalism. J. Chem. Inf. Model. 2006, 46, 345–354. [Google Scholar] [CrossRef]
  44. Roy, K.; Das, R.N. A review on principles, theory and practices of 2D-QSAR. Curr. Drug Metab. 2014, 15, 346–379. [Google Scholar] [CrossRef] [PubMed]
  45. Voigt, B.; Porzel, A.; Bruhn, C.; Wagner, C.; Merzweiler, K.; Adam, G. Synthesis of 24-epicathasterone and related brassinosteroids with modified side chain. J. Tetrahedron. 1997, 53, 17039–17054. [Google Scholar] [CrossRef]
  46. Nakagawa, Y.; Shimizu, B.; Oikawa, N.; Akamatsu, M.; Nishimura, K.; Kurihara, N.; Ueno, T.; Fujita, T. (Eds.) Classical and Three-dimenssional QSAR in agrochemistry. In ACS Symposium Series; American Chemical Society: Washington, DC, USA, 1995; pp. 288–301. [Google Scholar]
  47. Watanabe, B.; Nakagawa, Y.; Miyagawa, H. Synthesis of castasterone/ponasterone hybrid compound and evaluation of its molting hormone-like activity. J. Pestic. Sci. 2003, 28, 188–193. [Google Scholar] [CrossRef] [Green Version]
  48. Watanabe, B.; Nakagawa, Y.; Ogura, T.; Miyagawa, H. Stereoselective synthesis of (22R)- and (22S)-castasterone/ponasterone A hybrid compounds and evaluation of their molting hormone activity. J. Steroids 2004, 69, 483–493. [Google Scholar] [CrossRef]
  49. Savchenko, R.G.; Urmanova, Y.R.; Shafikov, R.V.; Afon’kina, S.R.; Khalilov, L.; Odinokov, V. Regio- and stereodirected transformation of 20-hydroxyecdysone to 2-dehydro-3-epi-20-hydroxyecdysone under ozonization in pyridine. J. Mendeleev Commun. 2008, 18, 191–192. [Google Scholar] [CrossRef]
  50. Savchenko, R.G.; Urasaeva, Y.R.; Galyautdinov, I.V.; Afonkina, S.R.; Khalilov, L.M.; Dolgushin, F.M.; Odinokov, V.N. Synthesis of 7,8α-dihydro-14α-deoxyecdysteroids. J. Steroids 2011, 76, 603–606. [Google Scholar] [CrossRef]
  51. Yingyongnarongkul, B.; Suksamrarn, A. Asymmetric dihydroxylation of stachysterone C: Stereoselective synthesis of 24-epi-abutasterone. J. Tetrahedron 1998, 54, 2795–2800. [Google Scholar] [CrossRef]
  52. Siddall, J.B.; Horn, D.H.S.; Middleton, E.J. Synthetic studies on insect hormones. The synthesis of a possible metabolite of crustecdysone (20-hydroxyecdysone). J. Chem. Commun. 1967, 17, 899–900. [Google Scholar] [CrossRef]
  53. Galbraith, M.N.; Horn, D.H.S.; Middleton, E.J.; Thomson, J.A.; Siddall, J.B.; Hafferl, W. Catabolism of crustecdysone in the blowfly Calliphora stygia. J. Chem. Soc. Chem. Commun. 1969, 19, 1134–1135. [Google Scholar] [CrossRef]
  54. Khayrullina, V.R.; Gerchikov, A.Y.; Lagunin, A.A.; Zarudii, F.S. Quantitative Analysis of Structure−Activity Relationships of Tetrahydro-2H-isoindole Cyclooxygenase-2 Inhibitors. J. Biokhimiya 2015, 80, 74–86. [Google Scholar] [CrossRef] [PubMed]
  55. Khairullina, V.R.; Akbasheva, Y.Z.; Gimadieva, A.R.; Mustafin, A.G. Analysis of the relationship «structure-activity» in theseries of certain 5-ethyluridine derivatives with pronounced anti-herpetic activity. J. Vestn. Bashk. Univ. 2017, 22, 960–965. [Google Scholar]
  56. Khairullina, V.R.; Gerchikov, A.Y.; Lagunin, A.A.; Zarudii, F.S. QSAR modeling of thymidilate synthase inhibitors in a series of quinazoline derivatives. J. Pharm. Chem. 2018, 51, 884–888. [Google Scholar] [CrossRef]
  57. Khairullina, V.R.; Gimadieva, A.R.; Gerchikov, A.J.; Mustafin, A.G.; Zaarudii, F.S. Quantitative structure–activity relationship of the thymidylate synthase inhibitors of Mus musculus in the series of quinazolin-4-one and quinazolin-4-imine derivatives. J. Mol. Graph. Modell. 2018, 85, 198–211. [Google Scholar] [CrossRef]
  58. Zakharov, A.V.; Lagunin, A.A.; Filimonov, D.A.; Poroikov, V.V. Quantitative prediction of antitarget interaction profiles for chemical compounds. J. Chem. Res. Toxicol. 2012, 25, 2378–2385. [Google Scholar] [CrossRef] [Green Version]
  59. Filimonov, D.A.; Zakharov, A.V.; Lagunin, A.A.; Poroikov, V.V. QNA based “Star Track” QSAR approach. SAR QSAR Environ. J. Resolut. 2009, 20, 679–709. [Google Scholar] [CrossRef]
  60. Zakharov, A.V.; Peach, M.L.; Sitzmann, M.; Nicklaus, M.C. A New Approach to Radial basis function approximation and Its application to QSAR. J. Chem. Inf. Model. 2014, 54, 713–719. [Google Scholar] [CrossRef]
  61. Martynova, Y.Z.; Khairullina, V.R.; Biglova, Y.N.; Mustafin, A.G. Quantitative structure-property relationship modeling of the C60 fullerene derivatives as electron acceptors of polymer solar cells: Elucidating the functional groups critical for device performance. J. Mol. Graph. Model. 2019, 88, 49–61. [Google Scholar] [CrossRef]
  62. Martynova, Y.Z.; Khairullina, V.R.; Gimadieva, A.R.; Mustafin, A.G. QSAR-Modeling of desoxyuridine triphosphatase inhibitors in a series of some derivatives of uracil. J. Biomed. Chem. 2019, 65, 103–113. [Google Scholar] [CrossRef] [Green Version]
  63. Martynova, Y.Z.; Khairullina, V.R.; Nasretdinova, R.N.; Garifullina, G.G.; Mitsukova, D.S.; Gerchikov, A.Y.; Mustafin, A.G. Determination of the chain termination rate constants of the radical chain oxidation of organic compounds on antioxidant molecules by the QSPR method. J. Russ. Chem. Bull. Int. Ed. 2020, 69, 1679–1691. [Google Scholar] [CrossRef]
  64. Khairullina, V.; Safarova, I.; Sharipova, G.; Martynova, Y.; Gerchikov, A. QSAR Assessing the Efficiency of Antioxidants in the Termination of Radical-Chain Oxidation Processes of Organic Compounds. Molecules 2021, 26, 421. [Google Scholar] [CrossRef] [PubMed]
  65. Martynova, Y.Z.; Khairullina, V.R.; Garifullina, G.G.; Mitsukova, D.S.; Zarudiy, F.S.; Mustafin, A.G. QSAR-modeling of the relationship “structure-antioxidative activity” in a series of some benzopirane and benzofurane derivatives. J. Vestn. Bashk. Univ. 2019, 24, 573–580. [Google Scholar] [CrossRef]
  66. Martynova, Y.Z.; Khairullina, V.R.; Gerchikov, A.Y.; Zarudiy, F.S.; Mustafin, A.G. QSPR-modeling of antioxidant activity of potential and industrial used stabilizers from the class of substituted alkylphenols. J. Vestn. Bashk. Univ. 2020, 25, 723–730. [Google Scholar] [CrossRef]
  67. Khomchenko, A.S. Cerosoderzhashchie proizvodnye na osnove 3-(4-gidroksi(metoksi)aril)-1-galogenpropanovi 2,6-dimetilfenola: Sintez i antiokislitel’naya aktivnost’. Ph.D. Thesis, Novosibirsk State Pedagogical University, Novosibirsk, Russia, 2010. [Google Scholar]
  68. Boiko, M.A. Vzaimosvyaz’ elektrohimicheskoj aktivnosti alkil- i tio(amino)alkilzameshchennyh fenolov s ih stroeniem, kislotnymi i protivookislitel’nymi svojstvami. Ph.D. Thesis, Novosibirsk State Pedagogical University, Novosibirsk, Russia, 2006. [Google Scholar]
  69. Boiko, M.A.; Terakh, E.I.; Prosenko, A.E. Interrelation between the Electrochemical Activity of Alkyl- and Thioalkylphenols and Their Antioxidant Action. Russ. J. Phys. Chem. 2006, 80, 1225–1230. [Google Scholar] [CrossRef]
  70. Kandalintseva, N.V. Gidrofil’nye hal’kogensoderzhashchie proizvodnye alkilirovannyh fenolov: Sintez, svojstva, antiokislitel’naya i biologicheskaya aktivnost’. Doctor of Science Thesis, Novosibirsk State Pedagogical University, Novosibirsk, Russia, 2020. [Google Scholar]
  71. Xternal Validation Plus. Available online: https://sites.google.com/site/dtclabxvplus (accessed on 22 August 2022).
  72. Sharipova, G.M.; Safarova, I.V.; Khairullina, V.R.; Gerchikov, A.Y.; Zimin, Y.S.; Savchenko, R.G.; Limantseva, R.M. Kinetics and mechanism of antioxidant action of polysubstituted tetrahydroquinolines in liquid-phase oxidation reactions of organic compounds by oxygen. Int. J. Chem. Kin. 2022, 54, 1–9. [Google Scholar] [CrossRef]
  73. Roginskij, V.A. Fenol’nye antioksidanty: Reaktsionnaya sposobnost’ i effektivnost’. In Institut himicheskoj fiziki AN SSSR; Roginskij, V.A., Ed.; Nauka: Moscow, Russia, 1988; p. 246. [Google Scholar]
  74. Roginsky, V.; Lissi, E.A. Review of methods to determine chainbreaking antioxidant activity in food. J. Food Chem. 2005, 92, 235–254. [Google Scholar] [CrossRef]
  75. Khayrullina, V.R.; Gerchikov, A.J.; Ilina, E.A.; Drevko, J.B.; Isaeva, A.Y.; Drevko, B.I. Antioxidant properties of some 7,8-benzo-5,6-dihydro(4H)selenochromene derivaties. J. Kinet. Catal. 2013, 54, 14–17. [Google Scholar] [CrossRef]
  76. Khairullinaa, V.R.; Gerchikova, A.Y.; Urazaevab, Y.R.; Savchenkob, R.G.; Odinokovb, V.N. Antioxidant Properties of Conjugates of 20-Hydroxyecdysone Derivatives with a Polysubstituted Chromanylaldehyde. J. Kin. Kat. 2010, 51, 502–506. [Google Scholar] [CrossRef]
  77. Khairullina, V.R.; Gerchikova, A.Y.; Safarova, A.B.; Khalitova, R.R.; Spivak, A.Y.; Shakurova, E.R.; Odinokovb, V.N. Antioxidant Properties of Conjugates of Triterpenic Acids with Amido Derivatives of Trolox. J. Kin. Kat. 2011, 52, 186–191. [Google Scholar] [CrossRef]
  78. Denisov, E.T.; Denisova, T.G. The reactivity of natural phenols. J. Russ. Chem. Rev. 2009, 78, 1047–1073. [Google Scholar] [CrossRef]
  79. Garifullina, G.G.; Sakhautdinova, G.F.; Malikova, R.N.; Sattarova, A.F.; Shalashova, A.V.; Nasretdinova, R.N. Antioxidant Activity of Some Terpenoids in The Model Reaction of Ethylbenzene Oxidation. J. Vestn. Bashkir. Univ. 2019, 24, 835–841. [Google Scholar] [CrossRef] [Green Version]
  80. Khairullina, V.R.; Gerchikova, A.Y.; Denisova, S.B. Comparative Study of the Antioxidant Properties of Selected Flavonols and Flavanones. J. Kin. Kat. 2010, 51, 234–239. [Google Scholar] [CrossRef]
  81. Dyubchenko, O.I. Sintez, svojstva i antiokislitel’naya aktivnost’ gidroksiarilalkilaminov i ih proizvodnyh. Ph.D. Thesis, Novosibirsk State Pedagogical University, Novosibirsk, Russia, 2005. [Google Scholar]
  82. Boiko, M.A.; Terakh, E.I.; Prosenko, A.E. Relationship between the Electrochemical and Antioxidant Activities of Alkyl-Substituted Phenols. J. Kin. Kat. 2006, 47, 677–681. [Google Scholar] [CrossRef]
  83. Prosenko, A.E.; Dyubchenko, O.I.; Terakh, E.I.; Markov, A.F.; Gorokh, E.A.; Boiko, M.A. Synthesis and Investigation of Antioxidant Properties of Alkylated Hydroxybenzyl Dodecyl Sulfides. J. Pet. Chem. 2006, 46, 283–288. [Google Scholar] [CrossRef]
  84. Prosenko, A.E.; Markov, A.F.; Khomchenko, A.S.; Boiko, M.A.; Terakh, E.I.; Kandalintseva, N.V. Synthesis and Antioxidant Activity of Alkyl 3-(4-Hydroxyaryl)propyl Sulfides. J. Pet. Chem. 2006, 46, 442–446. [Google Scholar] [CrossRef]
  85. MarvinSketch. Available online: https://chemaxon.com/download/marvin-suite (accessed on 22 August 2022).
  86. DiscoveryStudioVisualiser. Available online: https://www.3ds.com (accessed on 22 August 2022).
  87. Lagunin, A.; Zakharov, A.; Filimonov, D.; Poroikov, V. QSAR Modelling of Rat Acute Toxicity on the Basis of PASS Prediction. J. Mol. Inform. 2011, 30, 241–250. [Google Scholar] [CrossRef]
  88. Roy, K.; Das, R.N.; Ambure, P.; Aher, R.B. Be aware of error measures. Further studies on validation of predictive QSAR models. J. Chemom. Intell. Lab. Syst. 2016, 152, 18–33. [Google Scholar] [CrossRef]
Figure 1. General structural of the formulas of the modeled antioxidant inhibitors (Pht=CH2[CH2CH2CH(CH3)CH2]3H). I, VIII (a phenol derivative), II–V (chromone derivatives), VI (20-hydroxyecdysone derivatives with chroman-2-yl moiety), VII (triterpenoids derivatives with chroman-2-yl moiety).
Figure 1. General structural of the formulas of the modeled antioxidant inhibitors (Pht=CH2[CH2CH2CH(CH3)CH2]3H). I, VIII (a phenol derivative), II–V (chromone derivatives), VI (20-hydroxyecdysone derivatives with chroman-2-yl moiety), VII (triterpenoids derivatives with chroman-2-yl moiety).
Molecules 27 06511 g001
Figure 2. Plot of the predicted vs. the experimental activities based on the M3 and M6 models.
Figure 2. Plot of the predicted vs. the experimental activities based on the M3 and M6 models.
Molecules 27 06511 g002
Figure 3. Structures of the compounds designated by AO1 and AO2.
Figure 3. Structures of the compounds designated by AO1 and AO2.
Molecules 27 06511 g003
Figure 4. Typical kinetic curves of the oxygen uptake during the oxidation of 1,4-dioxane in the absence (1) and in presence of AO1 taken in concentrations, mol/L: 0.44 × 10−6 (2); 1.24 × 10−6 (3); 1.88 × 10−6 (4); 2.50 × 10−6 (5); 3.13 × 10−6 (6). T = 348 K, Vi = 1 × 10−7 M/s.
Figure 4. Typical kinetic curves of the oxygen uptake during the oxidation of 1,4-dioxane in the absence (1) and in presence of AO1 taken in concentrations, mol/L: 0.44 × 10−6 (2); 1.24 × 10−6 (3); 1.88 × 10−6 (4); 2.50 × 10−6 (5); 3.13 × 10−6 (6). T = 348 K, Vi = 1 × 10−7 M/s.
Molecules 27 06511 g004
Figure 5. Typical kinetic curves of the oxygen uptake during the oxidation of 1,4-dioxane in the absence (1) and in the presence of AO2 taken in concentrations, mol/L: 0.44 × 10−6 (2); 0.94 × 10−6 (3); 1.25 × 10−6 (4); 1.88 × 10−6 (5); 3.13 × 10−6 (6). T = 348 K, Vi = 1 × 10−7 M/s.
Figure 5. Typical kinetic curves of the oxygen uptake during the oxidation of 1,4-dioxane in the absence (1) and in the presence of AO2 taken in concentrations, mol/L: 0.44 × 10−6 (2); 0.94 × 10−6 (3); 1.25 × 10−6 (4); 1.88 × 10−6 (5); 3.13 × 10−6 (6). T = 348 K, Vi = 1 × 10−7 M/s.
Molecules 27 06511 g005
Figure 6. Dependence of the inhibition efficiency parameter on the concentration of AO1 and AO2, Vi = 1 × 10−7 M/s, T = 348 K.
Figure 6. Dependence of the inhibition efficiency parameter on the concentration of AO1 and AO2, Vi = 1 × 10−7 M/s, T = 348 K.
Molecules 27 06511 g006
Figure 7. Dependence of the induction period on the injected initial concentration of the inhibitor. T = 348 K, Vi = 1 × 10−7 M/s.
Figure 7. Dependence of the induction period on the injected initial concentration of the inhibitor. T = 348 K, Vi = 1 × 10−7 M/s.
Molecules 27 06511 g007
Figure 8. Schematic representation of the GUSAR algorithm.
Figure 8. Schematic representation of the GUSAR algorithm.
Molecules 27 06511 g008
Figure 9. Construction of the training and test sets for the M1–M6 models in the design of the QSPR consensus models (S is set, TR and TS are training and test sets, M is the model, N is the number of compounds included in the corresponding sets and arrays). Designations: (1) S1 is the overall data set; (2) S2 is the training set TR1 for the M1–M3 models; (3) S3 is the external test set TS1 for the M1–M6 models; (4) S4 is the training set TR2 for the M4–M6 models; (5) S5 is the internal test set TS2 for the M4–M6 models.
Figure 9. Construction of the training and test sets for the M1–M6 models in the design of the QSPR consensus models (S is set, TR and TS are training and test sets, M is the model, N is the number of compounds included in the corresponding sets and arrays). Designations: (1) S1 is the overall data set; (2) S2 is the training set TR1 for the M1–M3 models; (3) S3 is the external test set TS1 for the M1–M6 models; (4) S4 is the training set TR2 for the M4–M6 models; (5) S5 is the internal test set TS2 for the M4–M6 models.
Molecules 27 06511 g009
Figure 10. Mechanism of the inhibited radical chain oxidation of organic compounds (I, RH and InH are the initiator, oxidized substrate, and inhibitor, respectively), where I is the initiator of the oxidation process, r is the radical that was formed upon the decay of the initiator I, RH is the oxidation substrate, R is the radical that was formed upon the elimination of a hydrogen atom from the substrate molecule by the initiator radical r, RO2 is the peroxyl radical formed upon the reaction of the substrate radical R with an oxygen molecule, InH is antioxidant, In is the radical formed as a result of the hydrogen atom elimination from the antioxidant molecule by the substrate peroxyl radical RO2.
Figure 10. Mechanism of the inhibited radical chain oxidation of organic compounds (I, RH and InH are the initiator, oxidized substrate, and inhibitor, respectively), where I is the initiator of the oxidation process, r is the radical that was formed upon the decay of the initiator I, RH is the oxidation substrate, R is the radical that was formed upon the elimination of a hydrogen atom from the substrate molecule by the initiator radical r, RO2 is the peroxyl radical formed upon the reaction of the substrate radical R with an oxygen molecule, InH is antioxidant, In is the radical formed as a result of the hydrogen atom elimination from the antioxidant molecule by the substrate peroxyl radical RO2.
Molecules 27 06511 g010
Table 1. Statistical parameters and the accuracy of the predicted logk7 values of the compounds included in the training sets TR1, TR2 within the M1–M6 consensus models (using Both). ∆logk7(TR1) = ∆logk7(TR2) = 7.057 1.
Table 1. Statistical parameters and the accuracy of the predicted logk7 values of the compounds included in the training sets TR1, TR2 within the M1–M6 consensus models (using Both). ∆logk7(TR1) = ∆logk7(TR2) = 7.057 1.
Training SetModelNNPM R 2 ¯ F ¯ S D ¯ Q 2 ¯ V
QSPR models based on the QNA descriptors
TR1M1123200.9687.6750.5480.76029
TR2M4103200.9627.3370.5870.74024
QSPR models based on the MNA descriptors
TR1M2123200.9687.0080.5500.76329
TR2M5103200.9647.8910.5780.75622
QSPR models based on both QNA and MNA descriptors
TR1M31233200.9768.7080.5120.80228
TR2M61033200.9738.0570.5510.78723
1 N is the number of structures in the training set; NPM is the number of regression equations used for the consensus model; R 2 ¯ is the determination coefficient calculated for the compounds of TRi; Q 2 ¯ is the correlation coefficient calculated for the training set by the cross-validation with the exception of one; F ¯ is Fisher’s criterion; S D ¯ —standard deviation; V is the number of variables in the final regression equation.
Table 2. Validation parameters of the QSPR models estimated using the Xternal Validation Plus 1.2 program based on the experimental and predicted logk7 values of the compounds of the internal training sets TR1 and TR2. Δlogk7(TR1) = ∆logk7(TR2) = 7.057 1.
Table 2. Validation parameters of the QSPR models estimated using the Xternal Validation Plus 1.2 program based on the experimental and predicted logk7 values of the compounds of the internal training sets TR1 and TR2. Δlogk7(TR1) = ∆logk7(TR2) = 7.057 1.
CommentsPrediction ParametersQSPR Model Used for Predicting logk7
TR1TR2
M1M2M3M4M5M6
Classical metrics (after removing 5% of the data with high residuals)R20.98680.98490.98960.98870.98500.9925
R200.98450.98370.98760.98700.98390.9903
R2’00.92360.93380.93170.93530.93660.9364
R m 2 ¯ 0.93840.94960.94540.94190.95320.9414
∆R2m0.01410.01490.01130.01320.01460.0099
CCC0.99160.99120.99320.99320.99120.9947
Mean absolute error and standard deviation for the test set (after removing 5% of the data with high residuals)RMSE0.10900.11070.09750.11280.10980.0976
MAE0.08550.08940.07650.09240.08790.0773
SD0.06790.06560.06070.06500.06610.0599
MAE+3SD0.28920.28620.25860.28730.28610.2570
Prediction quality-Good
Presence of systematic errors -Absent
1 R2, R20, and R’2 are the determination coefficients calculated with and without taking into account the origin; average R2m is the averaged determination coefficient of the regression function calculated using the values of determination coefficients on the ordinate axis (R2m) and on the abscissa axis (R’2m), respectively; ΔR2m is the difference between R2m and R’2m; CCC is the concordance correlation coefficient; MAE is the mean absolute error; SD is the standard deviation.
Table 3. Validation parameters of the QSPR models estimated using the Xternal Validation Plus 1.2 program based on the experimental and predicted logk7 values of the compounds of test sets TS1 and TS2. ∆logk7(TR1) = ∆logk7(TR2) = 7.057; Δlogk7(TS1) = 4.009; ∆logk7(TS2) = 3.148 1.
Table 3. Validation parameters of the QSPR models estimated using the Xternal Validation Plus 1.2 program based on the experimental and predicted logk7 values of the compounds of test sets TS1 and TS2. ∆logk7(TR1) = ∆logk7(TR2) = 7.057; Δlogk7(TS1) = 4.009; ∆logk7(TS2) = 3.148 1.
CommentsPrediction ParametersQSPR Model Used for Predicting logk7
TS1TS2
M1M2M3M4M5M6M4M5M6
Classical metrics (after removing 5% of the data with high residuals)R20.82040.73640.77150.76960.72890.78070.77650.81250.8071
R200.81150.73420.77010.76210.72630.77410.77390.79360.8013
R2’00.55550.46520.53460.47500.44660.50050.63040.76500.7064
Q2F10.95380.93900.95250.47500.93670.94680.95670.96080.9621
Q2F20.79660.73120.76270.74730.72120.76560.76790.78960.7969
R m 2 ¯ 0.67980.60100.67260.61910.58920.63320.69340.74340.7353
∆R2m0.16730.21910.18030.20460.22490.19640.13160.03190.0653
CCC0.87630.83710.86000.84330.82930.85630.87750.89980.8970
Mean absolute error and standard deviation for the test set (after removing 5% of the data with high residuals)RMSE0.41330.47500.41860.46060.48380.44360.38700.36850.3620
MAE0.31460.33090.29860.32960.34420.31290.29450.27190.2740
SD0.27400.34850.30000.32890.34760.32150.25800.25550.2431
MAE+3SD1.13671.37631.19851.31641.38711.27731.06841.03831.0032
Prediction quality- Good
Presence of systematic errors- Absent
1 R2, R20, and R’2 are the determination coefficients calculated with and without taking into account the origin; average R2m is the averaged determination coefficient of the regression function calculated using the determination coefficients on the ordinate axis (R2m) and on the abscissa axis (R’2m), respectively; ∆R2m is the difference between R2m and R’2m; CCC is the concordance correlation coefficient; MAE is the mean absolute error; SD is the standard deviation.
Table 4. Prediction of logk7 for antioxidants AO1 and AO2, based on the M1–M6 models.
Table 4. Prediction of logk7 for antioxidants AO1 and AO2, based on the M1–M6 models.
ModelApplicability (AD)Predicted Value of logk7predExperimental Value of logk7exp 1Δlogk7 22RMSEP (95%) 3
AO1AO2AO1AO2AO1AO2
M1in AD5.215.104.644.430.570.670.83
M2in AD4.795.320.150.890.95
M3in AD5.175.210.530.780.84
M4in AD5.255.230.610.800.92
M5in AD5.075.200.430.770.97
M6in AD5.195.150.550.720.89
1 The experimental determination of logk7 for compounds AO1 and AO2 is decribed in Section 3; 2 ∆logk7 = logk7pred − logk7exp; 3 The maximum values of the RMSEP were taken; multiplying this criterion by two gives the confidence interval with 95% probability (relative to the predicted value of logk7, if the model is correct and the errors are normally distributed, which was observed in our computational experiments) [32].
Table 5. Dependence of the initial oxidation rate of ethylbenzene on the concentration of AO1 and AO2; Vi = 1·10−7 M/s, T = 348 K.
Table 5. Dependence of the initial oxidation rate of ethylbenzene on the concentration of AO1 and AO2; Vi = 1·10−7 M/s, T = 348 K.
[AO1]·106, mol/LV0·106, M/s[AO2]·106, mol/LV0·106, M/s
0.002.300.002.36
0.441.860.441.95
1.241.660.941.89
1.881.531.251.77
2.501.201.881.53
3.131.133.131.44
Table 6. Statistical characteristics of the training sets TR1, TR2.
Table 6. Statistical characteristics of the training sets TR1, TR2.
Designation of TRiCode of the Training Set
TR1TR2
N123103
logk73.529
∆logk77.057
Thresholds used to evaluate the model’s forecast
0.10 × ∆logk70.706
0.15 × ∆logk71.059
0.20 × ∆logk71.411
0.25 × ∆logk71.764
Table 7. Statistical characteristics of test sets TS1, TS2.
Table 7. Statistical characteristics of test sets TS1, TS2.
Designation of TSiCode of the Test Set
TS1TS2
N2520
lgk 7 ¯ 5.1065.117
∆logk74.0093.148
Distribution of the observed response values of test sets TSi around the test mean (in %)
lgk 7 ¯ ± 0.5, %32.00035.000
lgk 7 ¯ ± 1.0, %64.00070.000
lgk 7 ¯ ± 1.5, %88.00095.000
lgk 7 ¯ ± 2.0, %96.000100.000
Distribution of the observed response values of test sets TSi around the training mean (in %)
lgk 7 ¯ ± 0.5, %8.00010.000
lgk 7 ¯ ± 1.0, %32.00030.000
lgk 7 ¯ ± 1.5, %44.00045.000
lgk 7 ¯ ± 2.0, %68.00070.000
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Khairullina, V.; Martynova, Y.; Safarova, I.; Sharipova, G.; Gerchikov, A.; Limantseva, R.; Savchenko, R. QSPR Modeling and Experimental Determination of the Antioxidant Activity of Some Polycyclic Compounds in the Radical-Chain Oxidation Reaction of Organic Substrates. Molecules 2022, 27, 6511. https://doi.org/10.3390/molecules27196511

AMA Style

Khairullina V, Martynova Y, Safarova I, Sharipova G, Gerchikov A, Limantseva R, Savchenko R. QSPR Modeling and Experimental Determination of the Antioxidant Activity of Some Polycyclic Compounds in the Radical-Chain Oxidation Reaction of Organic Substrates. Molecules. 2022; 27(19):6511. https://doi.org/10.3390/molecules27196511

Chicago/Turabian Style

Khairullina, Veronika, Yuliya Martynova, Irina Safarova, Gulnaz Sharipova, Anatoly Gerchikov, Regina Limantseva, and Rimma Savchenko. 2022. "QSPR Modeling and Experimental Determination of the Antioxidant Activity of Some Polycyclic Compounds in the Radical-Chain Oxidation Reaction of Organic Substrates" Molecules 27, no. 19: 6511. https://doi.org/10.3390/molecules27196511

APA Style

Khairullina, V., Martynova, Y., Safarova, I., Sharipova, G., Gerchikov, A., Limantseva, R., & Savchenko, R. (2022). QSPR Modeling and Experimental Determination of the Antioxidant Activity of Some Polycyclic Compounds in the Radical-Chain Oxidation Reaction of Organic Substrates. Molecules, 27(19), 6511. https://doi.org/10.3390/molecules27196511

Article Metrics

Back to TopTop