2.1. First Round of LASSO Regression
The results of the first round of LASSO regression (
Table 1,
Table 2 and
Table 3) are presented in the paragraphs below.
For the percentage of C14:0 in total FAs (C14:0) (
Table 1), its ratio to C16:0 (C14:0/C16:0), and its percentage in the biomass in terms of ash free dry weight (AFDW) (C14:0%AFDW), the selected variables were potassium concentration (cK), phosphate concentration (cPO4), CO
2 partial pressure (pCO2), 4-day average CO
2 partial pressure (pCO2_4DaysAv), temperature (T), 2-day average aeration rate in volumes of air per working volume per minute or vvm (VVM_2DaysAv), 4-day average dissolved CO
2 concentration (CO2aq_4DaysAv), light intensity (LI), and 4-day average photon flux per volume (LV_4DaysAv). All of these variables had a positive effect, except for VVM_2DayAv and LV_4DaysAv.
For the percentage of C16:0 in total FAs (C16:0) and its percentage in the biomass in terms of ash free dry weight (C16:0%AFDW), the selected variables were boron concentration (cBOH3), cobalt concentration (cCo), cK, nitrate as a nitrogen source (N-source NO3), 3-day average nitrate concentration (cNO3_3DaysAv), 2-day average urea concentration (cUrea_2DaysAv), sodium concentration (cNa), 4-day average phosphate concentration (cPO4_4DaysAv), sulfate concentration (cSO4), 3-day average CO2 partial pressure (pCO2_3DaysAv), T, 3-day average temperature (T_3DaysAv), the presence of aeration (Aeration), VVM_2DaysAv, pH (pH), 4-day average pH (pH_4_DaysAv), CO2aq_4DaysAv, LI, light period (LP), and 4-day average biomass concentration in terms of ash free dry weight (DW_4DaysAv). cBOH3, cCo, NO3_3DaysAv, Urea_2DaysAv, cNa, cPO4_4DaysAv, cSO4 andVVM_2DaysAv had a negative effect, while the rest of the variables displayed a positive effect.
For the percentage of C16:1 in total FAs (C16:1), its ratio to C16:0 (C16:1/C16:0), and its percentage in the biomass in terms of ash free dry weight (C16:1%AFDW), the selected variables were iron concentration (cFe), cK, manganese concentration (cMn), magnesium concentration (cMg), cNO3_3DaysAv, 4-day total nitrogen concentration (totalN_4DaysAv), cPO4_4DaysAv, cSO4, silicon concentration (cSi), pCO2_4DaysAv, T, T_3DaysAv, Aeration, pH_4_DaysAv, CO2aq_4DaysAv, 2-day average dissolved bicarbonate concentration (HCO3_2DaysAv), LI, photon flux per volume (LV), LP, and DW_4DaysAv. All of these variables had a positive effect, except for cK, T_3DaysAv and HCO3_2DaysAv.
For the percentage of C18:1 in total FAs (C18:1), its ratio to C16:0 (C18:1/C16:0), and its percentage in the biomass in terms of ash free dry weight (C18:1%AFDW), the selected variables were copper concentration (cCu), cFe, cMn, 2-day average nitrate concentration (cNO3_2DaysAv), 4-day average nitrate concentration (cNO3_4DaysAv), Urea_2DaysAv, 3-day average sodium concentration (cNa_3DaysAv), 4-day average phosphate concentration (cPO4_4DaysAv), 2-day average CO2 partial pressure (pCO2_2DaysAv), Aeration, 2-day average dissolved CO2 concentration (CO2aq_2DaysAv), LI, 3-day average photon flux per volume (LV_3DaysAv), LV_4DaysAv, LP, and DW_4DaysAv. cFe, NO3_2DasAv, NO3_4DasAv, Urea_2DaysAv, and cPO4_dDaysAv had a negative effect, while the rest of the variables displayed a positive effect.
For the percentage of C18:2 1 in total FAs (C18:2), its ratio to C16:0 (C18:2/C16:0), and its percentage in the biomass in terms of ash free dry weight (C18:2%AFDW), the selected variables were calcium concentration (cCa), cCu, cMn, NsourceNO3, urea as a nitrogen source (NsourceUrea), nitrate concentration (cNO3), 4-day average urea concentration (cUrea_4DaysAv), cPO4_4DaysAv, 4-day average temperature (T_4DaysAv), Aeration, 2-day average pH (pH_2DaysAv), LV, LV_4DaysAv, and 2-day average biomass concentration average biomass concentration in terms of ash free dry weight (DW_2DaysAv). All of these variables had a positive effect, except for cCa, NsourceNO3, Aeration, and pH_2DaysAv, which had a negative effect.
For the percentage of C20:5, n-3 in total FAs (C20:5n3), its ratio to C16:0 (C20:5n3/C16:0), and its percentage in the biomass in terms of ash free dry weight (C20:5n3%AFDW), the selected variables were cBOH3, cNO3_3DaysAv, 4-day average sodium concentration (cNa_4DaysAv), cPO4_4DaysAv, cSi, salinity (S), T, Aeration, CO2aq_4DaysAv, LI, and LP. T, Aeration, CO2aq_4DaysAv, LI, and LP had a negative effect, while the rest of the variables showed a positive effect.
In summary, the 1
st round of LASSO regression removed most of the initial variables, with the remaining variables being related to well-established growth parameters, such as temperature, pH, CO
2 partial pressure, light intensity or nutrient concentration or less studied parameters such as metal concentration and light period. One variable that emerged as important is the presence of aeration, which exerted a negative effect on both C18:2 and C20:5, n-3. The full dataset of results of the 1
st LASSO regression round is provided in the
Supplementary Materials (S2).
2.3. 2nd Rounds of LASSO and Stepwise Regression
The second round of LASSO regression included the main effects and interactions present in the models resulting from the first round of stepwise regression. This resulted in a reduced number of main effects and interactions. The main effects that were present in the models, either individually or as part of an interaction, were included in the second and final round of stepwise regression. However, only the most notable interactions (as described in the Materials and Methods section) were included to prevent overfitting. The second round of stepwise regression further reduced the number of terms resulting in sparse interpretable models. Only the results for the final models (
Table 4,
Table 5,
Table 6,
Table 7,
Table 8 and
Table 9) are presented below (and the full dataset in
Supplementary Materials S3), while the results of the second round of LASSO are provided in the
Supplementary Materials (S2).
C14:0 was notably influenced by T, exhibiting a positive effect (mean p-value < 0.05). A secondary factor (trimmed p-value < 0.05) was the positive interaction of T with cPO4. cPO4 also contributed to the model with a positive coefficient, albeit with a p-value exceeding 0.05.
Conversely, cPO4 was absent from the model for C14:0/C16:0, but its interaction with cK was notable (p-value < 0.05) with a positive coefficient. The only other term in this model, besides the intercept, was cK, which also had a positive coefficient.
The model for C14:0%AFDW was primarily influenced by variables related to aeration rate and CO2 supply. Both pCO2_4DaysAv and VVM_2DaysAv exhibited notable negative effects (mean p-value < 0.05), while their interaction had a notable positive effect. CO2aq_4DaysAv and its interaction with pCO2_4DaysAv also had a negative effect (p-value >> 0.05).
Parameters with positive mean coefficients in the model for C16:0 included cK, NsourceNO3 and its interaction with T_3DaysAv, NO3_3DaysAv, Aeration, LP, the interactions between pH and both T (p-value < 0.05) and LI, as well as the interactions of Aeration with both LI and LP (trimmed mean p-value < 0.05). Negative influences were observed for cNa, cPO4_3DaysAv, VVM_2DaysAv (trimmed mean p-value < 0.05), LI, DW_4DaysAv, and the interactions of NO3_3DaysAv with both cNa (trimmed mean p-value < 0.05) and VVM_2DaysAv.
For C16:0%AFDW, the positive effect of cK, the negative effect of cPO4_3DaysAv, the negative effect of VVM_2DaysAv, and the positive effect of the interaction between Aeration and LI were notable (mean p-value < 0.05). A secondary factor (trimmed mean p-value < 0.05) was the positive effect of the interaction between pCO2_3DaysAv and VVM_2DaysAv.
The model for C16:1 contained, apart from an intercept, only NO3_3DaysAv and LV, both of which had a negative effect (p >> 0.05), as well as their interaction, which had a notable negative effect (mean p-value < 0.05).
Conversely, the model for C16:1/C16:0 contained the largest number of terms from all models presented in this article, the most notable of which were the negative interaction between cMg and LV (p-value < 0.05), the interaction between T and pH_4DaysAv (trimmed mean p-value < 0.05), which presented a negative effect, and cSi (trimmed mean p-value < 0.05), which had a positive effect. Other terms with a positive effect (p > 0.05) were cMn and cMg, cPO4_4DaysAv and its interaction with NO3_3DaysAv, pCO2_4DaysAv, T_3DaysAv and its interaction with pH_4DaysAv, pH_4DaysAv, HCO3_2DaysAv and the interaction between cFe and NO3_3DaysAv. The remaining terms had a negative effect (p > 0.05) and included totalN_4DaysAv as well as its interactions with both HCO3_2DaysAv and LV, T and its interaction with pH_4DaysAv, CO2aq_4DaysAv and the interaction between Aeration and LP.
C16:1%AFDW was influenced by many of the same parameters as C16:1/C16:0, like cMg and cPO4_4DaysAv, which, in that case, had negative effects (p-value > 0.05). cSi also appears with an opposite effect (negative), which is, akin to the case of C16:1/C16:0, moderately important (trimmed mean p-value < 0.05). NO3_3DaysAv and its interaction with the pH_4DaysAv presented moderately significant effects (trimmed mean p-value < 0.05), negative and positive respectively. pH_4DaysAv had a notable negative effect (mean p-value < 0.05), while the interactions between cSi and LI, and between pCO2_4DaysAv and DW_4DaysAv had strong positive effects (mean p-value < 0.05). LI and the DW_4DaysAv showed negative effects (p-value > 0.05).
The most important (mean p-value < 0.05) terms for C18:1 were NO3_3DaysAv, pCO2_2DaysAv, and the interaction between the Aeration and LP, all of which had a positive effect. The interaction between Aeration and LV_4DaysAv also had a positive effect (trimmed mean p-value < 0.05), while CO2aq_2DaysAv had a negative effect (trimmed mean p-value < 0.05). Other negative influences (p-value > 0.05) were those of NO3_2DaysAv and NO3_3DaysAv, CO2aq_2DaysAv, LI, and LV_3DaysAv. The remaining terms had positive effects (p-value > 0.05) and included cPO4_4DaysAv, DW_4DaysAv and the interaction between can_3DaysAve and LV_3DaysAv.
Aeration dominated the model of C18:1/C16:0 with its positive interactions (mean p-value < 0.05) with both LV_4DaysAv and LP, while cNa_3DaysAv also had a positive effect (trimmed mean p-value < 0.05). The interaction between cNa_3DaysAv and LP had a positive effect (p-value > 0.05), while negative influences (p-value > 0.05) were observed for CO2aq_2DaysAv and the interactions of cNO3_2DaysAv and cNO3_4DaysAv with LI.
Aeration was also important for C18:1%AFDW with a notable positive effect (mean p-value < 0.05), while the interaction of pCO2_2DaysAv with LV_3DaysAv was also notable. The interaction between LV_4DaysAv and DW_4DaysAv was moderately important (trimmed mean p-value < 0.05) with a positive effect. Other terms included in the model were cNO3_2DaysAv and cNO3_4DaysAv, both with a negative effect, LV_4DaysAv and DW_4DaysAv (both with negative effect), pCO2_3DaysAv (positive effect), CO2aq_2DaysAv (negative effect), LI (positive effect), and the positive interactions of DW_4DaysAv with both LV_3DaysAv and LV_4DaysAv.
The model of C18:2 was primarily influenced by cCa, Aeration, and the interaction between cMn and T_4DaysAv, with the first two having a significant (mean p-value << 0.05) negative effect and the third showing a positive effect (mean p-value << 0.05). cPO4_4DaysAv was also included in the model and had a positive effect (p-value > 0.05).
The negative effects of cCa (mean p-value << 0.05) and the interaction between the Aeration and pH_2DaysAv (mean p-value < 0.05), as well as the positive effect of cNO3 (mean p-value << 0.05), were the most important effects in the model for C18:2/C16:0, while cMn had a moderately important positive effect (trimmed mean p-value < 0.05). cPO4_4DaysAv and the interaction between cMn and T_4DaysAv had positive effects, while Aeration had a negative effect, all with p-value larger than 0.05.
The interactions between cMn and T_4DaysAv (mean p-value << 0.05) and between LV and DW_2DaysAv (mean p-value < 0.05), both of which were positive, had the most notable effects on C18:2%AFDW. DW_2DaysAv was included in the model with a moderately positive effect (trimmed mean p-value < 0.05), while its interactions with both LV_4DaysAv and cMn also had a positive effect (p-value > 0.05). LV, LV_4DaysAv and the interaction between cCa and pH2DaysAv had negative effects (p-value > 0.05).
Similar to the case of C18:2, Aeration was a notable term for C20:5n3, with a negative effect (mean p-value < 0.05). The other notable term was T, also with a negative effect. LP had a moderately important negative effect (trimmed mean p-value < 0.05), while the interaction between NO3_3DaysAv and S had a positive effect (trimmed mean p-value < 0.05). NO3_3DaysAv and S had positive effects (p-value > 0.05), while cNa_4DaysAv, LP and its interaction with the T, and the interaction between Aeration and LI all had negative effects (p-value > 0.05).
Aeration and T had the most notable effects on C20:5n3/C16:0, both negative (mean p-value < 0.05), similarly to C20:5n3, while LP and the interaction between NO3_3DaysAv and cNa_4DaysAv had moderately important negative effects (trimmed mean p-value < 0.05). cBOH3, the interaction between NO3_3DaysAv and S, and the interaction between T and LP all had positive effects (p-value > 0.05). On the other hand, NO3_3DaysAv and the interactions of Aeration with LI and LP all had negative effects (p-value > 0.05).
The second rounds of LASSO and stepwise regression, which included the main effects and interactions derived from the initial models, yielded succinct and interpretable results. Aeration emerged as a pivotal factor consistently influencing the fatty acid composition across various species. It exhibited diverse effects, forming positive interactions with specific variables, such as LV_4DaysAv and LP, while also displaying negative interactions with others. This underscores the importance of aeration control in manipulating fatty acid profiles.
Mineral ions, notably calcium, magnesium, and potassium, played a discernible role in determining fatty acid composition. Their effects were evident through main effects as well as interactions, further highlighting their significance in lipid metabolism regulation. Temperature exhibited significant interactions with several parameters, often leading to shifts in fatty acid profiles. This suggests that temperature management could be a valuable strategy for manipulating lipid production in N. oculata.
Nitrogen, especially nitrate, emerged as an influential factor affecting fatty acid profiles. Its interactions with other variables, such as LV and pH_4DaysAv, demonstrated the intricate involvement of nitrogen in lipid synthesis pathways. Additionally, CO2-related variables contributed to the models, indicating the relevance of CO2 supply in lipid metabolism. The presence of both positive and negative effects underscores the complexity of CO2’s role in fatty acid production.
In conclusion, the refined models resulting from the second rounds of LASSO and stepwise regression emphasized the consistent significance of aeration, ion concentrations, temperature, nitrogen sources, and CO2-related variables in shaping the fatty acid composition of N. oculata. These findings provide valuable insights into the potential manipulation of lipid profiles for various applications, from biodiesel production to nutritional supplementation. The ensuing discussion will delve into the mechanistic underpinnings of these observed effects, connecting them to broader metabolic pathways and potential implications for bioprocess optimization.