Multi-Omics Techniques for Soybean Molecular Breeding

Cao, Pan; Zhao, Ying; Wu, Fengjiao; Xin, Dawei; Liu, Chunyan; Wu, Xiaoxia; Lv, Jian; Chen, Qingshan; Qi, Zhaoming

doi:10.3390/ijms23094994

Open AccessReview

Multi-Omics Techniques for Soybean Molecular Breeding

by

Pan Cao

^1,†,

Ying Zhao

^1,†,

Fengjiao Wu

^1,†,

Dawei Xin

¹

,

Chunyan Liu

¹,

Xiaoxia Wu

^1,*,

Jian Lv

^2,*,

Qingshan Chen

^1,* and

Zhaoming Qi

^1,*

¹

College of Agriculture, Northeast Agricultural University, Harbin 150030, China

²

Department of Innovation, Syngenta Biotechnology China, Beijing 102206, China

^*

Authors to whom correspondence should be addressed.

^†

These authors have contributed equally to this work.

Int. J. Mol. Sci. 2022, 23(9), 4994; https://doi.org/10.3390/ijms23094994

Submission received: 4 April 2022 / Revised: 22 April 2022 / Accepted: 28 April 2022 / Published: 30 April 2022

(This article belongs to the Special Issue Genetics and Novel Techniques for Soybean Yield Enhancement)

Download

Browse Figures

Versions Notes

Abstract

:

Soybean is a major crop that provides essential protein and oil for food and feed. Since its origin in China over 5000 years ago, soybean has spread throughout the world, becoming the second most important vegetable oil crop and the primary source of plant protein for global consumption. From early domestication and artificial selection through hybridization and ultimately molecular breeding, the history of soybean breeding parallels major advances in plant science throughout the centuries. Now, rapid progress in plant omics is ushering in a new era of precision design breeding, exemplified by the engineering of elite soybean varieties with specific oil compositions to meet various end-use targets. The assembly of soybean reference genomes, made possible by the development of genome sequencing technology and bioinformatics over the past 20 years, was a great step forward in soybean research. It facilitated advances in soybean transcriptomics, proteomics, metabolomics, and phenomics, all of which paved the way for an integrated approach to molecular breeding in soybean. In this review, we summarize the latest progress in omics research, highlight novel findings made possible by omics techniques, note current drawbacks and areas for further research, and suggest that an efficient multi-omics approach may accelerate soybean breeding in the future. This review will be of interest not only to soybean breeders but also to researchers interested in the use of cutting-edge omics technologies for crop research and improvement.

Keywords:

soybean; multi-omics; molecular breeding

1. Introduction

Soybean [Glycine max (L.) Merr.] originated in China over 5000 years ago; China currently ranks third in soybean production worldwide, and total production has increased to meet market demands [1]. In 1830, soybean travelled the great distance to Europe by the “ancient silk road”, and from there it spread throughout the world to North America, South America, India, and elsewhere [2].

At that time, soybean breeders were mainly farmers, brewers, or suppliers, and they selected soybeans based on their experience and market demands [3]. Records show that artificial hybridization was performed in the early 1900s [4], and breeders continued to advance soybean breeding through cooperation with cell biologists and molecular geneticists in the late twentieth century [5].

Soybean is a major oil crop that also provides plant protein to the food industry. Through genetic modification of its fatty acid composition, soybean oil has been tailored to meet end-user needs more successfully than other conventional oils. Three modified oils are already commercially available. Oils with a linolenic acid (18:3) content from 1% to 8% can reduce or eliminate the need for chemical hydrogenation to achieve the stability and shelf life required for certain food applications. The elimination of fatty acids produced by chemical hydrogenation and trans-hydrogenation is important for human health. An increase in oleic acid (18:1) from 25% to >80% also increases oil stability and shelf life. Reduction in palmitic acid (16:0) from 11% to <4% produces low levels of saturated fatty acids and is very beneficial for cardiovascular health [6]. Soybean meal extracted from high-oleic-acid soybeans is rich in protein and inhibits the growth of colon, liver, and lung cancer cells [7].

In response to global demands for different soybean product profiles, quantitative trait locus (QTL) mapping laid the foundation for breeding and selection of major traits. The SoyBase database was first established in the 1990s as the USDA Soybean Genetics Database, which collected genetic resources for soybean, including genetic maps and information on Mendelian genetics (www.soybase.org (accessed on 22 January 2022)). With efforts from many groups and researchers, the first soybean reference genome (Williams 82) was released in 2010, marking a new era in soybean omics research [8]. Studies of genomics, transcriptomics, proteomics, metabolomics, and phenomics have increased dramatically in recent years (Figure 1). The regulatory network of soybean is extremely complex; although single-omics approaches can reveal or explain specific biological phenomena, their results may be difficult to apply directly to soybean breeding. Additionally, now, given the huge amounts of omics data available, significant thought must be given to the best means of using them efficiently. In this review, we comprehensively describe the latest progress in crop omics techniques and discuss ways that soybean breeding may be further improved using multi-omics information in the future.

2. Multi-Omics Research Progress

2.1. Soybean Genomics Research Progress

After the release of the soybean reference genome, Lam et al. (2010) sequenced 31 wild and cultivated soybean genomes and identified genetic diversity patterns and signatures of selection [9]. The genome sequence of Glycine soja (G. soja var. IT182932) was also subsequently reported.

Because of at least two whole genome duplication (WGD) events in the soybean genome over the last 60 million years, it is quite common for genes to be present in two or four copies [10]. A more accurate reference genome can accelerate functional genomics research, and in 2018, Shen et al. [11] reported the de novo assembly of the Chinese soybean “Zhonghuang 13” genome based on advanced sequencing technologies, including single-molecule real-time (SMRT) sequencing, Hi-C chromosome conformation capture, and optical mapping. This genome exhibited more than 250,000 structural variations compared with Williams 82. Subsequently, the genome of the wild soybean accession W05 was assembled, with a size of 1013.2 Mb and a contig N50 of 3.3 Mb [12]. The Williams 82 genome was reassembled and reannotated (version Wm82.a4.v1), integrating ~1.6 million ESTs and 1.5 billion paired-end Illumina RNA-seq reads with homology-based gene predictions [13].

A pan-genome from seven wild soybean relatives was reported in 2014, facilitating the construction of a core genome and enabling genomic comparisons to identify lineage-specific genes and genes with copy number variations or large-effect mutations [14]. Recently, a graph-based pan-genome was constructed from 26 representative wild and cultivated soybeans based on long-read sequencing, enabling the detection of numerous genetic variations that could not be identified from short-sequence reads alone [15]. The pan-genome not only provides complete information on the entire soybean genome, but also serves as a platform for investigating the evolution and functional genomics of soybean [16,17,18].

Recent studies have discovered a number of genes with wide applicability to soybean production. In 2016, Zhao et al. [19] performed genetic and molecular studies of flowering genes by crossing early maturing soybean varieties. The hybrid offspring segregated for two maturity loci, E1 and E9, and detailed molecular analysis of the E9 locus was performed to identify the causal gene. Fine mapping, sequencing, and expression analysis indicated that E9 was FT2a, a homolog of Arabidopsis FLOWERING LOCUS T. The recessive allele of E9/FT2a delays flowering through reduced transcript abundance due to allele-specific transcriptional inhibition, associated with the sore-1 insertion. Therefore, FT2a transcript abundance is directly related to changes in soybean flowering time. The E9 allele can maintain vegetative growth in an early-flowering genetic background and can also be used as a long-juvenile allele, delaying flowering under short-day conditions at lower latitudes. Soybean is the main leguminous crop in temperate regions, and the photoperiod response is a key factor in its latitude adaptation. The varieties introduced into low-latitude, temperate regions mature earlier and have very low yields. The introduction of long-juvenile (LJ) traits extends the vegetative stage and increases yield under short-day conditions, thereby enabling the expansion of tropical planting. In 2017, Lu et al. [20] used natural variation in the soybean J locus to improve adaptability to tropical areas and increase yield.

The composition and content of fatty acids in soybean seeds are very important for the quality of soybean oil. Fatty acids are the main products of oil biosynthesis and substrates of oil catabolism, and they are an important energy source for organisms. In a 2020 study aimed at increasing soybean oil content, Wang et al. [21]. reported strong selection on GmSWEET10a during soybean domestication based on resequencing data from more than 800 genotypes. Selection on GmSWEET10a not only increased soybean seed size and oil content but also reduced protein content. These results were validated using near-isogenic lines with haplotype substitution and transgenic studies. Wang et al. also found that GmSWEET10b and its homolog GmSWEET10a are functionally redundant and are under selection in breeding, making the GmSWEET10b allele a good target for soybean breeding. Research on GmSWEET10a and GmSWEET10b has shown that the transport of sucrose and hexose contributes to sugar distribution from seed coat to embryo, thereby determining oil and protein content and seed size in soybeans. Selection on the GmSWEET10a allele is thought to have promoted the initial domestication of a variety of soybean seed traits, and targeted selection of the superior allele GmSWEET10b may further improve the yield and seed quality of modern soybean varieties.

In summary, although only a few genes suitable for use in soybean breeding have been identified through genomics analysis to date, the J locus and flowering-time-related genes show promise for moving soybean planting areas into new latitude locations [19,20]. Seed oil and protein content genes could also be used to improve soybean seed composition [21]. As the number of sequenced wild and cultivated soybean germplasm resources increases, there is an increasing need for a worldwide data integration platform that combines a series of reference genomes, a pan-genome database, and basic analysis platforms, expanding the scope of use of a wide range of available soybean germplasm resources.

2.2. Soybean Transcriptomics Research Progress

With the development of sequencing technology, transcriptomics is rapidly becoming a mainstay of plant science research [22,23,24]. Transcriptome analysis with next-generation sequencing, also known as RNA sequencing (RNA-seq), enables unbiased, high-throughput detection of all expressed transcripts. This technology offers new insights into molecular profiles and signaling pathways at the level of systems biology and can identify useful gene markers for the efficient breeding of soybean. Recent RNA-seq data provide an overall picture of the metabolic activities of storage compounds during soybean seed development and enable modeling of the gene network associated with seed lipid and protein deposition. During soybean seed filling, the embryo develops as carbohydrates, oils, and proteins are stored in the cotyledons [25,26,27,28]. Severin et al. [29] (2010) performed transcriptome sequencing of different tissues and different developmental stages of Williams 82 soybean, then constructed an RNA-seq atlas of hierarchically clustered gene expression profiles; highly expressed genes and legume-specific genes were also identified. Subsequently, similar research documented transcript patterns during soybean seed development and identified seed-specific genes and expression patterns [30,31].

Genes encoding storage proteins (e.g., beta-conglycinin and glycinin) and enzymes of lipid or starch synthesis were most highly expressed at the embryonic stage when fresh weight was highest, suggesting that these storage compounds are immediately deposited into seeds before initiation of the desiccation process. Other genes expressed at the dry seed stages have been annotated as water-deficit-associated hydrophilic proteins [32], including dehydrins and late-embryogenesis-abundant (LEA) proteins that facilitate the preservation of nutrients and cellular structures during seed desiccation. More significantly, several transcription factors (TFs) have been identified as major regulators of seed development, including the APETALA2 (AP2), VIVIPAROUS1/ABI3-LIKE (VAL), FERTILIZATION INDEPENDENT ENDOSPERM (FIE) [33], GLABRA2 (GL2), PICKLE (PKL) [34], and DNA-binding-with-one-finger (DOF4) transcription factors [35]. PKL is an important activator of embryonic development, and FIE inhibits premature endosperm development. VAL1 and VAL2 [36] regulate the transition from embryonic development to germination. Seed filling development is also regulated by ABSCISIC ACID (ABA)-INSENSITIVE 3 (ABI3), LEAFY COTYLEDON 2 (LEC2), FUSCA3 (FUS3), and WRINKLED1 (WRI1) TFs [26,37]. These TFs may influence the deposition of carbohydrates, oils, and proteins during seed filling. The interactions among TFs during embryonic development and germination are relatively complex and involve several TFs at each stage. Identification and characterization of transcript polymorphisms in soybean lines of different oil compositions provided evidence that mutation of FAD2-1A and FAB2C influenced oleic acid and stearic acid levels, respectively, in elite soybean lines [38]. Jang et al. [39] (2015) analyzed the gene expression patterns of seed protein and oil synthesis during early soybean seed development. Luet al. [40] (2016) built gene co-expression networks based on 40 transcriptome datasets from developing seeds of cultivated and wild soybean accessions, and they identified the two hub genes GA20OX and NFYA. Qi et al. [41] (2018) performed RNA-seq of four chromosomal segment substitution lines (CSSLs) that differed in seed storage composition, identifying seven hub genes through the integration of meta-analysis and RNA-seq co-expression networks. Yang et al. [42] (2018) screened out the three hub genes GmABI3b, GmNFYA, and GmFAD2-1B through dynamic transcriptome analysis of developing soybean seeds. Recently, as RNA sequencing technology has advanced, additional hub genes such as LEC2, ABI3, and SWEET10a have been identified by RNA-seq and co-expression analysis [43]. These datasets and hub genes provide additional resources and candidate gene lists for functional validation.

Transcriptome analysis has also provided an in-depth understanding of the molecular and genetic responses that underlie soybean adaptation to environmental stresses [44,45,46,47]. Transcriptomic data can be extremely valuable for examining differences in gene expression between stress-tolerant and stress-sensitive genotypes, facilitating the development of stress-tolerant genotypes, which is one of the primary objectives of soybean breeding. Comparative transcriptome analysis of soybean exposed to different stresses led to the identification of functional and regulatory genes that act in response to individual and combined stresses, enabling their use in breeding for combined stress tolerance. These genes comprise protein kinases, phosphatases, and a number of TFs from the basic helix–loop–helix (bHLH), ethylene response factor (ERF), myeloblastosis (MYB), no apical meristem (NAC), and WRKY families [48]. Changes in the expression of genes encoding osmolyte-regulating enzymes, aquaporins, LEA proteins, chaperone proteins, and reactive oxygen species (ROS) scavengers, which maintain ionic balance by active transport and protect cell membrane integrity, were also found to be associated with various stress responses in soybean. Transcriptomic studies permit the comparative genomic analysis of cultivated crops and their wild relatives, enabling the identification of additional target genes that are crucial for the improvement of breeding processes. For example, RNA-seq has been used to examine the expression profiles of wild soybean species under alkaline stress [49], providing insight into the functions of alkaline-stress-responsive genes and the molecular basis of wild soybean alkalinity tolerance [50].

Transcriptomics research is a high-throughput approach that uses large-scale datasets to study the overall transcript levels of multiple genes that function in a biological process. With rapid development of high-throughput sequencing technology, further reductions in sequencing costs, and continuous improvements in large-scale data processing capabilities, transcriptomics has become a common experimental approach for solving biological problems through the discovery of new transcripts, development of detailed transcriptional atlases, and accurate identification of metabolic pathways [51]. In addition, with the continuous development of sequencing technology, the current analysis methods and basic assumptions need to be reevaluated and adjusted in order to better cope with the large amount of omics data in the future [24]. At the same time, cell- and tissue-specific transcriptomics technology has been used in specific biological applications, such as plant transformation [52]. Single-cell transcriptomics is continuing to mature, providing precise spatial and temporal insights into biological processes [53,54]. To date, transcriptomics research has identified only a few genes that can be used in soybean breeding, but continued integration of transcriptomics with genomics and other omics approaches could help breeders to optimize soybean regulatory networks and refine hub gene candidates for further soybean breeding.

2.3. Soybean Proteomics Research Progress

Proteins are the source of biological phenomena, the executors of physiological functions, and the direct embodiment of biological activities. Proteomics, the study of protein expression and function, has developed rapidly since its inception, and ever-increasing amounts of proteomic information, together with rapidly expanding plant genome resources and EST sequence libraries, have provided help for the identification of proteins. Proteomics provides the necessary basis for the study of soybean protein expression [55]. In recent years, it has been widely used for research on various aspects of soybean biology, such as growth and development, stress, and root–nodule interactions, and the resulting studies have deepened our understanding of changes in protein expression during the soybean life cycle [56,57,58,59,60,61].

Before 2012, soybean proteome studies relied heavily on 2D gel electrophoresis [62], but advances in liquid chromatography with tandem mass spectrometry (LC–MS/MS) have made high-throughput proteomics possible, promoting greater efficiency and accuracy in proteome research. Hajduchet al. [63] (2005) constructed high-resolution proteome reference maps of seed filling in soybean. Subsequently, the integration of two-dimensional gel electrophoresis, semicontinuous multidimensional protein identification technology (Sec-MudPIT), and LC–MS has improved our understanding of the metabolic processes that occur during seed filling in soybean [64].

Afroz et al. [56] (2011) analyzed the proteome profiles of leaves, hypocotyls, and roots of young soybean seedlings and detected tissue-specific proteins. Since then, proteomics has increasingly been used to analyze enzyme expression and regulatory mechanisms involved in accumulation during seed storage [57]. Root hairs and developing seeds have been analyzed using isobaric tags for relative and absolute quantitation (iTRAQ), and specific proteins related to root hair and seed development have been identified [58,59].

Seed oils and seed storage proteins are the main seed storage reserves in soybean. Xu et al. [60] (2015) used proteomics to analyze differences in global protein expression profiles and oil synthesis between a high-oil soybean cultivar (Jiyu 73, JY73) and its parents. Proteomics has also been widely used in stress biology to identify key proteins. For example, Xu et al. [61] analyzed GmDGAT1-2 transgenic soybeans with high oil content using quantitative proteomics and lipidomics. They showed that GmDGAT1-2 overexpression induces downregulation of lipoxygenase and upregulation of oleosin, thereby significantly altering total fatty acid composition. A proteomics approach also revealed flood and drought response mechanisms in soybean. Wang et al. [65] (2018) identified sensitive tissues of stressed soybean at different developmental stages based on protein profiles, documenting the stress responses of young plants and seedlings exposed to combined stresses in a tissue-specific manner. In 2017, Wang et al. [66] performed tissue-specific proteomics studies of soybean seedlings under flooded conditions, and Wang et al. [67] (2021) demonstrated the dual effect of calcium on soybean radicle protrusion using quantitative proteomics.

Recently, Islam et al. [68] (2019) performed quantitative proteomic analysis of low-linolenic-acid transgenic and control soybean seeds. They revealed perturbations in proteins related to fatty acid metabolic pathways, including a lower abundance of proteins associated with FA initiation, elongation, and desaturation processes and with β-oxidation of α-linolenic acids. Wei et al. [69] (2020) combined quantitative proteomics with physiological data and revealed the effects of temperature and humidity stress on cotyledon, embryo, leaf, and pod vigor in soybean.

Although the breadth of proteome research in soybean is still lower than that in other crops, it nonetheless provides a starting point for functional genomics studies of natural product biosynthesis mechanisms and biotic and abiotic stresses in soybean [55,56,57,58,59,60,61]. The information obtained from proteomics can help to identify novel proteins, determine the expression patterns of their corresponding genes, and enable their molecular cloning. Soybean research could be further advanced by the construction of a soybean proteome reference map. A combination of proteomics, genomics, and transcriptomics could enable the screening of elite alleles and the development of molecular markers, providing new possibilities for soybean molecular breeding. Recent advances in protoplast and sequencing technologies have enabled single-cell transcriptomic studies in plants, but single-cell proteomics will need to develop much further to achieve a comparable throughput. A major problem with single-cell proteomics is its inherently low sample volume that challenges traditional sample preparation protocols and the sensitivity of current liquid chromatography–mass spectrometry (LC-MS) systems [70]. Nonetheless, protein data arguably reflect the execution and control of most cellular processes more closely than transcriptome data, and efficient, high-throughput techniques for analyzing protein expression, interactions, and modifications will be essential for understanding the molecular mechanisms that underlie plant phenotypes [71].

2.4. Soybean Metabolomics Research Progress

Metabolomics is a new approach to the qualitative and quantitative analysis of small metabolites with relative molecular weights <1000 in a given tissue or cell. It is an important aspect of systems biology, and its development will have implications for future soybean research. Metabolomics analyses can reveal specific metabolic signaling pathways, providing key resources for gene discovery, metabolic engineering, and the elucidation of regulatory mechanisms. Quantitative metabolomics techniques for the detection of plant metabolites include liquid chromatography–electrochemistry–mass spectrometry (LC–EC–MS), gas/liquid chromatography–mass spectrometry (GC/LC–MS), thin-layer chromatography (TLC), Fourier transform infrared (FT–IR) spectroscopy, NMR, direct infusion mass spectrometry (DIMS), and capillary electrophoresis–LC–MS [72,73,74,75,76]. The LC–MS, GC–MS, NMR, and capillary electrophoresis MS techniques are most commonly used in plant metabolomics [77,78,79]. Compared with genomics, transcriptomics, and proteomics, the results of plant metabolomics techniques are more directly related to the plant phenotype. The identified metabolites have the dual effect of influencing or regulating both gene transcription and protein expression. Subtle changes that occur at other levels of regulation can be further amplified at the metabolome level [80]. The general process of metabolomics studies involves plant sample collection, metabolite isolation, and the detection of metabolite type, content, and status using assay techniques to construct a metabolic fingerprint [81]. This is combined with bioinformatics analysis to mine relevant information and integrate metabolic pathways for a comprehensive understanding of plant metabolic processes and metabolite changes [82].

Single-cell mass spectrometry provides information on metabolite abundance in cell populations, enabling the identification of hidden phenotypes, metabolic states, and rare cells. Comparing gene expression, protein function, and metabolite levels in individual cells could provide a comprehensive understanding of cellular physiology. In recent years, the combination of MS with novel sampling and ionization techniques has emerged as an important tool for single-cell metabolomics [83]. MS-based single-cell metabolomics enables the simultaneous detection of multiple metabolites from a single cell without preselection or labeling, thus mapping phenotypes at the single-cell level. Although this approach is still relatively new, it has been adopted by a growing number of active research groups who are developing cell sampling and ionization techniques, data analysis tools, and applications to answer important biomedical and environmental questions [84].

Characterizing mechanisms of metabolic regulation is crucial for modifying soybean seed composition. Using a nontargeted metabolomics approach, 169 metabolites were identified from mature seeds of 29 representative soybean cultivars and then mapped onto a metabolic network. These metabolites were mainly involved in key pathways of seed development, such as the tricarboxylic acid cycle, glycolysis, amino acid biosynthesis and catabolism, nitrogen utilization, antioxidant utilization, lipid oxidation, and secondary metabolite accumulation [85]. Significant variations in metabolite abundance and clear metabolite–metabolite correlations among different soybean cultivars were also demonstrated. The isoflavone profiles of soybean germplasms highlighted the diverse varieties of isoflavones present in soybean [86], and several aglycones were associated with different levels of shade tolerance at the seedling stage [87]. The effects of seed dry weight, seed coat color, and maturity on metabolite abundance have also been determined [88,89]. For instance, black soybean seeds were used to investigate the effect of maturity on metabolite abundance at different maturity stages. Several metabolites showed different responses to seed maturation, and the isoflavone content was markedly related to seed maturity. In addition, plant metabolites that change through specific pathways can enhance the nutritional value of genetically modified soybean by promoting the accumulation of isoflavones in developing seeds.

The use of metabolomics techniques was evaluated in a comprehensive study to explore the results of abiotic stress on soybean metabolites. The results showed that exposure of soybean plants to abiotic stress increased the biosynthesis of secondary metabolites, among which glycine and proline act as major osmoprotectants, playing important roles in the reduction in osmotic damage induced by abiotic stress. Levels of different polyphenols (hydroxycinnamates and flavonoids), phenylpropanoids, alkaloid caffeate, and phytochemicals (daidzin, daidzein, syringic acid, formononetin, genistin, and genistein) also increased [90], all of which are known to respond to plant drought stress. A metabolomics approach was recently used to discover metabolic markers applicable to crop improvement. This concept was first introduced by an earlier study that identified metabolites with high predictive values as biomarkers for plant biomass accumulation and plant breeding improvement. The role of these markers in crop enhancement was verified by recent studies [91], and their potential for use in soybean selection was also evaluated. Recent studies compared changes in metabolites between tobacco and soybean after exposure to drought, and levels of 4-hydroxy-2-oxoglutaric acid in tobacco roots and pinitol and coumestrol in soybean roots were markedly increased, suggesting that these may be useful markers for differentiating lines grown under well-watered conditions from those exposed to drought stress [92].

In the last decade, metabolomics studies have opened up new horizons for the elucidation of plant metabolic pathways and genetic architecture. Metabolomics studies have resolved the isoflavone profile of soybean [84], identified metabolites associated with dry weight and maturity of soybean grains [87], and greatly accelerated the differentiation of disease-resistant from disease-susceptible soybean varieties under stress conditions [93]. However, because plants have complex metabolic pathways and diverse mechanisms of product synthesis, metabolomics is still in its infancy, and gaps remain between metabolomics findings and practical breeding applications. There is no single metabolomic analysis method that can cover all metabolites [94]. Methods of metabolite isolation, identification, and data analysis used for different research objectives are also different and have strict requirements for sample processing techniques. The research performed in soybean to date has focused mainly on known metabolites, and the large number of unknown metabolites obtained without appropriate structural information and identification impedes further studies and applications. Metabolomics also lacks deep integration with other approaches, making it a relatively narrow area of research. It remains difficult to effectively obtain comprehensive information on plant metabolites, and resolving these methodological issues will be a breakthrough for future metabolomics research.

2.5. Soybean Phenomics Research Progress

Given the rapid development of sequencing technology and the sheer number of plant materials to be tested, the collection of phenotypic information by appropriate high-throughput phenotyping technologies is particularly important for plant breeding. However, plant phenotypes are complex and dynamic, and they are easily affected by the environment. Manual investigation of plant phenotypes is characterized by low efficiency and large errors [95]. As an approach to solve these practical problems, plant phenomics has begun to receive significant attention [96]. At present, there are only 132 published articles related to high-throughput phenotyping in soybean, and 120 have been published in the past decade. The number of studies is increasing year by year, showing that breeders are becoming increasingly aware of the importance of accelerating soybean breeding through phenotyping studies.

Plant growth and development follow a strict growth regime, but harvested plants differ in their cotyledon size, resistance to stress, and metabolic capacity. Small developmental differences can result in dramatic changes in both physical traits and internal characteristics of plants [97]. Plant phenomics leverages high-throughput, high-resolution phenotyping technologies and platforms to acquire phenotype data during and after plant production [98]. It is characterized by the large amounts of trait data acquired and the ability to divide the same trait into multiple smaller traits for testing. Data acquisition with fixed, quantitative, and uniform acquisition standards facilitates high-throughput automated analysis, enhancing the accuracy of crop phenotype identification and further promoting the efficiency of plant breeding and cultivation management [99].

For example, professional unmanned aerial vehicles (UAVs) can be equipped with high-definition dual-camera multi-spectral equipment and can obtain accurate soybean yield estimates and efficient pod maturity classifications by reconstructing time course multispectral high-throughput image data [100]. Fusion of high-spatial-resolution RGB, multispectral, and thermal data from UAV systems has improved the estimation accuracy of soybean physiological, biochemical (e.g., chlorophyll content, nitrogen concentration), and biophysical parameters (e.g., leaf area index, aboveground fresh and dry biomass) [101]. The color and texture features of early-season RGB images of the soybean canopy have also been used to predict soybean yield, maturity, and seed size [102]. Professional UAVs equipped with expensive multispectral, hyperspectral, and thermal imaging equipment are increasingly used for high-precision sampling of soybean canopy traits, such as height, area, temperature, and leaf wilting [103,104,105]. However, it is difficult to obtain higher-dimensional phenotypic traits from 2D images, and some estimates of some morphological traits still require calibration. To this end, researchers have performed three-dimensional reconstruction of plant morphology from two-dimensional image sequences using fully open-source structure from motion (SFM) and multi-view stereo (MVS) approaches [96]. For example, plant height and growth phenotype data can be obtained by establishing high-density 3D point clouds from plant image data [106]. In 2020, researchers used 3D reconstruction technology to analyze the “phenotypic fingerprint” and growth pattern of soybean plants throughout the growth period [107]. This cost-effective 3D reconstruction method can replace expensive laser scanners, with the potential to automate some procedures.

The use of high-throughput phenotyping to record plant responses to stress can enable the identification of resistant plants and the discovery of new genes [108]. When studying the biological effects of stress, a combination of high-throughput phenotyping ground vehicles, unmanned aerial systems, and digital images taken with smartphones can be used to automatically assess the severity of iron deficiency chlorosis (IDC) in real time [109], enabling field screening for large-scale soybean IDC tolerance [110]. Zhou et al. developed an automated plant phenotyping system in an established greenhouse and used it to analyze chlorophyll content and salt tolerance by continuously photographing and extracting image features with digital cameras (red, green, and blue), thereby demonstrating its feasibility [111]. Soybean is sensitive to flooding stress. Researchers used five-band multispectral and infrared thermal imaging cameras to extract canopy image features from three flight altitudes and then used deep learning to estimate soybean flooding damage scores [105]. By measuring such phenotypes, QTLs or genes related to abiotic stress can be located, and new soybean varieties with strong stress resistance can be cultivated.

With the development of remote sensing, robotics, visualization, and artificial intelligence, plant phenomics research has entered a stage of rapid growth. However, the huge amounts of data and numbers of pictures also present unprecedented challenges. Machine learning, the foundation of artificial intelligence whereby algorithms make predictions and learn from data without explicit programming, has been used in attempts to solve this problem [112]. Deep learning, a new field within machine learning, was originally proposed in 2006 [113]; it now includes approaches such as convolutional neural networks (CNNs), multilayer perceptrons (MLPs), and recurrent neural networks (RNNs) [114]. CNNs have been most widely applied to plant phenotyping, and commonly used CNN deep learning frameworks include TensorFlow [115], PyTorch [116], and Caffe [117]. At the same time, the development of deep learning has been supported by advances in cloud computing and GPU parallel computing. At present, there is a high demand for modern, high-frequency, multi-site, standardized phenotype acquisition, and the standardization and storage of phenotype data is a current subject of concern. Nonetheless, deep learning and low-cost sensors have been applied to a number of image-based tasks with impressive results [114].

For example, machine learning, multi-modal data fusion, and deep learning have been applied to soybean multi-sensor data from UAV systems to enable accurate prediction of soybean yield [118,119]. Riera et al. developed a multi-view-image-based fusion architecture for monitoring soybean pods and estimating yield through deep learning and demonstrated its effectiveness [120]. In another study, a novel high-throughput image analysis method was developed to rapidly determine and analyze the morphology and color of 39,065 soybean seeds from 400 lines [121]. Algorithms such as random forests and deep CNNs are well suited to many vision-based computer problems. For example, a system based on an image processing algorithm could detect materials other than grain (MOGs) in harvested soybeans on a large scale [122]. UVA-based multispectral images combined with a random forest approach enabled estimation of soybean maturity in different lines [106]. Compared with traditional image analysis methods, CNN is simultaneously trained end-to-end without an image feature description and extraction process, and it has been used to efficiently segment single soybean seeds and calculate their morphological parameters [123,124]. Soybean phenotype data include not only aboveground information but also belowground root phenotype data, and the fully automated soybean nodule acquisition pipeline (SNAP) combines RetinaNet and UNet deep learning architectures for nodule detection and segmentation. Compared with traditional methods, SNAP reduces the labor and inconsistency associated with nodule calculations and enables earlier assessment of the effects of genetic and environmental factors and their interactions on nodules [125].

These methods enable timely, efficient, and accurate prediction of soybean phenotypes in different regions and scales, revealing the regional differentiation and evolution of soybean phenotypic traits and assisting with soybean breeding and cultivation decisions.

The main soybean phenotypic traits currently investigated by phenomics include yield [126], maturity [102], leaf area index [101], plant height [106], cotyledon size, color [121], and response to abiotic stresses [105,109,111]. Image analysis techniques include fluorescence imaging [127], thermal imaging [128], two- or three-dimensional color imaging [129], and infrared spectral imaging [130]. Despite recent advances in plant phenomics research, the study of plant phenotypes has been limited mainly to the description of external physical traits and has largely failed to address internal and biochemical characteristics, hindering its application to practical breeding. In addition, there are many issues that require further research. Phenotyping equipment is costly and expensive, and although it greatly reduces labor, it still requires personnel with a specific biological and technical background who can follow a standardized process. Some of the equipment can be operated only under strict environmental and weather conditions, and large-scale field data collection can be subject to large deviations due to weather conditions, particularly for spectral equipment, which requires high light levels. Some cutting-edge technologies for crop phenotyping, such as artificial intelligence techniques and CT imaging, have rarely been used in soybean breeding. In the context of the big data era, the information obtained by machine learning is a huge multidimensional matrix. It will be important to help breeders filter useful information from such massive datasets and integrate this information with other biological data in order to ultimately perform deep data mining for the selection of new plant varieties.

3. Molecular Breeding in Soybean

Food security is among the most important topics for human society. Plant breeding, as a major approach to increasing the food supply, is one of the oldest agricultural practices in human civilization. To date, there have been three major innovations in plant breeding: artificial selection, cross breeding, and molecular breeding. A fourth innovation, optimization and precision design breeding, is also underway [95]. Molecular breeding refers to the application of molecular biology techniques to breeding, i.e., breeding at the molecular level, and it is the most widely used breeding approach at present.

Long ago, our ancestors began to domesticate wild plants. They selected desired phenotypes and individuals from among the plants they cultivated and began to intentionally control plant reproduction, ushering in the first stage of plant breeding, artificial selection. Breeding for selected phenotypes continued for thousands of years through the 19th century, when Mendel’s law was first proposed in 1865 [131]. After that, pedigree breeding was developed on the basis of segregation, and plant breeding entered the second stage, hybridization. The structure of DNA was discovered in 1953, and life science entered the molecular era [132]; on this basis, molecular biology techniques were developed to perform breeding at the molecular level. These techniques currently include marker-assisted selection [133] and transgenic breeding [134]. Molecular marker-assisted breeding (marker-associated breeding) is based on the close linkage between molecular markers and genes that determine target traits. By detecting specific molecular markers, the presence of target genes can be determined, enabling the selection of target traits. This approach is rapid, accurate, and free of interference from environmental conditions. Transgenic breeding uses genetic engineering to produce new varieties with desired characteristics through the introduction of specific genes. These methods have transformed plant breeding from pure phenotypic selection to a combination of genotypic and phenotypic selection, enabling the production of new varieties that meet human requirements.

Since the assembly of the first plant genome in Arabidopsis thaliana 20 years ago, developments in high-throughput sequencing technology have provided reference genomes for more than 800 land plants. High quality reference genomes and large amounts of genotyping data provide great convenience for quantitative genetic analysis of complex traits.

Experiments with the obtained genes have increased the oil content of soybeans, increased the content of soybean oleic acid, and reduced the content of linolenic acid. In 2010, Pham [135] et al. found that three polymorphisms in the FAD2-1B allele of two soybean lines resulted in missense mutations. The mutant FAD2-1B allele was associated with an increase in oleic acid content. Pham [136] et al. also found that high-oleic-acid soybeans could be obtained by combining the mutant FAD2-1A and FAD2-1B genes. However, despite their high oleic acid content, these soybeans still contained 4–6% linolenic acid, which may be enough to cause oxidation instability in the oil. Therefore, one or two mutated FAD3 genes were added into a high-oleic-acid background to further reduce linolenic acid content [137]. Recently, a high-oleic-, low-linolenic-acid soybean variety with elevated vitamin E content was developed by molecular-assisted breeding [136]. Oleic acid levels affect human health, and producing high quality soybeans with high contents of oleic acid and other health-promoting components should continue to be the focus of further research. Molecular-assisted breeding has also been used to improve carbohydrate profiles [136], pod shatter [138,139], yield and latitude adaptation [19,20,140], and seed oil and protein content [21] in soybean breeding.

Soybean molecular breeding involves the integration of multiple disciplines. With progress in science and technology, simple integration of theories from individual disciplines is no longer sufficient for the continued development of soybean molecular breeding. Although only a few genes identified from omics approaches have been used in soybean breeding to date, the integration of one to four genes into a new variety has been achieved with the assistance of genomics tools. Nonetheless, continued improvements in soybean breeding will be dependent on advances in multiple omics. In addition to the traditional omics methods mentioned above, new omics techniques have also emerged, including metagenomics, single-cell omics, and various types of epigenetic omics. These new omics methods provide more data for the characterization of complex traits. Nonetheless, it remains a challenge to integrate different omics approaches into an overall omics strategy to dissect the detailed connections among different omics datasets and further our understanding of complex trait regulation.

Crop production is increasingly difficult owing to water shortages, climate change, and extreme weather events caused by the current global environment. Therefore, new innovations in precision design breeding (i.e., Stage 4.0 breeding) will be required to feed the growing world population. The development of new technologies, especially the development of cross-disciplines based on economics, has brought Stage 4.0 breeding to the forefront. A prerequisite for precision design breeding is the accurate association of genotype with phenotype. Therefore, it is necessary to determine the genetic anatomy of agronomic traits and identify the corresponding genotypic variations. In the past 40 years, multiple revolutions in DNA sequencing technology have significantly improved sequencing throughput and quality, and sequencing costs have continued to decline, greatly promoting functional genomics research [95,141]. Nonetheless, the challenge of integrating large, dissimilar datasets remains.

4. Further Perspectives

With the growth of the human population, continuous improvements in living standards, and the potential threat posed by global environmental change to food production, it is necessary to perform intensive research into the molecular mechanisms of high and stable crop yields under adverse conditions. The excellent alleles and germplasm resources of China have been established independently. A design innovation system for environmental adaptive breeding has been established in China and combined with big-data climate modeling and accurate predictions to produce high-quality, stable crop yields and provide important guarantees for national food security and people’s life and health [95,142].

Here, we propose a conceptual workflow for the use of multiple omics datasets to identify key factors that regulate soybean yield, seed quality, stress biology, and other factors. First, wild and cultivated soybean reference genomes, pan-genome databases, and analysis platforms can be expanded to include a wider range of soybean germplasm resources in the future. The integrative analysis of multi-omics data can be used to construct the regulatory networks of individual traits, analyze the synergistic relationships among traits, and improve our overall understanding of molecular regulatory networks. The comprehensive use of genetics, multi-omics, molecular biology, and other technical approaches can identify key regulatory genes involved in relevant pathways and clarify their effects. To obtain further details, genome-wide association studies (GWAS), transcriptome-wide association studies (TWAS), metabolite-GWAS (mGWAS), and population genetics can enable fine mapping of a hub gene in natural germplasm populations or segregation populations. The hub gene can then be cloned to create overexpression or knock-out plants for phenotype validation. Multiple omics strategies can accelerate the identification of the hub gene and can provide a better understanding of its regulatory network. During the hub gene screening process, identification of excellent alleles and further utilization of genomic resources from wild soybean are also important future directions. Germplasm with excellent alleles can be selected for crossing with the main cultivated variety. With the assistance of marker development for the hub gene(s) and a high-throughput phenotyping system, important hub genes can be integrated into the molecular design of a new soybean variety by high-throughput molecular-assisted breeding selection (Figure 2). The sharing of information and resources from different groups should be strengthened worldwide, and hub genes and their excellent alleles can be used to accelerate the development of soybean breeding [142,143].

5. Conclusions

Multiple omics approaches show promise for the efficient improvement of soybean breeding in future research. The genome is the basic foundation of soybean germplasm, whereas the transcriptome, proteome, metabolome, and phenome are the upper layers. In soybean breeding practices, genomics, transcriptomics, proteomics, metabolomics, and high-throughput phenotyping will need to be better integrated to construct the regulatory networks of complex traits and efficiently identify hub genes. Genome editing of specific hub genes will also help with their functional validation. After marker development for the hub gene(s), important or specific hub genes can be integrated to design a new soybean variety. Further development of integrated multi-omics resources can promote the efficient and accurate discovery of excellent alleles, providing more possibilities for soybean breeding.

6. Supplementary Research Methodology

Keywords for each omics technique were used for literature searches at PubMed (https://pubmed.ncbi.nlm.nih.gov/ (accessed on 22 January 2022)); search terms included soybean genome, soybean transcriptome, soybean proteome, soybean metabolome, soybean high-throughput phenotype, and soybean multi-omics. All literature on each technique was collected as completely as possible, and we focused on milestone literature on soybean and literature published in the most recent 3–5 years. We developed the present review from these collected references, summarizing the research questions, important data, research methods, and major results of each paper.

Author Contributions

Conceptualization, Z.Q. and Q.C.; methodology, P.C. and Z.Q.; investigation, P.C., Y.Z., F.W., C.L., D.X., X.W. and J.L.; data curation, P.C., Y.Z. and F.W.; writing—original draft preparation, Z.Q., P.C., Y.Z. and F.W.; writing—review and editing, Z.Q., P.C., J.L. and F.W.; visualization, Z.Q., P.C., J.L. and F.W.; supervision, Q.C. and Z.Q.; funding acquisition, Z.Q. All authors have read and agreed to the published version of the manuscript.

Funding

This review was financially supported by the Natural Science Foundation of Heilongjiang-Outstanding Youth Foundation (YQ2021C011).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

Hymowitz, T.; Shurtleff, W.R. Debunking soybean myths and legends in the historical and popular literature. Crop. Sci. 2005, 45, 473–476. [Google Scholar] [CrossRef]
Hymowitz, T.; Harlan, J.R. Introduction of soybean to North America by Samuel Bowen in 1765. Econ. Bot. 1983, 37, 371–379. [Google Scholar] [CrossRef]
Marra, M.C.; Piggott, N.E.; Carlson, G.A. The Net Benefits, Including convenience of roundup ready soybeans: Results from a national survey. Tech. Bull. 2004, 3. Available online: https://www.researchgate.net/publication/237717600 (accessed on 22 January 2022).
Bradshaw, J.E. Plant breeding: Past, present and future. Euphytica 2017, 213, 60. [Google Scholar] [CrossRef]
Orf, J.H. Breeding, Genetics, and Production of Soybeans. In Soybeans: Chemistry, Production, Processing, and Utilization; AOCS Press: St. Paul, MN, USA, 2008; pp. 33–65. [Google Scholar] [CrossRef]
Fehr, W.R. Breeding for modified fatty acid composition in soybean. Crop. Sci. 2007, 47, S-72–S-87. [Google Scholar] [CrossRef]
Rayaprolu, S.J.; Hettiarachchy, N.S.; Chen, P.; Kannan, A.; Mauromostakos, A. Peptides derived from high oleic acid soybean meals inhibit colon, liver and lung cancer cell growth. Food Res. Int. 2013, 50, 282–288. [Google Scholar] [CrossRef]
Schmutz, J.; Cannon, S.B.; Schlueter, J.; Ma, J.; Mitros, T.; Nelson, W.; Hyten, D.L.; Song, Q.; Thelen, J.J.; Cheng, J.; et al. Genome sequence of the palaeopolyploid soybean. Nature 2010, 463, 178–183. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Lam, H.M.; Xu, X.; Liu, X.; Chen, W.; Yang, G.; Wong, F.L.; Li, M.W.; He, W.; Qin, N.; Wang, B.; et al. Resequencing of 31 wild and cultivated soybean genomes identifies patterns of genetic diversity and selection. Nat. Genet. 2010, 42, 1053–1059. [Google Scholar] [CrossRef] [PubMed]
Kim, M.Y.; Lee, S.; Van, K.; Kim, T.H.; Jeong, S.C.; Choi, I.Y.; Kim, D.S.; Lee, Y.S.; Park, D.; Ma, J.; et al. Whole-genome sequencing and intensive analysis of the undomesticated soybean (Glycine soja Sieb. and Zucc.) genome. Proc. Natl. Acad. Sci. USA 2010, 107, 22032–22037. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Shen, Y.; Liu, J.; Geng, H.; Zhang, J.; Liu, Y.; Zhang, H.; Xing, S.; Du, J.; Ma, S.; Tian, Z. De novo assembly of a Chinese soybean genome. Sci. China Life Sci. 2018, 61, 871–884. [Google Scholar] [CrossRef]
Xie, M.; Chung, C.Y.; Li, M.W.; Wong, F.L.; Wang, X.; Liu, A.; Wang, Z.; Leung, A.K.; Wong, T.H.; Tong, S.W.; et al. A reference-grade wild soybean genome. Nat. Commun. 2019, 10, 1216. [Google Scholar] [CrossRef] [Green Version]
Valliyodan, B.; Cannon, S.B.; Bayer, P.E.; Shu, S.; Brown, A.V.; Ren, L.; Jenkins, J.; Chung, C.Y.; Chan, T.F.; Daum, C.G.; et al. Construction and comparison of three reference-quality genome assemblies for soybean. Plant J. 2019, 100, 1066–1082. [Google Scholar] [CrossRef] [PubMed]
Li, Y.; Zhou, G.; Ma, J.; Jiang, W.; Jin, L.G.; Zhang, Z.; Guo, Y.; Zhang, J.; Sui, Y.; Zheng, L.; et al. De novo assembly of soybean wild relatives for pan-genome analysis of diversity and agronomic traits. Nat. Biotechnol. 2014, 32, 1045–1052. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Liu, Y.; Du, H.; Li, P.; Shen, Y.; Peng, H.; Liu, S.; Zhou, G.A.; Zhang, H.; Liu, Z.; Shi, M.; et al. Pan-genome of wild and cultivated soybeans. Cell 2020, 182, 162–176. [Google Scholar] [CrossRef] [PubMed]
Liu, Y.; Tian, Z. From one linear genome to a graph-based pan-genome: A new era for genomics. Sci. China Life Sci. 2020, 63, 1938–1941. [Google Scholar] [CrossRef] [PubMed]
Zhang, H.; Jiang, H.; Hu, Z.; Song, Q.; An, Y.C. Development of a versatile resource for post-genomic research through consolidating and characterizing 1500 diverse wild and cultivated soybean genomes. BMC Genom. 2022, 23, 250. [Google Scholar] [CrossRef] [PubMed]
Tao, Y.; Jordan, D.R.; Mace, E.S. A graph-based pan-genome guides biological discovery. Mol. Plant 2020, 13, 1247–1249. [Google Scholar] [CrossRef]
Zhao, C.; Takeshima, R.; Zhu, J.; Xu, M.; Sato, M.; Watanabe, S.; Kanazawa, A.; Liu, B.; Kong, F.; Yamada, T.; et al. A recessive allele for delayed flowering at the soybean maturity locus E9 is a leaky allele of FT2a, a FLOWERING LOCUS T ortholog. BMC Plant Biol. 2016, 16, 20. [Google Scholar] [CrossRef] [Green Version]
Lu, S.; Zhao, X.; Hu, Y.; Liu, S.; Nan, H.; Li, X.; Fang, C.; Cao, D.; Shi, X.; Kong, L.; et al. Natural variation at the soybean J locus improves adaptation to the tropics and enhances yield. Nat. Gene 2017, 49, 773–779. [Google Scholar] [CrossRef]
Wang, S.; Liu, S.; Wang, J.; Yokosho, K.; Zhou, B.; Yu, Y.C.; Liu, Z.; Frommer, W.B.; Ma, J.F.; Chen, L.Q.; et al. Simultaneous changes in seed size, oil content and protein content driven by selection of SWEET homologues during soybean domestication. Natl. Sci. Rev. 2020, 7, 1776–1786. [Google Scholar] [CrossRef]
Moses, L.; Pachter, L. Museum of spatial transcriptomics. Nat. Methods 2022, 87, 1–13. [Google Scholar] [CrossRef]
Ozsolak, F.; Milos, P.M. RNA sequencing: Advances, challenges and opportunities. Nat. Rev. Genet. 2011, 12, 87–98. [Google Scholar] [CrossRef] [PubMed]
Zeng, Z.; Li, Y.; Li, Y.; Luo, Y. Statistical and machine learning methods for spatially resolved transcriptomics data analysis. Genome Biol. 2022, 23, 83. [Google Scholar] [CrossRef] [PubMed]
Wang, W.; Xie, Y.; Liu, L.; King, G.J.; White, P.; Ding, G.; Wang, S.; Cai, H.; Wang, C.; Xu, F.; et al. Genetic Control of Seed Phytate Accumulation and the Development of Low-Phytate Crops: A Review and Perspective. J. Agric. Food Chem. 2022, 70, 3375–3390. [Google Scholar] [CrossRef] [PubMed]
Verdier, J.; Thompson, R.D. Transcriptional regulation of storage protein synthesis during dicotyledon seed filling. Plant Cell Physiol. 2008, 49, 1263–1271. [Google Scholar] [CrossRef] [Green Version]
Verdier, J.; Kakar, K.; Gallardo, K.; Le Signor, C.; Aubert, G.; Schlereth, A.; Town, C.D.; Udvardi, M.K.; Thompson, R.D. Gene expression profiling of M. truncatula transcription factors identifies putative regulators of grain legume seed filling. Plant Mol. Biol. 2008, 67, 567–580. [Google Scholar] [CrossRef]
Hajduch, M.; Hearne, L.B.; Miernyk, J.A.; Casteel, J.E.; Joshi, T.; Agrawal, G.K.; Song, Z.; Zhou, M.; Xu, D.; Thelen, J.J. Systems analysis of seed filling in Arabidopsis: Using general linear modeling to assess concordance of transcript and protein expression. Plant Physiol. 2010, 152, 2078–2087. [Google Scholar] [CrossRef] [Green Version]
Severin, A.J.; Woody, J.L.; Bolon, Y.T.; Joseph, B.; Diers, B.W.; Farmer, A.D.; Muehlbauer, G.J.; Nelson, R.T.; Grant, D.; Specht, J.E.; et al. RNA-Seq Atlas of Glycine max: A guide to the soybean transcriptome. BMC Plant Biol. 2010, 10, 160. [Google Scholar] [CrossRef] [Green Version]
Jones, S.I.; Gonzalez, D.O.; Vodkin, L.O. Flux of transcript patterns during soybean seed development. BMC Genom. 2010, 11, 136. [Google Scholar] [CrossRef] [Green Version]
Jones, S.I.; Vodkin, L.O. Using RNA-Seq to profile soybean seed development from fertilization to maturity. PLoS ONE 2013, 8, e59270. [Google Scholar] [CrossRef] [Green Version]
Crouch, M.L.; Sussex, I.M. Development and storage-protein synthesis in Brassica napus L. embryos in vivo and in vitro. Planta 1981, 153, 64–74. [Google Scholar] [CrossRef]
Mosquna, A.; Katz, A.; Shochat, S.; Grafi, G.; Ohad, N. Interaction of FIE, a Polycomb protein, with pRb: A possible mechanism regulating endosperm development. Mol. Genet. Genom. 2004, 271, 651–657. [Google Scholar] [CrossRef] [PubMed]
Shen, B.; Sinkevicius, K.W.; Selinger, D.A.; Tarczynski, M.C. The homeobox gene GLABRA2 affects seed oil content in Arabidopsis. Plant Mol. Biol. 2006, 60, 377–387. [Google Scholar] [CrossRef] [PubMed]
O’Rourke, J.A.; Bolon, Y.T.; Bucciarelli, B.; Vance, C.P. Legume genomics: Understanding biology through DNA and RNA sequencing. Ann. Bot. 2014, 113, 1107–1120. [Google Scholar] [CrossRef] [PubMed]
Wang, F.; Perry, S.E. Identification of direct targets of FUSCA3, a key regulator of Arabidopsis seed development. Plant. Physiol. 2013, 161, 1251–1264. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Pelletier, J.M.; Kwong, R.W.; Park, S.; Le, B.H.; Baden, R.; Cagliari, A.; Hashimoto, M.; Munoz, M.D.; Fischer, R.L.; Goldberg, R.B.; et al. LEC1 sequentially regulates the transcription of genes involved in diverse developmental processes during seed development. Proc. Natl. Acad. Sci. USA 2017, 114, E6710–E6719. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Goettel, W.; Liu, Z.; Xia, J.; Zhang, W.; Zhao, P.X.; An, Y.Q. Systems and evolutionary characterization of microRNAs and their underlying regulatory networks in soybean cotyledons. PLoS ONE 2014, 9, e86153. [Google Scholar] [CrossRef] [PubMed]
Jang, Y.E.; Kim, M.Y.; Shim, S.; Lee, J.; Lee, S.H. Gene expression profiling for seed protein and oil synthesis during early seed development in soybean. Genes Genom. 2015, 37, 409–418. [Google Scholar] [CrossRef]
Lu, X.; Li, Q.T.; Xiong, Q.; Li, W.; Bi, Y.D.; Lai, Y.C.; Liu, X.L.; Man, W.Q.; Zhang, W.K.; Ma, B.; et al. The transcriptomic signature of developing soybean seeds reveals the genetic basis of seed trait adaptation during domestication. Plant J. 2016, 86, 530–544. [Google Scholar] [CrossRef] [Green Version]
Qi, Z.; Zhang, Z.; Wang, Z.; Yu, J.; Qin, H.; Mao, X.; Jiang, H.; Xin, D.; Yin, Z.; Zhu, R.; et al. Meta-analysis and transcriptome profiling reveal hub genes for soybean seed storage composition during seed development. Plant Cell Environ. 2018, 41, 2109–2127. [Google Scholar] [CrossRef]
Wang, Y.; Ye, W.; Wang, Y. Genome-wide identification of long non-coding RNAs suggests a potential association with effector gene transcription in Phytophthora sojae. Mol. Plant Pathol. 2018, 19, 2177–2186. [Google Scholar] [CrossRef] [Green Version]
Peng, L.; Qian, L.; Wang, M.; Liu, W.; Song, X.; Cheng, H.; Yuan, F.; Zhao, M. Comparative transcriptome analysis during seeds development between two soybean cultivars. Peer J. 2021, 9, e10772. [Google Scholar] [CrossRef] [PubMed]
Matsui, A.; Ishida, J.; Morosawa, T.; Mochizuki, Y.; Kaminuma, E.; Endo, T.A.; Okamoto, M.; Nambara, E.; Nakajima, M.; Kawashima, M.; et al. Arabidopsis transcriptome analysis under drought, cold, high-salinity and ABA treatment conditions using a tiling array. Plant Cell Physiol. 2008, 8, 1135–1149. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Liu, W.; Tai, H.; Li, S.; Gao, W.; Zhao, M.; Xie, C.; Li, W.X. bHLH122is important for drought and osmotic stress resistance in Arabidopsis and in the repression of ABA catabolism. New. Phytol. 2014, 201, 1192–1204. [Google Scholar] [CrossRef] [PubMed]
Rasheed, S.; Bashir, K.; Matsui, A.; Tanaka, M.; Seki, M. Transcriptomic analysis of soil-grown Arabidopsis thaliana roots and shoots in response to a drought stress. Front. Plant Sci. 2016, 7, 180. [Google Scholar] [CrossRef] [PubMed]
Rasheed, S.; Bashir, K.; Nakaminami, K.; Hanada, K.; Matsui, A.; Seki, M. Drought stress differentially regulates the expression of small open reading frames (sORFs) in Arabidopsis roots and shoots. Plant Signal. Behav. 2016, 11, e1215792. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Chen, W.; Yao, Q.; Patil, G.B.; Agarwal, G.; Deshmukh, R.K.; Lin, L.; Wang, B.; Wang, Y.; Prince, S.J.; Song, L.; et al. Identification and comparative analysis of differential gene expression in soybean leaf tissue under drought and flooding stress revealed by RNA-Seq. Front. Plant Sci. 2016, 7, 1044. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Ge, Y.; Li, Y.; Zhu, Y.M.; Bai, X.; Lv, D.K.; Guo, D.; Ji, W.; Cai, H. Global transcriptome profiling of wild soybean (Glycine soja) roots under NaHCO 3 treatment. BMC Plant Biol. 2010, 10, 153. [Google Scholar] [CrossRef] [Green Version]
Zhang, J.; Wang, J.; Wei, J.; Liu, J.; Yang, S.; Gai, J.; Li, Y. Identification and analysis of NaHCO3 stress responsive genes in wild soybean (Glycine soja) Roots by RNA-seq. Front. Plant 2016, 7, 1842. [Google Scholar] [CrossRef] [Green Version]
Hedlund, E.; Deng, Q. Single-cell RNA sequencing: Technical advancements and biological applications. Mol. Asp. Med. 2018, 59, 36–46. [Google Scholar] [CrossRef]
Hurgobin, B.; Lewsey, M.G. Applications of cell- and tissue-specific ’omics to improve plant productivity. Emerg. Top. Life Sci. 2022, 6, 163–173. [Google Scholar] [CrossRef]
Saliba, A.E.; Westermann, A.J.; Gorski, S.A.; Vogel, J. Single-cell RNA-seq: Advances and future challenges. Nucleic Acids Res. 2014, 42, 8845–8860. [Google Scholar] [CrossRef] [PubMed]
Guo, J.; Huang, Z.; Sun, J.; Cui, X.; Liu, Y. Research progress and future development trends in medicinal plant transcriptomics. Front. Plant Sci. 2021, 12, 691838. [Google Scholar] [CrossRef] [PubMed]
Schumacher-Schuh, A.; Bieger, A.; Borelli, W.V.; Portley, M.K.; Awad, P.S.; Bandres-Ciga, S. Advances in proteomic and metabolomic profiling of neurodegenerative diseases. Front. Neurol. 2022, 12, 792227. [Google Scholar] [CrossRef] [PubMed]
Afroz, A.; Hashiguchi, A.; Khan, M.R.; Komatsu, S. Analyses of the proteomes of the leaf, hypocotyl, and root of young soybean seedlings. Protein Pept. Lett. 2010, 17, 319–331. [Google Scholar] [CrossRef]
Hajduch, M.; Matusova, R.; Houston, N.L.; Thelen, J.J. Comparative proteomics of seed maturation in oilseeds reveals differences in intermediary metabolism. Proteomics 2011, 11, 1619–1629. [Google Scholar] [CrossRef]
Nguyen, T.H.; Brechenmacher, L.; Aldrich, J.T.; Clauss, T.R.; Gritsenko, M.A.; Hixson, K.K.; Libault, M.; Tanaka, K.; Yang, F.; Yao, Q.; et al. Quantitative phosphoproteomic analysis of soybean root hairs inoculated with Bradyrhizobium japonicum. Mol. Cell Proteom. 2012, 11, 1140–1155. [Google Scholar] [CrossRef] [Green Version]
Qin, J.; Gu, F.; Liu, D.; Yin, C.; Zhao, S.; Chen, H.; Zhang, J.; Yang, C.; Zhan, X.; Zhang, M. Proteomic analysis of elite soybean Jidou17 and its parents using iTRAQ-based quantitative approaches. Proteome Sci. 2013, 11, 12. [Google Scholar] [CrossRef] [Green Version]
Xu, X.P.; Liu, H.; Tian, L.; Dong, X.B.; Shen, S.H.; Qu, L.Q. Integrated and comparative proteomics of high-oil and high-protein soybean seeds. Food Chem. 2015, 172, 105–116. [Google Scholar] [CrossRef]
Xu, Y.; Yan, F.; Liu, Y.; Wang, Y.; Gao, H.; Zhao, S.; Zhu, Y.; Wang, Q.; Li, J. Quantitative proteomic and lipidomics analyses of high oil content GmDGAT1-2 transgenic soybean illustrate the regulatory mechanism of lipoxygenase and oleosin. Plant Cell Rep. 2021, 40, 2303–2323. [Google Scholar] [CrossRef]
Jorrin-Novo, J.V.; Komatsu, S.; Sanchez-Lucas, R.; Rodríguez de Francisco, L.E. Gel electrophoresis-based plant proteomics: Past, present, and future. Happy 10th anniversary Journal of Proteomics! J. Proteom. 2019, 198, 1–10. [Google Scholar] [CrossRef]
Hajduch, M.; Ganapathy, A.; Stein, J.W.; Thelen, J.J. A systematic proteomic study of seed filling in soybean. Establishment of high-resolution two-dimensional reference maps, expression profiles, and an interactive proteome database. Plant Physiol. 2005, 137, 1397–1419. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Agrawal, G.K.; Hajduch, M.; Graham, K.; Thelen, J.J. In-depth investigation of the soybean seed-filling proteome and comparison with a parallel study of rapeseed. Plant Phys. 2008, 148, 504–518. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Wang, X.; Komatsu, S. Proteomic approaches to uncover the flooding and drought stress response mechanisms in soybean. J. Proteom. 2018, 17, 201–215. [Google Scholar] [CrossRef]
Wang, X.; Khodadadi, E.; Fakheri, B.; Komatsu, S. Organ-specific proteomics of soybean seedli under flooding and drought stresses. J. Proteom. 2017, 162, 62–72. [Google Scholar] [CrossRef] [PubMed]
Wang, X.; Hu, H.; Li, F.; Yang, B.; Komatsu, S.; Zhou, S. Quantitative proteomics reveals dual effects of calcium on radicle protrusion in soybean. J. Proteom. 2021, 230, 103999. [Google Scholar] [CrossRef] [PubMed]
Islam, N.; Bates, P.D.; Maria John, K.M.; Krishnan, H.B.; Zhang, Z.J.; Luthria, D.L.; Natarajan, S.S. Quantitative proteomic analysis of low linolenic acid transgenic soybean reveals perturbations of fatty acid metabolic pathways. Proteomics 2019, 19, 1800379. [Google Scholar] [CrossRef] [PubMed]
Wei, J.; Liu, X.; Li, L.; Zhao, H.; Liu, S.; Yu, X.; Shen, Y.; Zhou, Y.; Zhu, Y.; Shu, Y.; et al. Quantitative proteomic, physiological and biochemical analysis of cotyledon, embryo, leaf and pod reveals the effects of high temperature and humidity stress on seed vigor formation in soybean. BMC Plant Biol. 2020, 20, 127. [Google Scholar] [CrossRef] [PubMed]
Clark, N.M.; Elmore, J.M.; Walley, J.W. To the proteome and beyond: Advances in single-cell omics profiling for plant systems. Plant Physiol. 2022, 188, 726–737. [Google Scholar] [CrossRef] [PubMed]
Mergner, J.; Kuster, B. Plant Proteome Dynamics. Annu. Rev. Plant Biol. 2022, 73. [Google Scholar] [CrossRef]
Fiehn, O.; Kopka, J.; Dörmann, P.; Altmann, T.; Trethewey, R.N.; Willmitzer, L. Metabolite profiling for plant functional genomics. Nat. Biotechnol. 2000, 18, 1157–1161. [Google Scholar] [CrossRef]
Weckwerth, W. Metabolomics in systems biology. Annu. Rev. Plant Biol. 2003, 54, 669. [Google Scholar] [CrossRef] [PubMed]
Moco, S.; Bino, R.J.; Vos, R.; Vervoort, J. Metabolomics technologies and metabolite identification. TrAC-Trend Anal. Chem. 2007, 26, 855–866. [Google Scholar] [CrossRef]
Allwood, J.W.; Goodacre, R. An introduction to liquid chromatography-mass spectrometry instrumentation applied in plant metabolomic analyses. Phytochem. Anal. 2010, 21, 33–47. [Google Scholar] [CrossRef] [PubMed]
Jogaiah, S.; Govind, S.R.; Tran, L.S. Systems biology-based approaches toward understanding drought tolerance in food crops. Crit. Rev. Biotechnol. 2013, 33, 23–39. [Google Scholar] [CrossRef] [PubMed]
Fiehn, O. Metabolomics–the link between genotypes and phenotypes. Plant Mol. Biol. 2002, 48, 155–171. [Google Scholar] [CrossRef] [PubMed]
Kikuchi, J.; Hirayama, T. Practical aspects of uniform stable isotope labeling of higher plants for heteronuclear NMR-based metabolomics. Methods Mol. Biol. 2007, 358, 273–286. [Google Scholar] [CrossRef] [PubMed]
Kim, H.K.; Choi, Y.H.; Verpoorte, R. NMR-based plant metabolomics: Where do we stand, where do we go? Trends Biotechnol. 2011, 29, 267–275. [Google Scholar] [CrossRef] [PubMed]
Taylor, J.; King, R.D.; Altmann, T.; Fiehn, O. Application of metabolomics to plant genotype discrimination using statistics and machine learning. Bioinformatics 2002, 18, 241–248. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Xiao, Q.; Mu, X.; Liu, J.; Li, B.; Liu, H.; Zhang, B.; Xiao, P. Plant metabolomics: A new strategy and tool for quality evaluation of Chinese medicinal materials. Chin. Med. 2022, 17, 45. [Google Scholar] [CrossRef]
Nicholson, J.K.; Lindon, J.C. Metabonomics. Nature 2008, 455, 1054–1056. [Google Scholar] [CrossRef]
Dolatmoradi, M.; Samarah, L.Z.; Vertes, A. Single-Cell Metabolomics by Mass Spectrometry: Opportunities and Challenges. Anal. Sens. 2022, 2, e202100032. [Google Scholar] [CrossRef]
Lanekoff, I.; Sharma, V.V.; Marques, C. Single-cell metabolomics: Where are we and where are we going? Curr. Opin. Biotechnol. 2022, 75, 102693. [Google Scholar] [CrossRef] [PubMed]
Lin, H.; Rao, J.; Shi, J.; Hu, C.; Cheng, F.; Wilson, Z.A.; Zhang, D.; Quan, S. Seed metabolomic study reveals significant metabolite variations and correlations among different soybean cultivars. J. Integr. Plant Biol. 2014, 56, 826–836. [Google Scholar] [CrossRef] [PubMed]
Kim, J.K.; Kim, E.H.; Park, I.; Yu, B.R.; Lim, J.D.; Lee, Y.S.; Lee, J.H.; Kim, S.H.; Chung, I.M. Isoflavones profiling of soybean [Glycine max (L.) Merrill] germplasms and their correlations with metabolic pathways. Food Chem. 2014, 153, 258–264. [Google Scholar] [CrossRef]
Liu, J.; Hu, B.; Liu, W.; Qin, W.; Wu, H.; Zhang, J.; Yang, C.; Deng, J.; Shu, K.; Du, J.; et al. Metabolomic tool to identify soybean [Glycine max (L.) Merrill] germplasms with a high level of shade tolerance at the seedling stage. Sci. Rep. 2017, 7, 42478. [Google Scholar] [CrossRef] [Green Version]
Lee, J.; Hwang, Y.S.; Chang, W.S.; Moon, J.K.; Choung, M.G. Seed maturity differentially mediates metabolic responses in black soybean. Food Chem. 2013, 141, 2052–2059. [Google Scholar] [CrossRef]
Wilcox, J.R.; Shibles, R.M. Interrelationships among seed quality attributes in soybean. Crop. Sci. 2001, 41, 11–14. [Google Scholar] [CrossRef]
Feng, Z.; Ding, C.; Li, W.; Wang, D.; Cui, D. Applications of metabolomics in the research of soybean plant under abiotic stress. Food Chem. 2020, 310, 125914. [Google Scholar] [CrossRef]
Meyer, R.C.; Steinfath, M.; Lisec, J.; Becher, M.; Witucka-Wall, H.; Törjék, O.; Fiehn, O.; Eckardt, A.; Willmitzer, L.; Selbig, J.; et al. The metabolic signature related to high plant growth rate in Arabidopsis thaliana. Proc. Natl. Acad. Sci. USA 2007, 104, 4759–4764. [Google Scholar] [CrossRef] [Green Version]
Tripathi, P.; Rabara, R.C.; Reese, R.N.; Miller, M.A.; Rohila, J.S.; Subramanian, S.; Shen, Q.J.; Morandi, D.; Bücking, H.; Shulaev, V.; et al. A toolbox of genes, proteins, metabolites and promoters for improving drought tolerance in soybean includes the metabolite coumestrol and stomatal development genes. BMC Genom. 2016, 17, 102. [Google Scholar] [CrossRef] [Green Version]
Scandiani, M.M.; Luque, A.G.; Razori, M.V.; Ciancio Casalini, L.; Aoki, T.; O’Donnell, K.; Cervigni, G.D.; Spampinato, C.P. Metabolic profiles of soybean roots during early stages of Fusarium tucumaniae infection. J. Exp. Bot. 2015, 66, 391–402. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Hall, R.D.; D’Auria, J.C.; Silva-Ferreira, A.C.; Gibon, Y.; Kruszka, D.; Mishra, P.; Zedde, R. High-throughput plant phenotyping: A role for metabolomics? Trends Plant Sci. 2022, 1360–1385, 30–39. [Google Scholar] [CrossRef] [PubMed]
Shen, Y.; Zhou, G.; Liang, C.; Tian, Z. Omics-based interdisciplinarity is accelerating plant breeding. Curr. Opin. Plant Biol. 2022, 66, 102167. [Google Scholar] [CrossRef] [PubMed]
Zhao, C.; Zhang, Y.; Du, J.; Guo, X.; Wen, W.; Gu, S.; Wang, J.; Fan, J. Crop phenomics: Current status and perspectives. Front. Plant Sci. 2019, 10, 714. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Lube, V.; Noyan, M.A.; Przybysz, A.; Salama, K.; Blilou, I. MultipleXLab: A high-throughput portable live-imaging root phenotyping platform using deep learning and computer vision. Plant Methods 2022, 18, 38. [Google Scholar] [CrossRef] [PubMed]
Andrade-Sanchez, P.; Gore, M.A.; Heun, J.T.; Thorp, K.R.; Carmo-Silva, A.E.; French, A.N.; Salvucci, M.E.; White, J.W. Development and evaluation of a field-based high-throughput phenotyping platform. Funct. Plant Biol. 2013, 41, 68–79. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Fiorani, F.; Schurr, U. Future scenarios for plant phenotyping. Annu. Rev. Plant. Biol. 2013, 64, 267–291. [Google Scholar] [CrossRef] [Green Version]
Yu, N.; Li, L.; Schmitz, N.; Tian, L.F.; Greenberg, J.A.; Diers, B.W. Development of methods to improve soybean yield estimation and predict plant maturity with an unmanned aerial vehicle based platform. Remote Sens. Environ. 2016, 187, 91–101. [Google Scholar] [CrossRef]
Maimaitijiang, M.; Ghulam, A.; Sidike, P.; Hartling, S.; Peterson, K.; Fritschi, F. Unmanned Aerial System (UAS)-based phenotyping of soybean using multi-sensor data fusion and extreme learning machine. ISPRS J. Photogramm. 2017, 134, 43–58. [Google Scholar] [CrossRef]
Yuan, W.; Wijewardane, N.K.; Jenkins, S.; Bai, G.; Ge, Y.; Graef, G.L. Early Prediction of Soybean Traits through Color and Texture Features of Canopy RGB Imagery. Sci. Rep. 2019, 9, 14089. [Google Scholar] [CrossRef] [Green Version]
Zhou, J.; Zhou, J.; Ye, H.; Ali, M.L.; Nguyen, H.T.; Chen, P. Classification of soybean leaf wilting due to drought stress using UAV-based imagery. Comput. Electron. Agric. 2020, 175, 105576. [Google Scholar] [CrossRef]
Toda, Y.; Kaga, A.; Kajiya-Kanegae, H.; Hattori, T.; Yamaoka, S.; Okamoto, M.; Tsujimoto, H.; Iwata, H. Genomic prediction modeling of soybean biomass using UAV-based remote sensing and longitudinal model parameters. Plant Genome 2021, 14, e20157. [Google Scholar] [CrossRef] [PubMed]
Zhou, J.; Mou, H.; Zhou, J.; Ali, M.L.; Ye, H.; Chen, P.; Nguyen, H.T. Qualification of soybean responses to flooding stress using UAV-Based imagery and deep learning. Plant Phenom. 2021, 2021, 9892570. [Google Scholar] [CrossRef] [PubMed]
Zhou, J.; Fu, X.; Zhou, S.; Zhou, J.; Ye, H.; Nguyen, H.T. Automated segmentation of soybean plants from 3D point cloud using machine learning. Comput. Electron. Agric. 2019, 162, 143–153. [Google Scholar] [CrossRef]
Zhu, R.; Sun, K.; Yan, Z.; Yan, X.; Yu, J.; Shi, J.; Hu, Z.; Jiang, H.; Xin, D.; Zhang, Z.; et al. Analysing the phenotype development of soybean plants using low-cost 3D reconstruction. Sci. Rep. 2020, 10, 7055. [Google Scholar] [CrossRef]
Finkel, E. Imaging. With ‘phenomics,’ Plant scientists hope to shift breeding into overdrive. Science 2009, 325, 380–381. [Google Scholar] [CrossRef]
Naik, H.S.; Zhang, J.; Lofquist, A.; Assefa, T.; Sarkar, S.; Ackerman, D.; Singh, A.; Singh, A.K.; Ganapathysubramanian, B. A real-time phenotyping framework using machine learning for plant stress severity rating in soybean. Plant Met. 2017, 13, 23. [Google Scholar] [CrossRef] [Green Version]
Dobbels, A.A.; Lorenz, A.J. Soybean iron deficiency chlorosis high throughput phenotyping using an unmanned aircraft system. Plant Met. 2019, 15, 97. [Google Scholar] [CrossRef]
Zhou, J.; Chen, H.; Zhou, J.; Fu, X.; Ye, H.; Nguyen, H.T. Development of an automated phenotyping platform for quantifying soybean dynamic responses to salinity stress in greenhouse environment. Comput. Electron. Agric. 2018, 151, 319–330. [Google Scholar] [CrossRef]
Urbina, F.; Ekins, S. The Commoditization of AI for Molecule Design. Artif. Intell. Life Sci. 2022, 2, 100031. [Google Scholar] [CrossRef]
Ranzato, M.A.; Hinton, G.; LeCun, Y. Guest editorial: Deep learning. Int. J. Comput. Vis. 2015, 113, 1–2. [Google Scholar] [CrossRef] [Green Version]
Arya, S.; Sandhu, K.S.; Singh, J. Deep learning: As the new frontier in high-throughput plant phenotyping. Euphytica 2022, 218, 47. [Google Scholar] [CrossRef]
Krizhevsky, A.; Sutskever, I.; Hinton, G.E. Imagenet classification with deep convolutional neural networks. Adv. Neural Inf. Process. Syst. 2012, 25. [Google Scholar] [CrossRef]
Fey, M.; Lenssen, J.E. Fast graph representation learning with PyTorch Geometric. arXiv 2019, arXiv:1903.02428. [Google Scholar]
Jia, Y.; Shelhamer, E.; Donahue, J.; Karayev, S.; Long, J.; Girshick, R.; Darrell, T. Caffe: Convolutional architecture for fast feature embedding. In Proceedings of the 22nd ACM International Conference on Multimedia, 2014; MM ’14: Proceedings of the 22nd ACM international conference on Multimedia, Orlando, FL, USA, 3–7 November 2014; pp. 675–678. [Google Scholar] [CrossRef]
Herrero-Huerta, M.; Rodriguez-Gonzalvez, P.; Rainey, K.M. Yield prediction by machine learning from UAS-based mulit-sensor data fusion in soybean. Plant Met. 2020, 16, 78. [Google Scholar] [CrossRef] [PubMed]
Maimaitijiang, M.; Sagan, V.; Sidike, P.; Hartling, S.; Esposito, F.; Fritschi, F.B. Soybean yield prediction from UAV using multimodal data fusion and deep learning. Remote Sens. Environ. 2020, 237, 111599. [Google Scholar] [CrossRef]
Riera, L.G.; Carroll, M.E.; Zhang, Z.; Shook, J.M.; Ghosal, S.; Gao, T.; Singh, A.; Bhattacharya, S.; Ganapathysubramanian, B.; Singh, A.K.; et al. Deep multiview image fusion for soybean yield estimation in breeding applications. Plant Phenom. 2021, 2021, 9846470. [Google Scholar] [CrossRef]
Baek, J.H.; Lee, E.; Kim, N.; Kim, S.L.; Choi, I.; Ji, H.; Chung, Y.S.; Choi, M.S.; Moon, J.K.; Kim, K.H. High throughput phenotyping for various traits on soybean seeds using image analysis. Sensors 2020, 20, 248. [Google Scholar] [CrossRef] [Green Version]
Momin, M.A.; Yamamoto, K.; Miyamoto, M.; Kondo, N.; Grift, T. Machine vision based soybean quality evaluation. Comput. Electron. Agric. 2017, 140, 452–460. [Google Scholar] [CrossRef]
Li, Y.; Jia, J.; Zhang, L.; Khattak, A.M.; Gao, W.; Sun, S.; Wang, M. Soybean seed counting based on pod image using two-column convolution neural network. IEEE Access. 2019, 7, 64177–64185. [Google Scholar] [CrossRef]
Yang, S.; Zheng, L.; He, P.; Wu, T.; Sun, S.; Wang, M. High-throughput soybean seeds phenotyping with convolutional neural networks and transfer learning. Plant Met. 2021, 17, 50. [Google Scholar] [CrossRef] [PubMed]
Jubery, T.Z.; Carley, C.N.; Singh, A.; Sarkar, S.; Ganapathysubramanian, B.; Singh, A.K. Using Machine learning to develop a fully automated soybean nodule acquisition pipeline (SNAP). Plant Phenom. 2021, 2021, 9834746. [Google Scholar] [CrossRef] [PubMed]
Santana, D.C.; de-Oliveira-Cunha, M.P.; Dos-Santos, R.G.; Cotrim, M.F.; Teodoro, L.P.R.; da-Silva-Junior, C.A.; Baio, F.H.R.; Teodoro, P.E. High-throughput phenotyping allows the selection of soybean genotypes for earliness and high grain yield. Plant Methods 2022, 18, 13. [Google Scholar] [CrossRef] [PubMed]
Rousseau, C.; Belin, E.; Bove, E.; Rousseau, D.; Fabre, F.; Berruyer, R.; Guillaumes, J.; Manceau, C.; Jacques, M.A.; Boureau, T. High throughput quantitative phenotyping of plant resistance using chlorophyll fluorescence image analysis. Plant Methods 2013, 9, 17. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Feher-Juhasz, E.; Majer, P.; Sass, L.; Lantos, C.; Csiszar, J.; Turoczy, Z.; Mihaly, R.; Mai, A.; Horvath, G.V.; Vass, I.; et al. Phenotyping shows improved physiological traits and seed yield of transgenic wheat plants expressing the alfalfa aldose reductase under permanent drought stress. Acta Physiol. Plant 2014, 36, 663–673. [Google Scholar] [CrossRef] [Green Version]
Hu, H.; Zhang, J.Z.; Sun, X.Y.; Zhang, X.M. Estimation of leaf chlorophyll content of rice using image color analysis. Can J. Remote Sens. 2013, 39, 185–190. [Google Scholar] [CrossRef]
Matsuda, O.; Tanaka, A.; Fujita, T.; Iba, K. Hyperspectral imaging techniques for rapid identification of Arabidopsis mutants with altered leaf pigment status. Plant Cell Physiol. 2012, 53, 1154–1170. [Google Scholar] [CrossRef] [Green Version]
Orel, V.; Wood, R. Early developments in artificial selection as a background to Mendel’s research. Hist. Phil. Life Sci. 1981, 3, 145–170. Available online: https://www.jstor.org/stable/23328309 (accessed on 22 January 2022).
Watson, J.D.; Crick, F.H. Molecular structure of nucleic acids: A structure for deoxyribose nucleic acid. Nature 1953, 171, 737–738. [Google Scholar] [CrossRef]
Hasan, N.; Choudhary, S.; Naaz, N.; Sharma, N.; Laskar, R.A. Recent advancements in molecular marker-assisted selection and applications in plant breeding programmes. J. Genet. Eng. Biotechnol. 2021, 19, 128. [Google Scholar] [CrossRef]
Hartung, F.; Schiemann, J. Precise plant breeding using new genome editing techniques: Opportunities, safety and regulation in the EU. Plant J. 2014, 78, 742–752. [Google Scholar] [CrossRef] [PubMed]
Pham, A.T.; Lee, J.D.; Shannon, J.G.; Bilyeu, K.D. Mutant alleles of FAD2-1A and FAD2-1B combine to produce soybeans with the high oleic acid seed oil trait. BMC Plant Biol. 2010, 10, 195. [Google Scholar] [CrossRef] [Green Version]
Hagely, K.; Konda, A.R.; Kim, J.H.; Cahoon, E.B.; Bilyeu, K. Molecular-assisted breeding for soybean with high oleic/low linolenic acid and elevated vitamin E in the seed oil. Mol. Breed. 2021, 41, 3. [Google Scholar] [CrossRef]
Pham, A.T.; Shannon, J.G.; Bilyeu, K.D. Combinations of mutant FAD2 and FAD3 genes to produce high oleic acid and low linolenic acid soybean oil. Theor. Appl. Genet. 2012, 125, 503–515. [Google Scholar] [CrossRef] [PubMed]
Hagely, K.B.; Jo, H.; Kim, J.H.; Hudson, K.A.; Bilyeu, K. Molecular-assisted breeding for improved carbohydrate profiles in soybean seed. Theor. Appl. Genet. 2020, 133, 1189–1200. [Google Scholar] [CrossRef] [PubMed]
Miranda, C.; Culp, C.; Škrabišová, M.; Joshi, T.; Belzile, F.; Grant, D.M.; Bilyeu, K. Molecular tools for detecting Pdh1 can improve soybean breeding efficiency by reducing yield losses due to pod shatter. Mol. Breed. 2019, 39, 27. [Google Scholar] [CrossRef]
Kou, K.; Yang, H.; Li, H.; Fang, C.; Chen, L.; Yue, L.; Nan, H.; Kong, L.; Li, X.; Wang, F.; et al. A functionally divergent SOC1 homolog improves soybean yield and latitudinal adaptation. Curr. Biol. 2022, 32, 1728–1742.e6. [Google Scholar] [CrossRef]
Jing, H.C.; Tian, Z.X.; Chong, K.; Li, J.Y. Progress and perspective of molecular design breeding. Sci. China Life Sci. 2021, 51, 1356. [Google Scholar] [CrossRef]
Wang, L.; Guo, Y.; Yang, S.H. Designed breeding for adaptation of crops to environmental abiotic stresses. Sci. China Life Sci. 2021, 51, 1424–1434. [Google Scholar] [CrossRef]
OuYang, Y.D.; Chen, L.T. Fertility Regulation and Molecular Design Hybrid Breeding in Crops. Sci. China Life Sci. 2021, 51, 1385. [Google Scholar] [CrossRef]

Figure 1. Numbers of PubMed references for different types of omics techniques.

Figure 2. Multi-omics approaches for soybean molecular breeding.

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Cao, P.; Zhao, Y.; Wu, F.; Xin, D.; Liu, C.; Wu, X.; Lv, J.; Chen, Q.; Qi, Z. Multi-Omics Techniques for Soybean Molecular Breeding. Int. J. Mol. Sci. 2022, 23, 4994. https://doi.org/10.3390/ijms23094994

AMA Style

Cao P, Zhao Y, Wu F, Xin D, Liu C, Wu X, Lv J, Chen Q, Qi Z. Multi-Omics Techniques for Soybean Molecular Breeding. International Journal of Molecular Sciences. 2022; 23(9):4994. https://doi.org/10.3390/ijms23094994

Chicago/Turabian Style

Cao, Pan, Ying Zhao, Fengjiao Wu, Dawei Xin, Chunyan Liu, Xiaoxia Wu, Jian Lv, Qingshan Chen, and Zhaoming Qi. 2022. "Multi-Omics Techniques for Soybean Molecular Breeding" International Journal of Molecular Sciences 23, no. 9: 4994. https://doi.org/10.3390/ijms23094994

APA Style

Cao, P., Zhao, Y., Wu, F., Xin, D., Liu, C., Wu, X., Lv, J., Chen, Q., & Qi, Z. (2022). Multi-Omics Techniques for Soybean Molecular Breeding. International Journal of Molecular Sciences, 23(9), 4994. https://doi.org/10.3390/ijms23094994

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Multi-Omics Techniques for Soybean Molecular Breeding

Abstract

1. Introduction

2. Multi-Omics Research Progress

2.1. Soybean Genomics Research Progress

2.2. Soybean Transcriptomics Research Progress

2.3. Soybean Proteomics Research Progress

2.4. Soybean Metabolomics Research Progress

2.5. Soybean Phenomics Research Progress

3. Molecular Breeding in Soybean

4. Further Perspectives

5. Conclusions

6. Supplementary Research Methodology

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI