Next Article in Journal
A Study of the Effects of Mechanical Alloying Fraction, Solution Treatment Temperature and Pre-Straining Degree on the Structure and Properties of a Powder Metallurgy-Produced FeMnSiCrNi Shape Memory Alloy
Previous Article in Journal
Challenges in Resolubilisation of Rare Earth Oxalate Precipitates Using EDTA
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Impact of Diffraction Data Volume on Data Quality in Serial Crystallography

College of General Education, Kookmin University, Seoul 02707, Republic of Korea
Crystals 2025, 15(2), 104; https://doi.org/10.3390/cryst15020104
Submission received: 31 December 2024 / Revised: 17 January 2025 / Accepted: 19 January 2025 / Published: 21 January 2025
(This article belongs to the Section Biomolecular Crystals)

Abstract

:
Serial crystallography (SX) enables macromolecular structure determination at biologically relevant temperatures while minimizing radiation damage. This technique relies on processing numerous diffraction images from multiple crystals to construct a complete dataset for three-dimensional structure determination. Although increasing the volume of SX diffraction data improves data quality, excessive data collection reduces beamtime efficiency. Therefore, understanding the relationship between data volume and data quality is crucial for the efficient use of SX beamtime. In this study, serial synchrotron crystallography datasets from lysozyme and glucose isomerase were analyzed to assess the impact of varying diffraction data volumes on processing statistics and structural determination outcomes. Data processing statistics and structure refinement metrics improved as the volume of integrated diffraction data increased; however, the rate of improvement in data quality was not proportional to the number of integrated diffraction patterns. Furthermore, the rate of improvement in data processing statistics decreased beyond a certain threshold volume. These findings expand our understanding of SX data processing and provide insights into optimizing the efficiency of data processing.

1. Introduction

Serial crystallography (SX), utilizing X-ray free-electron lasers (XFELs) or synchrotron X-rays, enables the structural determination of biological and chemical molecules with minimal radiation damage [1,2,3,4]. In serial femtosecond crystallography (SFX) using XFEL, the crystal sample is exposed to the XFEL beam for tens of femtoseconds, whereas in serial synchrotron crystallography (SSX) using synchrotron X-rays, the crystal is exposed to radiation for timescales on the order of nanoseconds to microseconds [5,6]. Consequently, the radiation damage to the crystal is significantly lower compared to traditional macromolecular crystallography (MX) [7]. This indicates that SX techniques can provide more detailed structural information while preserving radiation-sensitive elements of the structure [7]. Additionally, typical data collection in both SFX and SSX experiments is performed at room temperature or in a controlled environment at the desired temperature [8]. The room-temperature structures of biomolecules provide more biologically relevant molecular flexibility compared to the structures of biomolecules obtained at cryogenic temperatures [6]. Therefore, this approach provides better biologically relevant structural information than conventional MX or cryo-electron microscopy techniques [9,10,11]. Furthermore, SX combined with optical lasers facilitates pump-probe experiments to elucidate light-induced structural changes in photoactive proteins [12,13,14]. This contributes to understanding the structural changes of photoactive proteins on fast timescales, such as femtoseconds to nanoseconds [15]. Additionally, the use of the chemical mix-and-inject method in SX offers structural insights into reaction mechanisms during protein–substrate interactions [16,17]. This method offers valuable information on time-dependent structural changes on timescales such as milliseconds, including events like substrate binding or inhibitor binding at the active site [18,19].
In conventional MX, diffraction data for determining three-dimensional (3D) structures are collected by rotating a single crystal under X-ray exposure [20,21,22]. When the crystal’s space group is known, the required degree of rotation can be calculated to achieve a complete dataset [23,24,25]. Although specific requirements can vary depending on the space group and diffraction intensity, rotating the crystal 360° typically provides a complete diffraction dataset [24].
By contrast, in serial femtosecond crystallography (SFX) and serial synchrotron crystallography (SSX) experiments, each crystal undergoes an X-ray pulse exposure or one-time short-duration X-ray exposure, respectively, resulting in partial diffraction information [1,3,26,27]. In general, complete diffraction datasets are integrated by processing thousands to hundreds of thousands of diffraction images [28,29] collected using sample delivery methods, such as injectors or fixed-target scanning [30,31,32,33,34,35,36,37,38], which involves delivering a large number of crystals to the X-ray interaction point. The complete dataset is constructed through Monte Carlo integration [7,39]. Unlike MX, SX has no standardized number of required images for 3D structure determination. In general, processing larger numbers of diffraction images in SX improves data quality metrics, such as signal-to-noise ratio (SNR) and correlation coefficient (CC). Accordingly, in general SX experiments, efforts are made to obtain as many diffraction images as possible to enhance data quality. However, beamtime at XFEL or synchrotron facilities is often limited [40,41], necessitating efficient data collection strategies to acquire sufficient amounts of data. Despite its importance, no systematic studies have analyzed the relationship between data volume and data quality in SX experiments.
This study examines the effects of integrated diffraction data volume on data processing statistics and crystal structure determination SSX datasets of lysozyme and glucose isomerase (GI). The findings broaden our understanding of the role of data volume in SX and contribute to developing data collection strategies for experiments with limited SX beamtime.

2. Materials and Methods

2.1. Data Preparation

Hen egg white lysozyme (HEWL, Cat No. L6876) was purchased from Sigma-Aldrich (St. Louis, MO, USA). Crystallization for HEWL was performed following a previously described method [42]. A solution of lysozyme (30 mg/mL) in a buffer containing 10 mM Tris-HCl (pH 8.0) and 200 mM NaCl was mixed with a crystallization solution composed of 100 mM sodium acetate (pH 4.0), 8% (w/v) PEG 8000, and 2 M NaCl in a 1.5 mL microcentrifuge tube. The mixture was vortexed for 30 s and incubated at 20 °C overnight. The resulting crystals had sizes of approximately 20–30 μm.
The lysozyme crystal suspension (40 µL) was mixed with monoolein (60 µL) using a dual-syringe setup. The syringe containing the lysozyme crystals embedded in lipidic cubic phase (LCP) medium was connected with a needle of 170 μm internal diameter. To prevent solution evaporation, the needle tip was sealed with Parafilm and stored at 20 °C until data collection.

2.2. Data Collection

SSX experiments were conducted at beamline 11C of the Pohang Accelerator Laboratory, Republic of Korea [43]. The X-ray wavelength was 0.9774 Å. The X-ray beam size at the sample position was 4.5 μm (vertical) × 8 μm (horizontal) (full-width at half maximum). The photon flux was approximately 5 × 1011 photons/s. The detector-to-crystal distance was 300 mm. The syringe containing the LCP medium was mounted on a syringe pump in the experimental hutch [44]. The LCP medium was extruded from the syringe by the syringe pump at a flow rate of 200 nL/min. X-ray diffraction data were collected at 25 °C using a Pilatus 6M detector (DECTRIS, Baden-Dättwil, Switzerland). The X-ray exposure time for each acquired image was 100 ms.

2.3. Data Processing

For the lysozyme dataset, diffraction images containing Bragg peaks were identified and filtered using the Cheetah [45] program. The hit images were then processed using the CrystFEL [46] program with the XGANDALF [47] indexing algorithm. During indexing, the detector geometry was refined using the optimiser [48]. Scaling of the diffraction images was performed with varying numbers of diffraction patterns. For glucose isomerase, previously collected SSX diffraction data [49] were retrieved from the Coherent X-ray Imaging Data Bank [50]. These datasets were processed using the same procedures as for lysozyme datasets.

2.4. Structure Determination

Molecular replacement (MR) was performed using Phase-MR in PHENIX [51], with the crystal structures of lysozyme (PDB code 7DTF) [52] and GI (6IRK) [53] used as search models. Structure refinement for the MR solution was performed using phenix.refine in PHENIX [51], along with the addition of the water molecule under default parameters. The model structures of lysozyme were built using the COOT program [54]. The structure refinement parameters were identical for all data processing. Final structures were validated using MolProbity [55]. Structure and electron density maps were visualized with PyMOL (www.pymol.org; accessed on 22 November 2024).

3. Results

3.1. Processing of Lysozyme SSX Data Using Varying Volumes of Diffraction Images

To understand the effect of diffraction data volume on data quality, SSX data were collected using lysozyme as a model sample. A total of 50,000 images were collected, with 21,724 hit images containing Bragg peaks, yielding a hit rate of 43.44%. These hit images were further indexed, resulting in 20,765 indexed images and 26,432 diffraction patterns, with a multi-crystal hit rate of 127.26%. Using criteria of SNR > 1 and CC1/2 > 0.5 at the highest resolution [56], the entire dataset (referred to as Lys-All) was processed to a resolution of 1.45 Å. The data processing statistics for Lys-All showed an overall SNR of 9.6 (high-resolution shell: 1.33), CC1/2 of 0.9939 (0.5428), and Rsplit of 6.59% (80.09%) (Table 1). To investigate the effect of diffraction data volume on data quality, the Lys-All dataset was divided into subsets with diffraction pattern volumes of 2000, 3000, 5000, 10,000, 15,000, 20,000, and 25,000 patterns (referred to as Lys-diffraction numbers) (Table 1). The diffraction patterns for each dataset were obtained sequentially from the processed diffraction patterns for Lys-All. Therefore, datasets with a larger number of diffraction images include smaller subsets of the data.
The overall data processing results demonstrated that increasing the integration of diffraction data volume improved redundancy, SNR, CC1/2, and CC* values while reducing the Rsplit value (Figure 1). This indicates that a larger diffraction data volume enhances data statistics. Meanwhile, when examining results processed in increments of 5000 diffraction patterns, the rate of data improvement was not strictly proportional to the increase in diffraction pattern volume.
Both the overall and highest-resolution shell redundancy values increased with diffraction data volume (Figure 1A). The increases in overall redundancy for Lys-10000, Lys-15000, Lys-20000, and Lys-25000 compared to the preceding dataset (5000 fewer diffraction patterns) were 106.11%, 58.83%, 45.43%, and 16.99%, respectively. The increases in highest-resolution shell redundancy for Lys-10000, Lys-15000, Lys-20000, and Lys-25000 were 106.33%, 59.14%, 45.46%, and 16.72%, respectively. This indicates that the rates of increase in redundancy values at the overall and highest-resolution shells were similar. However, the rate of increase in redundancy diminished as the integrated diffraction data volume increased.
The overall SNR values for Lys-5000, Lys-10000, Lys-15000, Lys-20000, and Lys-25000 were 4.72, 6.56, 7.55, 8.37, and 9.36, respectively (Figure 1B). The rates of increase in SNR for Lys-10000, Lys-15000, Lys-20000, and Lys-25000, compared to the preceding dataset (5000 fewer diffraction patterns), were 38.98%, 15.09%, 10.86%, and 11.59%, respectively, showing a decreasing trend in SNR improvement as the data volume increases. Meanwhile, the highest-resolution SNR values for Lys-5000, Lys-10000, Lys-15000, Lys-20000, and Lys-25000 were 0.89, 1.20, 1.22, 1.18, and 1.29, respectively, showing no or minimal improvement beyond Lys-10000.
CC1/2 and CC* values improved with increasing diffraction data volume in both overall and highest-resolution shells (Figure 1C,D). Although the data quality metrics improved significantly at the highest resolution, the rates of improvement were not directly proportional to the additional integrated diffraction data.
The overall Rsplit values for Lys-5000, Lys-10000, Lys-15000, Lys-20000, and Lys-25000 were 14.86, 10.74, 8.82, 7.50, and 6.79, respectively, showing a decreasing trend with increasing data volume (Figure 1E). The reduction rates for Lys-10000, Lys-15000, Lys-20000, and Lys-25000, compared to the preceding dataset with 5000 fewer diffraction patterns, were 27.72%, 17.87%, 14.96%, and 9.46%, respectively. For the highest-resolution shell, the Rsplit values for Lys-5000, Lys-10000, Lys-15000, Lys-20000, and Lys-25000 were 87.70, 87.05, 91.03, 82.69, and 80.09, respectively—also exhibiting a general decreasing trend.
The Wilson B-factor analysis showed a gradual increase in the B-factor as the processed data volume increased (Figure 1F). The overall data processing statistics improved with an increase in the integrated data volume. However, the quality of the data did not increase proportionally to the data volume.

3.2. Data Processing Statistics for Glucose Isomerase Datasets

The quality of data based on the number of diffraction images can vary depending on the sample. The impact of data volume on data quality was further assessed using a previously reported GI diffraction dataset obtained via fixed-target SSX experiments [49]. From this dataset, 25,099 diffraction images were used for data processing, which produced 39,657 diffraction patterns with a multi-crystal hit rate of approximately 158%. After integrating the entire diffraction dataset (referred to as GI-All), the GI data were processed to 1.60-Å resolution based on the criteria of SNR > 1 and CC1/2 > 0.5 at high resolutions. Data processing statistics for GI-All showed an SNR of 3.12 (highest shell: 1.52), CC of 0.9060 (0.6385), and Rsplit of 26.09% (64.76%) (Table 2). The GI-All dataset was further divided into subsets based on diffraction pattern volume containing 5000, 10,000, 15,000, 20,000, 25,000, 30,000, and 35,000 patterns (referred to as GI-diffraction numbers) to investigate the effect of diffraction data volume on data quality. Similar to the lysozyme dataset, processing statistics such as redundancy, SNR, CC, CC*, and Rsplit improved with increasing data volume (Figure 2 and Table 2).
The increasing rates in overall (high-resolution shell) redundancy for GI-10000, GI-15000, GI-20000, GI-25000, GI-30000, and GI-35000 datasets, compared with GI-5000, were 274% (277%), 421% (425%), 656% (664%), 789% (797%), 958% (967%), 1167% (1167%), and 1337% (1180%), respectively (Figure 2A). These results indicate that both the overall and highest-resolution shell redundancy values gradually increased as the diffraction data volumes grew.
The overall SNR values for GI-5000, GI-10000, GI-15000, GI-20000, GI-25000, GI-30000, and GI-35000 were 1.35, 1.75, 2.12, 2.61, 2.77, 2.90, 3.02, and 3.12, respectively. The rates of increase in overall SNR for GI-10000, GI-15000, GI-20000, GI-25000, GI-30000, and GI-35000, compared with the dataset with 5000 fewer diffraction patterns, were 29.62%, 21.14%, 23.11%, 6.13%, 4.69%, 4.13%, and 3.31%, respectively. This indicated that the overall SNR values did not increase significantly beyond the GI-20000 dataset. Meanwhile, the highest-resolution SNR values for GI-5000, GI-10000, GI-15000, GI-20000, GI-25000, GI-30000, and GI-35000 were 0.50, 0.76, 1.03, 1.42, 1.48, 1.50, 1.50, and 1.52, respectively, with no significant improvement in SNR beyond the GI-20000 dataset. The overall CC1/2 and CC* values improved with increasing volume of the diffraction data (Figure 2C and Figure 3D). Meanwhile, the CC1/2 and CC* values at high resolution increased up to the GI-20000 dataset, after which no further improvement was observed (Figure 2C and Figure 3D).
The overall Rsplit values of the GI datasets decreased as more diffraction patterns were integrated. However, the Rsplit values at high-resolution shell decreased up to the GI-20000 dataset, after which no further improvement was observed (Figure 2E). Meanwhile, the Wilson B-factor values of the GI datasets decreased up to the GI-20000 dataset but increased beyond this dataset (Figure 2F).

3.3. Structure Determination of Lysozyme Datasets

To understand how the different volumes of diffraction data affect structure determination, the crystal structures of lysozyme and GI were determined using MR. For lysozyme, MR was performed at 1.45 Å resolution, and all datasets successfully yielded MR solutions. The top log-likelihood gain (LLG)/translation function Z-score (TFZ) values for Lys-2000, Lys-3000, Lys-5000, Lys-10000, Lys-15000, Lys-20000, Lys-25000, and Lys-All were 8101/64.0, 8805/66.3, 9519/67.7, 10,329/69.7, 10,411/69.4, 10,442/69.3, 10,588/69.6, and 10,632/69.7, respectively (Table 3).
The results show a significant increase in LLG and TFZ scores for datasets from Lys-2000 to Lys-10000, with increasing volume of diffraction data. Beyond Lys-10000, the rate of increase in these scores diminished (Figure 3A). Structure refinement results revealed that the overall Rwork and Rfree values for all lysozyme datasets were <0.2060 and <0.2281, respectively (Figure 3B), which lie within the typical range for reliable macromolecular crystallography structures. Both values showed a decreasing trend with increasing volume of diffraction data (Figure 3B). Meanwhile, at the high-resolution shell, the Rwork values decreased with increasing data volume, whereas the Rfree values showed no consistent decreasing trend (Figure 3C). During refinement, water molecules were automatically added to the lysozyme structure, and the number of defined water molecules ranged from 89 to 91, with no observable correlation to diffraction data volume (Table 3). The average B-factor values for protein and water molecules revealed no significant trends with respect to dataset volumes (Figure 3D).
After structure refinement, electron density maps for each lysozyme dataset were compared to assess whether diffraction data volume affected the electron density map quality. The results showed that, while the data processing and structure refinement statistics differed with diffraction image volume, no significant differences were observed in the quality of the electron density maps for the lysozyme datasets (Figure 4).

3.4. Structure Determination of Glucose Isomerase Datasets

For GI datasets, MR was performed at a resolution of 1.60 Å, and all datasets successfully yielded MR solutions. The top LLG/TFZ values for GI-5000, GI-10000, GI-15000, GI-20000, GI-25000, GI-30000, GI-35000, and GI-All were 11,961/64.8, 15,521/70.1, 18,945/73.5, 22,264/75.5, 23,582/75.8, 24,522/75.9, 25,337/76.1, and 26,028/76.3, respectively (Table 4).
GI-5000, GI-10000, and GI-15000 showed consistent increases in LLG and TFZ scores with increasing number of diffraction images, whereas beyond GI-20000, these scores exhibited lower rates of increase (Figure 5A). Structure refinement results revealed that the overall Rwork values for GI-5000 and GI-10000 were > 0.25 (Figure 5B), falling outside the reliable structural range for R-values in macromolecular crystallography at this resolution. In contrast, for datasets beyond GI-15000, the Rwork and Rfree values were <0.2457 and <0.2591, respectively. The overall Rwork and Rfree values decreased as the volume of diffraction data increased. However, at the high-resolution shell, no significant improvement in R-values was observed after GI-15000, despite the increase in the volume of diffraction data (Figure 5C). The numbers of defined water molecules for GI-5000, GI-10000, GI-15000, GI-20000, GI-25000, GI-30000, GI-35000, and GI-All were 292, 297, 296, 303, 300, 315, 333, and 335, respectively. This indicates that the improvement in data quality with increasing diffraction data volume enabled the definition of more water molecules.
The B-factor analysis revealed no significant trends in the B-factors of protein and water molecules with increasing data volume (Figure 5D). GI contains two metal ion-binding sites in its active site, involved in substrate binding and isomerization reactions. The B-factor values of Mg2+ at the metal-binding sites of GI also revealed no significant trends with the increasing volume of diffraction data (Figure 5D).
After structure refinement, the electron density maps for the GI datasets were compared. While the data processing and structure refinement statistics differed with diffraction image volume, no significant differences were observed in the quality of the electron density maps across datasets (Figure 6).

4. Discussion

In SX, integrating a large number of diffraction images is crucial for accurately determining 3D structures. Theoretically, integrating increasing the number of images enhances data redundancy, leading to improvements in completeness, SNR, CC1/2, and Rsplit [57,58]. This may contribute to high-resolution structures with enhanced statistical reliability, resulting in more accurate structure determination. However, beamtime at XFEL or synchrotron facilities required for SX experiments is extremely limited [40,41]. This is particularly true for XFEL facilities with low X-ray repetition rates or synchrotron sources where crystals are exposed X-rays on the millisecond scale due to lower photon flux. In such cases, efficient data collection strategies are essential to maximize the use of restricted beamtime. Collecting excessive data without proper planning can result in inefficient beamtime use. It is thus crucial to perform real-time data processing during SX data collection to ensure the collection of only the amount of data needed to achieve the desired resolution and quality.
This study analyzed SSX datasets of lysozyme and GI to evaluate the impact of varying diffraction data volumes on data quality. The analysis revealed that increasing the diffraction data volume improved overall data processing metrics such as redundancy, SNR, CC1/2, and Rsplit. Notably, for lysozyme data volumes smaller than Lys-10000 and GI data volume smaller than GI-20000, SNR, CC1/2, and Rsplit values gradually improved with increasing volumes of diffraction data. However, beyond these thresholds, the rates of improvement decreased. These findings suggest that continuously accumulating and integrating more diffraction data does not result in improvements in quality proportional to the increase in data volume. This enables the identification of a point in data quantity beyond which quality improvement slows, allowing the establishment of an efficient data collection strategy.
The general macromolecular crystallographic criteria for Rwork and Rfree values of <25% and <30%, respectively, are considered reliable for structural determination [59], although lower R-values are generally preferred. While direct comparisons are difficult due to differences in data collection conditions and processing methods, the R-value in SX is typically higher than in MX. This is because SX usually involves data collection from multiple crystals at room temperature, which can result in relatively higher R-values due to crystal heterogeneity and molecular flexibility.
Based on these criteria, diffraction data volumes up to Lys-10000 and Gl-20000 may be sufficient to yield reliable structural information, as they satisfy the criteria for Rwork and Rfree values. For instance, SSX experiment conducted here using lysozyme crystals collected 26,432 diffraction patterns. If real-time data processing had been implemented during data collection with 10,000 diffraction patterns sufficient to solve the structure, the beamtime required for data collection could have been reduced by more than half.
Meanwhile, lysozyme and GI were used as model samples, both exhibiting high diffraction intensity. If the diffraction data have lower intensity, obtaining complete high-resolution data may require merging a larger number of images compared to lysozyme and GI, which could result in different profiles for data statistics depending on the number of images. Therefore, understanding the trends in data quality as the number of images increases is more important than focusing solely on the absolute number of diffraction images used for lysozyme and GI in this study.
The Wilson B-factor reflects molecular flexibility or data quality [60,61]. Generally, as data quality improves, the Wilson B-factor tends to decrease. For lysozyme, the Wilson B-factor increased with increasing volume of diffraction data, while the GI dataset did not show a consistent trend. This suggests that, unlike SNR and CC1/2, which improve with increasing diffraction data in SX, the trend of Wilson B-factor values can vary, indicating that these values are not reliable as a criterion for investigating the impact of data volume on data quality.
All datasets for both lysozyme and GI yielded successful MR solutions, showing high LLG and TFZ scores. These high values were attributed to the strong similarity between the experimental data and the search models. Therefore, understanding the trends in the changes in LLG and TFZ scores is more important than focusing on their absolute values. Meanwhile, the trend of increasing LLG scores with greater diffraction data volume shows a similarity to the trends of improvements in SNR, CC1/2, and Rsplit with increasing volume of diffraction data.
For lysozyme data, there was no significant difference in the number of automatically added water molecules as the data volume increased. In contrast, for GI, an increasing trend in the number of water molecules was observed with the merging of more diffraction data. This suggests that in cases where accurate information on water molecules is critical, it is important to verify that sufficient diffraction data has been collected.
Meanwhile, the amount of diffraction data required in SX studies can vary based on experimental objectives, desired resolution, and specific statistics. In addition, the required amount of diffraction patterns is affected by factors such as the crystal’s diffraction intensity, space group, and the randomness of crystal orientations during X-ray exposure, resulting in different data processing statistics, including completeness. Therefore, rather than focusing on the volumes of diffraction data mentioned for lysozyme and GI, it is important to understand the trends in statistics for data processing and structural refinement based on the amount of integrated diffraction data.
This study provides an understanding of the relationship between data quantity and quality in SX, offering valuable insights into optimizing experimental design and the efficient use of beamtime.

Funding

This work was funded by the National Research Foundation of Korea (NRF) (NRF-2021R1I1A1A01050838).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The structure factor and coordinates have been deposited in the Protein Data Bank under the accession codes 9L9R (lysozyme-All). Other structure factors and coordinates have been deposited in the ZENODO (DOI: 10.5281/zenodo.14580403).

Acknowledgments

I would like to thank the beamline staff at the 11C beamline at the Pohang Accelerator Laboratory for their assistance with data collection. The author thanks the Global Science experimental Data hub Center (GSDC) at the Korea Institute of Science and Technology Information (KISTI) for providing computing resources and technical support.

Conflicts of Interest

The author declares no conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:
SXserial crystallography
SSXserial synchrotron crystallography
SFXserial femtosecond crystallography
MXmacromolecular crystallography
GIglucose isomerase
SNRsignal-to-noise ratio
CCcorrelation coefficient

References

  1. Chapman, H.N.; Fromme, P.; Barty, A.; White, T.A.; Kirian, R.A.; Aquila, A.; Hunter, M.S.; Schulz, J.; DePonte, D.P.; Weierstall, U.; et al. Femtosecond X-ray protein nanocrystallography. Nature 2011, 470, 73–77. [Google Scholar] [CrossRef] [PubMed]
  2. Chapman, H.N.; Caleman, C.; Timneanu, N. Diffraction before destruction. Philos. Trans. R. Soc. Lond. B Biol. Sci. 2014, 369, 20130313. [Google Scholar] [CrossRef] [PubMed]
  3. Schriber, E.A.; Paley, D.W.; Bolotovsky, R.; Rosenberg, D.J.; Sierra, R.G.; Aquila, A.; Mendez, D.; Poitevin, F.; Blaschke, J.P.; Bhowmick, A.; et al. Chemical crystallography by serial femtosecond X-ray diffraction. Nature 2022, 601, 360–365. [Google Scholar] [CrossRef] [PubMed]
  4. Nass, K.; Gorel, A.; Abdullah, M.M.; Martin, A.V.; Kloos, M.; Marinelli, A.; Aquila, A.; Barends, T.R.M.; Decker, F.-J.; Bruce Doak, R.; et al. Structural dynamics in proteins induced by and probed with X-ray free-electron laser pulses. Nat. Commun. 2020, 11, 1814. [Google Scholar] [CrossRef]
  5. Boutet, S.; Lomb, L.; Williams, G.J.; Barends, T.R.M.; Aquila, A.; Doak, R.B.; Weierstall, U.; DePonte, D.P.; Steinbrener, J.; Shoeman, R.L.; et al. High-Resolution Protein Structure Determination by Serial Femtosecond Crystallography. Science 2012, 337, 362–364. [Google Scholar] [CrossRef] [PubMed]
  6. Weinert, T.; Olieric, N.; Cheng, R.; Brunle, S.; James, D.; Ozerov, D.; Gashi, D.; Vera, L.; Marsh, M.; Jaeger, K.; et al. Serial millisecond crystallography for routine room-temperature structure determination at synchrotrons. Nat. Commun. 2017, 8, 542. [Google Scholar] [CrossRef]
  7. Barends, T.R.M.; Stauch, B.; Cherezov, V.; Schlichting, I. Serial femtosecond crystallography. Nat. Rev. Methods Primers 2022, 2, 59. [Google Scholar] [CrossRef]
  8. Martin-Garcia, J.M.; Conrad, C.E.; Coe, J.; Roy-Chowdhury, S.; Fromme, P. Serial femtosecond crystallography: A revolution in structural biology. Arch. Biochem. Biophys. 2016, 602, 32–47. [Google Scholar] [CrossRef]
  9. Nam, K.H. Comparative Analysis of Room Temperature Structures Determined by Macromolecular and Serial Crystallography. Crystals 2024, 14, 276. [Google Scholar] [CrossRef]
  10. Nam, K.H. Guide to serial synchrotron crystallography. Curr. Res. Struct. Biol. 2024, 7, 100131. [Google Scholar] [CrossRef]
  11. Pegg, D.E. Principles of Cryopreservation. In Cryopreservation and Freeze-Drying Protocols; Methods in Molecular Biology; Springer: Humana Totowa, NJ, USA, 2007; pp. 39–57. [Google Scholar]
  12. Tenboer, J.; Basu, S.; Zatsepin, N.; Pande, K.; Milathianaki, D.; Frank, M.; Hunter, M.; Boutet, S.; Williams, G.J.; Koglin, J.E.; et al. Time-resolved serial crystallography captures high-resolution intermediates of photoactive yellow protein. Science 2014, 346, 1242–1246. [Google Scholar] [CrossRef] [PubMed]
  13. Kupitz, C.; Basu, S.; Grotjohann, I.; Fromme, R.; Zatsepin, N.A.; Rendek, K.N.; Hunter, M.S.; Shoeman, R.L.; White, T.A.; Wang, D.; et al. Serial time-resolved crystallography of photosystem II using a femtosecond X-ray laser. Nature 2014, 513, 261–265. [Google Scholar] [CrossRef] [PubMed]
  14. Hekstra, D.R. Emerging Time-Resolved X-Ray Diffraction Approaches for Protein Dynamics. Annu. Rev. Biophys. 2023, 52, 255–274. [Google Scholar] [CrossRef] [PubMed]
  15. Westenhoff, S.; Meszaros, P.; Schmidt, M. Protein motions visualized by femtosecond time-resolved crystallography: The case of photosensory vs photosynthetic proteins. Curr. Opin. Struct. Biol. 2022, 77, 102481. [Google Scholar] [CrossRef]
  16. Stagno, J.R.; Liu, Y.; Bhandari, Y.R.; Conrad, C.E.; Panja, S.; Swain, M.; Fan, L.; Nelson, G.; Li, C.; Wendel, D.R.; et al. Structures of riboswitch RNA reaction states by mix-and-inject XFEL serial crystallography. Nature 2016, 541, 242–246. [Google Scholar] [CrossRef]
  17. Olmos, J.L., Jr.; Pandey, S.; Martin-Garcia, J.M.; Calvey, G.; Katz, A.; Knoska, J.; Kupitz, C.; Hunter, M.S.; Liang, M.; Oberthuer, D.; et al. Enzyme intermediates captured “on the fly” by mix-and-inject serial crystallography. BMC Biol. 2018, 16, 59. [Google Scholar] [CrossRef]
  18. Park, J.; Nam, K.H. Recent chemical mixing devices for time-resolved serial femtosecond crystallography. TrAC Trends Anal. Chem. 2024, 172, 117554. [Google Scholar] [CrossRef]
  19. Park, J.; Nam, K.H. Experimental approaches for time-resolved serial femtosecond crystallography at PAL-XFEL. In Time-Resolved Methods in Structural Biology; Methods in Enzymology; Academic Press: Cambridge, MA, USA, 2024; pp. 131–160. [Google Scholar]
  20. Powell, H.R. A beginner’s guide to X-ray data processing. Biochemist 2021, 43, 46–50. [Google Scholar] [CrossRef]
  21. Maveyraud, L.; Mourey, L. Protein X-ray Crystallography and Drug Discovery. Molecules 2020, 25, 1030. [Google Scholar] [CrossRef]
  22. Pusey, M.L.; Liu, Z.-J.; Tempel, W.; Praissman, J.; Lin, D.; Wang, B.-C.; Gavira, J.A.; Ng, J.D. Life in the fast lane for protein crystallization and X-ray crystallography. Prog. Biophys. Mol. Biol. 2005, 88, 359–386. [Google Scholar] [CrossRef]
  23. Otwinowski, Z.; Minor, W. Processing of X-ray diffraction data collected in oscillation mode. Methods Enzymol. 1997, 276, 307–326. [Google Scholar] [CrossRef] [PubMed]
  24. Dauter, Z. Data-collection strategies. Acta Crystallogr. D Biol. 1999, 55, 1703–1717. [Google Scholar] [CrossRef] [PubMed]
  25. Evans, P.R. An introduction to data reduction: Space-group determination, scaling and intensity statistics. Acta Crystallogr. D Biol. 2011, 67, 282–292. [Google Scholar] [CrossRef] [PubMed]
  26. White, T.A.; Kirian, R.A.; Martin, A.V.; Aquila, A.; Nass, K.; Barty, A.; Chapman, H.N. CrystFEL: A software suite for snapshot serial crystallography. J. Appl. Crystallogr. 2012, 45, 335–341. [Google Scholar] [CrossRef]
  27. Brehm, W.; Diederichs, K. Breaking the indexing ambiguity in serial crystallography. Acta Crystallogr. D Biol. 2013, 70, 101–109. [Google Scholar] [CrossRef]
  28. Chavas, L.M.G.; Gumprecht, L.; Chapman, H.N. Possibilities for serial femtosecond crystallography sample delivery at future light sources. Struct. Dyn. 2015, 2, 041709. [Google Scholar] [CrossRef]
  29. Nam, K.H. Hit and Indexing Rate in Serial Crystallography: Incomparable Statistics. Front. Mol. Biosci. 2022, 9, 858815. [Google Scholar] [CrossRef]
  30. DePonte, D.P.; Weierstall, U.; Schmidt, K.; Warner, J.; Starodub, D.; Spence, J.C.H.; Doak, R.B. Gas dynamic virtual nozzle for generation of microscopic droplet streams. J. Phys. D 2008, 41, 195505. [Google Scholar] [CrossRef]
  31. Weierstall, U.; Spence, J.C.H.; Doak, R.B. Injector for scattering measurements on fully solvated biospecies. Rev. Sci. Instrum. 2012, 83, 035108. [Google Scholar] [CrossRef]
  32. Weierstall, U.; James, D.; Wang, C.; White, T.A.; Wang, D.; Liu, W.; Spence, J.C.; Bruce Doak, R.; Nelson, G.; Fromme, P.; et al. Lipidic cubic phase injector facilitates membrane protein serial femtosecond crystallography. Nat. Commun. 2014, 5, 3309. [Google Scholar] [CrossRef]
  33. Hunter, M.S.; Segelke, B.; Messerschmidt, M.; Williams, G.J.; Zatsepin, N.A.; Barty, A.; Benner, W.H.; Carlson, D.B.; Coleman, M.; Graf, A.; et al. Fixed-target protein serial microcrystallography with an x-ray free electron laser. Sci. Rep. 2014, 4, 6026. [Google Scholar] [CrossRef] [PubMed]
  34. Grünbein, M.L.; Nass Kovacs, G. Sample delivery for serial crystallography at free-electron lasers and synchrotrons. Acta Crystallogr. D Biol. Crystallogr. 2019, 75, 178–191. [Google Scholar] [CrossRef] [PubMed]
  35. Zhao, F.Z.; Zhang, B.; Yan, E.K.; Sun, B.; Wang, Z.J.; He, J.H.; Yin, D.C. A guide to sample delivery systems for serial crystallography. FEBS J. 2019, 286, 4402–4417. [Google Scholar] [CrossRef] [PubMed]
  36. Martiel, I.; Muller-Werkmeister, H.M.; Cohen, A.E. Strategies for sample delivery for femtosecond crystallography. Acta Crystallogr. D Struct. Biol. 2019, 75, 160–177. [Google Scholar] [CrossRef] [PubMed]
  37. Fuller, F.D.; Gul, S.; Chatterjee, R.; Burgie, E.S.; Young, I.D.; Lebrette, H.; Srinivas, V.; Brewster, A.S.; Michels-Clark, T.; Clinger, J.A.; et al. Drop-on-demand sample delivery for studying biocatalysts in action at X-ray free-electron lasers. Nat. Methods 2017, 14, 4461. [Google Scholar] [CrossRef]
  38. Calvey, G.D.; Katz, A.M.; Zielinski, K.A.; Dzikovski, B.; Pollack, L. Characterizing Enzyme Reactions in Microcrystals for Effective Mix-and-Inject Experiments using X-ray Free-Electron Lasers. Anal. Chem. 2020, 92, 13864–13870. [Google Scholar] [CrossRef]
  39. Kirian, R.A.; Wang, X.; Weierstall, U.; Schmidt, K.E.; Spence, J.C.H.; Hunter, M.; Fromme, P.; White, T.; Chapman, H.N.; Holton, J. Femtosecond protein nanocrystallography—Data analysis methods. Optics Express 2010, 18, 5713–5723. [Google Scholar] [CrossRef]
  40. Johansson, L.C.; Stauch, B.; Ishchenko, A.; Cherezov, V. A Bright Future for Serial Femtosecond Crystallography with XFELs. Trends Biochem. Sci. 2017, 42, 749–762. [Google Scholar] [CrossRef]
  41. Lee, K.; Lee, D.; Park, J.; Lee, J.-L.; Chung, W.K.; Cho, Y.; Nam, K.H. Upgraded Combined Inject-and-Transfer System for Serial Femtosecond Crystallography. Appl. Sci. 2022, 12, 9125. [Google Scholar] [CrossRef]
  42. Nam, K.H. Real-time monitoring of large-scale crystal growth using batch crystallization for serial crystallography. J. Cryst. Growth 2023, 614, 127219. [Google Scholar] [CrossRef]
  43. Park, S.Y.; Ha, S.C.; Kim, Y.G. The Protein Crystallography Beamlines at the Pohang Light Source II. Biodesign 2017, 5, 30–34. [Google Scholar]
  44. Park, S.Y.; Nam, K.H. Sample delivery using viscous media, a syringe and a syringe pump for serial crystallography. J. Synchrotron Radiat. 2019, 26, 1815–1819. [Google Scholar] [CrossRef] [PubMed]
  45. Barty, A.; Kirian, R.A.; Maia, F.R.; Hantke, M.; Yoon, C.H.; White, T.A.; Chapman, H. Cheetah: Software for high-throughput reduction and analysis of serial femtosecond X-ray diffraction data. J. Appl. Crystallogr. 2014, 47, 1118–1131. [Google Scholar] [CrossRef] [PubMed]
  46. White, T.A.; Mariani, V.; Brehm, W.; Yefanov, O.; Barty, A.; Beyerlein, K.R.; Chervinskii, F.; Galli, L.; Gati, C.; Nakane, T.; et al. Recent developments in CrystFEL. J. Appl. Crystallogr. 2016, 49, 680–689. [Google Scholar] [CrossRef]
  47. Gevorkov, Y.; Yefanov, O.; Barty, A.; White, T.A.; Mariani, V.; Brehm, W.; Tolstikova, A.; Grigat, R.R.; Chapman, H.N. XGANDALF—Extended gradient descent algorithm for lattice finding. Acta Crystallogr. A Found. Adv. 2019, 75, 694–704. [Google Scholar] [CrossRef]
  48. Yefanov, O.; Mariani, V.; Gati, C.; White, T.A.; Chapman, H.N.; Barty, A. Accurate determination of segmented X-ray detector geometry. Opt. Express 2015, 23, 28459–28470. [Google Scholar] [CrossRef]
  49. Park, S.Y.; Choi, H.; Eo, C.; Cho, Y.; Nam, K.H. Fixed-target serial synchrotron crystallography using nylon mesh and enclosed film-based sample holder. Crystals 2020, 10, 803. [Google Scholar] [CrossRef]
  50. Maia, F.R.N.C. The Coherent X-ray Imaging Data Bank. Nat. Methods 2012, 9, 854–855. [Google Scholar] [CrossRef]
  51. Liebschner, D.; Afonine, P.V.; Baker, M.L.; Bunkoczi, G.; Chen, V.B.; Croll, T.I.; Hintze, B.; Hung, L.W.; Jain, S.; McCoy, A.J.; et al. Macromolecular structure determination using X-rays, neutrons and electrons: Recent developments in Phenix. Acta Crystallogr. D Struct. Biol. 2019, 75, 861–877. [Google Scholar] [CrossRef]
  52. Nam, K.H.; Cho, Y. Stable sample delivery in a viscous medium via a polyimide-based single-channel microfluidic chip for serial crystallography. J. Appl. Crystallogr. 2021, 54, 1081–1087. [Google Scholar] [CrossRef]
  53. Lee, D.; Baek, S.; Park, J.; Lee, K.; Kim, J.; Lee, S.J.; Chung, W.K.; Lee, J.L.; Cho, Y.; Nam, K.H. Nylon mesh-based sample holder for fixed-target serial femtosecond crystallography. Sci. Rep. 2019, 9, 6971. [Google Scholar] [CrossRef] [PubMed]
  54. Emsley, P.; Cowtan, K. Coot: Model-building tools for molecular graphics. Acta Crystallogr. D Biol. Crystallogr. 2004, D60, 2126–2132. [Google Scholar] [CrossRef] [PubMed]
  55. Williams, C.J.; Headd, J.J.; Moriarty, N.W.; Prisant, M.G.; Videau, L.L.; Deis, L.N.; Verma, V.; Keedy, D.A.; Hintze, B.J.; Chen, V.B.; et al. MolProbity: More and better reference data for improved all-atom structure validation. Protein Sci. 2018, 27, 293–315. [Google Scholar] [CrossRef] [PubMed]
  56. Evans, P. Resolving Some Old Problems in Protein Crystallography. Science 2012, 336, 986–987. [Google Scholar] [CrossRef]
  57. Xu, H.; Lebrette, H.; Yang, T.; Srinivas, V.; Hovmöller, S.; Högbom, M.; Zou, X. A Rare Lysozyme Crystal Form Solved Using Highly Redundant Multiple Electron Diffraction Datasets from Micron-Sized Crystals. Structure 2018, 26, 667–675.e663. [Google Scholar] [CrossRef]
  58. Wlodawer, A.; Minor, W.; Dauter, Z.; Jaskolski, M. Protein crystallography for aspiring crystallographers or how to avoid pitfalls and traps in macromolecular structure determination. FEBS J. 2013, 280, 5705–5736. [Google Scholar] [CrossRef]
  59. Rondeau, J.-M.; Schreuder, H. Protein Crystallography and Drug Discovery. In The Practice of Medicinal Chemistry; Academic Press: Cambridge, MA, USA, 2015; pp. 511–537. [Google Scholar]
  60. de Sá Ribeiro, F.; Lima, L.M.T.R. Linking B-factor and temperature-induced conformational transition. Biophys. Chem. 2023, 298, 107027. [Google Scholar] [CrossRef]
  61. Mlynek, G.; Djinović-Carugo, K.; Carugo, O. B-Factor Rescaling for Protein Crystal Structure Analyses. Crystals 2024, 14, 443. [Google Scholar] [CrossRef]
Figure 1. The profile of data processing statistics for lysozyme SSX data based on varying diffraction data volumes: (A) redundancy, (B) SNR, (C) CC1/2, (D) CC*, (E) Rsplit, and (F) Wilson B-factor. The overall (80.0–1.45 Å) and highest (1.50–1.45 Å) resolution shells are indicated by blue and red lines, respectively.
Figure 1. The profile of data processing statistics for lysozyme SSX data based on varying diffraction data volumes: (A) redundancy, (B) SNR, (C) CC1/2, (D) CC*, (E) Rsplit, and (F) Wilson B-factor. The overall (80.0–1.45 Å) and highest (1.50–1.45 Å) resolution shells are indicated by blue and red lines, respectively.
Crystals 15 00104 g001
Figure 2. The profile of data processing statistics for GI SSX data based on varying diffraction data volumes: (A) redundancy, (B) SNR, (C) CC1/2, (D) CC*, (E) Rsplit, and (F) Wilson B-factor. The overall (72.46–1.60 Å) and highest (1.65–1.60 Å) resolution shells are indicated by blue and red lines, respectively.
Figure 2. The profile of data processing statistics for GI SSX data based on varying diffraction data volumes: (A) redundancy, (B) SNR, (C) CC1/2, (D) CC*, (E) Rsplit, and (F) Wilson B-factor. The overall (72.46–1.60 Å) and highest (1.65–1.60 Å) resolution shells are indicated by blue and red lines, respectively.
Crystals 15 00104 g002
Figure 3. The profile of MR and structure determination of lysozyme at various sample volumes. (A) The MR score for top LLG and TFZ for lysozyme datasets. Rwork/Rfree values of lysozyme datasets at (B) the overall resolution range (55.85–1.45 Å) and (C) the highest-resolution range (1.49–1.45 Å). (D) B-factor values for protein and water molecules for lysozyme datasets.
Figure 3. The profile of MR and structure determination of lysozyme at various sample volumes. (A) The MR score for top LLG and TFZ for lysozyme datasets. Rwork/Rfree values of lysozyme datasets at (B) the overall resolution range (55.85–1.45 Å) and (C) the highest-resolution range (1.49–1.45 Å). (D) B-factor values for protein and water molecules for lysozyme datasets.
Crystals 15 00104 g003
Figure 4. The 2Fo-Fc (blue mesh, 1σ) and Fo-Fc (green mesh, 3σ; red mesh, −3σ) electron density maps of lysozyme datasets.
Figure 4. The 2Fo-Fc (blue mesh, 1σ) and Fo-Fc (green mesh, 3σ; red mesh, −3σ) electron density maps of lysozyme datasets.
Crystals 15 00104 g004
Figure 5. The profile of MR and structure determination of GI at various sample volumes. (A) The MR score for top LLG and TFZ for lysozyme datasets. Rwork/Rfree values of GI datasets at (B) the overall resolution range (72.46–1.60 Å) and (C) the highest-resolution range (1.64–1.60 Å). (D) B-factor values for protein, water, and Mg2+ for GI datasets.
Figure 5. The profile of MR and structure determination of GI at various sample volumes. (A) The MR score for top LLG and TFZ for lysozyme datasets. Rwork/Rfree values of GI datasets at (B) the overall resolution range (72.46–1.60 Å) and (C) the highest-resolution range (1.64–1.60 Å). (D) B-factor values for protein, water, and Mg2+ for GI datasets.
Crystals 15 00104 g005
Figure 6. The 2Fo-Fc (blue mesh, 1σ) and Fo-Fc (green mesh, 3σ; red mesh, −3σ) electron density maps of GI datasets depending on the volume of diffraction data.
Figure 6. The 2Fo-Fc (blue mesh, 1σ) and Fo-Fc (green mesh, 3σ; red mesh, −3σ) electron density maps of GI datasets depending on the volume of diffraction data.
Crystals 15 00104 g006
Table 1. Data processing statistics for lysozyme datasets.
Table 1. Data processing statistics for lysozyme datasets.
Data NameLys-2000Lys-3000Lys-5000Lys-10000Lys-15000Lys-20000Lys-25000Lys-All
No. of images20003000500010,00015,00020,00025,00026,432
Resolution (Å)80–1.45
(1.50–1.45)
80–1.45
(1.50–1.45)
80–1.45
(1.50–1.45)
80–1.45
(1.50–1.45)
80–1.45
(1.50–1.45)
80–1.45
(1.50–1.45)
80–1.45
(1.50–1.45)
80–1.45
(1.50–1.45)
Unit cell (Å)
a78.9878.9878.9878.9878.9878.9878.9878.98
b78.9878.9878.9878.9878.9878.9878.9878.98
c38.2438.2438.2438.2438.2438.2438.2438.24
Number of reflections1,048,602
(30,142)
1,620,575
(46,465)
2,721,401
(78,165)
5,610,234
(161,211)
8,912,635
(256,692)
12,960,884
(373,287)
15,163,328
(435,665)
15,752,048
(452,449)
Number of unique reflections47.3
(14.0)
73.1
(21.6)
122.7
(36.3)
252.9
(74.9)
401.7
(119.2)
584.2
(173.4)
683.5
(202.4)
710.0
(210.1)
Reflection22,177
(2148)
22,182
(2151)
22,185
(2153)
22,185
(2153)
22,185
(2153)
22,185
(2153)
22,185
(2153)
22,185
(2153)
Completeness (%)99.96
(99.77)
99.99
(99.91)
100.00
(100.00)
100.00
(100.00)
100.00
(100.00)
100.00
(100.00)
100.00
(100.00)
100.00
(100.00)
SNR3.12
(0.59)
3.72
(0.69)
4.72
(0.89)
6.56
(1.20)
7.55
(1.22)
8.37
(1.18)
9.34
(1.29)
9.6
(1.33)
Rsplit (%) a23.86
(199.67)
19.79
(162.79)
14.86
(124.95)
10.74
(87.7)
8.82
(87.05)
7.50
(91.03)
6.79
(82.69)
6.59
(80.09)
CC1/20.9142
(0.1824)
0.9422
(0.2304)
0.9669
(0.3422)
0.9818
(0.4757)
0.9884
(0.4961)
0.9922
(0.4841)
0.9935
(0.5284)
0.9939
(0.5428)
CC*0.9773
(0.5555)
0.9850
(0.6120)
0.9915
(0.7141)
0.9954
(0.8029)
0.9970
(0.8143)
0.9980
(0.8077)
0.9983
(0.8315)
0.9984
(0.8388)
Wilson
B-factor (Å2)
24.6624.7124.6524.9025.3525.8625.9926.00
Highest resolution shell is shown in parentheses. a Rsplit = 1 2 · h k l I h k l e v e n I h k l o d d 1 2 I h k l e v e n I h k l o d d b Rwork = Σ||Fobs| − |Fcalc||/Σ|Fobs|, where Fobs and Fcalc are the observed and calculated structure-factor amplitudes, respectively.
Table 2. Data processing statistics of GI.
Table 2. Data processing statistics of GI.
Data NameGI-5000GI-10000GI-15000GI-20000GI-25000GI-30000GI-35000GI-All
Diffraction images500010,00015,00020,00025,00030,00035,00039,657
Resolution (Å)72.46–1.60
(1.65–1.60)
72.46–1.60
(1.65–1.60)
72.46–1.60
(1.65–1.60)
72.46–1.60
(1.65–1.60)
72.46–1.60
(1.65–1.60)
72.46–1.60
(1.65–1.60)
72.46–1.60
(1.65–1.60)
72.46–1.60
(1.65–1.60)
Unit cell (Å)
a64.4564.4564.4564.4564.4564.4564.4564.45
b100.26100.26100.26100.26100.26100.26100.26100.26
c103.54103.54103.54103.54103.54103.54103.54103.54
Number of reflections8,677,505
(603,600)
23,815,611
(1,672,804)
36,543,752
(2,570,291)
56,945,902
(4,010,249)
68,498,647
(4,815,125)
83,098,370
(5,837,510)
101,251,911
(7,126,384)
116,012,239
(8,167,934)
Number of unique reflections133.6
(93.7)
366.8
(259.7)
562.8
(399.1)
877.1
(622.6)
1055.0
(747.6)
1279.9
(906.3)
1559.4
(1106.4)
1786.8
(1268.1)
Reflection64,928
(6441)
64,928
(6441)
64,928
(6441)
64,928
(6441)
64,928
(6441)
64,928
(6441)
64,928
(6441)
64,928
(6441)
Completeness (%)100.00
(100.00)
100.00
(100.00)
100.00
(100.00)
100.00
(100.00)
100.00
(100.00)
100.00
(100.00)
100.00
(100.00)
100.00
(100.00)
SNR1.35
(0.50)
1.75
(0.76)
2.12
(1.03)
2.61
(1.42)
2.77
(1.48)
2.90
(1.50)
3.02
(1.50)
3.12
(1.52)
Rsplit (%)59.2
(190.43)
45.36
(125.81)
38.88
(92.67)
32.31
(66.07)
29.89
(63.24)
28.47
(63.59)
27.07
(65.09)
26.09
(64.76)
CC1/20.5877
(0.3153)
0.7567
(0.4467)
0.8136
(0.5422)
0.8543
(0.6584)
0.8741
(0.6737)
0.8854
(0.6649)
0.8979
(0.6384)
0.9060
(0.6385)
CC*0.8604
(0.6924)
0.9281
(0.7858)
0.9472
(0.8385)
0.9599
(0.8910)
0.9658
(0.8972)
0.9691
(0.8937)
0.9727
(0.8828)
0.9750
(0.8828)
Wilson
B-factor (Å2)
20.0319.5519.4619.1819.3419.4519.5819.63
The highest resolution shell is shown in parentheses.
Table 3. Structure refinement statistics for lysozyme.
Table 3. Structure refinement statistics for lysozyme.
DatasetLys-2000Lys-3000Lys-5000Lys-10000Lys-15000Lys-20000Lys-25000Lys-All
MR phasing
Top LLG8101.558805.239519.7410,329.2010,411.6710,442.4310,588.9510,632.07
Top TFZ6466.367.769.769.469.369.669.7
Structure refinement
Resolution (Å)55.85–1.45
(1.49–1.45)
55.85–1.45
(1.49–1.45)
55.85–1.45
(1.49–1.45)
55.85–1.45
(1.49–1.45)
55.85–1.45
(1.49–1.45)
55.85–1.45
(1.49–1.45)
55.85–1.45
(1.49–1.45)
55.85–1.45
(1.49–1.45)
Rwork0.2060
(0.4060)
0.1970
(0.3958)
0.1901
(0.3782)
0.1828
(0.3476)
0.1809
(0.3471)
0.1806
(0.3549)
0.1782
(0.3394)
0.1789
(0.3415)
Rfree0.2281
(0.4358)
0.2152
(0.4616)
0.2102
(0.3677)
0.2012
(0.3462)
0.1911
(0.3908)
0.1991
(0.3719)
0.1936
(0.3456)
0.1939
(0.3756)
No. of atoms
Protein10011001100110011001100110011001
Water8989919188819389
RMSD
Bond (Å)0.0060.0060.0050.0050.0050.0050.0050.005
Angle (°)0.9010.8630.8450.8540.8670.8400.8510.811
B-factor (Å2)
Protein25.7725.7825.6025.6026.1226.3926.5926.35
Water35.6336.2635.5336.2835.4236.3637.3636.68
Ramachandran plot (%)
Favored99.2198.4398.4398.4398.4399.2199.2198.43
Allowed0.791.571.571.571.570.790.791.57
Outlier00000000
Highest resolution shell is shown in parentheses. Rfree was calculated as Rwork using a randomly selected subset of unique reflections not used for structure refinement.
Table 4. Structure refinement statistics for GI.
Table 4. Structure refinement statistics for GI.
GI-5000GI-10000GI-15000GI-20000GI-25000GI-30000GI-35000GI-All
MR phasing
Top LLG11,961.2215,521.8218,945.6122,264.0023,582.3124,522.7125,337.2626,028.86
Top TFZ64.870.173.575.575.875.976.176.3
Structure refinement
Resolution (Å)72.03–1.60
(1.64–1.60)
72.03–1.60
(1.64–1.60)
72.03–1.60
(1.64–1.60)
72.03–1.60
(1.64–1.60)
72.03–1.60
(1.64–1.60)
72.03–1.60
(1.64–1.60)
72.03–1.60
(1.64–1.60)
72.03–1.60
(1.64–1.60)
Rwork0.2625
(0.3578)
0.2554
(0.3434)
0.2457
(0.3361)
0.2437
(0.3056)
0.2351
(0.3151)
0.227
(0.3189)
0.2214
(0.3106)
0.2161
(0.3122)
Rfree0.2944
(0.3848)
0.2755
(0.4028)
0.2591
(0.3633)
0.2587
(0.3138)
0.2603
(0.3687)
0.2501
(0.4023)
0.2442
(0.3376)
0.2396
(0.3525)
No. of atoms
Protein30413041304130413041304130413041
Water292297296303300315333335
RMSD
Bond (Å)0.0080.0080.0080.0070.0070.0070.0070.007
Angle (°)0.9900.9830.9800.9330.9320.9220.9320.923
B-factor (Å2)
Protein17.716.9716.4416.7516.6017.1717.1216.53
Water26.4025.6325.5025.9626.6527.8128.4627.98
Mg2+12.8911.9215.1812.7113.2313.2613.7113.15
Ramachandran plot (%)
Favored96.3496.3495.8295.8295.8295.5696.0896.08
Allowed3.393.393.923.923.924.183.663.66
Outlier0.260.260.260.260.260.260.260.26
Highest resolution shell is shown in parentheses.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Nam, K.H. Impact of Diffraction Data Volume on Data Quality in Serial Crystallography. Crystals 2025, 15, 104. https://doi.org/10.3390/cryst15020104

AMA Style

Nam KH. Impact of Diffraction Data Volume on Data Quality in Serial Crystallography. Crystals. 2025; 15(2):104. https://doi.org/10.3390/cryst15020104

Chicago/Turabian Style

Nam, Ki Hyun. 2025. "Impact of Diffraction Data Volume on Data Quality in Serial Crystallography" Crystals 15, no. 2: 104. https://doi.org/10.3390/cryst15020104

APA Style

Nam, K. H. (2025). Impact of Diffraction Data Volume on Data Quality in Serial Crystallography. Crystals, 15(2), 104. https://doi.org/10.3390/cryst15020104

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop