Next Article in Journal
On the Thorny Issue of Single Submission
Previous Article in Journal
Is Citation Count a Legitimate Indicator of Scientific Impact? A Case Study of Upper (1974) “The Unsuccessful Self-Treatment of a Case of Writer’s Block” and Its Derivatives
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Temporal Evolution of Bradford Curves in Academic Library Contexts

Information and Intelligence Department, University Town Library of Shenzhen, 2239 Lishui Road, Nanshan District, Shenzhen 518055, China
Publications 2024, 12(4), 36; https://doi.org/10.3390/publications12040036
Submission received: 12 July 2024 / Revised: 12 September 2024 / Accepted: 18 September 2024 / Published: 15 October 2024

Abstract

:
Bradford’s law of bibliographic scattering is a fundamental principle in bibliometrics, offering valuable guidance for academic libraries in literature search and procurement. However, Bradford curves can exhibit various shapes over time, and predicting these shapes remains a challenge due to a lack of causal explanation. This paper attributes the deviations from the theoretical J-shape to integer constraints on the number of journals and articles, extending Leimkuhler’s function to encompass highly productive core journals, where the theoretical journal number falls below one. Using the Simon–Yule model, key parameters of the extended formulas are identified and analyzed. The paper explains the reasons for the Groos droop and examines the critical points for shape changes. The proposed formulas are validated with empirical data from the literature, demonstrating that this method can effectively predict the evolution of Bradford curves, providing academic libraries with a valuable tool for evaluating journal coverage, optimizing resource allocation, and refining Collection Development Policies (CDP).

1. Introduction

1.1. Background and Significance

Bradford’s law, a foundational principle in bibliometrics, holds significant theoretical value in modeling and simulating the dynamics of scientific knowledge production [1,2,3,4,5,6,7,8]. By integrating it into the framework of science dynamics, this research reveals how micro-level behaviors influence macro-level trends in knowledge growth and shifts in core literature. However, generating Bradford curves can be labor-intensive, especially for journals with few relevant papers. Additionally, as the scientific literature in a given discipline often grows exponentially or passes through various developmental stages [9], a Bradford curve created at one point in time may become outdated without adjustments. Some mathematical models predict a J-shaped curve; but, in practice, Bradford curves can take on at least six different shapes, including the S-shaped curve with the so-called Groos droop [10]. There is a lack of causal explanations for these variations, and comprehensive empirical studies are limited [11]. As such, predicting the evolution of Bradford curves remains an open question requiring further investigation.
This paper attributes the different shapes of Bradford curves to the integer constraints of journal and paper numbers. If journal productivity n is high enough that the corresponding theoretical journal number f t n = C / n α falls below one, the actual journal number f e n can only be zero or one. This discrete nature causes deviations in the core zone from the theoretical predictions of models like Lotka or Simon–Yule. To address this, the paper proposes two distinct formulas for the core and the normal zones, which are analyzed through theoretical methods and Monte Carlo simulations of the Simon–Yule model. The causes of the Groos droop are explained, and the critical points for shape changes are identified. Finally, the proposed formulas are validated using empirical data, demonstrating that this method can predict the evolution of Bradford curves.
While Bradford’s law may not directly guide acquisition strategies in modern academic libraries, it still provides a valuable framework for evaluating journal coverage and understanding the distribution of core literature within specific fields [12,13,14,15,16]. This suggests its potential for use in ensuring comprehensive and balanced collections, contributing to a more strategic approach to library resource management.

1.2. Literature Review

Bradford’s law was first proposed by Bradford in 1934 [17] but did not gain wide recognition until Vickery further developed the theory in 1948 [18]. According to Bradford’s law, if journals are arranged in descending order of productivity and divided into p groups with the same number of papers, the number of journals in each group n i follows the ratio n 1 : n 2 : : n p = 1 : k : : k p 1 , where k is the Bradford multiplier. Besides this verbal form, Bradford’s law can also be depicted as a J-shaped curve by plotting the accumulated productivity R r of the first r journals against the natural logarithm of the journal rank r . Leimkuhler proposed the mathematical formula for Bradford curve in 1967 [19], and Egghe developed a method for determining the parameters of this formula in 1990 [20]. In Leimkuhler’s function R r = a l o g 1 + b r , where the key parameters a and b can be calculated from the article number A , journal number T , and the productivity y m of the most productive journal. Although Leimkuhler’s function matches well with many bibliographies, it corresponds to a J-shaped curve, which deviates from those with a Groos droop [20].
Incomplete bibliographies were initially believed to cause the Groos droop, but further research refuted this hypothesis [21]. Egghe demonstrated that if the ranking of each journal r is transformed into r = r + r 0 by adding a large constant r 0 > 1 / b , then the new curve will concave downwards, showing a Groos droop [22]. The merging of different bibliographies, each with a different maximum journal productivity y m i , could explain the large constant r 0 [22]. However, it is also likely that the large core regions (regions with the most productive journals where f t n i < 1 ) of some bibliographies contribute to the large r 0 [23]. Essentially, y m in Leimkuhler’s function denotes the journal productivity, where f t y m = C / y m α 1 [24], rather than the maximum yield X 1 of a journal as claimed by Egghe himself. Thus, if the total number of these journals T 0 exceeds the critical value r 0 = 1 / b , a Groos droop will emerge. This paper adopts this explanation and extends Leimkuhler’s function to predict the evolution of Bradford curves.
In the 1990s, research interest in Bradford’s law shifted from the static presentation of data at a particular time to its dynamic and evolutionary aspects [25]. Oluić-Vuković studied how the increase in productivity of core journals affected the shape of the distribution curve over time [26]. By analyzing the research output of Croatian scholars in different subjects, she concluded that the Groos droop or S-shaped curve is caused by an increase in the concentration/dispersal disparity, reflected by the rise in the core/periphery ratio [27]. The dynamic evolution of Bradford curves and the emergence of the Groos droop were presented in her 1992 study [28], and other similar empirical studies partitioning bibliographies over time were conducted by Garg [29], Wagner-Döbler [11] and Sen [30].
Meanwhile, stochastic models like the Simon–Yule model have increasingly been used to study the dynamic characteristics of bibliometric laws [25,31]. Initially introduced by Yule in 1924 for studying the distribution of biological genera by species number, the Simon–Yule model gained recognition when Simon expanded it in 1955 to analyze the frequency distributions of words in writing samples [32]. Besides employing theoretical methods for precisely solving the constant entry rate α of new sources [32], Monte Carlo simulations have been used to explore more complex scenarios, such as declining entry rates α t [33] and autocorrelated growth rates γ of established journals (also referred to as aging or obsolescence rate) [34]. Chen et al. [35,36,37] first used the Simon–Yule model to study the evolution of Lotka’s and Bradford’s laws over time. They found the entry rate α t and the autocorrelated growth rate γ have significant yet opposite effects on the Bradford curves, offering an explanation for the various types of Bradford curves [36].
Later, Oluić-Vuković also explored the dynamics of Bradford distributions using the Simon–Yule model but found that its steady-state solution was too restrictive to handle time variations, limiting its applicability [25,31]. This paper also utilizes the Simon–Yule model to examine different scenarios’ effects on key parameters (e.g., journal number T 0 , article number A 0 , and maximum productivity X 1 of the core region) of the extended Leimkuhler’s function. However, it is not used directly to forecast the evolution of Bradford curves or to compare them with empirical data. Instead, key parameters are estimated from past empirical data to improve predictions of Bradford curve evolution in the future.

2. Theoretical Study

2.1. Simon-Yule Model

Simon’s generating mechanism for the Bradford distribution is based on the following two assumptions, where f t n , t denotes the number of journals that have published exactly n papers in the first t published papers:
  • There is a constant probability α that the t + 1 -th paper is published in a new journal—a journal that has not published in the first t papers.
  • The probability that the t + 1 -th paper is published in a journal that has published n papers is proportional to n f n , t —that is, to the total number of papers of all journals that have published exactly n papers.
Therefore, if there are A papers at a given time, then the corresponding journal number T is approximately T = A α . Based on Simon’s two assumptions, the steady-state solution of the Bradford distribution can be written as [37]:
f t n = ρ B n ,   ρ + 1 ρ Γ ρ + 1 n ρ + 1
where B is the beta function, Γ is the gamma function, and ρ is a function of the entry rate of new journal α , defined as ρ = 1 / 1 α . Equation (1) suggests that the analytical outcome of the Simon–Yule model aligns with Lotka’s law when ρ 1 .
In addition to the analytical solutions, Monte Carlo simulations are conducted for α = 0.15 , and the results are compared with the theoretical results of Equation (1), as shown in Figure 1. Detailed procedures for these simulations can be found in [33] and the data and MATLAB code used are available in the Supplementary Materials. To reduce inherent randomness, each case is simulated N = 10 4 times, utilizing only the medians of these simulations as the final outputs.
Figure 1 illustrates distinct zones in the simulation results: a normal zone (blue circles) and a core zone (red squares). This distinction arises from the necessity for actual journal numbers f e n to be integers, unable to fall below one. Thus, when the journal productivity n is high enough for the theoretical journal number f t n to drop below one, actual journal numbers f e n are constrained to zero or one, deviating from theoretical predictions as shown by the red squares and black dots in Figure 1. Moreover, despite the smaller number of journals in the core region, their contribution to the paper count is significant, as shown in Figure 1b. Hence, accurate prediction of the paper count for each journal X r in the core zone is crucial for depicting the Bradford curve faithfully.
Estimating journal productivity X r of the core region involves determining the journal count T 0 and the paper count A 0 first. Since Figure 1 shows a close match between the journal number f n and the paper number n f n in the normal region, the total journal count T 1 and the total paper count A 1 of the normal region can be directly obtained by summing all journal and paper numbers. These are calculated as T 1 = n = 1 y m f n and A 1 = n = 1 y m n f n , where y m is the journal productivity when f t y m 1 . In this context, y m can be seen as the productivity of the most productive journal in the normal region, as indicated by the black diamond in Figure 1. According to Equation (1), the analytical expression for y m can be derived as:
y m = A ρ 1 Γ ρ + 1 1 ρ + 1
Once y m is calculated, the total journal count T 0 and the paper count A 0 of the core region can be calculated as T 0 = T T 1 and A 0 = A A 1 . Alternatively, they can be directly calculated as:
T 0 y m + T f n d n = y m ρ
A 0 y m + T n f n d n = y m 2 ρ 1
where y m is calculated using Equation (2).
In the Simon–Yule model with constant entry rate α , the maximum number of papers one journal can have, X 1 , can be estimated using Gumbel’s r -th characteristic extreme theory [38,39,40]:
G X r X r + f i d i = r A
By solving this equation, it can be derived that the productivity of the most productive journal, X 1 , can be expressed as:
X 1 = A Γ ρ + 1 1 ρ = ρ 1 1 ρ y m ρ + 1 ρ
The productivity of the r -th most productive journal, X r , is related to X 1 by X r = X 1 r 1 / ρ . Figure 2 compares Gumbel’s r -th characteristic extreme values with the medians of the simulation results. While Gumbel’s theory effectively predicts the largest paper number X 1 , it falls short in estimating other paper numbers in the core zone X r , for r = 2 ,   3 ,   ,   T 0 . Thus, an alternative method is assumed in this paper, where all other X r , for r = 2 ,   3 ,   ,   T 0 , are related to the largest paper number X 1 through the equation:
X 1 X r = k r 1 + 1
where k is the only parameter waiting to be determined. The validity of this equation is supported by Figure 2b, where the blue circles represent the simulation results, and the blue dashed lines show the linear fitting results. Hence, the productivity of the r -th most productive journal can be derived from Equation (7), and the cumulative productivity of the first r most productive journals can be written as:
R c r = i = 1 r X 1 k i 1 + 1
If there are T 0 journals with A 0 papers in the core region and the numbers of T 0 and A 0 are known, then the parameter k can be calculated from the equation R c T 0 = A 0 . Then, Equation (8) can be used to predict the evolution of the core regions ( r T 0 ) of Bradford curves.

2.2. Leimkuhler’s Function

After removing the T 0 journals and the A 0 papers of the core region, the remaining T 1 journals and A 1 papers align well with the theoretical results predicted by Equation (1). Consequently, they follow Lotka’s law, and their Bradford curve can be predicted using the revised Leimkuhler’s function [20]:
R r 1 = a l o g 1 + b r 1
where the key parameters a and b are defined as:
a = A 1 l o g e γ y m
b = e γ y m 1 T 1
where γ is the Euler–Mascheroni constant, γ 0.5772 , and y m is the journal productivity when the corresponding theoretical journal number f t y m 1 . While y m can be directly calculated from Equation (2), it can also be estimated using the following equation if the values of X 1 , T 0 and A 0 are known:
y m X 1 k T 0 1 + 1
Since the core region’s journal productivity is higher than the normal region’s, these significant journals rank lower. Consequently, the Bradford curve for the normal region starts at the point T 0 ,     A 0 . Each rank r 1 in the normal region should be transformed to r = r 1 + T 0 , and the cumulative productivity of the first r journals R r should be transformed into R n r = R r 1 + A 0 . The revised Leimkuhler’s function for the normal region is then written as:
R n r = R r T 0 + A 0 = a l o g 1 + b r T 0 + A 0
Equation (13) can be used to predict the dynamic evolution of the normal regions ( T 0 < r < T ) of the Bradford curve. Therefore, Equations (8) and (13) together can be used to predict the dynamic evolution of the Bradford curves.
Figure 3a displays the Bradford curve. The blue circles represent the normal zone, while the red squares indicate the core zone. The blue dashed lines show the prediction results of Equation (13), and the red dotted lines show the prediction results of Equation (8). The black upper triangle, the diamond, and the lower triangle represent the points 1 , X 1 , T 0 , A 0 , and the T , A , respectively. From the discussion above, it is evident that these three points and two lines are crucial for predicting the evolution of the Bradford curve.

2.3. Groos Droop

Groos was the first to observe that in some datasets, when the journal productivity is low, the Bradford curve tends to bend downwards [10]. Egghe [22] explained the cause of the Groos droop as merging datasets. However, this section demonstrates that the core region’s existence causes the Groos droop in the normal region.
The first and second derivatives of R c r for the core region can be derived from Equation (8):
R c r l o g r = X 1 r k r 1 + 1
2 R c r l o g r 2 = X 1 1 k r k r 1 + 1 2
From Equation (15), it can be noted that when k > 1 , 2 R c r l o g r 2 < 0 , the Bradford curve for the core region concaves downwards. Conversely, when k < 1 , 2 R c r l o g r 2 > 0 , the curve concaves upwards. As the entry rate of new journals α increases, the number of journals T rises, and the distribution of articles become more dispersed, leading to a decrease in the largest journal productivity X 1 . From Equation (8) and R c T 0 = A 0 , we know that a lower X 1 results in a lower k if A 0 is relatively constant. Therefore, as α increases, the Bradford curve for the core region will gradually concave upwards, as shown in Figure 3.
Similarly, the first and second derivatives of R n r for the normal region can be derived from Equation (13):
R n r l o g r = a b r b r T 0 + 1
2 R n r l o g r 2 = a b 1 b T 0 r b r T 0 + 1 2
From Equation (17), it can be noted that when T 0 > 1 / b , 2 R c r l o g r 2 < 0 , the Bradford curve for the normal region will concave downwards, showing a Groos droop. When T 0 < 1 / b , 2 R c r l o g r 2 > 0 , the curve will concave upwards, forming a J-shaped curve. As the entry rate of new journals α increases, Figure 3a shows that the T 0 will eventually fall below 1 / b , causing the Bradford curve for the normal region to concave upwards, similar to the core region.
Figure 3b illustrates the variation of key parameters T 0 , 1 / b and k with the entry rate α . When A = 10 4 , the normal region will start to concave upwards at critical point α n 0.2 , while the core region will do so at α c 0.3 . Therefore, when α < 0.2 , the entire Bradford curve will concave downwards; when α > 0.3 , it will concave upwards. For 0.2 < α < 0.3 , the Bradford curve will exhibit a reversed S-shape, with the core region concaving downwards and the normal region concaving upwards. Figure 3a shows the three shapes of Bradford curves. In this specific case, since α n < α c , there is no S-shaped Bradford curve. This is because the aging of journals is not considered here, making the largest journal productivity X 1 relatively large. When aging effects are considered, which will be discussed in detail in Section 3.2, X 1 decreases significantly, which results in a lower k and thus a much lower α c . If α c < α n , an S-shaped Bradford curve will appear for α c < α < α n , with the core region concaving upwards and the normal region concaving downwards.

2.4. Bradford Dynamics

Given the analytical expression of T 0 , A 0 and X 1 (Equations (3), (4) and (6)), the core region of the Bradford curve at any time can be predicted using Equation (8). The parameters for the normal regions can be derived from these factors through T 1 = T T 0 , A 1 = A A 0 and Equation (12), allowing the normal region of the Bradford curve to be predicted using Equation (13).
The Bradford curves for the constant entry rate scenario are shown in Figure 4a. Here, the red squares and blue circles represent the simulation results for the core and the normal regions, respectively, while the red dashed lines and the blue dotted lines represent the theoretical results of Equations (8) and (13). The key points 1 , X 1 , T 0 , A 0 , and T , A are also shown as the black upper triangle, the diamond and the lower triangle in Figure 4a. It is notable that although the core region contains far fewer journals than the normal region, its representation is significant due to the x-axis’s log scale. Consequently, journals with lower ranks are better represented in Figure 4a.
Figure 3 shows that when α = 0.15 , the entire Bradford curve concaves downwards for 10 3 < A < 10 4 . Figure 4a depicts the evolution of Bradford curves as the paper number A increases from 10 3 to 10 4 , aligning well with theoretical predictions.
Figure 4b presents the simulation and the analytical results for T 0 , A 0 and X 1 . Hollow symbols represent the simulation results, while solid symbols denote the analytical ones. It can be observed that these three key factors are linear functions of the paper number A , as indicated by Equation (18):
l o g Y = a ρ + b ρ l o g A
where the constants a ρ and b ρ are functions of ρ . Since ρ is approximately one, a ρ and b ρ can be considered constants. The close match between the analytical and the numerical results confirms the validity of Equations (3), (4) and (6). The analytical result for T 0 is slightly lower than the numerical ones because, in the numerical results, all journals with productivity n that satisfy f n = 1 are considered part of the core region; whereas, in theory, some of them belong to the normal region.

3. Numerical Study

3.1. Decreasing Entry Rate

Assume the probability of adding a new journal decreases linearly over the time
α t = α s k t
where k is a constant, k = α s α f / A f , where α f and A f are the entry rate of new journals and the total number of articles in the final state, respectively. The accumulated number of journals can then be expressed as:
T = t = 1 A α t = α s A 1 2 k A 2
Using Equation (20), a quadratic fitting of T and A allows us to determine the values of α s and α f . Once these are known, the average entry rate α ¯ = α s + α f / 2 can be used to calculate the analytical results.
Figure 5 shows the dynamic evolution of Bradford curves and the variations of key parameters when the entry rate decreases linearly from 0.2 to 0.1. The proposed method effectively predicts these variations, with analytical results using α ¯ matching well with the simulation results. Although the numerical results for A 0 and X 1 are slightly lower than the analytical ones, this suggests that a decreasing entry rate has a slight negative impact on their increase. However, the effect on the overall shape of the Bradford curve and key parameters is relatively insignificant, indicating that the analytical results for a constant entry rate α ¯ can still be used to predict key parameters without significant errors.

3.2. Aging Rate of Journals

Simon assumes that only one paper gets published in each time period and models the probability of a journal increasing in paper number as proportional to a weighted sum of its past increments. These increments are weighted by a factor that decreases geometrically over time, with the rate of decrease denoted as γ .
Let y j k represent the change in the paper number of the j -th journal during the k -th time interval, where y j k is either 1 (indicating a unit increment) or 0 (indicating no change). The paper number of the j -th journal at the end of the k -th interval is given by τ = 1 k y j τ . The expected increment in the paper number during the k + 1 -th interval is:
p y j k + 1 = 1 = 1 W k τ = 1 k y j τ γ k τ
where W k is a time-dependent function consistent across all journals, defined as W k = j = 1 T w j k with w j k = τ = 1 k y j τ γ k τ . The parameter γ determines how quickly the influence of past growth is diminished and is thus referred to as the aging rate of journals in this paper.
Figure 6 illustrates the impact of the aging rate of journals γ on the dynamics of Bradford curves and key parameters. It shows that the aging factor increases T 0 , making the normal region concave downwards more. The aging effect also markedly reduces X 1 by weakening the Mathew effect, where successful journals attract more papers. As older journals lose their appeal, this “success breeds success” effect diminishes, leading to a substantial decrease in X 1 , as shown in Figure 6a. Consequently, while the number of articles A 0 in the core zone remains relatively unchanged, T 0 must increase to offset the reduction in X 1 . This expansion of the core region increases its share of the Bradford curve, shaping it into a J-shape due to the reduction of k and X 1 , as discussed in Section 2.3. Additionally, with the increase in T 0 , it is more likely that T 0 will exceed 1 / b , further contributing to the concave downwards shape of the normal region. In summary, the aging effect of journals facilitates the Bradford curve to more easily adopt an S-shape.

3.3. Varying Entry and Aging Rates

Figure 7 shows the impact of varying entry and aging rates on Bradford curves. In real-world scenarios, both the entry rate and the aging rate often change steadily. For example, the entry rate might decrease linearly from 0.2 to 0.1, while the aging rate increases linearly from 0.95 to 1.0, and the simulation results are shown in Figure 7. Comparing Figure 5, Figure 6 and Figure 7 reveals that the effects of decreasing entry rate and increasing aging rate are similar to those observed with constant entry and aging rates (Figure 6). However, in Figure 7b, both the article number A 0 and journal number T 0 are even lower compared to Figure 6b, indicating that the decreasing entry rate further exacerbates their decrease. Consequently, the normal region of Bradford curve becomes less concave downwards, and its starting point on the y-axis is notably lower. Importantly, all three key factors continue to show linear relationships with the article number, suggesting that Equation (18) remains useful for predicting them.

4. Empirical Study

4.1. Dataset of Croatian Chemistry Research

Oluić-Vuković used the research output in chemistry by authors from Croatia to prepare full bibliographic references for a ten-year period [28]. This dataset includes only articles published in journals, comprising 2543 papers across 416 journals over a decade. The productivity of the top few (fewer than 10) most prolific journals was taken directly from Figure 1 in [28], while the productivity of other journals was taken from Tables 4 and 6 in [25].
In these tables, the journal productivity n and the number of journals f n are recorded for nine cumulative time periods. For journals with higher productivity (where n correspond to f n = 1 ), Oluić-Vuković grouped the data into ranges (e.g., n 77 ), which limits the precision of the distribution for high-producing journals. To accurately represent this higher productivity range, we relied on the detailed Bradford curves provided in [28], adjusting the data to match the total number of journals T and articles A as indicated in Table 2 of [28].
As an example, using the data from the ninth interval (also found in Table 1 of reference [27]), we identified the highest productivity level X 1 = 301 by observing the last value in the sequence of n . The dividing point between the core and the normal zones was set at n = 24 based on the fact that for n < 24 , most values of f ( n ) were greater than 1, and n values were continuous; while for n > 24 , f ( n ) showed irregular gaps with isolated high n values and lower average journal counts.
The journal count T 0 and the article count A 0 for the core zone were calculated by summing the respective f ( n ) and n · f n for n 24 . Similarly, the total number of journals T and articles A were obtained by summing the values across all n, ensuring consistency with the overall dataset.
To predict the dynamics of the Bradford curve, the process begins by predicting the variation of the total article number A t over time t . Logistic regression analysis [41] was applied to the empirical data to predict the total number of articles A t at any given time, as shown in Figure 8a. Logistic regression was applied using MATLAB to model the growth of the article output based on empirical data, providing a reliable prediction of future article numbers.
Next, the total journal number T and the entry rate of new journals α were estimated by plotting the total journal number T against the total article number A and a linear fit of T = A α was applied, as shown in Figure 8b. The linear fit, with an R-squared value of 0.98, confirmed a strong relationship between T and A , providing insights into how new journals enter the system as article output increases.
Once the point T , A is determined for any time, linear regression of Equation (18) is used to determine the three key parameters T 0 , A 0 , and X 1 on a log-log axis, with the fitting results shown in Figure 9a. These parameters reveal the structure of the core and the normal regions of the Bradford distribution.
Finally, based on the three key points T 0 , A 0 , 1 , X 1 , and T , A , Equations (8) and (13) were used to model the Bradford curve for the core region and the normal region, respectively. The results, illustrated as red dashed lines and blue dotted lines in Figure 9b, show the gradual transition in the curve’s shape. This approach allows librarians and practitioners to model journal productivity trends, helping to optimize collection development strategies and predict journal coverage effectively.
The Bradford curves shown in Figure 9b align closely with the empirical data, demonstrating a strong match between the predicted and the observed outcomes. Notably, the Bradford curve transitions gradually from a J-shape to an S-shape, a transformation that is accurately captured by the analytical predictions.

4.2. Dataset of Solar Power Research

The bibliographies on solar power research for the years 1971, 1974, 1977, 1980, 1983, and 1986 were compiled by Garg et al. [29], encompassing papers published in journals from the Engineering Index. The data for this analysis were directly extracted from Tables 1–7 of the referenced study. Unlike the Croatian Chemistry dataset, Garg’s dataset is clearer, as it records every journal’s exact productivity n without grouping high-productivity journals.
The dataset is divided into six cumulative periods: 1971, 1971 + 1974, 1971 + 1974 + 1977, 1971 + 1974 + 1977 + 1980, 1971 + 1974 + 1977 + 1980 + 1983, and 1971 + 1974 + 1977 + 1980 + 1983 + 1986. Although the data skips certain years (e.g., 1972 and 1973), this does not affect the Bradford curve calculation. However, when predicting article growth, care must be taken to sum the predicted outputs for each year to match the cumulative totals.
The data processing followed the same four-step method as the Croatian Chemistry dataset. First, logistic regression was applied to predict the cumulative article numbers A t over time t , with data from Table 7 used to estimate the article numbers for each desired interval, as shown in Figure 10a. This step involves fitting the empirical data and adjusting for the cumulative nature of the dataset, ensuring accurate predictions for the individual years within each period.
Next, quadratic fitting (Equation (20)) was applied to the journal and article pairs to estimate the total number of journals T and the entry rate of new journals α , as shown in Figure 10b. Logistic fitting was applied to better capture the effect of a linearly decreasing journal entry rate, providing a more accurate representation of how new journals enter the system as article output increases.
The key parameters T 0 , A 0 , and X 1 were then predicted using linear fitting on the log-log axis, based on the empirical data from the earlier steps. This method, depicted in Figure 11a, was used to determine these parameters at each cumulative time point, allowing the segmentation of the core and the normal regions of the Bradford curve.
Finally, using Equations (8) and (13), Bradford curves for the core and the normal regions were plotted, as shown in Figure 11b. As with the Croatian Chemistry dataset, determining the highest productivity level X 1 , the core zone journal count T 0 , and the core zone article count A 0 was straightforward in Garg’s dataset due to the clear distribution of journals and articles. This simplicity contrasted with the Croatian dataset, where grouping high-productivity journals added complexity.
Figure 9b and Figure 11b demonstrate that while the proposed method can predict the general trend of the Bradford curves, there are inherent errors. These errors arise because Bradford’s law inherently contains uncertainties. Numerical studies reveal that the article numbers for each journal rank have a large standard deviation, making it practically impossible to predict the precise shape of the Bradford curve. Additionally, the various fitting procedures introduce errors into the process. Therefore, this method can only predict the general trend of Bradford dynamics but cannot accurately predict the article number for each journal at any given time.
Another issue with this method is that the first derivatives of the core region (Equation (14)) and the normal region (Equation (16)) differ at the T 0 , A 0 point, resulting in a non-smooth analytical curve at the intersection point. In contrast, the numerical simulation results are smooth throughout. This problem could be addressed by proposing more complex formulas for the normal region, but this would complicate the overall method. Given the difficulty in accurately predicting Bradford curve dynamics, this aspect is not explored further in this paper.

5. Discussion

5.1. Theoretical Contributions

This study makes contributions in two areas: advancing bibliometric modeling and simulation and addressing extreme value phenomena. Both are critical to understanding the dynamics of scientific knowledge production.
  • Modeling and Simulation of Science or Publications
Modeling and simulations in bibliometrics help uncover micro-level behaviors that drive broader trends, such as publication growth and shifts in core literature [42]. These methods offer controlled environments to reduce data biases and explore extreme scenarios, providing valuable insights for decision-making. While fields like network science, agent-based modeling (ABM), and diffusion models have been widely used in science modeling and simulation [43], the contributions of bibliometric research in these areas have not been fully explored. This study bridges that gap by incorporating models like Bradford’s law into the broader science dynamics framework, showing their value in understanding knowledge production over time.
2.
Extreme Value and Non-Lotkaian informetrics
This study addresses extreme value phenomena—highly productive authors, highly cited papers, and influential journals—that traditional models like Lotkaian informetrics struggle to capture [44,45]. These extreme cases deviate from typical productivity patterns, necessitating new theoretical approaches. By applying the extreme value theory, order statistics, and the Simon–Yule model, this research provides a more accurate model for the core zone of Bradford curves. These rare but influential entities shape scientific output in ways that Lotkaian informetrics cannot fully explain. To address this, the study introduces a new framework—non-Lotkaian informetrics—which better accounts for extreme values and extends beyond bibliometrics to areas like patent analysis and network studies.

5.2. Practical Applications

Despite bundled licensing agreements, Bradford’s law helps academic libraries evaluate their coverage rate in specific disciplines, offering a quantitative framework to estimate the relationship between acquired and available journals for informed decision-making.
  • Journal Coverage Evaluation and Resource Optimization
Bradford’s law is a key tool for evaluating journal collection coverage and optimizing resources in academic libraries. By using the Bradford curve, libraries can assess how well their subscriptions cover a specific discipline, focusing on high-impact journals like those in SCI or Scopus while deprioritizing less essential ones such as predatory journals. This method helps libraries understand their coverage rate in particular fields, ensuring comprehensive collections by focusing on core journals and guiding effective weeding strategies to retire low-usage ones [46,47,48].
Additionally, Bradford’s law supports resource optimization by helping libraries periodically evaluate journal and database subscriptions, even when subscriptions appear stable year-to-year. This assessment not only aids in making informed decisions about renewals and expansions, ensuring continued coverage of core journals as disciplines evolve, but also provides valuable insights for developing and refining Collection Development Policies (CDP). By using Bradford’s law as a reference, libraries can better balance budget efficiency with evolving research needs, ensuring that their collections remain comprehensive and aligned with institutional priorities [49,50].
2.
Broader Applications in Academic Libraries
Bradford’s law provides a versatile framework for assessing academic impact by classifying papers based on citation counts and authors by their h-index. It ranks papers into zones of influence, similar to how top-tier journals are grouped in JCR rankings, helping researchers identify the most influential papers in their field [51]. The same principle applies to authors, where those with higher h-indices are placed in the core zone, offering insights into both productivity and influence [52]. This broader application not only enhances support for impactful research but also strengthens the academic ecosystem by providing valuable insights into scholarly influence.

6. Conclusions

This paper examines how integer constraints on the number of journals T and articles A affect the shape of Bradford curve, dividing it into two distinct zones: the core zone and the normal zone, based on the significance of these integer effects. Using the Simon–Yule model, we derive analytical results for key parameters and distributions under a constant entry rate. Theoretical formulas for each zone are developed, and the reasons behind the various shapes of Bradford curves are analyzed. Monte Carlo simulations are employed to study the impact of decreasing entry rates of new journals and aging rates of journals on the shape of the Bradford curve and the key parameters. Finally, we validate our proposed method using empirical data from the Croatian Chemistry and the Solar Power research datasets. The main conclusions are:
  • Bradford curves should be divided into two separate zones based on the significance of integer constraints on journal and article numbers. Different formulas for each zone should be derived separately;
  • Bradford curves can exhibit four different shapes, determined by the second derivatives of the core and the normal zones;
  • The largest productivity X 1 , the number of journals T 0 , and the number of articles A 0 are key parameters influencing the shapes of Bradford curves. Decreasing entry rates and aging rates of journals affect these parameters.
  • The proposed four-step method can predict general trends in Bradford curves despite some errors.
  • Bradford’s law provides a valuable framework for academic libraries to evaluate journal coverage, optimize resource allocation, and refine Collection Development Policies (CDP), ensuring comprehensive and well-balanced collections as research needs evolve.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/publications12040036/s1, file: Data and Code.

Funding

This research was supported by two projects: the 2024 Shenzhen Library and Information Science Research Project, “Temporal Evolution of Bradford’s Law in the Context of Library Professionalization” (Project No. SWTQ2024261), and the 2023 Guangdong Provincial Library Key Research Project, “Joint Analysis and Data Governance of Papers and Patents in the Context of Smart Libraries” (Project No. GDTK23004).

Data Availability Statement

All relevant data and code used in this study are provided in the Supplementary Materials, organized under the “Data and Code” folder. The code, developed in MATLAB, is available to support the reproducibility of the analysis. Due to the complexity and organization of the code, we recommend that researchers utilize tools such as AI-assisted analyzers (e.g., ChatGPT) to facilitate the interpretation of the code. We encourage other researchers to replicate the findings, and for any further inquiries or clarification, the corresponding author can be contacted directly.

Conflicts of Interest

The author declares no conflicts of interest.

References

  1. Singh, C.K.; Barme, E.; Ward, R.; Tupikina, L.; Santolini, M. Quantifying the rise and fall of scientific fields. PLoS ONE 2022, 17, e0270131. [Google Scholar] [CrossRef] [PubMed]
  2. Golosovsky, M. Citation Analysis and Dynamics of Citation Networks; Springer: Cham, Switzerland, 2019. [Google Scholar]
  3. Sun, X.; Kaur, J.; Milojević, S.; Flammini, A.; Menczer, F. Social dynamics of science. Sci. Rep. 2013, 3, 1069. [Google Scholar] [CrossRef] [PubMed]
  4. Leydesdorff, L. The Evolutionary Dynamics of Discursive Knowledge: Communication-Theoretical Perspectives on An Empirical Philosophy of Science; Springer Nature: Cham, Switzerland, 2021. [Google Scholar]
  5. Sjögårde, P. Mapping the Structure of Science through Clustering in Citation Networks: Granularity, Labeling and Visualization; Karolinska Institutet: Stockholm, Sweden, 2023. [Google Scholar]
  6. Raimbault, J. Exploration of an interdisciplinary scientific landscape. Scientometrics 2019, 119, 617–641. [Google Scholar] [CrossRef]
  7. Schneider, J.W.; Costas, R. Identifying potential “breakthrough” publications using refined citation analyses: Three related explorative approaches. J. Assoc. Inf. Sci. Technol. 2017, 68, 709–723. [Google Scholar] [CrossRef]
  8. Manolopoulos, Y.; Vergoulis, T. Predicting the Dynamics of Research Impact; Springer: Cham, Switzerland, 2021. [Google Scholar]
  9. Larivière, V.; Archambault, É.; Gingras, Y. Long-term variations in the aging of scientific literature: From exponential growth to steady-state science (1900–2004). J. Am. Soc. Inf. Sci. Technol. 2008, 59, 288–296. [Google Scholar] [CrossRef]
  10. Groos, O.V. Brief Communications Bradford’s Law and the Keenan-Atherton Data. Am. Doc. (Pre-1986) 1967, 18, 46. [Google Scholar] [CrossRef]
  11. Wagner-Döbler, R. Time dependencies of Bradford distributions: Structures of journal output in 20th-century logic and 19th-century mathematics. Scientometrics 1997, 39, 231–252. [Google Scholar] [CrossRef]
  12. Chisaba Pereira, C.A. Data-driven model (DDM) for collection development and management: From library data to institutional value generation. Qual. Quant. Methods Libr. 2021, 10, 177–187. [Google Scholar]
  13. Crawford, L.S.; Condrey, C.; Avery, E.F.; Enoch, T. Implementing a just-in-time collection development model in an academic library. J. Acad. Librariansh. 2020, 46, 102101. [Google Scholar] [CrossRef]
  14. Voorbij, H.; Lemmen, A. Coverage of periodicals in national deposit libraries. Ser. Rev. 2007, 33, 40–44. [Google Scholar] [CrossRef]
  15. Zaghmouri, L. The First of Its Kind: Collection Development Techniques for the Vasche Library’s Modern Assyrian Heritage Collection. Collect. Manag. 2023, 48, 48–55. [Google Scholar] [CrossRef]
  16. Mwilongo, K.J.; Luambano, I.; Lwehabura, M.J.F. Collection development practices in academic libraries in Tanzania. J. Librariansh. Inf. Sci. 2020, 52, 1152–1168. [Google Scholar] [CrossRef]
  17. Bradford, S.C. Sources of information on specific subjects. Engineering 1934, 137, 85–86. [Google Scholar]
  18. Vickery, B.C. Bradford’s law of scattering. J. Doc. 1948, 4, 198–203. [Google Scholar] [CrossRef]
  19. Leimkuhler, F.F. The bradford distribution. J. Doc. 1967, 23, 197–207. [Google Scholar] [CrossRef]
  20. Egghe, L. Applications of the theory of Bradford’s law to the calculation of Leimkuhler’s law and to the completion of bibliographies. J. Am. Soc. Inf. Sci. 1990, 41, 469–492. [Google Scholar] [CrossRef]
  21. Qiu, L.; Tague, J. Complete or incomplete data sets. The Groos droop investigated. Scientometrics 1990, 19, 223–237. [Google Scholar] [CrossRef]
  22. Egghe, L.; Rousseau, R. Reflections on a deflection: A note on different causes of the Groos droop. Scientometrics 1988, 14, 493–511. [Google Scholar] [CrossRef]
  23. Chen, Y.-S.; Leimkuhler, F.F. Bradford’s law: An index approach. Scientometrics 1987, 11, 183–198. [Google Scholar] [CrossRef]
  24. Egghe, L. Consequences of Lotka’s Law for the Law of Bradford. J. Doc. 1985, 41, 173–189. [Google Scholar] [CrossRef]
  25. Oluić-Vuković, V. Simon’s generating mechanism: Consequences and their correspondence to empirical facts. J. Am. Soc. Inf. Sci. 1998, 49, 867–880. [Google Scholar] [CrossRef]
  26. Oluić-Vuković, V. Impact of productivity increase on the distribution pattern of journals. Scientometrics 1989, 17, 97–109. [Google Scholar] [CrossRef]
  27. Oluić-Vuković, V. The shape of the distribution curve: An indication of changes in the journal productivity distribution pattern. J. Inf. Sci. 1991, 17, 281–290. [Google Scholar] [CrossRef]
  28. Oluić-Vuković, V. Journal productivity distribution: Quantitative study of dynamic behavior. J. Am. Soc. Inf. Sci. 1992, 43, 412–421. [Google Scholar] [CrossRef]
  29. Garg, K.; Sharma, P.; Sharma, L. Bradford’s law in relation to the evolution of a field. A case study of solar power research. Scientometrics 1993, 27, 145–156. [Google Scholar] [CrossRef]
  30. Sen, S.; Chatterjee, S. Bibliographic scattering and time: An empirical study through temporal partitioning of bibliographies. Scientometrics 1998, 41, 135–154. [Google Scholar] [CrossRef]
  31. Oluić-Vuković, V. Bradford’s distribution: From the classical bibliometric “law” to the more general stochastic models. J. Am. Soc. Inf. Sci. 1997, 48, 833–842. [Google Scholar] [CrossRef]
  32. Simon, H.A. On a class of skew distribution functions. Biometrika 1955, 42, 425–440. [Google Scholar] [CrossRef]
  33. Simon, H.A.; Van Wormer, T.A. Some Monte Carlo estimates of the Yule distribution. Behav. Sci. 1963, 8, 203–210. [Google Scholar] [CrossRef]
  34. Ijiri, Y.; Simon, H.A. Skew Distributions and the Sizes of Business Firms; North-Holland: Amsterdam, The Netherlands, 1977; Volume 24. [Google Scholar]
  35. Chen, Y.-S.; Chong, P.P.; Tong, M.Y. The Simon-Yule approach to bibliometric modeling. Inf. Process. Manag. 1994, 30, 535–556. [Google Scholar] [CrossRef]
  36. Chen, Y.S.; Chong, P.P.; Tong, M.Y. Dynamic behavior of Bradford’s law. J. Am. Soc. Inf. Sci. 1995, 46, 370–383. [Google Scholar] [CrossRef]
  37. Chen, Y.-S. Analysis of Lotka’s law: The Simon-Yule approach. Inf. Process. Manag. 1989, 25, 527–544. [Google Scholar] [CrossRef]
  38. Glänzel, W. High-end performance or outlier? Evaluating the tail of scientometric distributions. Scientometrics 2013, 97, 13–23. [Google Scholar] [CrossRef]
  39. Glänzel, W. The role of the h-index and the characteristic scores and scales in testing the tail properties of scientometric distributions. Scientometrics 2010, 83, 697–709. [Google Scholar] [CrossRef]
  40. Gumbel, E.J. Statistics of Extremes; Echo Point Books & Media: Brattleboro, VT, USA, 1958. [Google Scholar]
  41. Verhulst, P. Notice on the law that the population follows in its growth. Corresp. Math. Phys. 1838, 10, 113–126. [Google Scholar]
  42. Gilbert, N. A simulation of the structure of academic science. Sociol. Res. Online 1997, 2, 91–105. [Google Scholar] [CrossRef]
  43. Scharnhorst, A.; Börner, K.; Van den Besselaar, P. Models of Science Dynamics: Encounters between Complexity Theory and Information Sciences; Springer Science & Business Media: Berlin/Heidelberg, Germany, 2012. [Google Scholar]
  44. Egghe, L. Power Laws in the Information Production Process: Lotkaian Infometrics; Academic Press: Cambridge, MA, USA, 2005. [Google Scholar]
  45. Burrell, Q.L. Extending Lotkaian informetrics. Inf. Process. Manag. 2008, 44, 1794–1807. [Google Scholar] [CrossRef]
  46. Goffman, W.; Morris, T.G. Bradford’s law and library acquisitions. Nature 1970, 226, 922–923. [Google Scholar] [CrossRef]
  47. Alabi, G. Bradford’s law and its application. Int. Libr. Rev. 1979, 11, 151–158. [Google Scholar] [CrossRef]
  48. Drott, M.C.; Mancall, J.C.; Griffith, B.C. Bradford’s law and libraries: Present applications—Potential promise. Aslib Proc. 1979, 31, 296–304. [Google Scholar] [CrossRef]
  49. Tague, J. What’s the Use of Bibliometrics? 1988. Available online: https://documentserver.uhasselt.be/bitstream/1942/846/1/tague271.pdf (accessed on 12 October 2024).
  50. Xu, F. A standard procedure for Bradford analysis and its application to the periodical literature in systems librarianship. Libr. Hi Tech 2011, 29, 751–763. [Google Scholar] [CrossRef]
  51. Cline, G.S. Application of Bradford’s Law to citation data. Coll. Res. Libr. 1981, 42, 53–61. [Google Scholar] [CrossRef]
  52. Bornmann, L. How are excellent (highly cited) papers defined in bibliometrics? A quantitative analysis of the literature. Res. Eval. 2014, 23, 166–173. [Google Scholar] [CrossRef]
Figure 1. Comparisons of the theoretical and numerical results: (a) number of journals f n with productivity n ; (b) number of papers n f n produced by journals with productivity n .
Figure 1. Comparisons of the theoretical and numerical results: (a) number of journals f n with productivity n ; (b) number of papers n f n produced by journals with productivity n .
Publications 12 00036 g001
Figure 2. Journal productivity in the core region X r as a function of journal rank r : (a) journal productivity X r as a function of r ; (b) journal productivity ratio X 1 / X r as a function of r .
Figure 2. Journal productivity in the core region X r as a function of journal rank r : (a) journal productivity X r as a function of r ; (b) journal productivity ratio X 1 / X r as a function of r .
Publications 12 00036 g002
Figure 3. The evolution of the Bradford curves and the cause of the Groos droop: (a) the evolution of the Bradford curves; (b) the cause of the Groos droop.
Figure 3. The evolution of the Bradford curves and the cause of the Groos droop: (a) the evolution of the Bradford curves; (b) the cause of the Groos droop.
Publications 12 00036 g003
Figure 4. The dynamics of the Bradford curves and the variation of key parameters when α = 0.15 : (a) the dynamics of the Bradford curves; (b) the variation of key parameters.
Figure 4. The dynamics of the Bradford curves and the variation of key parameters when α = 0.15 : (a) the dynamics of the Bradford curves; (b) the variation of key parameters.
Publications 12 00036 g004
Figure 5. The dynamics of the Bradford curves and the variation of key parameters when α decreases linearly from 0.2 to 0.1: (a) the dynamics of the Bradford curves; (b) the variation of key parameters.
Figure 5. The dynamics of the Bradford curves and the variation of key parameters when α decreases linearly from 0.2 to 0.1: (a) the dynamics of the Bradford curves; (b) the variation of key parameters.
Publications 12 00036 g005
Figure 6. The dynamics of the Bradford curves and the variation of key parameters when α = 0.15 and γ = 0.95 : (a) the dynamics of the Bradford curves; (b) the variation of key parameters.
Figure 6. The dynamics of the Bradford curves and the variation of key parameters when α = 0.15 and γ = 0.95 : (a) the dynamics of the Bradford curves; (b) the variation of key parameters.
Publications 12 00036 g006
Figure 7. The dynamics of the Bradford curves and the variation of key parameters when α decreases linearly from 0.2 to 0.1 and γ increases linearly from 0.95 to 1.0: (a) the dynamics of the Bradford curves; (b) the variation of key parameters.
Figure 7. The dynamics of the Bradford curves and the variation of key parameters when α decreases linearly from 0.2 to 0.1 and γ increases linearly from 0.95 to 1.0: (a) the dynamics of the Bradford curves; (b) the variation of key parameters.
Publications 12 00036 g007
Figure 8. The process of determining the point T , A for any given time: (a) the total article number A t as a function of time t ; (b) the total journal number T as a function of the article number A .
Figure 8. The process of determining the point T , A for any given time: (a) the total article number A t as a function of time t ; (b) the total journal number T as a function of the article number A .
Publications 12 00036 g008
Figure 9. The procedures for predicting the evolution of the Bradford curves: (a) the variation of key parameters T 0 , A 0 , and X 1 with the article number A ; (b) the dynamics of the Bradford curves.
Figure 9. The procedures for predicting the evolution of the Bradford curves: (a) the variation of key parameters T 0 , A 0 , and X 1 with the article number A ; (b) the dynamics of the Bradford curves.
Publications 12 00036 g009
Figure 10. The process of determining the point T , A for any given time: (a) the total article number A t as a function of time t ; (b) the total journal number T as a function of the article number A .
Figure 10. The process of determining the point T , A for any given time: (a) the total article number A t as a function of time t ; (b) the total journal number T as a function of the article number A .
Publications 12 00036 g010
Figure 11. The procedures for predicting the evolution of the Bradford curves: (a) the variation of key parameters T 0 , A 0 , and X 1 with the article number A ; (b) the dynamics of the Bradford curves.
Figure 11. The procedures for predicting the evolution of the Bradford curves: (a) the variation of key parameters T 0 , A 0 , and X 1 with the article number A ; (b) the dynamics of the Bradford curves.
Publications 12 00036 g011
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Xue, H. Temporal Evolution of Bradford Curves in Academic Library Contexts. Publications 2024, 12, 36. https://doi.org/10.3390/publications12040036

AMA Style

Xue H. Temporal Evolution of Bradford Curves in Academic Library Contexts. Publications. 2024; 12(4):36. https://doi.org/10.3390/publications12040036

Chicago/Turabian Style

Xue, Haobai. 2024. "Temporal Evolution of Bradford Curves in Academic Library Contexts" Publications 12, no. 4: 36. https://doi.org/10.3390/publications12040036

APA Style

Xue, H. (2024). Temporal Evolution of Bradford Curves in Academic Library Contexts. Publications, 12(4), 36. https://doi.org/10.3390/publications12040036

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop