1. Introduction
Prime numbers were a subject of mathematical inquiry long before the time of Pythagoras (570–500 BCE), who first attributed to numbers in general and their relationships a unique—almost mystical—power to express deeper truths about the universe and natural phenomena [
1,
2,
3]. Although Pythagoras and his school did not systematically study primes, we refer to primes that can be expressed as sum of two squares, as “Pythagorean Primes”, e.g.,
,
. The Greek philosopher Heraclitus (540–480 BCE) also accepted the existence of “powerful hidden harmonies” in nature and worked them into his philosophical theory about the cosmos [
4]. The systematic study of primes starts with Euclid (fl. 300 BCE), who famously proved that primes “measure” all other numbers and that there is an infinitude of them [
5,
6]. Eratosthenes (276–194 BCE) invented a sieve as a practical way to find primes [
7] by observing that composite numbers follow repeating patterns. Goldbach (1690–1764), Euler (1707–1783) and Riemann (1826–1866)—among other mathematicians—have grappled with important questions related to prime numbers. One of those questions, Goldbach’s conjecture, states that every even number ≥ 4 can be expressed as a sum of two primes. It remains unproven despite “twenty-six decades of effort by some of the best minds on the planet” ([
7], p. 90) as one of those “conjectural theorems of which no proof is known, although empirical evidence makes their truth seem highly likely” [
8].
The increasing rise of digital communications in recent years, along with the concomitant need for enhanced privacy and secure authentication, has rekindled interest in prime numbers. Several encryption methods employ large primes in the construction of secure cryptographic keys, since the prime factorization of large numbers is a computationally tedious process. Primes therefore help secure the communication of digital information over public networks such as the internet [
9].
Life patterns based on primes also occur in nature. Examples are the cicadas’ 13- and 17-year birth cycles, giving them an advantage over predators by making it rarer to coincide with them. If a predator emerges on a 5-year cycle, it will coincide with the cicadas every 65 and 85 years respectively, giving them an evolutionary advantage. Prime numbers can help explain the phyllotaxis spirals, characteristic spiral arrangement of leaves that maximize a plant’s sunlight exposure.
Scientists continue to study the numerical, algebraic, geometric, distributional, asymptotic and other properties of prime numbers—proved or conjectured—and explore their implications on the nature of our universe.
In this paper, we show that the set P ∪ {1} has the smallest cardinality and values for optimally encoding all integers through product or sum operations. We refer to this unique dual irreducibility property or primes as a quantum type encoding optimality. Finally, we show how these encoding properties can be extended to non-integers.
2. ILP Formulation–Additive Optimality of Prime Factors
Euclid, in his Elements, first proved that any integer, larger than or equal to 2, that is not a prime can be expressed as a product of primes called prime factors ([
5]
Book VII Prop. 30–32, pp. 218–219;
Book IX Prop. 14, p. 265). This result is of such profound importance in mathematics that it has been called the “
Fundamental Theorem of Arithmetic”.
The uniqueness of prime factorization implies that an integer cannot be expressed factorially using a smaller number of prime factors. Primes can therefore be thought of as the “universal factorial quanta” of ℤ≥2, and, in that sense, P is irreducible in a quantum sense; i.e., no subset of ℤ with smaller cardinality or member values exists having the same universally optimal factorial encoding capability in ℤ.
Irreducibility is a fundamental property in science; for mathematics in particular, irreducibility arises in problems related to computability, incompleteness, decidability, algorithmic information theory and the limits of computation [
10]. Generally, when all members of a set A can be expressed as functions of set B, and no set with smaller cardinality than B exists with the same property, then B is called
irreducible. If B⊂A and B is irreducible in cardinality
and values, i.e., if B contains the fewest
and smallest numbers possible, capable of encoding every member of A, we refer to the subset B of A as
optimal in a quantum encoding sense.
Prime factors are also optimal in a factorial-additive sense, as we show below.
Definition 1. For s ∈ ℤ≥2, the Prime Counting Function π(s) is equal to the number of primes less than or equal to s.
Notation. The symbols ℝ, ℤ and ℤ≥2 are used in this paper as representations of the set of real numbers, integers and integers greater or equal to 2, respectively. The nth prime number, in ascending order, is denoted as ; Pn is the set of the first n primes, P of all primes, , is the set of odd primes, and . The integer part of x ∈ ℝ, i.e., the largest integer less than or equal to x, is denoted by ⌊x⌋ and the discrete delta function, taking the values of 1 for x = 0 and 0 for x ≠ 0, is denoted by δ(x). The natural logarithm of x is log(x).
Example 1. π(12) = 5;π(13) =π(14) =π(15) =π(16) = 6 where = {2,3,5,7,11}, =∪ {13}.
Consider an odd
s ∈ ℤ
≥3 with
π(
s) =
n for some
n ∈ ℤ
≥1; then
s may be expressed by its prime factorization
where
∈
,
∈ ℤ
≥0 for
i ∈ [1,
n]. By taking the natural logarithm on both sides of (1)
and since
s ≥
>
n, the above may be written as
or, equivalently, in vector form
where
By applying (2) on each element of
we obtain
where
and
is the
augmented log coefficient matrix of
s. The following example shows how (3) can be used to compactly express the prime factorizations of all odd numbers in [3,
s].
Example 2. Consider s = 9. Using (3), the logarithm of each odd i ∈ [3, 9] can be expressed through prime factorization as a linear product ofas follows: In the above equation, log(9) is expressed through the prime factorization of
, or equivalently
and not via the identity
which would result in
. This is an important distinction, in terms of the additive optimality of prime factors, as we will see below.
We can rewrite (3) as
where I is the (
s − 1)/2 × (
s − 1)/2 identity matrix and
the (
s − 1)/2 × 1 null vector. Equation (4) implies that the vector
is in the null space of matrix
. As we saw in the previous example, this does not give us immediate insight into the optimality of
, since
trivially satisfies (4). To establish the
additive optimality of prime factors we need the following proposition.
Proposition 1. Letsatisfy (4). Ifcorresponds to the prime factorization of, then every element of the (s − 1)/2 × 1 vectorattains its minimum value.
Proof. Since the minimality applies element-wise for vector , we may use (2) to show that the proposition holds for each vector element. Consider any non-prime factorization of a composite, i.e., non-prime, number s > 1 and let q be any composite factor. If s is odd, 2 cannot be one of its factors, and it follows that there is a prime p > 2 and some odd integer q > 2 such that . Using basic algebra, we can show that , i.e., the sum, p + q, of two factors is always strictly less than their product iff which, in this case, is true. Since primes are the smallest factors possible, the proposition is proved. For s even, the same rationale applies with the exception that, in this case, we have p ≥ 2 and q ≥2 so that . Therefore, in this case, the attained additive minimum may not be unique, except for the special case when s/2 is odd. □
Remark 1. The minimality condition of Proposition 1 does not extend to the logarithms of composite factors, since Remark 2. The reason for losing the strict inequality in the proof of the above proposition for s even—and thus, the uniqueness of the minimum—is that the number 2 has the unique property, which means that any factorization havingas a factor, with n ≥ 2, results in different factorizations with equal factor sums. For example, consider, which can be factored either asor, both with a factor sum of 6. Even numbers not divisible by, for n ≥ 2, are not subject to this relaxation. In such case, s/2 is odd, since the multiplicity of 2 in the prime factorization of s is equal to one. As a result, the conditions for the strict inequality in the proof of Proposition 1 hold.
Example 3. Consider s = 450 with prime factorizationand prime factor sum equal to. From the previous discussion, we expect non-prime factorizations to have factor sums larger than that of prime factors:
This example demonstrates how the sum of factors gradually decreases as each factorization includes more prime terms, attaining its unique minimum value (=18) when all factors are prime numbers.
The additive minimality of prime factors, established by Proposition 1, gives us additional insight into how efficiently primes encode integers in a factorial-additive sense.
Example 4. Consider the previous example and suppose we are asked to encode s = 450, using multiplication only, in a way that minimizes the “additive cost” of the factors used to generate it. We are told that each factor has a cost equal to its value and are asked to allocate our factor-cost budget in such a way that 450 is generated factorially at minimum cost. The unique solution to this encoding optimization problem is to generate 450 via its prime factorization at a total additive-cost of 18. The additive cost of every alternate factorization is higher than that of the prime factors, ranging from 19 to 227, as shown in the previous example.
In summary, the subset ⊂ ℤ uniquely encodes all members of ℤ factorially, has the lowest cardinality, the smallest possible numbers as members, with each factorial encoding also optimal in an additive sense.
From (4) and Proposition 1, it follows that prime factorization is the solution to the ILP-type problem
where
.
Equation (5) describes a minimization problem with a linear objective function, linear constraints, and integer decision variables. Here, the decision variables are the elements of the augmented log coefficient matrix .
Remark 3. The last row ofcorresponds to the prime factorization of s, expressed in log form, as in (2).
Observe that the value of element
of matrix
can be constrained by
to further reduce the size of the ILP’s decision space.
Through the ILP formulation described by (5) and (6), we can employ a rich array of computational optimization methods to solve large-scale prime factorization problems.
Remark 4. The above formulation can be used to determine prime factors larger than a given number n. One way to accomplish this is to construct a number s that cannot have any prime factor ≤ n and then solve the corresponding prime factorization ILP. A candidate that meets this condition is where
is the product of all
odd integers up to
n and
If Mn, given by (7), had any prime factors ≤ n, it would follow that would have at least one of those as prime factor; false, since the only prime factor of is 2.
Example 4. To illustrate this point, consider the cases m = 5, m = 13: For much larger n, finding the prime factors requires more sophisticated search and optimization techniques.
Remark 5. Instead of multiplying all odd numbers up to n, as in the previous example, we can multiply the number 3 with all integers not exceeding n, of the form 6·k ± 1 with k ≥ 1, since all primes > 3 may be expressed that way. This results in a smaller candidate compared to . If we know the first m primes and want to search for a prime larger than , we can opt for an even smaller product given by , where i = 2, 3, …, m. We explore additional search techniques in the next section, where we discuss partitions and the summative optimality of primes.
Remark 6. Given that any number s∈ℤ≥2may be written aswhere ∈ℤ≥0 for i∈ [1, n] are the factor exponents and since, by definition, all≥ 2which implies that is an (elementwise) upper bound on the factor-cost vector given by (5). Remark 7. The prime counting function π(·) may be expressed in algebraic form. By using such an expression, we can express Goldbach’s Conjecture (GC) algebraically. In Appendix A, we derive algebraic expressions for π(·) and GC and discuss their properties. 3. Summative Optimality of Primes
Of the many remarkable properties prime numbers possess, perhaps the most intriguing is a “mirroring” property inherent in their distribution on the number line. To visualize this, consider an even number s ≥ 6, and write the odd numbers in [3, s/2], in ascending order, on the left and those in [s/2, s − 3] in reverse order on the right. By construction, the sum of both elements in any row is equal to s. In this otherwise unremarkable arrangement, at least one of the pairs is comprised of two prime numbers, suggesting that any even number ≥ 6 may be expressed as a sum of two odd primes.
Considering that the density of primes π(s)/s decreases as s is increased, the above conjectured “mirroring” property becomes even more surprising: that for any even number s, there is always an odd prime in [s/2, s] that “aligns” itself with another odd prime in [3, s/2], in such a way as to partition s. We will explore the implications of this later in the paper.
Example 5. Let s = 40 and consider all partition pairs (, ) wherewith, odd > 1: (3,37)*, (5,35), (7,33), (9,31), (11,29)*, (13,27), (15,25), (17,23)*, (19,21). There are 3 Po× Po pairs, marked with *.
Remarkably, after removing all repeating number patterns, such as the prime factors of s and their multiples, there is always an odd prime in [s/2, s − 3] that aligns itself with another odd prime in [3, s/2] to form a partition of s. This persistent pattern seems to be a fitting example of an Heraclitean “hidden universal harmony”.
An interesting ramification of the above, is that given an even
s > 6 with
m prime factors, at least one of the
π(
s/2) −
m numbers given by
, where
is any of the primes in [3,
s/2] that are
not factors of
s, is prime. This facilitates finding large primes, as it dramatically reduces looking for prime candidates in [
s/2,
s – 3]. Since a prime in [
s/2,
s − 3] is always paired with a prime number in [3,
s/2] which is
not a prime factor of s, if we want to find a prime larger than
s/2
, we need only perform at most
π(s/2) −
m primality tests. When we have no a priori knowledge of the primes in [1,
s/2] we may still use the algebraic properties of pairs in a partition to make the search for
Po ×
Po pairs more efficient. We discuss this in
Appendix B. Note that a consequence of the prime density decreasing with
s is that—even for large values of
s—primes are not evenly distributed in [1,
s/2] and [
s/2,
s]; this is discussed in
Appendix C.
This prime “mirroring” (or “pairing”) property is a result of the “Goldbach conjecture” (GC), named after the German mathematician Christian Goldbach (1690–1764) who mentioned it in a 1742 letter to the eminent Swiss mathematician, physicist, astronomer, geographer, logician and engineer Leonhard Euler, to which Euler responded: “That every even integer is a sum of two primes, I regard as a completely certain theorem, although I cannot prove it.” [
11]. While Goldbach considered {1} as prime, the statement “
every even number > 2 can be expressed as the sum of two primes” is now the proper formulation of the GC, commensurate with our current definition of primes.
The GC is easily seen to imply that “every odd number > 5 is the sum of three primes”, since every odd number > 5 can be expressed as sum of an odd prime and some even number, which in turn is a sum of two primes. This is often referred to as the “weak” or “odd” Goldbach Conjecture (oGC).
In the two and a half centuries since the GC was formulated, and while considerable efforts have been made by mathematicians towards a proof, the conjecture remains open. The oGC is considered “relatively easier” and has been proved for numbers larger than
[
12]. The GC has been computationally verified for numbers up to
[
13].
Rather than considering the GC or oGC as binary questions, it is helpful to think of them, respectively, as
encoding efficiency limits, in the sense that, in general, an even number cannot be written as a sum of
fewer than two odd primes and an odd number cannot be written as a sum of
fewer than three primes. In that sense, significant progress has been made towards attaining these encoding limits. Vinogradov (1937) showed that every sufficiently large odd integer can be written as the sum of at most three primes, and so every sufficiently large integer is the sum of at most four primes. One result of Vinogradov’s work is that we know GC holds for almost all even integers. In 1966, Chen proved that every sufficiently large even integer is the sum of a prime plus a number with no more than two prime factors. The GC has also been linked to the Riemann Hypothesis (RH), in the sense that if we accept the RH as true, then every odd integer is a sum of three primes, which implies that every even integer is a sum of four primes. The interested reader can find more information in [
14].
Ultimately, this might be the first instance where an important proof relies on showing a mathematical proposition to be true for a range of numbers larger than some lower bound L, while computationally confirming its validity for all numbers up to L.
In this paper, our purpose is to explore the universal optimality property of primes at its limit; we do so by assuming that the optimal encoding efficiency implied by the GC is true.
Note that, even at suboptimal encoding efficiencies, i.e., assuming that even (odd) integers can be expressed as sums of
more than two (three) primes, the set of primes P remains the subset of ℤ with the fewest and smallest members that can most efficiently encode all integers. For example, consider the—suboptimal to the GC—summative encoding efficiency resulting from Bertrand’s postulate ([
8], p. 455), which asserts that there is always a prime
p ∈ [
n, 2
n] where
n ≥ 1. Therefore, there is always a prime
p ≥ ⌊
s/2⌋ in the interval [2,
s] for
s ≥ 3. The postulate implies that this prime spans at least half of
s. By successively applying this process we establish the existence of a sequence of decreasing primes
with
s as its sum. For example, let
s = 100 and let
be the first prime in the sequence between [
s/2,
s]. By applying the postulate to the remainder
and selecting
∈ [⌊39/2⌋, 39] as the second term in the sequence, the new remainder becomes
. Since the new remainder is small (<20), we can easily express it as a sum of two primes, e.g., 13 + 3. Generalizing, we can show that any number
s ≥ 3 can be written as a sum of
at most ⌊log(
s)/log(2)⌋ + 1 primes. Given the sparsity of primes and, hence, the low value of the ratio
, as
n increases, i.e., the ratio of the cardinality
n of
to its largest member
, this demonstrates the remarkable summative encoding capability of primes.
We therefore accept as true that every even integer > 4 is a sum of two odd primes. Equivalently, we accept as true that every integer > 0 may be expressed as a sum of up to three numbers in . From the above discussion, it follows that by accepting the validity of the GC, we establish the significance of and its subsets as the most efficient—in an encoding sense—“building block” of all integers, including primes themselves.
Note that, since P ⊂ ℤ, we may apply the oGC to generate a “summative decomposition” of primes. Since each prime ≥ 5 can be written as a sum of three primes, it follows that ultimately, smaller primes could be used as the summative building blocks of larger primes. For example: 5 = 2 + 3, 7 = 2 + 2 + 3, 11 = 3 + 3 + 5, 13 = 3 + 5 + 5, 17 = 5 + 5 + 7, 19 = 5 + 7 + 7, 23 = 5 + 7 + 11 etc. We can therefore use the first 5 primes {2,3,5,7,11} to summatively optimally encode the next 4 primes {13,17,19,23}, for an intra-prime summative encoding efficiency factor of 1.8 [=9/5].
As with the factorial case, the optimal summative encoding implied by the GC, can be represented as the solution of an optimization problem. In this case however, the formulation is nonlinear, i.e., gives rise to a nonlinear programming (NLP) formulation. To see this, consider how a set of even numbers in [6, 2m], for some integer m ≥ 3, can be generated by using one addition operation over the smallest possible subset of odds within [3, 2m − 3]. The solution is the summative optimal encoding implied by the GC, i.e., that every even number ≥ 6, is the sum of two primes.
Note that additional constraints may be introduced to refine this solution, when multiple summative encodings of an even number
e exist. One approach is to select the one with the smallest (or largest) deviation from the midpoint
e/2, as discussed in
Appendix B. For example, in the case of
e = 16, this would result in
be more (or less) preferable than
, in the sense that in the first encoding instance the distance from the midpoint is 5, while in the second it is equal to 3.
Let
be the vector of even numbers in [6, 2
m], for some integer
m ≥ 3, and
the vector of odds in [3, 2
m − 3]. Thus, the NLP may be formulated as follows:
where
A is the
m ×
m decision-variable matrix, with elements
for
, specifying the elements of
to be added in order to optimally generate the output vector
. From the GC, we know that the prime elements of
are necessary and sufficient for an optimal NLP solution to exist, i.e., to satisfy the objective function and the constraints in (9).
Ongoing research is focused on alternate formulations, e.g., based on an objective function that minimizes the rank of the decision matrix, as in the NLP below: