1. Introduction
The
k-satisfiability (abbreviated) problem refers to a Boolean formula
k given a conjunctive normal form (CNF), in which each basic clause is a disjunction of
k words to determine whether it exists. A set of truth assignments makes the value of
true and finds all assignments that can satisfy
. The theory and application of the SAT problem is an important topic for scholars of computer and mathematical logic [
1]. The
k-satisfiability (
k-SAT) problem is also the first problem to be proven to be non-deterministic polynomial(NP)-complete.
Since the discovery of phase transition phenomenon [
2,
3,
4] in solving the random 3-SAT formula in the early 1990s, researchers have found that the random SAT formula is almost always satisfiable or unsatisfiable, depending on the size of the constraint density
. The change point is called the satisfiable phase change point
. In the random 3-SAT problem, although the exact value of
has not been studied, studies have shown that
is at least 3.52 and at most 4.506. Before
reaches
, the solution space of the 3-SAT formula undergoes several phase transitions and evolutions.
In recent years, scholars have conducted extensive research on the
k-SAT problem. The most important progress includes the prediction of the phase transition point, the search for the strict lower bound of the satisfiability threshold, etc., and a wealth of research results have been obtained. Among them, literature [
5,
6] uses the symmetric cavity method (1RSB) in statistical physics to give the satisfiability threshold
of the 3-SAT formula as a function of
k, and calculate the approximate value
. Literature [
7,
8,
9] used the 1RSB entropy cavity method [
10,
11] and found that before
reaches
, the solution space undergoes a cluster phase transition at
. The solution space before the phase transition is mainly dominated by a large solution cluster, in which there are rich internal structures and multi-dimensional communities. As
increases, when
reaches
, the number of macroscopic states in the solution space will suddenly increase, that is, the solution space will begin to evolve and split into exponentially multi-level declusters, and solutions of different declusters cannot be reversed by a single variable. From one cluster to another cluster, and its similarity is much lower than the internal solutions of the same cluster, it is more difficult to use the local search algorithm to solve. After the clustering phase transition occurs,
continues to increase, and the constrained density
undergoes a condensed phase transition at
. At this time, the exponential multilevel clusters decrease sharply, but the solution space still has exponential multilevel declustering at this time, and the declustering. The separation is obvious. In the 3-SAT problem
, literature [
12,
13] used long-range frustration theory to predict the threshold of interference transition
in the
k-SAT formula. When
reaches this threshold, it indicates freezing. The argument begins to strongly dominate each solution community in the declustering, and increasing the
declustering will cause the community to disappear. This threshold can also be used as the lower bound of the satisfiability phase transition point
, but it is shown through experiments that the prediction is accurate when
. When
,
, which is quite different from the predicted result.
In 2016, Angsheng Li proposed the structural entropy theory [
14]. The structural entropy principle of graph is essentially a measure, which is defined as determining the minimum total number of digits required for node code that can be accessed from the random walk in graph. This principle can completely or maximally detect the two-dimensional and
k-dimensional structure composed of the rules and order of the graph against the random changes in the graph. They use random walk to capture the information interaction between nodes in the graph to completely distinguish the structural noise in the undirected graph, and use the structural entropy of the graph to study the information interaction between nodes and the natural clustering of nodes, and achieve some new results [
15,
16,
17].
As is known to all, the internal structure of the solution space of SAT problem exists rich and multistage community structure, the view of the evolution and phase transformation of solution space, the existing studies are all narrow discussions of
with different constraint densities, lacking a unified generalized model. In addition, the existing research lacks a high summary of multidimensional structure information in solution space.
K-SAT problems can be transformed into 3-SAT problems through the method of specification [
18], so 3-SAT problems are of great research significance. Inspired by structural information theory, in this paper, we study the structural properties of the solution space of 3-SAT problem by using the structural entropy of graphs.
This paper studies the 3-SAT problem and the community structure changes in the problem solution space. Based on structural entropy theory, this paper proposes a two-dimensional structural entropy measurement model and a
k-dimensional solution space measurement model to measure and analyze the evolution and phase transition of the 3-SAT solution space. We introduce the
model to randomly generate instances of the same scale with different
. Furthermore, we propose and use the belief propagation (BP)-based planting strategy algorithm and the planting strategy-based label propagation algorithm (LPA) to generate the 3-SAT problem and the solution space, and study its structural information. Combined with the phase transition of the solution space structural entropy, the phase transition of the solution space and the change in community structure dependent on the change in
are analyzed; the structural entropy conditions when the solution space changes in the community are obtained, which was experimentally verified [
12,
13]. As result, the accuracy of the predicted interference transition threshold
could be determined.
2. Preliminaries
2.1. Bp Iterative Equation
Let us suppose the CNF is , the set of arguments is denoted by , and is used to represent . Formula can construct a bipartite graph , termed a factor graph. The edges in figure are divided into real edges and imaginary edges. Among these edges, is called the argument set and is called the clause set. The conjunction of clauses constitutes the conjunctive normal form formula, which is denoted as and , for short.
As shown in
Figure 1, two types of information are defined on the edge of the factor graph, namely the information
(
) transmitted by the argument to the clause and the information
sent by the clause to the argument [
19,
20].
indicates whether the argument
is in or not, the probability that the value is
under the constraint of clause
, and the probability that the variable
takes the value of
without the constraint of clause
.
represents the probability that the variable
takes the value of
to satisfy clause
. According to the cavity theory in statistical physics, the following BP iterative equation can be obtained [
21]:
Among these formulas, and are normalization factors, and , respectively, represent the set of clauses connected to the argument to remove and the set of argument nodes connected to the constraint to remove . If the BP iteration can converge, the marginal probability of argument can be calculated according to the information , and the value of the argument can be obtained from the marginal probability.
2.2. Solution Space of 3-SAT Problem
In the 1RSB theory, a solution cluster in the solution space of a random 3-SAT problem is regarded as a macroscopic state of the system. When the constraint density of the random 3-CNF reaches a certain critical value , the structure of the solution space of the random k-CNF formula will change qualitatively, and the number of macro-states in the solution space will suddenly increase, that is, the solution space is split into a large number of subspaces, each subspace contains a certain number of solutions, but the statistical properties of different subspaces are different, and the similarity of the solutions of different subspaces is much lower than that of the same subspace. This abrupt phenomenon is called the cluster phase transition under the thermodynamic limit , and is called the confinement density of the cluster phase transition.
Take the change of solution space of
k-SAT formula as an example, as shown in
Figure 2. When the constraint density
gradually increases from zero and is close to
, although the solution space has only one macroscopic state under the meaning of statistical significance, more and more community structures and more and more community structures have been accumulated in this macroscopic state. Strong point-to-set association. The point-to-collection feature correlation length tends to be infinite as
becomes smaller. In the random
k-SAT problem, when
exceeds
until it reaches another critical value point
, the solution space of the random
k-CNF formula will be condensed. Regarding [
22], it signifies that the statistical properties of the solution space begin to be determined by those subspaces that contain the largest number of microscopic configurations but a small number. In the random 3-SAT problem,
, that is, the cluster phase transition point and the condensed phase transition point coincide. When
, the number of subspaces that determine the statistical properties of the solution space changes from the order of
to the order of
(where
and
are constants).
2.3. Two-Dimensional Structural Entropy of Undirected Graphs
The solution space is expressed as an undirected graph, which is convenient for us to study the structural characteristics of the solution space. The real-world network is a highly connected graph, which develops dynamically. We use random walks to obtain the interactive information of nodes in the network, and give the concept of entropy reflecting the dynamic complexity of the network. This is the one-dimensional structural information of the network. Here we first recall the Shannon entropy, and then introduce the definition of structural entropy [
14].
Definition 1. For the probability vectorand, then the entropy function of the probability vectoris defined as: We interpret as “’s self-information”, which also means that is the amount of information required by a certain code. Therefore, is the average amount of information in the code that determines the probability distribution.
Definition 2. Letbe an undirected connected graph withnodes andedges. For each node,
is the degree of nodein graph G, let, and then describe the fixed distribution of random walk through probability vector, then the definition of one-dimensional structural entropy is as follows: The one-dimensional structure information of the connected graph measures the amount of information needed to determine the one-dimensional code of the node. The one-dimensional code can be obtained from a random walk in G and has a stable distribution.
Definition 3. For the undirected graph, assuming thatis the community of the node setof, the structural entropy of graphwith respect to the partitionis:whererepresents the number of nodes in community;is the volume of community; that is, the sum of degrees in community;represents the number of communities in partition;is the degree of the-th node in; andis community, which connects degrees outside the community. If each node in graphis regarded as a code pair, the code of the community-containing nodeis, and the code of this node in the community is. According to the information theory proposed by Shannon, the first term ofis the minimum number of coding bits required to determine the position of nodein community, that is, the node that graphcan visit through one-step random walk is. The second item is to determine the minimum number of coding bits required for community. Therefore, the two-dimensional structural entropy is defined as: Definition 4. For the two-dimensional structural information of a discontinuous graph, given a graph,
suppose it is an inductive subgraph of all connected components of,
and there is a two-dimensional structural entropy formula:
whereis the volume of the graph,
is defined according to the two-dimensional structural entropy, andis the weighted average of the two-dimensional structural entropy of all subgraphs. Specifically, if there is a subgraph, such as, with only one isolated node, the structural information ofis:. Therefore, for any graph, if thenode has no edges, then.
Information entropy reflects the overall amount of information in the undirected graph, that is, the uncertainty of the undirected graph network. Based on this, the two-dimensional structural entropy follows the principle of minimizing uncertainty to remove structural noise and data redundancy. The degree of association between structure and node structure can effectively measure the structural information within a community and the structural complexity between different communities.
4. Structural Metric Model of Solution Space of 3-SAT Formula
4.1. Lpa Based on Planting Strategy
The definition of structural entropy involves the concept of community division. This means a complex network is divided into several non-overlapping communities according to a certain method; the structural information within and between communities is then calculated through classification and integration, and the structure of the entire network is finally obtained. Entropy is a manifestation of important structural information, such as the structural complexity, dynamic interaction, and uncertainty of the network.
There are no specific requirements for community division in the definition of structural entropy, but according to Formula (5), improving the accuracy of community division is the only way to minimize network uncertainty. Currently, in broad sense, there is no efficient community discovery algorithm that can satisfy any community at any same time. Algorithms that perform well in individual networks may not perform satisfactorily in community discovery in other networks. Therefore, designing an efficient community discovery algorithm for the solution space of the 3-SAT problem can greatly improve the accuracy of the structural entropy measurement method of the solution space in this problem.
The LPA is a semi-supervised machine-learning method based on graphs. It has the advantage of being suitable for large-scale undirected graphs; its community discovery is fast and the number of communities does not need to be known before each iteration.
When the node is initialized, the original LPA assigns a unique label to each node in the network, which randomly arranges the nodes to obtain the node sequence and uses this as the label update sequence. In the calculation, if the highest number of labels of neighboring nodes is the same, then a neighbor is randomly selected to update the label. When using the LPA algorithm for community discovery, the update order of tags is very sensitive; thus, the update order of these tags affects the magnitude of community discovery. The above two random processes will gradually increase the impact of relatively important nodes on the community as the iteration progresses, while ignoring the differences in importance between nodes, resulting in a “backflow” of labels, and may cause problems that did not originally belong to the community. Nodes are included in the community, which ultimately lead to too much difference between the results discovered by the community and the real community.
In the solution space generated by Algorithm 1 in the previous section, an LPA based on the planting solution strategy is proposed to optimize the belief propagation algorithm to solve the community division of the solution space in the 3-SAT formula.
In the process of selecting the label update, the initially labeled node selects the label with the highest occurrence in the neighbor nodes to update the label, and selects the label to be updated according to Equation (11):
where
represents the label to be updated,
represents the set of neighbor nodes of node
, and
represents the label of node
’s neighbor node
.
In the solution space calculated based on the planting strategy, the greater the number of common neighbors between nodes, the greater the number the nodes that are affected by the same or several frozen variables. At the same time, each node has a parameter of the number of flips. The smaller the number of flips, the more the node is affected by the frozen variables. As gradually increases, the influence of the frozen variable on the solution space community gradually increases, so the community must be divided according to the influence of the frozen variable on the solution. Dividing the community structure in this way is also more accurate than the traditional LPA and can greatly reduce the uncertainty of the solution space structural information.
The LPA based on the planting solution strategy proposed in this paper uses the influence of the above frozen variables to determine the update order. It also uses the similarity between nodes to measure the more common neighbors of the two nodes: the greater the similarity of the nodes, the greater the probability of being in the same community, thus the result of the division is more in line with the original community distribution. The degree of a node in an undirected graph is defined as the number of edges directly connected to the node. The connected nodes structurally indicate that there is only one variable value between them. The similarity between nodes is calculated by the following formula:
Among them,
is the first-order neighbor set of nodes
and
is the degree of node
. In this algorithm (Algorithm 2), the node update sequence is traversed, and the similarity calculation is performed. If the current node and the node before the update sequence are adjacent nodes, the similarity index is less than the threshold. If the label of the node is null, then the node before the sequence is selected as the seed node; otherwise, it is not selected as the seed node.
Algorithm 2 LPA based on planting solution strategy. |
Input: Output: Community findings
If connect: go to Step 2, flag = 1; else: Backtrack and traverse all connected subgraphs; go to Step 2, flag = 0. According to the walkSAT flip result, the ascending sequence of the number of nodes set flips is formed, and the seed node set is generate. for , let . , ; if : go to Step 6. , if : go to Step 4.
- (1)
, go to Step 5. - (2)
If: go to Step 5. - (3)
If , , , , and : go to Step 5. - (4)
If: go to Step 5. - (5)
Go to Step 5
If there are multiple labels for node in the network, the label with the most occurrences in its neighborhood is selected as the label for . If there are multiple seed nodes that meet the conditions, the one with the most frequent occurrences of the label is selected as its own label. If node does not have an adjacent seed node that meets the conditions, keep the label as empty . without change.
|
In the above algorithm, Steps 1–7 use BP iteration to calculate the satisfiable probability of each variable and divide the variable nodes into two categories, namely, frozen variables and non-frozen variables. In Step 8, the non-frozen variables are randomly flipped by the walkSAT algorithm to form a solution space with the frozen variables as the core, and the number of flips for generating each solution walkSAT is recorded. Steps 9–10 calculate the Hamming distance between the solutions and generate the solution space adjacency matrix to construct the solution space.
The number of flips means the influence of the frozen variable on the solution. The more the number of flips, the greater the influence, and the more likely it is to be in the same community. The solution space obtained by Algorithm 1 through the planting strategy records the number of flips of each solution, which can be used as the basis for the next solution space community division.
4.2. Two-Dimensional Structural Entropy of Solution Space of 3-SAT Formula
In order to obtain the structural information of the solution space of the 3-SAT formula, the solution is represented by nodes, and the connection between nodes means that only one argument between the two solutions has a different value. This can be completed without any loss of the relationship between the solutions and the relationship between the solutions and the information of each node. The solution generated by randomly flipping the non-frozen variables on the basis of the frozen variables by walkSAT in Step 8 of Algorithm 1 is abstracted as an undirected graph. Therefore, through the research and analysis of the solution space undirected graph, the structure information of the solution space of the 3-SAT problem can be obtained more directly. The community structure of the undirected graph divided by Algorithm 2 is shown in the
Figure 4.
The nodes in
Figure 3 are divided into three communities, namely {1,2,3,4,5}, {7,8}, {6,9}. The division of the community structure not only depends on the inherent relationship between nodes and more importantly, divide the nodes with similar influence distribution of the frozen variables on the solution community into the same community, and always follows the principle of minimizing structural entropy in the process of radiating other nodes. Therefore, the community structure can be maximized. It satisfies the information minimization principle in the definition of structural entropy, that is, it satisfies
in the definition of two-dimensional structure entropy and
in the definition of
K-dimensional structure entropy.
If the undirected graph
G has no self-loops and polygons, then the probability of a random walk in
G is set
, the static distribution of the random walk can be expressed as a probability vector
, where
. The amount of information about the distribution can be obtained:
According to the theory of information entropy, in a limited set of mutually exclusive and joint exhaustive events, entropy is the average amount of information, so the local structural entropy (average amount of information in the network structure) within the community is expressed as:
In information theory,
represents the unit of entropy as “bit”, that is, the entropy value represents the number of bits required for encoding. Therefore,
represents the average amount of information required to determine the number of coded bits of
by the probability
, where
is used as the number of binary representations, indicating that
can be represented as
, so
is to determine the amount of information needed to encode
. The local structural entropy between communities is expressed as:
The community structure is divided by the LPA algorithm based on the planting strategy, and the sum of Formulas (14) and (15) are used as the two-dimensional structural entropy of the proposition formula, which reflects the interactivity between nodes within the community. In addition, due to the consideration of the influence of frozen variables, these influences are used to form communities. According to the probability of random walk and the random flip of walkSAT, the understanding nodes are obtained by radiation expansion. At the same time, it is changed to Algorithm 2 and it can extract structural noise and data redundancy effectively, which satisfies the principle of minimizing the uncertainty in the undirected graph community discovery to the greatest extent.
4.3. K-Dimensional Structure Entropy of Solution Space of 3-Sat Formula
The previous section has analyzed that the LPA based on planting strategy follows the principle of minimizing structural entropy when dividing communities, and satisfies in the definition of k-dimensional structural entropy. This section introduces the construction process of the K-dimensional structural entropy community segmentation tree.
As shown in
Figure 5, using Algorithm 2 iterative community division, the solution space undirected graph can be divided into a multi-level community structure. As shown in
Figure 6, thereby constructing a split tree with a set of nodes in the graph as tree nodes. The tree structure can be constructed according to the following regulations:
- (1)
Define the set , is the root node.
- (2)
Let be the node on the split tree. The immediate successor node of is and the nodes are sorted from left to right in ascending order.
- (3)
Let be the node on the split tree. indicates that is the initial character segment of . When is not the root node, the longest initial character segment of is .
- (4)
For each node, is a division of ; thus ( is the height of , the height of the root node is 0, and the height of other nodes increases according to the tree).
- (5)
For each node, if is the parent node of , .
After initializing the split tree, the undirected graph is divided into communities to obtain multiple overlapping communities , and these overlapping communities are used as the second-level nodes of the split tree, where the parent nodes are the root nodes (including all nodes in the undirected graph). Each community contains one or more nodes, and . If the number of nodes in () is greater than one, continue to divide , and use the sub-community as its child nodes. If the sub-community still contains multiple nodes, repeat the above steps to continue dividing the community until it is in the community. When only a single node is included, the division of the community is stopped.
For a given undirected graph
G, according to the principle of information minimization, the
K-dimensional structure information of
G is to determine the total number of bits of
K-dimensional coding of
G. Therefore, the
K-dimensional structural entropy also provides the dynamic complexity of the quantized undirected graph
G of structural information. On the one hand, as the constraint density α reaches the satisfiability threshold
, a single solution cluster in the solution space of the 3-SAT formula undergoes a cluster phase transition [
7,
8,
9], and the cluster phase transition before the phase transition has rich internal structure, and most of the solutions in the solution space are dominated by a large solution cluster. On the other hand, it is defined by the
K-dimensional structure entropy. The
K-dimensional knot structure entropy is suitable for the multi-level structure analysis of undirected connected graphs. Therefore, The
K-dimensional structure entropy is suitable for measuring the multi-dimensional structure information of large-scale declustering before the cluster phase transition occurs.
It can be seen from Formula (8) that all the communities of the connected undirected graph are included in the K-dimensional structure entropy calculation based on the structure of the partition tree, and the rules and order of the K-dimensional structure graph can be detected to the greatest extent. The order is completely extracted from the undirected graph, and the order and disorder are separated from the structured data, and the cumulative sum of the entropy of the tree nodes is calculated, which conforms to the additivity of entropy. Therefore, the K-dimensional structure entropy takes into account the community segmentation of all nodes, and fully and effectively analyzes and understands the dynamic complexity and uncertainty of spatial undirected graphs and other important properties. The partition tree constructed by the iterative division of Algorithm 2 can effectively remove the noise in the undirected graph, and minimize the information loss from the construction of the partition tree to the calculation process of structural entropy. The final calculation result is a highly abstract and summary of the K-dimensional information of declustering.
5. Result Analysis
In order to ensure the accuracy of the 3-SAT formula solution space measurement model, this section first verifies the performance of the LPA algorithm based on the planting strategy. For the communities divided by the BP algorithm (Algorithm 1) based on the planting strategy, the modularity standard is used to analyze the quality of the community division. The basic principle is to regard the network as a zero-model network, and discover the difference between the modularity of the community and the modularity of the zero model through comparison. The greater the difference, the better the quality of community discovery. The definition of modularity
is:
where
represents the number of edges in the network,
represents the adjacency matrix of the network,
represents the degree of node
, and the function
represents if nodes
and
are classified as the same community. Because other community discovery algorithms (including LPA) cannot meet the pertinence of the model, a large number of experiments are of little significance, and Algorithm 2 has similar properties to LPA, so only the modularity of Algorithm 2 and LPA is compared. The comparison results of the 3-SAT formula under different constraint densities
are shown in
Table 1.
Table 1 shows that in the solution space of the 3-SAT formula with different constraint densities, the modularity of the LPA based on the planting strategy is significantly improved than that of the LPA, and because Algorithm 2 is specific to the solution space community division, it shows that Algorithm 2 is more effective to solve the division of space community.
The hardware environment for the experiment is Intel Core i7 9750H + RTX 2060 + 16 GB RAM, and the software environment is Windows 10 64 + ubuntu18.04 + xshell + Python3.7 + Matlab. Let us use the model to generate the 3-SAT formula, where is the number of arguments and is the number of clauses. We let , then generate a CNF formula with a length of 100, 125, 150, and increase the value of the constraint density by adding randomly generated clauses one by one. In the solution space undirected graph, Algorithm 2 is used to divide the communities, and the two-dimensional structural entropy of the solution space of the 3-SAT formula is calculated. The result is as follows:
Figure 7 shows the change of the entropy of the two-dimensional structure when the variable scale is 100 and the constraint density increases from 3.5 to 4.3. As the constraint density increases, the number of solutions and the complexity of the solution space begin to change, and the entropy of the two-dimensional structure decreases as
increases. When the constraint density is between 3.50 and 3.74, a single decluster has a complex community structure and the decrease in the entropy of the two-dimensional structure
is not stable enough. As the number of solutions continues to decrease, the value of
generally shows a downward trend. As the constraint density increases, the constraint density undergoes a clustering phase transition at 3.73, and the declustering undergoes a heterogeneous transformation. A single decluster begins to decompose into multiple declusters, but the solution space is still dominated by one main decluster, here. In the process, the complexity of the internal declustering community continues to decrease, and the decrease in the constraint density around 3.92 tends to be flat. As the value of
increases to about 4.17, the solution space changes again, and the number of solutions continues to decrease. The frozen variable increases the impact on the solution community, the single cluster is sharply separated into smaller clusters, and the solution space begins. This solution space tends to be stable; and after increasing
, the solutions in the solution space begin to disappear in clusters. It can be seen from
Figure 7 that the two-dimensional structural entropy can effectively measure the overall structural information of the solution space.
The literature uses symmetric cavity theory (1RSB) [
5,
6] to provide a 3-SAT formula cluster phase transition point
around 3.859, which is lower than the first phase transition result in
Figure 7. When
is near the first phase transition point, the complexity of solution space increases sharply, which leads to a sharp increase in structural information and a decrease in accuracy of Algorithm 2, so
,
. This leads to the experimental results having lower upper bounds than those in the literature [
5,
6]. When CNF is more easily divided, or more accurate algorithms are used, the upper bound of experimental results will be improved, but it cannot give a definite upper bound. These studies suggest that the condensed phase transition point in 3-SAT is
, that is, the few largest solution clusters that contain the most solutions in the solution space of the 3-SAT formula cannot determine the statistical properties of the solution space.
Figure 7 shows that the second transition occurs when the constraint density is about 4.17, which is close to the use of remote frustration theory to predict the threshold
in the literature [
12,
13].
After the transition, the entropy of the two-dimensional structure tends to be segmented and stable. If the value of the frozen argument is large, it cannot satisfy the newly added random clause. As increases, the declustering of the solution space begins to cluster and disappear, and the entropy of the two-dimensional structure also decreases in segments. This means that at this time the frozen variables have greatly affected every solution community, so it can be considered that this transition point is the interference transition point predicted by the remote frustration theory.
The above results show that the two-dimensional structural entropy accurately reflects the overall structural information of the solution space of the 3-SAT formula. The phase transition and the transition point of the two-dimensional structural entropy illustrate the validity of this measurement.
Before the clustering phase transition occurs, each solution cluster in the solution space is relatively large, and the number of solutions has not decreased sharply. It contains complex community structures and rich multidimensional communities. The entropy of the two-dimensional structure cannot be accurately determined. The solution space structure is measured, that is, the decrease in the entropy of the
two-dimensional structure shown in
Figure 7 is unstable. Therefore, this section uses the
k-dimensional structural entropy to measure the
k-dimensional structural information of a single decluster of the solution space. The solution space undirected graph data constructed in the process of calculating the entropy of the two-dimensional structure are used continuously, and Algorithm 2 is used to iteratively partition the community, divide the partition tree, and then calculate the
k-dimensional structure entropy. The result is shown in
Figure 8.
Figure 8 shows that before the cluster phase transition occurs, the internal
k-dimensional structural information of the solution space continues to decrease, and the
k-dimensional structural entropy decreases steadily. At
, the decline decreases and gradually stabilizes. At
, a clustering phase change occurs, and the solution space that was originally a single declustering is decomposed into multiple declusterings. This phenomenon shows that the
k-dimensional structural entropy performs well when measuring the solution space of single clustering, and is suitable for clustering with rich internal structure.
The above experimental results show that the 3-SAT formula can be used to solve the spatial clustering phase transition point , the condensed phase transition point , the threshold value of the interference transition point, and the corresponding structural entropy threshold value. At , if , then there is no cluster phase transition in the solution space. If , the solution space does undergo a cluster phase transition: the cluster phase transition point is around , the condensed phase transition point , the two-dimensional structural entropy has an interference transition at 24.2, and the interference transition point . The above structural entropy analysis can obtain the structural entropy thresholds of the 3-SAT formula clustering phase transition points, the condensed phase transition points, and the interference transition points at different scales. It can also estimate the clustering phase transition point, the condensed phase transition point, and the interference transition point of the k-SAT problem under different values.
As a comparison of the entropy of the two-dimensional structure,
Figure 9 record the changes in the number of communities that understand the space. In order to more easily observe the relationship between the changes in the number of communities in the 3-SAT problem solution space and the changes in the entropy of the two-dimensional structure, the experimental data are normalized. That is, when
is fixed, the ratio of the number of each community to the maximum number of communities is calculated, and the ratio of the two-dimensional structure entropy to the maximum two-dimensional structure entropy is also calculated in the same way.
Figure 9 depict the change curve of the number of communities in the solution space of the random 3-SAT problem with
, respectively. From the experimental results, it can be seen that with the increase of
n, the solution space of the random 3-SAT problem becomes more complicated. The curve of the number of communities in the solution space reaches the highest point at
. At this time, the community structure of the solution space is at the most complex state, and the transition position of the condensed phase transition point moves from left to right. The three sets of experiments all show that the solution space community structure is the most complicated near the transition point. When
, the number of solution space communities increases exponentially.
When the solution space of SAT problem is studied by using structural entropy theory, there is no need for harsh judgment conditions, and there is no need to discuss the specific CNF in a narrow sense, only according to the optimal partition of pairs of graphs. The structural entropy theory is widely used, the judgment conditions are simple, and it can extract the multidimensional structural information of the graph model to the maximum extent, which is more effective than the methods in literature [
5,
6,
12,
13]. On the other hand, the method in this paper cannot provide a definite upper bound or lower bound, which is determined by the properties of structural entropy. Although the judgment conditions of the method in literature [
5,
6,
12,
13] are more demanding, it can provide an exact upper and lower bound.