3.2.1. Maximize Edge Coverage with k Bicliques
We prove the following result:
Theorem 1. There exists a time algorithm that given in input a bipartite graph and a parameter k, where G is a yes-instance of the -biclique cover problem, it outputs a set of at most k bicliques in G covering at least edges of G.
We first provide an overview of the main ideas on which Algorithm 1, which maximizes the number of edges covered with k bicliques, is based. The algorithm has four steps. First, we use a kernelization method to obtain a smaller graph which has a -biclique cover iff G has a -biclique cover. Then, we exploit the properties of the -biclique covers for the kernelized graph to identify a set of all large enough bicliques that are part of any cover of . Then, we will show how to enumerate the set containing all possible small bicliques of any cover of not in . We will keep a mapping from bicliques in the smaller graph to bicliques in the original graph G. Then, we cast the problem of selecting bicliques as a monotone submodular maximization problem hence showing that we can cover at least edges of the original graph G with k bicliques.
We now prove formally Theorem 1, showing an algorithm that outputs a set of at most k bicliques in G covering at least edges of G. We describe each step of Algorithm 1 in details.
We apply the kernelization technique introduced by Fleischner et al. [
2] for showing the parametrized complexity of biclique cover, which we now recall.
Given an instance
of biclique cover, the kernelization obtains an instance
where
is constructed by applying the following two rules to
G: (1) remove all nodes from
G with no neighbors; (2) for each set of vertices that have the same neighborhood in
G, keep only one of them. We call
the kernelized graph of
G if
is constructed from
G by applying the two rules. Fleischner et al. [
2] show that
G admits a
k-biclique cover iff
admits a
k-biclique cover. We now show that using the same kernelization,
G admits a
-biclique cover iff
admits a
-biclique cover.
Lemma 2. is a yes-instance of the -biclique cover problem iff the kernelized graph of G is a yes-instance of the -biclique cover problem.
Proof. The lemma is a corollary of the results for general kernelization of biclique cover [
2,
7].
Notice that nodes with no neighbors can be ignored without complications. We show that one application of the rule that removes one node u if another node v has the same neighborhood transforms yes-instances into yes-instances and no-instances to no-instance. The result will follow then by induction (over the number of application of the rule). Let be the graph obtained by removing from G when there is a s.t. (the case with is similar and omitted).
Consider a -biclique cover of G. Removing a node v from G and from all the bicliques involving v gives a graph that can be covered with bicliques each covering at most nodes on the left side and on the right side.
Consider a -biclique cover of . As node u has the same neighbors of v, we can extend each of the (at most bicliques) covering u to include v on the left side. This preserves the fact that they are bicliques. Also v is covered by at most bicliques, and in total we use k bicliques. The nodes on the right side are still covered by the same number of bicliques. ☐
The algorithm first applies the kernelization to obtain the bipartite graph . This can be done in time . We define . We will store a mapping that records for each node in the set of nodes in V that had the same neighborhood and of which v is the only node left in – i.e., is an equivalence class of the nodes with the same neighborhood. Notice that for each node , contains at least one node and that all nodes in G with non-zero degree are in a unique set.
Algorithm 1 ApproxMaxCover() |
Assumes G is a yes-instance for -biclique cover. — Step 1 — Let be the graph obtained applying the Fleischner et al. [2] kernelization. Store mapping . Let if then report no -biclique cover exists for G and quit. end if — Step 2 — Let . For all such that or .
Let and Let Let Let and add biclique to the set
if then report no -biclique cover exists for G and quit. end if — Step 3 — Let be the set containing all bicliques of of the form:
All the bicliques , for or for All the bicliques (, ) for or for And all the in . — Step 4 — Let . Let . Let . while do Select s.t. the number of edges covered in G by the union of the bicliques in is maximized. Add C to . end while return. |
First, notice that given a biclique of , we can construct a biclique of G using the mapping : and . It is possible to observe that is a biclique in G if is a biclique in . Given a biclique in , we write for the biclique B in obtained in this way.
The following properties of the kernelized graph of G will be exploited in the rest of the paper.
Lemma 3. Let be a yes-instance of the -biclique cover problem and let be the kernelized version of the graph G. Then the following statements are true:
admits a -biclique cover.
.
In any -biclique cover of there are no two nodes on the same side of the graph covered by the exact same set of bicliques.
Proof. The first statement is a corollary of Lemma 2. Notice that all nodes on the same side of , by construction, have distinct neighborhoods. Hence, they cannot be covered by the same set of 2 bicliques in any -biclique cover. Finally, since each node in one side is covered by distinct subsets of at most 2 bicliques out of k, there are at most nodes in . ☐
We call biclique any biclique that has exactly l nodes in the left side of the bipartite graph and exactly r nodes in the right side. For a graph G and a set of vertices A, we call the induced subgraph containing all nodes A and the edges between them.
We will also use the following notation. Given a -biclique cover of the graph G, where is a set of bicliques covering all the edges of G, we define as the set of bicliques covering the node u in the cover . We have . We will omit from when it is clear which cover we are discussing.
We now make a series of important observations for the structure of the -biclique cover of the kernelized graph of a -biclique cover yes-instance G.
Lemma 4. Let G be a yes-instance for the -biclique cover problem and let be the kernelized graph of G. Suppose (resp. ) is such that . Let be a -biclique cover of . If there is a single biclique that covers all the nodes in A, then, in the cover , all nodes () connected to all nodes in A are covered by the same biclique B.
Proof. Fix a solution of the -biclique cover of . Let B be the biclique that covers all nodes in A in the solution . We prove the case for , the other case is similar.
First, fix three distinct nodes in A. All the nodes are covered by biclique B. Notice that by the third item of Lemma 3, each node in A is covered by a different set of (at most 2) bicliques in . Hence, there is no other single biclique that covers two nodes in A (otherwise there will be two nodes both covered by the pair of bicliques).
Now consider a node connected to all nodes in A. If v is not part of biclique B, in the biclique cover , all the edges for are covered by a distinct biclique. This is a contradiction, as v is covered by at most 2 bicliques in . ☐
We use the previous lemma to prove the following useful result.
Lemma 5. Let be the kernelized graph of a -biclique cover yes-instance G. Suppose that is such that is a biclique (or a biclique). Let and . Let and let . Then the biclique is part of any-biclique cover of and it is the unique biclique that covers all nodes in A.
Proof. Suppose to get a contradiction that in a given -biclique cover of there is no such unique biclique covering all nodes in A. From now on in this proof, when we use , we refer to , i.e., the set of bicliques covering u in the cover .
W.l.o.g., we assume that and . The proof follows similarly in the opposite case.
Notice that if any set of at least 3 nodes in either side of A is covered by the same biclique then, B covers all nodes in A. This is because, by Lemma 4, if three nodes in side in A (resp. ) are covered by B then all nodes in side in A (resp. ) are covered by B, and this implies that the all nodes in side (resp. ) are covered by B.
Consider any node . If , than there is a contradiction. In fact, all neighbors of u are covered by a single biclique and hence we get that more than 3 on one side are covered by the same biclique. By the previous argument than there is a biclique covering all nodes in A. So we assume for the rest of the proof that for all nodes . Fix a node u in and let be the two distinct bicliques covering the node u in . All nodes in are covered by either or .
We now show that there is no node such that . In fact, if such a node exists, then there are no assignments of and to nodes in such that all nodes in belong to one of the two bicliques but no bicliques covers 3 nodes in (recall that if a biclique covers 3 nodes in we get a contradiction because then all nodes in are covered by the same biclique and hence the fourth node in is covered as well by that biclique). Also similarly, this implies that and both cover exactly two distinct nodes . Let . W.l.o.g., we assume that and , and where by kernelization we have and .
Now, we have that all nodes in do not belong to both and . For sake of contradiction, let, w.l.o.g., be a node that belongs to . We know that v does not belong to (by kernelization). So , as it is connected to and hence , giving a contradiction.
This implies, for any node to be connected to all nodes in , that the following conditions are met: and . By kernelization, this gives a contradiction, as two distinct nodes in belong to the same set of bicliques.
We have shown that in any -biclique cover of there is at least one biclique covering all nodes in A. Now notice that there can be only one biclique covering all nodes in A, as otherwise there will be two nodes with the same set of bicliques in the cover on the same side of the graph. Hence, is the unique biclique covering all nodes in A in any -biclique cover. Finally, by applying Lemma 4 to all nodes connected to each node in (resp. ), we know that the unique biclique is , as in the statement of the lemma. ☐
The previous lemma means that if A is a (or ) of then the biclique obtained as in the statement is part of any optimal solution. Also notice that clearly if in a given -biclique cover of there is a biclique of size at least 3 in the smallest side, and at least 4 in the largest side, this biclique induces a (or ) in .
Hence, by enumerating all and in in polynomial time we can find a set of bicliques in such that: contains all large enough bicliques—i.e., with nodes (resp. ) in the smaller (resp. larger) side that are part of all -covers of ; and no biclique not in which is large enough (again it contains a or a ) is part of any -biclique cover. This is done in Step 2 of Algorithm 1.
We know that all the bicliques necessary to cover the remaining edges of (besides the one in ) need to involve at most 3 nodes in the smaller side. We can then enumerate a set of bicliques, in polynomial time, containing a super-graph of each biclique not in that is part of any -biclique cover. It is possible to verify that at the end of Step 3 of the algorithm, contains at least one biclique that covers all edges of any biclique with at most 3 nodes in the smallest side of the biclique. Step 3 runs in polynomial time.
Finally, we have a set , which contains bicliques that are necessary part of any optimal solution of and a set of bicliques that are sufficient to cover all edges of not covered by bicliques in .
Let be the set of bicliques of G obtained by (recall the definition of the mapping). Similarly we obtain the set from .
The algorithm will output all the bicliques in . For the algorithm will obtain the remaining bicliques in the following way. First, let . The algorithm will add greedily a single biclique from to for times choosing each time the biclique that maximizes the number of newly covered edges in . It is easy to see that, since this function is monotone submodular and additional bicliques are sufficient to cover all edges not covered by , at least of the edges of G are covered at the end of the process. We can now complete the proof of Theorem 1.
Proof of Theorem 1. Notice that by the previous considerations, the set at the end of Algorithm 1 contains at most k bicliques, and these bicliques covers edges of G.
The algorithm requires time for the kernelization. Then, the remaining graph has at most nodes and at most edges, by Lemma 3.
The most expensive operation is enumerating all the , of the graph which can be done in time. Hence, Step 2 takes time.
Step 3 takes at most time.
Finally, Step 4 does at most k iterations over a set of size each step of the iteration taking at most time.
The total is hence , and the result follows. ☐
3.2.2. Covering All Edges with the Minimum Number of Bicliques
In this section we prove Theorem 2, the existence of an algorithm that outputs a set of at most bicliques covering all edges of the graph.
Theorem 2. There exists a time algorithm that given in input a bipartite graph and a parameter k, where G is a yes-instance of the -biclique cover problem, outputs: a set of at most bicliques in G covering all edges of G.
Here, is the i-th harmonic number and .
For this problem, we use similar techniques described before. We outline the main differences from Algorithm 1. As before, we will construct the set , which contains large bicliques that are part of any optimal solution. In this set, besides the bicliques including a (or ), we will identify as well certain bicliques with size for and for (and the cases , ).
This will ensure that all remaining bicliques necessary for covering all edges in will have at most 9 edges in (i.e., they will be , , for and , for .)
Then, we cast the problem of selecting the minimum number of bicliques in that cover all remaining edges of (besides the one in ) as a set cover problem with sets of size at most 9. For this problem, it is known that the greedy algorithm gives a approximation factor showing that all remaining edges in are covered with additional bicliques plus the in for a total of bicliques.
Finally, from the biclique cover of , we can reconstruct the biclique cover in G with the same techniques of the previous section, outputting a cover with at most bicliques.
Properties of the Kernelized Graph of a -Biclique Cover Yes-Instance
We show some additional properties of the kernelized graph of a -biclique cover yes-instance that will be exploited by the algorithm. In this subsection, we always assume is the kernelized graph of -biclique cover yes-instance G.
Lemma 6. Let be a biclique in such that is not included in any biclique in . Then, in any -biclique cover , there is always a unique biclique that covers all the nodes in and no other node in the left side. (A similar result holds for bicliques.)
Proof. We first show that in any biclique , all nodes in the small side need to belong to the same biclique in any -biclique cover.
Fix a -biclique cover . refers to the set of bicliques covering u in the cover .
Let . Suppose to get a contradiction that , then each node in contains one biclique from and one from . It is possible to verify that at most distinct sets of size at most 2 can be formed by taking one element from and one from . Hence, there are two nodes in that are covered by the same set of bicliques, giving a contradiction as the graph is kernelized.
So we know that there is a biclique . Consider a node such that . It is easy to see that there can be only one such node, as this node must be covered by the other biclique of and the other biclique of not equal to A. Again, for the kernelization, only one such node can exist.
So at least 4 nodes in belong to a unique biclique. Finally, notice that if there is another node in the side of in the same biclique, that would form a biclique. ☐
Lemma 7. Let be a biclique in such that is not in any biclique including the node in in the left part of the biclique. Then, in any optimal solution, there is a single biclique that covers the node in and no other node on that side.
Proof. Fix a -biclique cover . refers to the set of bicliques covering u in the cover .
Let . If , then all node in belong to A and we can apply Lemma 4. Let , we have that all nodes in belong either to A or to B or both. W.l.o.g., let A be the more common of the two. We have that A appears at least 5 times. So, if any other node in u side belongs to A, that would form a biclique. ☐
Algorithm
We now state how the algorithm for this problem constructs the set . First, we add to all bicliques including a , as done in Step 2 of Algorithm 1. Then, we add to all bicliques of the form (and similar form ) for such that are part of a and not of a . Finally, we add bicliques (and the form ) such that has at least 10 edges not covered by the bicliques added before.
We claim the following property for .
Lemma 8. There is a -biclique cover of which contains all bicliques in and only other bicliques with at most 9 edges.
Proof. The lemma follows by combining Lemmas 5 and 6. By Lemma 5, we know that there is no other biclique in any -biclique cover with at least 3 nodes on the smaller side and at least 4 on the largest side.
Suppose there is a -biclique cover of containing all bicliques of and another biclique of the form for . If nodes in are part of a biclique, we get a contradiction, as we have two nodes in that are both covered by the biclique B and the other biclique in identified by the algorithm (by kernelization this is not possible). So we know that is not part of a biclique. However, in this case, by construction of the algorithm, there is another biclique covering , so again we get a contradiction. (The same proof works for the case for ).
Finally notice that, by construction, at most 9 edges of each node are left uncovered by bicliques in so there is a -biclique cover that includes all bicliques in and other bicliques with at most 9 edges each. ☐
Finally, notice that we can enumerate the set of all bicliques with at most 9 edges necessary to cover the edges left uncovered by the bicliques in in time , where n is the number of nodes in the graph .
The proof of Theorem 2 follows by the properties of set cover.