In this section, we propose the MultiSplit packet classification framework, which encompasses a process of grouping based on equivalent priority, pre-partitioning the rule set, and constructing decision tree classifiers using the MultiSplit algorithm. Finally, the packet classification process and related optimizations are introduced.
3.2. Equivalent Priority Grouping
During packet classification involving multiple trees/tuples, each packet must identify all applicable matching rules and select the one with the highest priority as the final match. It has been found that for two rules to match the same packet, they must overlap across all fields. This indicates that there is an inherent abstract relationship between the coverage and the priorities of the rule set. Inspired by [
35], a special pattern within the rule set warrants further analysis. Rules with non-overlapping fields can be termed mutually exclusive. Among these, there is no need to debate which rule better matches a packet, as they cannot match the same packet. Therefore, identifying such patterns within the rule set is crucial to minimize the lookup queries across multiple trees/tuples for the highest-priority matching rule.
Taking the five-tuple rule as an example, each field can be represented as a prefix, range, or exact value. This means that a rule can be viewed as a hypercube in five-dimensional space. Similarly, packet headers can be represented by a five-tuple, although the fields in this case contain specific values. Consequently, a packet header can be represented as a point in a five-dimensional hyperspace. If a packet (
p) matches both rules
and
, it can be deduced that there is an overlap in each field of rules
and
. Conversely, if at least one field does not overlap between two rules, they are considered non-overlapping, indicating that no packet can match both rules simultaneously. The necessary and sufficient condition for rules
and
to overlap is shown in (
1).
where
represents the intersection of rules
and
in dimension
f, and
where
and
represent the lower and upper bounds of rule
in dimension
f, respectively.
Based on the above, it can be established that not all rules have the potential to match the same packet. The key factor here is determining the overlap between them. Consequently, to minimize the lookups during packet classification, it is necessary to separate rules that do not overlap with each other from the rule set. This process is referred to as equivalent priority grouping.
Following this, a new attribute named equivalent priority can be abstracted from the grouping results. The grouping process (i.e., the process of calculating equivalent priorities), as shown in Algorithm 1, includes multiple rounds of iteration. Each iteration involves selecting the highest-priority, non-overlapping set of rules from the current rule set, removing it from the rule set, and treating the remaining rules as a new set for the next iteration (lines 1–14). For example, the first iteration involves determining which rules belong to group 1. In group 1, all rules are non-overlapping with each other, and their priority is higher than that of any rules outside this group. At the start of each iteration, the first rule in the set can be directly assigned to the current group (line 2). For each subsequent rule, the algorithm checks its overlap with all preceding rules, and it is assigned to the current group only if it does not overlap with any rules (lines 3–12). After completing the grouping process, the equivalent priority of each rule can be determined based on its group number, meaning the equivalent priority is the complement of the group number (line 19). For instance, the group number for the rule with the highest equivalent priority is 1, the next highest is 2, and so on, with 0 indicating an uncalculated equivalent priority. However, it is unnecessary to compute the equivalent group for all rules, even if the number of such groups in certain rule sets is not large. A predetermined number of equivalent groups (M; usually 8) is set. After the iteration, if there are still rules left in the set, they are all assigned to the last equivalent group (lines 15–17).
The results of the equivalent priority grouping for the rule set in
Table 1 are illustrated in
Figure 4. In the first iteration, rule
is initially placed in group
. Among the remaining rules,
,
, and
do not overlap with any of the preceding rules and are, thus, placed in group
. In the second iteration, considering only the rule set containing
, rule
is first placed in group
. Subsequently, only rule
does not overlap with any preceding rules and is also placed in group
. In the third iteration, with only rule
remaining, it is directly placed in group
. Finally, each rule is assigned an equivalent priority based on its group number. Rules
are given an equivalent priority of 2, rules
are given an equivalent priority of 1, and rule
is given an equivalent priority of 0.
Algorithm 1 Calculating equivalent priorities. |
- Input:
A rule set contains N rules, the number of groups M - Output:
A rule set contains N grouped rules - 1:
for to do - 2:
Put the rule into ▹: The i-th group - 3:
for to do ▹ Current rule set contains n rules - 4:
for to do - 5:
if CheckOverlap() == TRUE then ▹ Call the function on line 22 - 6:
break - 7:
end if - 8:
if then - 9:
Put the rule into - 10:
end if - 11:
end for - 12:
end for - 13:
▹ Exclude the rules of from the rule set and update n - 14:
end for - 15:
if
then - 16:
Put the remaining rules into - 17:
end if - 18:
for to do - 19:
▹: equivalent priority of ; : Find the group to which rule R belongs - 20:
end for - 21:
return
- 22:
function CheckOverlap() - 23:
for f = SIP to PROTOCOL do ▹ Judge each field - 24:
if then ▹ According to ( 1) - 25:
return FALSE - 26:
end if - 27:
end for - 28:
return TRUE - 29:
end function
|
Compared with the original priority, the equivalent priority exhibits two new properties. First, different rules can share the same equivalent priority. Second, the total number of equivalent priorities is significantly reduced. The resulting rule proportions under different groups after applying equivalent priority grouping to rule sets generated by ClassBench [
29], represented by the Cumulative Distribution Function (CDF), are shown in
Figure 5a. The figure shows that in both ACL and IPC rules, over 60% belong to group 1, with group 1 accounting for as much as 99.4% in the ACL_100k rule set. Furthermore, by applying equivalent priority grouping to the OpenFlow rule sets generated by ClassBench-ng [
36], the proportion of rules under different groups is obtained, as shown in
Figure 5b. It can be seen that in the OF2 rule sets, over 80% of the rules belong to group 1. Grouping with equivalent priority can significantly reduce redundant lookups during the packet classification stage, which is elaborated on in
Section 3.5.
Some existing works also utilize rule overlap, such as SAX-PAC [
37]; however, this differs from the equivalent priority grouping proposed in this paper. In terms of implementation purpose, SAX-PAC uses rule overlap to construct the
order-independent classifier, aiming to optimize the encoding and representation of rules to reduce TCAM space requirements. In contrast, this paper aims to reduce the number of lookups during packet classification to decrease memory access. Regarding implementation principles, SAX-PAC constructs an order-independent by finding the maximum number of rules in the original rule set when each pair of its rules do not intersect. In contrast, this paper groups the rule set based on the original priority order and overlap, with all groups having a priority sequence. In terms of results, the largest order-independent classifier constructed by SAX-PAC accounts for 99.8%, 91.4%, and 96.9% of the ACL1_50k, FW1_50k, and IPC1_50k rule sets, respectively. This significantly differs from the results presented in
Figure 5a of this paper.
3.3. Partitioning for Rule Sets
A key issue with constructing classifiers using decision tree algorithms is rule replication [
16,
18,
25]. The primary cause of rule replication is the unavoidable cutting of large-range rules to separate small-range ones. This leads to some large-range rules being cut internally, resulting in these rules being replicated into different child nodes. The impact of rule replication is more detrimental in shallower nodes, as the replicated rules in these nodes are further duplicated in deeper nodes, causing the rule replication factor [
18] to increase exponentially.
The distribution of rule sets has specific features that can be utilized to mitigate the impacts of rule replication. For instance, the prefix length of prefix-type rules determines their length in the rule space. By cutting only rules with similar lengths, the likelihood of rules being cut internally can be reduced. Therefore, similar prefix lengths are used to partition the rule set [
14,
17,
18,
25] to effectively reduce rule replication in this paper. For
N-dimensional rules, a threshold array (
) is defined. For each field (
), if the field’s range length exceeds
, it is considered a large field; otherwise, it is a small field. By categorizing rules based on various combinations of large and small fields, the rule set can be partitioned into several subsets, each subset with similar field sizes.
More specifically, since the SIP/DIP fields offer more distinctiveness in rule differentiation and excessive partitioning into subsets can actually increase the complexity of classification, partitioning is only performed on the SIP/DIP fields. After partitioning the rule set, a maximum of
subsets of rules are created. Decision tree classifiers are constructed for each rule subset to minimize rule replication as much as possible.
Figure 6 illustrates the prefix distribution and partition scheme of the IPC_10k rule set. When
, the IPC_10k rule set (total of 914 rules) is partitioned into 4 subsets. The first subset contains 37 rules satisfying SIP prefix length
and DIP prefix length
. The second subset consists of 153 rules satisfying SIP prefix length
and DIP prefix length
. Similarly, the third subset contains 132 rules, and the fourth subset contains 592 rules. Additionally, each subset is identified by the highest equivalent priority of the rules within it to facilitate subsequent pruning.
3.4. Decision Tree Construction
As described in
Section 2, cut and split are two mainstream methods for decomposing the rule space in decision trees, with the split method opting to separate rules at their boundaries. This method somewhat alleviates the rule replication issue encountered by the cut method. However, its binary mechanism leads to low efficiency in rule separation. More specifically, the hierarchical index of the decision tree constructed using the cut method is shown in (
2), where
denotes the value of a packet in the
f-th field and
represents the comparison key of a node in dimension
f. The hierarchical index of the decision tree constructed using the split method is shown in (
3), where
indicates the lower bound of the rule space of a node in dimension
f, and
denotes the cutting step size within the node. Both of these hierarchical indexing operations require only limited computations or comparisons, with a complexity of
. In comparison, the former algorithm enables indexing across multiple child nodes in a single hierarchical indexing stage, while the latter is confined to indexing within just two child nodes. This suggests that the conventional split method’s low efficiency in separating rules can lead to unnecessarily complex tree structures, such as those with excessive depth, ultimately resulting in significant memory consumption and even memory explosion.
Based on this, MultiSplit introduces multiple splits in each iteration process, creating multiple child nodes in rule space decomposition. The whole process of decision tree construction by MultiSplit is shown in Algorithm 2. For a given rule space, MultiSplit executes a set of fixed decomposition strategies to construct a single level of the decision tree and then iteratively applies the same process to the resulting subspaces for further decomposition. The decomposition strategy is as follows. The
N rules (
) in this rule space (
) are projected onto dimension
f and mapped to
N segments (
). By extracting the end points (up to
) of these segments and merging overlapping end points, each pair of adjacent end points forms a new segment, resulting in a total of
M segments
(line 4–6). Then, the following three-step heuristic is executed.
Algorithm 2 MultiSplit decomposition strategy |
- Input:
A rule space contains N rules, leaf node rule number threshold , decomposition factor and step factor - Output:
decomposed rule spaces or current rule space - 1:
if
then - 2:
- 3:
for f = SIP to PROTOCOL do - 4:
←Project(R,f) ▹: one-dimensional segments of rules - 5:
←Extract() ▹: the endpoints of the segments(may be overlap) - 6:
(, ) ←Merge() and Sort() ▹ Get segments and points - 7:
← ▹: number of rules covering - 8:
if then - 9:
▹ Select f as the splitting dimension - 10:
- 11:
end if - 12:
end for - 13:
▹: the number of decomposed subspaces - 14:
while true do - 15:
for to do - 16:
for to do - 17:
if then - 18:
▹ Choose as the position of i-th split - 19:
break - 20:
end if - 21:
end for - 22:
Split(, ) ▹: the i-th subspace - 23:
end for - 24:
if then - 25:
- 26:
else - 27:
break - 28:
end if - 29:
end while - 30:
return ▹C: all decomposed subspaces - 31:
else - 32:
Terminate decomposition and designate current rule space as a leaf node - 33:
return - 34:
end if
|
The above strategy can ensure that each subspace covers a roughly equal number of rules while minimizing the impact of rule replication. Constructing the decision tree with MultiSplit is an iterative process, so the algorithm needs to explicitly define the termination condition for stopping the decomposition of the rule space. In the case that the number of rules in the node is not greater than a threshold called binth, the splitting process is stopped.
3.5. Packet Classification
Each rule subset after partitioning generates a decision tree constructed by the building algorithm. A schematic of the data structure for MultiSplit tree nodes and the lookup process in a single node are shown in
Figure 7. Looking up in a decision tree starts at the root node and indexes downward level by level. For non-leaf nodes, the packet header and the range segments stored in the node are used to find the index of the next level’s child node. Since most non-leaf nodes contain numerous range segments, MultiSplit uses a binary search to index the matching child node. For leaf nodes, a linear search of the node’s rule space is conducted to find the matching rule. Among all the matching rules, the one with the highest priority is selected as the final match. Algorithm 3 shows the detailed process of lookup using the MultiSplit decision tree. To avoid unnecessary lookups, the following two optimizations are implemented in MultiSplit.
Introduce a tree priority for each tree, setting the highest equivalent priority of the rules within it. During the classifying process, if a matching rule is already found and its equivalent priority is higher than the priority of the next decision tree to be traversed, that tree is skipped.
During the classifying process, if a matching rule is found with the highest equivalent priority (i.e., group 1), it is immediately selected as the final matching rule, ending the lookup. Since over half of the rules in the rule set have the highest equivalent priority, this optimization significantly speeds up the packet classification process.
Algorithm 3 MultiSplit Decision Tree Lookup Algorithm |
- Input:
A packet header , the root nodes of K decision trees - Output:
Final matching rule - 1:
for to do - 2:
if Equivalent priority of matching rule priority of current decision tree then - 3:
continue - 4:
end if - 5:
▹ Take root node as the current processing node - 6:
while is a non-leaf node do - 7:
the specified dimension in node N - 8:
IndexChild() ▹ Calculate the child node index - 9:
▹ Take child node as the current processing node - 10:
if is an empty node then - 11:
break ▹ Skip current tree - 12:
end if - 13:
end while - 14:
if is not a null node then - 15:
linearly search for matching rules in node - 16:
end if - 17:
if the matching rule has the highest equivalence priority then - 18:
return - 19:
end if - 20:
end for - 21:
return
- 22:
function IndexChild() - 23:
- 24:
- 25:
while do - 26:
- 27:
if then - 28:
break - 29:
else if then - 30:
- 31:
else - 32:
- 33:
end if - 34:
end while - 35:
return m - 36:
end function
|
The second optimization can significantly accelerate the classification speed during packet classification without affecting the correctness of the classification results. This can be easily demonstrated as follows: suppose a packet (P) matches rule , and has the highest equivalent priority. In this case, must be the final matching rule. If there exists another rule () that also matches packet P and has a higher priority than , it implies that and overlap. Consequently, and would not be grouped together during the equivalent priority grouping, and the group containing would have a higher equivalent priority than the group containing . Therefore, cannot have the highest equivalent priority, which contradicts the given condition, that is, does not exist. In fact, our experimental results confirm that the classification results are entirely correct after the rule set is grouped by equivalent priority.