RuLer is an inductive rule learning algorithm designed in the context of live coding for automatic synthesizers programming [
3]. It takes as input a set of labeled presets, from which a set of IF-THEN rules generalizing them is obtained. Examples of labels could be: “intro” if the preset is intended to be used during the intro of a piece, or “harsh”, which could be the linguistic label describing the produced sound.The generalization process is based on the patterns found through the iterative comparison of the presets. To compare the presets, a dissimilarity function receives a pair of them and returns
True whenever they are similar enough according to the specific form of the function and a given threshold. The dissimilarity threshold (
) is established by the user. The algorithm works as follows:
2.1. Dissimilarity Function
The dissimilarity function receives two rules () together with a threshold and returns True if the rules have the same category and dissimilarity. It returns False otherwise. The parameter d is an input parameter of the algorithm.
The dissimilarity function, currently implemented in the RuLer algorithm, counts the number of empty intersections between the sets of the corresponding entries in the rules. For example, if = [{1}, {3,5}, intro] and = [{1,3}, {7,11}, intro], dissimilarity(,) = 1. If = [{1}, {3,5,7}, intro] and = [{1,3}, {7,11}, intro], dissimilarity(,) = 0.
2.2. Create_Rule Function
This functions receives pairs of rules , satisfying that dissimilarity. Then, it creates a new rule according to the way it is defined. The function currently used creates a new rule by taking the unions of the corresponding sets of the rules received. For example, if = [{1}, {3,5,7}, intro] and = [{1,3}, {7,11}, intro], then = [{1,3}, {3,5,7,11}, intro]. The candidate rule is accepted if the following conditions are met:
No contradictions (i.e., rules with the same parameter values but different label) are created during the generalization process.
From all the presets contained in the candidate rule, the percentage of them contained in the original data are greater than or equal to a [0,1]. This number is also an input parameter of the algorithm defined by the user. For instance, = 1 implies that 100% of the instances contained in a candidate rule have to be present in the input data for the rule to be accepted. needs 50% of the instances, etc.
2.4. RuLer Characteristics
The RuLer algorithm is designed to return all the existing patterns, expressing as rules all pairs of instances satisfying , as its main intention is to offer all possibilities for creating new instances. Therefore, it is possible for a single instance, let us call it , to be included in more than one valid rule if , and are single rules satisfying that: , and .
To illustrate this case, consider the dataset of
Table 1.
The RuLer algorithm with parameters
and
produces the rules: [{2}, {1, 2}, ‘intro’], [{1, 2, 3}, {2}, ‘intro’]. These rules are shown, with their possible extensions in a solid line and a dashed line respectively, at the left of
Figure 1.
Notice that the combination [{2},{2},‘intro’] is present in both rules. As mentioned, if this were not the case, one of the patterns might fail to return to the user. To illustrate this, consider the same dataset and let us use the Hamming distance (
) as the similarity function. Then, suppose that the create_rule function, whenever a pattern is found, creates a rule taking the unions of the parameters of the respective rules and eliminates the component rules after producing the new one. With these conditions, comparing
and
produces the rule
[{2,3},{2},‘intro’]. This rule will not produce another rule when compared with the remaining data:
[{1},{2},‘intro’] and
[{2},{1},‘intro’]. Therefore, the resulting rule set is:
[{2,3}{2},‘intro’],
[{1},{2},‘intro’] and
[{2},{1},‘intro’]. This is shown at the right of
Figure 1. The resulting rule set does not express the existing patterns between [{2},{1},‘intro’] and [{2},{2},‘intro’] as well as between [{1},{2},‘intro’] and [{2},{2},‘intro’] or [{3},{2},‘intro’]. To avoid this, the create_rule and the dissimilarity function were conceived to return all the patterns found in the data.
Regarding how
d and
work, consider the simple set of individual rules presented in
Table 2.
If
and
, the single rule that models the dataset is at the mid part of
Table 2. The number of allowed empty intersections among the single rules at the Top of the Table is two. Then, every pair of rules can be compacted into a new rule during the process. As the ratio of single rules that have to be contained in the original data for any created rule is
, the rule at the mid part can be created as it contains all the instances in the original data which are
of the number of single instances of the rule (nine). Note that this is true if all seen values are: for the first attribute 1, 2, and 3; For the second attribute 4, 5, and 6; For the third attribute 6.
If
and
, the rule model extracted by the algorithm is presented at the bottom of
Table 2. Here, the ratio of single instances contained in any rule that have to be in the original data are
. Therefore, the rule at the middle of
Table 2 cannot be created.
The parameter ratio is constant because it defines the level of generalization that the user of the algorithm wants to explore. The ratio allows for the extension of the knowledge base to cases that have not been previously used to build the model. If the user is more conservative, the ratio should be closer to 1. If the goal is to be more exploratory, lower ratios are needed.
Finally, although no comparisons of computational time were carried out, the algorithm complexity serves to estimate its performance. If
m is the size of input data, the algorithm complexity is
. This complexity considers the dissimilarity and create_rule functions described. This complexity is better than a previous version of the algorithm
presented in [
11].