5.1. Exploiting Relevant Context Using Rough Set Theory
We firstly demonstrate the working of rough set theory for attributes reduction for generating decision rules to select relevant contexts for a video. A Rough Set based approach is demonstrated below which is able to allow the association of contextual information with collaborative filtering to eliminate irrelevancy and roughness by considering given examples in
Table 4 for attributes reduction and generating decision rules. It involves the following steps:
Step 1 (Sets formation):
By considering the example given in
Table 4, three sets can be formed which are shown in
Table 5:
Set consisting of users, D = {, , , …, U },
Conditional Attributes: {Age, Time, Day Type},
Decision Attributes: {, }.
Step 2 (Indiscernibility Relation):
Indiscernibility defines a relation among different types of objects in which all included values are similarly related to the considered attributes subset which means two identical context with identical decisions from the table. Thus, on the basis of each attribute, we can have two indiscernibility relation sets.
For Age, two indiscernibility sets are:
For Time, two indiscernibility sets are:
For Day Type, two indiscernibility sets are:
Step 3 (Formal Approximation):
After Indiscernibility Relation, we can compute lower and upper approximation to determine the boundary region.
Lower approximation () will give us two sets: in the first, users had clearly watched
and, in the second, users had clearly watched
. It indicates that if a user with context {
Child,
Day, and
Weekday} had watched
, then there will be no other user in the complete decision table with the same context but different video preference. This context will contain users who had clearly watched
. Thus, the two groups of users who had watched
and
respectively are:
Upper approximation () will give us two sets: the first is the set of all users from the decision table that can be possibly classified as users who had watched
and the second is the set of users who had possibly watched
:
Boundary Region (BR): Consists of objects that can be uniquely classified without any conflict. To better illustrate this concept, consider an example: a user with context {Adult, Day and Weekend} had watched . Likewise, another user with the same context had watched . Then, there will be roughness in deciding which video is more suitable in the given context. Moreover, the boundary region can be computed through: BR = .
If BR = ∅, then the set X is crisp with respect to Videos; however, if BR, then the set is called rough related to the Videos. That is to say, any rough set has non-empty boundary region, in comparison to the crisp set.
For the given example, however,
BR which means that the given set is rough. The boundary region for current example is given below:
Quality of Approximation:
Users who had clearly watched V: | |
| B(X): 5/20: 25%. |
Users who had clearly watched V: | |
| B(X): 6/20: 30%. |
Users who had possibly watched V or V: | |
| B(X): 9/20: 45%. |
Now, for data reduction, we have to ignore the conflicting part, which is 45%.
Step 4 (Data Reduction Using the Positive Region):
Data reduction tries to verify equivalent information that contains the pairing mechanism in which users with same indiscernibility relations will be placed together. By considering the boundary region from Equation (
23), the positive region from Equation (
21) along with the degree of dependency is computed in
Table 6:
From
Table 6, the groups that are represented by
G where
i = 1, 2, …, 5 is used to represent the groups that have the same degree of dependency as each other. The list of users are obtained from the lower approximation as given in Equation (
21). Lastly, the degree of dependency
is used to show which users have similar dependency. The level of dependency between the members of same group is represented by “+” sign. This representation indicates that the users classified as the same group members contain similar contextual information. However, for a better understanding of this concept, we have represented each group with their decision attributes and conditional attributes in
Table 7.
By considering this grouping, a new
Table 8 is formed with reduced attributes of contextual information with regard to the users and videos.
After obtaining a reduced attributes table, we are now going to compute decision rules for the selection of influential contexts for a video. The new
Table 9 formed after reducing the attributes on the basis of the degree of dependency, can be used for the generation of decision rules. Subsequently, the process for generating decision rules is given below. The possible subsets based upon given contextual attributes {
Age,
Time,
Daytype} will be eight. Each subset will generate a new reduced set for the decision-making process. However, we have to ignore the empty set and the last complete set. Thus, the remaining six subsets along with their reduced generated subset are shown below:
Table 9,
Table 10 and
Table 11 indicate the first three subsets from the reduced context
Table 8. These tables present the individual subset of the three pieces of contextual information. In
Table 12, a new set is formed by combining age and time. Through this table, we reduced the contextual attributes as given in
Table 13. Similarly,
Table 14 represents the subset formed by combining age and day. The reduced table obtained from this is given in
Table 15. Similarly,
Table 16 represent the subset with the combination of time and day and the reduced table obtained from this is given in
Table 17. Lastly, the decision rules obtained from reduced tables are jointly given in
Table 18. These rules for selecting influential contexts are given in Equation (
24):
As indicated from the given Equation (
24), the decision rules that represent which context is influential for a video to be recommended are explicitly defined below:
Rule-1
R1: If User’s:
Age = Child, |
Time = Day, |
Day Type = Weekday. |
Then, video for recommendation = V
Rule-2
R2: If User’s
Age = Adult, |
Time = Day, |
Day Type = Weekday. |
Then, the video for recommendation = V
Rule-3
R3: If User’s
Age = Child, |
Time = Day, |
Day Type = Weekend. |
Then, the Video for recommendation = V
Our goal was to select concrete contextual features that are influential and relevant in which a video has to be recommended. In the case that a user’s set of attributes are {Child, Day, Weekday}, then through RST we had identified for which videos these contexts are most influential. The results of the demonstrated example depict that applying rough sets for exploiting relevant contexts for recommendations of the videos generate only a 27% return of decision rules because RST works upon granularity and lacks a parameterization property. Hence, a better approach is required to overcome this issue.
5.2. Exploiting Relevant Context Using Soft-Rough Sets
Applying Soft-Rough Sets on CAVRS to alleviate contextual sparsity for exploiting relevant context is our proposed model. Assuming a new scenario is given in
Table 19 which currently has three contextual attributes (
Age,
Time,
Day), possible contextual scenarios by considering these three contextual attributes will be:
: Child, Day, Weekday,
: Child, Day, Weekend,
: Child, Night, Weekday,
: Child, Night, Weekend,
: Adult, Day, Weekday,
: Adult, Day, Weekend,
: Adult, Night, Weekday,
: Adult, Night, Weekend.
Let A = {, , , , , , , } be a universal set of contexts and V = {,,,,} is called a parameters set containing videos. A pair (F, V) represents a soft set if and only if F denotes a mapping of set of videos V into the subset of universal set U.
Step 1 (Soft Set Formation):
In this phase, contextual attributes will be mapped to the videos. A soft set (
F, E) describes which videos had been watched in a specific context scenario. By considering data from
Table 20, the contextual mapping using soft sets theory concept will become as follows:
For the purpose of demonstrating conflict in the given example, we have illustrated the given scenario in
Table 20. The relation between columns and rows of the table represents the conflicting scenario between different contextual attributes. We have eight contextual factors related to their conflicts with other contexts in
Table 20. Note that a conflict will appear when a video that has been watched in a specific context is also been watched by the user in a different context. For simplicity, the negative (-) sign in the status column represents the existence of conflicting situation, and corresponding context column represents the number of conflicts of current context situation with other contextual scenarios.
We have illustrated the conflicting situation in Boolean-valued information systems
below in which the universal set
U and conflicting function
f are demonstrated as:
Similarly,
F(c,v) = -, F(c,v) = 0,
F(c,v) = -, F(c,v) = 0, F(c,v) = -
F(c,v) = 0, F(c,v) = -,
F(c,v) = 0, F(c,v) = 0, F(c,v) = 0
F(c,v) = -, F(c,v) = 0,
F(c,v) = 0, F(c,v) = 0, F(c,v) = -
F(c,v) = -, F(c,v) = 0,
F(c,v) = 0, F(c,v) = 0, F(c,v) = -
F(c,v) = 0, F(c,v) = 0,
F(c,v) = -, F(c,v) = 0, F(c,v) = 0
F(c,v) = 0, F(c,v) = 0,
F(c,v) = 0, F(c,v) = +, F(c,v) = -
F(c,v) = 0, F(c,v) = 0,
F(c,v) = 0, F(c,v) = 0, F(c,v) = -
F(c,v) = -, F(c,v) = -,
F(c,v) = 0, F(c,v) = 0, F(c,v) = 0.
Figure 3 presents the flow graph of
Table 19 and the conflict situation in the given
Table 20 through visual representation. The given flow graph in the figure illustrates each mapping between the context and video on the basis of some conflict status (+), (-), and (0).
The negative (-) sign represents the existence of a conflicting situation, the positive (+) sign indicates that there is no conflicting situation, and 0 represents that a corresponding video has not been watched in the current context.
Step 2 (Calculating Lower Approximation):
In this phase, we will compute lower approximation on the soft set which has been demonstrated in a previous step. This set will contain only those objects that surely belong to the set
X. Lower approximation will be computed by assuming value of
X as follows:
Let us consider our
X = {
,
} which is one of the subsets from the
P(V) where
V represents the videos, then the lower approximation will be:
Similarly, if
X = {
,
}, then the lower approximation set will become:
where
v represents a video and
c represents any context scenario in which video
v has been watched. To better illustrate this concept, consider the scenario given in
Table 19. From the illustrated example, it is demonstrated that a given combination of videos which we have taken in set
X is efficient to recommend in context
because videos watched in
has maximum matching with a given set
X. Hence, we can compute a relevant context scenario for each subset of video combination. As our lower approximation set
is not empty, hence, we will perform upper approximation
on the given soft sets.
Step 3 (Calculating Upper Approximation):
In this phase, we will compute upper approximation on the soft set which has been created in step 1. This set will contain those objects that possibly belong to the set
X. Upper approximation will be computed by assuming value of
X as follows:
If our X = {, }, then upper approximation will be:
Let us consider our X = {, }, which is one of the subsets from the P(V), where V represents the videos, then the upper approximation will be:
={F(c,v), F(c,v)}→F(), F(), F(), F(), F().
As can be seen from the above approximations, it indicates that, with a given combination of videos in set X, possible contextual scenarios can be , , , and .
Step 4 (Degree of Dependency using Positive Region):
This phase is performed immediately after the calculation of lower
and upper approximation
to access the degree of dependency between the attributes using positive region POS
(X).
Figure 4 presents the degree of dependency between each of the contextual scenarios obtained from
Table 20.
As can be seen from
Figure 4, each branch represents the link of a contextual scenario with other context scenarios. It gives a clear indication of dependency between different context attributes. For clarity and simplicity, solid lines denote a high degree of dependency, and the dotted lines represent low dependency between the attributes.
In the given circumstance, a contextual scenario
c is revoked of an attribute
v. Suppose our
X = {
,
} which is one of the subsets from the
P(V), where
V represents the videos, the degree of dependency of contextual scenario for this subset can be computed using Equation (
9). For this particular scenario,
POS(X) will be
c, and
|O| will be the whole set of videos
V = {
,
,
,
,
}. Likewise, the calculations using Equation (
9) will be incorporated in Equation (
11) for computing degree of dependency
, where
a A. For the sake of simplicity and clarity, each calculation is given below:
Respectively,
computing dependency of
X on
A as:
Finally, degree of dependency
will be computed as given below:
After the association of degree of dependency between the c and all remaining contextual scenarios from the set A, we can now compute the relevant contexts through this dependency level.
Step 5 (Weighting Process):
We will use lower approximation while generating decision rules for exploiting relevant context. In our case, it can be seen that the relevant context for the given subset X in Step 2 is . However, other videos can be achieved by applying the same mechanism on different subsets of videos.
The key intuition behind using a soft-rough set process, rather than traditional approaches, is that high precision is achieved by making real estimates through identifying imprecise data. For the weighting process, we assume that those contextual attributes that are less dependent on other contextual features are more valuable in making predictions. The weighting factor w will consider the weight as the real values in the range of [0,1]. These weights for each of the contextual attributes manipulate the influence of those factors by controlling their overall contribution during recommendation.
In the given circumstance, the degree of dependency can be computed from step 4 by following the methodology defined in that section. Likewise,
Figure 4 illustrates the degree of dependency between different contextual scenarios. As interesting as it may appear, each contextual scenario is a combination of different contextual attributes. Hence, the degree of dependency of each contextual scenario actually depicts the level of dependency of particular contextual scenarios with other contextual attributes. We had divided dependency into three categories: completely dependent, partially dependent, and least dependent for the purpose of assigning weights.
Table 20 reflects the degree of dependency for each contextual attribute as demonstrated below:
c = Highest Dependency ⇒ Child, Day, Weekday, |
c = Least Dependency ⇒ Child, Day, Weekend, |
c = Partial Dependency ⇒ Child, Night, Weekday, |
c = Partial Dependency ⇒ Child, Night, Weekend, |
c = Least Dependency ⇒ Adult, Day, Weekday, |
c = Partial Dependency ⇒ Adult, Day, Weekend, |
c = Partial Dependency ⇒ Adult, Night, Weekday, |
c = Partial Dependency ⇒ Adult, Night, Weekend. |
Likewise, degree of dependency for each contextual attribute from the above illustrative equation can be computed in a way that highest dependency weight is 0.2, the weight for partial dependency is 0.3, and, lastly, the weight for least dependency is 0.5 as demonstrated below:
For each contextual attribute, the total weights are calculated through the sum of individual weight associated due to their occurrence in each of the dependency level. As an example, Child occurs in four different contextual scenarios, with different dependency levels. Thus, by summing these up, the dependency level will generate the total weight of the Child attributes.
Step 6 (Selection of Relevant Contextual Attribute):
The above-computed weights are being used for relevant context prediction for a given set of videos. A new minimum contextual set is chosen for the recommendation process. Supposing that a combination of videos is given as X = {, }, then the most suitable contextual scenario in which this combination of videos has to be recommend will be F(), as F() is the combination of three contextual attributes {Child, Day, Weekday}. Thus, the most influential contextual factors from these three attributes will be the one with maximum weight among them. Hence, for the given circumstance, the most influential context will be Day, which actually represents the Time attribute. Respectively, we can ignore other contextual attributes by selecting only the minimal part from the set of contextual attributes.
It had been indicated from the literature that using irrelevant contextual features in recommender algorithms may agitate contextual sparsity. Therefore, the selection of contextual sets with minimal contextual factors can alleviate the contextual sparsity. After the detection of the relevant contextual set containing minimal but influential contextual factors, the recommendation process will start. One of the most popular approaches that is the most successful while predicting recommendations for the videos in CAVRS is collaborative filtering (CF). According to this approach, a comparison will be performed between the ratings of the current user and the rating of other users with similar history.
However, to confirm the genuineness of the results from the given illustrative example, we extensively analyzed the given scenario for the example, and our findings revealed that the conflicting situation reduces when a minimal set of contextual factors are used as contextual scenarios. We can, therefore, recommend the videos to the users on the basis of contextual scenarios containing minimal and highly weighted contextual factors.