On Complexity of Deterministic and Nondeterministic Decision Trees for Conventional Decision Tables from Closed Classes

Ostonov, Azimkhon; Moshkov, Mikhail

doi:10.3390/e25101411

Open AccessArticle

On Complexity of Deterministic and Nondeterministic Decision Trees for Conventional Decision Tables from Closed Classes

by

Azimkhon Ostonov

^*

and

Mikhail Moshkov

Computer, Electrical and Mathematical Sciences & Engineering Division and Computational Bioscience Research Center, King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Saudi Arabia

^*

Author to whom correspondence should be addressed.

Entropy 2023, 25(10), 1411; https://doi.org/10.3390/e25101411

Submission received: 8 August 2023 / Revised: 5 September 2023 / Accepted: 10 September 2023 / Published: 3 October 2023

Download

Browse Figures

Versions Notes

Abstract

:

In this paper, we consider classes of conventional decision tables closed relative to the removal of attributes (columns) and changing decisions assigned to rows. For tables from an arbitrary closed class, we study the dependence of the minimum complexity of deterministic and nondeterministic decision trees on the complexity of the set of attributes attached to columns. We also study the dependence of the minimum complexity of deterministic decision trees on the minimum complexity of nondeterministic decision trees. Note that a nondeterministic decision tree can be interpreted as a set of true decision rules that covers all rows of the table.

Keywords:

closed classes of decision tables; deterministic decision trees; nondeterministic decision trees

1. Introduction

Decision tables (sometimes represented as datasets or finite information systems with a distinguished decision attribute) appear in data analysis [1,2,3,4,5,6] and in such areas as combinatorial optimization, computational geometry, and fault diagnosis, where they are used to represent and explore problems [7,8].

Decision trees [1,5,6,7,9] and decision rule systems [2,3,4,8,10,11,12] are widely used as classifiers, as a means for knowledge representation, and as algorithms for solving various problems of combinatorial optimization, fault diagnosis, etc. Decision trees and rules are among the most interpretable models in data analysis [13].

In this paper, we consider classes of conventional decision tables closed under the removal of columns and changing of decisions. The most natural examples of such classes are closed classes of decision tables generated by information systems (see Section 3.4 for an explanation). We study the dependence of the minimum complexity of deterministic and nondeterministic decision trees on the complexity of the set of attributes attached to columns of the decision table. We also study the dependence of the minimum complexity of deterministic decision trees on the minimum complexity of nondeterministic decision trees. Note that the nondeterministic decision trees can be considered as representations of systems of true decision rules that cover all rows of decision tables. Note also that the depth of deterministic and nondeterministic decision trees for computation Boolean functions was studied quite intensively [14,15,16,17].

This paper continues the study of closed classes of decision tables that began with work [18] and continued with works [19,20]. To the best of our knowledge, there are no other papers that study the closed classes of decision tables.

Various classes of objects that are closed under different operations are intensively studied. Among them, in particular, are classes of Boolean functions closed under the operation of superposition [21] and minor-closed classes of graphs [22]. Decision tables represent an interesting and important mathematical object deserving of mathematical research, in particular, the study of closed classes of decision tables.

In [18], we studied the dependence of the minimum depth of deterministic decision trees and the depth of deterministic decision trees constructed by a greedy algorithm on the number of attributes (columns) for conventional decision tables from classes closed under operations of removal of columns and changing of decisions.

In [19], we considered classes of decision tables with many-valued decisions closed under operations of removal of columns, changing of decisions, permutation of columns, and duplication of columns. We studied relationships among three parameters of these tables: the complexity of a decision table (if we consider the depth of decision trees, then the complexity of a decision table is the number of columns in it), the minimum complexity of a deterministic decision tree, and the minimum complexity of a nondeterministic decision tree. We considered a rough classification of functions characterizing relationships and enumerated all possible seven types of relationships.

In [20], we considered classes of decision tables with 0–1 decisions (each row is labeled with the decision 0 or the decision 1) closed relative to the removal of attributes (columns) and changing decisions assigned to rows. For tables from an arbitrary closed class, we studied the dependence of the minimum complexity of deterministic decision trees on various parameters of the tables: the minimum complexity of a test, the complexity of the set of attributes attached to columns, and the minimum complexity of a strongly nondeterministic decision tree. We also studied the dependence of the minimum complexity of strongly nondeterministic decision trees on the complexity of the set of attributes attached to columns. Note that a strongly nondeterministic decision tree can be interpreted as a set of true decision rules that covers all rows labeled with the decision 1.

In the previous papers, we did not consider in detail conventional decision tables in which rows are labeled with arbitrary decisions. These tables differ significantly from tables with many-valued decisions and from tables with 0–1 decisions considered previously. We now describe the results obtained in the present paper. Let A be a class of conventional decision tables closed under the removal of columns and changing of decisions, and let

ψ

be a bounded complexity measure. In this paper, we study three functions:

F_{ψ, A} (n)

,

G_{ψ, A} (n)

and

H_{ψ, A} (n)

.

The function

F_{ψ, A} (n)

characterizes the growth in the worst case of the minimum complexity of a deterministic decision tree for a decision table from A with the growth of the complexity of the set of attributes attached to columns of the table. We prove that the function

F_{ψ, A} (n)

is either bounded from above by a constant or grows as a logarithm of n, or it grows almost linearly depending on n (it is bounded from above by n and is equal to n for infinitely many n). These results are generalizations of results obtained in [20] for closed classes of decision tables with 0–1 decisions.

The function

G_{ψ, A} (n)

characterizes the growth in the worst case of the minimum complexity of a nondeterministic decision tree for a decision table from A with the growth of the complexity of the set of attributes attached to columns of the table. We prove that the function

G_{ψ, A} (n)

is either bounded from above by a constant or grows almost linearly depending on n (it is bounded from above by n and is equal to n for infinitely many n).

The function

H_{ψ, A} (n)

characterizes the growth in the worst case of the minimum complexity of a deterministic decision tree for a decision table from A with the growth of the minimum complexity of a nondeterministic decision tree for the table. This function is either not everywhere defined or is everywhere defined. Let

H_{ψ, A} (n)

be everywhere defined. We proved that this function is either bounded from above by a constant or it is greater than or equal to n for infinitely many n.

The novelty of the work is as follows:

For the function $F_{ψ, A}$ , which characterizes the complexity of deterministic decision trees, we have received an exhaustive description of the types of its behavior.
For the function $G_{ψ, A}$ , which characterizes the complexity of nondeterministic decision trees, we have received an exhaustive description of the types of its behavior.
For the function $H_{ψ, A}$ , which characterizes relationships between the complexity of deterministic and nondeterministic decision trees, we have received a preliminary description of the types of its behavior that requires additional study.

The obtained results allow us to point out the cases when the complexity of deterministic and nondeterministic decision trees is essentially less than the complexity of the set of attributes attached to columns of the table. This may be useful in applications.

The paper consists of six sections. In Section 2, main definitions and notation are considered. In Section 3, we provide the main results. Section 4 contains auxiliary statements. In Section 5, we prove the main results. Section 6 contains short conclusions.

2. Main Definitions and Notation

Denote

ω = {0, 1, 2, \dots}

and, for any

k \in ω \ {0, 1}

, denote

E_{k} = {0, 1, \dots, k - 1}

. Let

P = {f_{i} : i \in ω}

be the set of attributes (really names of attributes). Two attributes

f_{i}, f_{j} \in P

are considered different if

i \neq j

.

2.1. Decision Tables

First, we define the notion of a decision table.

Definition 1.

Let

k \in ω \ {0, 1}

. Denote by

M_{k}

the set of rectangular tables filled with numbers from

E_{k}

in each of which rows are pairwise different, each row is labeled with a number from ω (decision), and columns are labeled with pairwise different attributes from P. Rows are interpreted as tuples of values of these attributes. Empty tables without rows belong also to the set

M_{k}

. We will use the same notation Λ for these tables. Tables from

M_{k}

will be called decision tables.

Example 1.

Figure 1 shows a decision table from

M_{2}

.

Denote by

M_{k} C

the set of tables from

M_{k}

in each of which all rows are labeled with the same decision. Let

Λ \in

M_{k} C

.

Let T be a nonempty table from

M_{k}

. Denote by

P (T)

the set of attributes attached to columns of the table T. Let

f_{i_{1}}, \dots, f_{i_{m}} \in P (T)

and

δ_{1}, \dots, δ_{m} \in E_{k}

. We denote by

T (f_{i_{1}}, δ_{1}) \dots (f_{i_{m}}, δ_{m})

the table obtained from T by the removal of all rows that do not satisfy the following condition: in columns labeled with attributes

f_{i_{1}}, \dots, f_{i_{m}}

, the row has numbers

δ_{1}, \dots, δ_{m}

, respectively.

We now define two operations on decision tables: the removal of columns and changing of decisions. Let

T \in M_{k}

.

Definition 2.

Removal of columns. Let

D \subseteq P (T)

. We remove from T all columns labeled with the attributes from the set D. In each group of equal rows on the remaining columns, we keep one with the minimum decision. Denote the obtained table by

I (D, T)

. In particular,

I (\emptyset, T) = T

and

I (P (T), T) = Λ

. It is obvious that

I (D, T) \in

M_{k}

.

Definition 3.

Changing of decisions. Let

ν : E_{k}^{|P (T)|} \to ω

(by definition,

E_{k}^{0} = \emptyset

). For each row

\bar{δ}

of the table T, we replace the decision attached to this row with

ν (\bar{δ})

. We denote the obtained table by

J (ν, T)

. It is obvious that

J (ν, T) \in

M_{k}

.

Definition 4.

Denote

[T] = {J (ν, I (D, T)) : D \subseteq P (T), ν : E_{k}^{|P (T) \ D|} \to ω}

. The set

[T]

is the closure of the table T under the operations of removal of columns and changing of decisions.

Example 2.

Figure 2 shows the table

J (ν, I (D, T_{0}))

, where

T_{0}

is the table shown in Figure 1,

D = {f_{4}}

and

ν (x_{1}, x_{2}) = x_{1} + x_{2}

.

Definition 5.

Let

A \subseteq M_{k}

and

A \neq \emptyset

. Denote

[A] = ⋃_{T \in A} [T]

. The set

[A]

is the closure of the set A under the considered two operations. The class (the set) of decision tables A will be called a closed class if

[A] = A

.

Let

A_{1}

and

A_{2}

be closed classes of decision tables from

M_{k}

. Then,

A_{1} \cup A_{2}

is a closed class of decision tables from

M_{k}

.

2.2. Deterministic and Nondeterministic Decision Trees

A finite tree with the root is a finite directed tree in which exactly one node called the root has no entering edges. The nodes without leaving edges are called terminal nodes.

Definition 6.

A k-decision tree is a finite tree with the root, which has at least two nodes and in which

The root and edges leaving the root are not labeled.
Each terminal node is labeled with a decision from the set ω.
Each node, which is neither the root nor a terminal node, is labeled with an attribute from the set P. Each edge leaving such a node is labeled with a number from the set $E_{k}$ .

Example 3.

Figure 3 and Figure 4 show 2-decision trees.

We denote by

T_{k}

the set of all k-decision trees. Let

Γ \in T_{k}

. We denote by

P (Γ)

the set of attributes attached to nodes of

Γ

that are neither the root nor terminal nodes. A complete path of

Γ

is a sequence

τ = v_{1}, d_{1}, \dots, v_{m}, d_{m}, v_{m + 1}

of nodes and edges of

Γ

in which

v_{1}

is the root of

Γ

,

v_{m + 1}

is a terminal node of

Γ

and, for

j = 1, \dots, m

, the edge

d_{j}

leaves the node

v_{j}

and enters the node

v_{j + 1}

. Let

T \in

M_{k}

. If

P (Γ) \subseteq P (T)

, then we correspond to the table T and the complete path

τ

a decision table

T (τ)

. If

m = 1

, then

T (τ) = T

. If

m > 1

and, for

j = 2, \dots, m

, the node

v_{j}

is labeled with the attribute

f_{i_{j}}

and the edge

d_{j}

is labeled with the number

δ_{j}

, then

T (τ) = T (f_{i_{2}}, δ_{2}) \dots (f_{i_{m}}, δ_{m})

.

Definition 7.

Let

T \in M_{k} \ {Λ}

. A deterministic decision tree for the table T is a k-decision tree Γ satisfying the following conditions:

Only one edge leaves the root of Γ.
For any node, which is neither the root nor a terminal node, edges leaving this node are labeled with pairwise different numbers.
$P (Γ) \subseteq P (T)$ .
For any row of T, there exists a complete path τ of Γ such that the considered row belongs to the table $T (τ)$ .
For any complete path τ of Γ, either $T (τ) = Λ$ or all rows of $T (τ)$ are labeled with the decision attached to the terminal node of τ.

Example 4.

The 2-decision tree shown in Figure 3 is a deterministic decision tree for the decision table shown in Figure 1.

Definition 8.

Let

T \in

M_{k} \ {Λ}

. A nondeterministic decision tree for the table T is a k-decision tree Γ satisfying the following conditions:

$P (Γ) \subseteq P (T)$ .
For any row of T, there exists a complete path τ of Γ such that the considered row belongs to the table $T (τ)$ .
For any complete path τ of Γ, either $T (τ) = Λ$ or all rows of $T (τ)$ are labeled with the decision attached to the terminal node of τ.

Example 5.

The 2-decision tree shown in Figure 4 is a nondeterministic decision tree for the decision table shown in Figure 1.

Figure 4. A nondeterministic decision tree for the decision table shown in Figure 1.

2.3. Complexity Measures

Denote by B the set of all finite words over the alphabet

P = {f_{i} : i \in ω}

, which contains the empty word

λ

and on which the word concatenation operation is defined.

Definition 9.

A complexity measure is an arbitrary function

ψ : B \to ω

that has the following properties: for any words

α_{1}, α_{2} \in B

,

$ψ (α_{1}) = 0$ if and only if $α_{1} = λ$ —positivity property.
$ψ (α_{1}) = ψ (α_{1}^{'})$ for any word $α_{1}^{'}$ obtained from $α_{1}$ by permutation of letters—commutativity property.
$ψ (α_{1}) \leq ψ (α_{1} α_{2})$ —nondecreasing property.
$ψ (α_{1} α_{2}) \leq ψ (α_{1}) + ψ (α_{2})$ —boundedness from above property.

The following functions are complexity measures:

Function h for which, for any word $α \in B$ , $h (α) = |α|$ , where $|α|$ is the length of the word $α$ .
An arbitrary function $φ : B \to ω$ such that $φ (λ) = 0$ , for any $f_{i} \in P$ , $φ (f_{i}) > 0$ and, for any nonempty word $f_{i_{1}} \dots f_{i_{m}} \in B$ ,

$φ (f_{i_{1}} \dots f_{i_{m}}) = \sum_{j = 1}^{m} φ (f_{i_{j}}) .$

(1)
An arbitrary function $ρ : B \to ω$ such that $ρ (λ) = 0$ , for any $f_{i} \in P$ , $ρ (f_{i}) > 0$ , and, for any nonempty word $f_{i_{1}} \dots f_{i_{m}} \in B$ , $ρ (f_{i_{1}} \dots f_{i_{m}}) = max {ρ (f_{i_{j}}) : j = 1, \dots, m}$ .

Definition 10.

A bounded complexity measure is a complexity measure ψ, which has the boundedness from below property: for any word

α \in B

,

ψ (α) \geq |α|

.

Any complexity measure satisfying the equality (1), in particular the function h, is a bounded complexity measure. One can show that if functions

ψ_{1}

and

ψ_{2}

are complexity measures, then the functions

ψ_{3}

and

ψ_{4}

are complexity measures, where for any

α \in B

,

ψ_{3} (α) = ψ_{1} (α) + ψ_{2} (α)

and

ψ_{4} (α) = max (ψ_{1} (α), ψ_{2} (α))

. If the function

ψ_{1}

is a bounded complexity measure, then the functions

ψ_{3}

and

ψ_{4}

are bounded complexity measures.

Definition 11.

Let ψ be a complexity measure. We extend it to the set of all finite subsets of the set P. Let D be a finite subset of the set P. If

D = \emptyset

, then

ψ (D) = 0

. Let

D = {f_{i_{1}}, \dots, f_{i_{m}}}

and

m \geq 1

. Then,

ψ (D) = ψ (f_{i_{1}} \dots f_{i_{m}})

.

2.4. Parameters of Decision Trees and Tables

Let

Γ \in T_{k}

and

τ = v_{1}, d_{1}, \dots, v_{m}, d_{m}, v_{m + 1}

be a complete path of

Γ

. We correspond to the path

τ

a word

F (τ) \in B

: if

m = 1

, then

F (τ) = λ

, and if

m > 1

and, for

j = 2, \dots, m

, the node

v_{j}

is labeled with the attribute

f_{i_{j}}

, then

F (τ) = f_{i_{2}} \dots f_{i_{m}}

.

Definition 12.

Let ψ be a complexity measure. We extend the function ψ to the set

T_{k}

. Let

Γ \in T_{k}

. Then,

ψ (Γ) = max {ψ (F (τ))}

, where the maximum is taken over all complete paths τ of the decision tree Γ. For a given complexity measure ψ, the value

ψ (Γ)

will be called the complexity of the decision tree Γ. The value

h (Γ)

will be called the depth of the decision tree Γ.

Let

ψ

be a complexity measure. We now describe the functions

ψ^{d}

,

ψ^{a}

,

S e p

,

W_{ψ}

,

S_{ψ}

,

{\hat{S}}_{ψ}

,

M_{ψ}

, and N defined on the set

M_{k}

and taking values from the set

ω

. By definition, the value of each of these functions for

Λ

is equal to 0. Let

T \in M_{k} \ {Λ}

.

$ψ^{d} (T) = min {ψ (Γ)}$ , where the minimum is taken over all deterministic decision trees $Γ$ for the table T.
$ψ^{a} (T) = min {ψ (Γ)}$ , where the minimum is taken over all nondeterministic decision trees $Γ$ for the table T.
A set $D \subseteq P (T)$ is called a separating set for the table T if the sets of columns labeled with attributes from D rows of the table T are pairwise different. Then, $S e p (T)$ is the minimum cardinality of a separating set for the table T.
$W_{ψ} (T) = ψ (P (T))$ . This is the complexity of the set of attributes attached to columns of the table T.
Let $\bar{δ}$ be a row of the table T. Denote $S_{ψ} (T, \bar{δ}) = min {ψ (D)}$ , where the minimum is taken over all subsets D of the set $P (T)$ such that in the set of columns of T labeled with attributes from D, the row $\bar{δ}$ is different from all other rows of the table T. Then, $S_{ψ} (T) = max {S_{ψ} (T, \bar{δ})}$ , where the maximum is taken over all rows $\bar{δ}$ of the table T.
${\hat{S}}_{ψ} (T) = max {S_{ψ} (T^{*}) : T^{*} \in [T]}$ .
If $T \in M_{k} C$ , then $M_{ψ} (T) = 0$ . Let $T \notin M_{k} C$ , $|P (T)| = n$ , and columns of the table T be labeled with the attributes $f_{t_{1}}, \dots, f_{t_{n}}$ . Let $\bar{δ} = (δ_{1}, \dots, δ_{n}) \in E_{k}^{n}$ . Denote by $M_{ψ} (T, \bar{δ})$ the minimum number $p \in ω$ for which there exist attributes $f_{t_{i_{1}}}, \dots, f_{t_{i_{m}}} \in P (T)$ such that $T (f_{t_{i_{1}}}, δ_{i_{1}}) \dots (f_{t_{i_{m}}}, δ_{i_{m}}) \in M_{k} C$ and $ψ (f_{t_{i_{1}}} \dots f_{t_{i_{m}}}) = p$ . Then, $M_{ψ} (T) = max {M_{ψ} (T, \bar{δ}) : \bar{δ} \in E_{k}^{n}}$ .
$N (T)$ is the number of rows in the table T.

For the complexity measure h, we denote

W (T) = W_{h} (T)

,

S (T) = S_{h} (T)

,

\hat{S} (T) = {\hat{S}}_{h} (T)

, and

M (T) = M_{h} (T)

. Note that

W (T)

is the number of columns in the table T.

Example 6.

We denote by

T_{0}

the decision table shown in Figure 1. One can show that

h^{d} (T_{0}) = 2

,

h^{a} (T_{0}) = 2

,

S e p (T_{0}) = 3

,

W (T_{0}) = 3

,

N (T_{0}) = 7

,

S (T_{0}) = 3

,

\hat{S} (T_{0}) = 3

, and

M (T_{0}) = 2

.

3. Main Results

In this section, we consider results obtained for the functions

F_{ψ, A}

,

G_{ψ, A}

, and

H_{ψ, A}

and discuss closed classes of decision tables generated by information systems.

3.1. Function $F_{ψ, A}$

Let

ψ

be a bounded complexity measure and A be a nonempty closed class of decision tables from

M_{k}

. We now define a function

F_{ψ, A} : ω \to ω

. Let

n \in ω

. Then

F_{ψ, A} (n) = max {ψ^{d} (T) : T \in A, W_{ψ} (T) \leq n} .

The function

F_{ψ, A}

characterizes the growth in the worst case of the minimum complexity of a deterministic decision tree for a decision table from A with the growth of the complexity of the set of attributes attached to columns of this table.

Let

D = {n_{i} : i \in ω}

be an infinite subset of the set

ω

in which, for any

i \in ω

,

n_{i} < n_{i + 1}

. Let us define a function

H_{D} : ω \to ω

. Let

n \in ω

. If

n < n_{0}

, then

H_{D} (n) = 0

. If, for some

i \in ω

,

n_{i} \leq n < n_{i + 1}

, then

H_{D} (n) = n_{i}

.

Theorem 1.

Let ψ be a bounded complexity measure and A be a nonempty closed class of decision tables from

M_{k}

. Then,

F_{ψ, A}

is an everywhere defined nondecreasing function such that

F_{ψ, A} (n) \leq n

for any

n \in ω

and

F_{ψ, A} (0) = 0

. For this function, one of the following statements holds:

(a): If the functions $S_{ψ}$ and N are bounded from above on class A, then there exists a positive constant $c_{0}$ such that $F_{ψ, A} (n) \leq c_{0}$ for any $n \in ω$ .
(b): If the function $S_{ψ}$ is bounded from above on class A and the function N is not bounded from above on class A, then there exist positive constants $c_{1}$ , $c_{2}$ , $c_{3}$ , $c_{4}$ such that $c_{1} {log}_{2} n - c_{2} \leq F_{ψ, A} (n) \leq c_{3} {log}_{2} n + c_{4}$ for any $n \in ω \ {0}$ .
(c): If the function $S_{ψ}$ is not bounded from above on class A, then there exists an infinite subset D of the set ω such that $H_{D} (n) \leq F_{ψ, A} (n)$ for any $n \in ω$ .

Thus, for the function

F_{ψ, A}

, we have received an exhaustive description of the types of its behavior. Type (a) is degenerate: the number of rows in decision tables from the closed class is limited from above by a constant. Type (b) is of most interest to us: the complexity of deterministic decision trees behaves in the worst case as the logarithm on the complexity of the set of attributes in the table. Type (c) is not of particular interest: the complexity of deterministic decision trees in the worst case is the same as the complexity of the set of attributes in the table.

3.2. Function $G_{ψ, A}$

Let

ψ

be a bounded complexity measure and A be a nonempty closed class of decision tables from

M_{k}

. We now define a function

G_{ψ, A}

. Let

n \in ω

. Then

G_{ψ, A} (n) = max {ψ^{a} (T) : T \in A, W_{ψ} (T) \leq n} .

The function

G_{ψ, A}

characterizes the growth in the worst case of the minimum complexity of a nondeterministic decision tree for a decision table from A with the growth of the complexity of the set of attributes attached to columns of this table.

Theorem 2.

Let ψ be a bounded complexity measure and A be a nonempty closed class of decision tables from

M_{k}

. Then,

G_{ψ, A}

is an everywhere defined nondecreasing function such that

G_{ψ, A} (n) \leq n

for any

n \in ω

and

G_{ψ, A} (0) = 0

. For this function, one of the following statements holds:

(a): If the function $S_{ψ}$ is bounded from above on class A, then there exists a positive constant c such that $G_{ψ, A} (n) \leq c$ for any $n \in ω$ .
(b): If the function $S_{ψ}$ is not bounded from above on the class A, then there exists an infinite subset D of the set ω such that $H_{D} (n) \leq G_{ψ, A} (n)$ for any $n \in ω$ .

Thus, for the function

G_{ψ, A}

, we have received an exhaustive description of the types of its behavior. Type (a) is of most interest to us: the complexity of nondeterministic decision trees is bounded from above by a constant. Type (b) is not of particular interest: the complexity of nondeterministic decision trees in the worst case is the same as the complexity of the set of attributes in the table.

3.3. Function $H_{ψ, A}$

Let

ψ

be a bounded complexity measure and A be a nonempty closed class of decision tables from

M_{k}

. We now define possibly partial function

H_{ψ, A} : ω \to ω

. Let

n \in ω

. If the set

{ψ^{d} (T) : T \in A, ψ^{a} (T) \leq n}

is infinite, then the value

H_{ψ, A} (n)

is undefined. Otherwise,

H_{ψ, A} (n) = max {ψ^{d} (T) : T \in A, ψ^{a} (T) \leq n}

.

The function

H_{ψ, A}

characterizes the growth in the worst case of the minimum complexity of a deterministic decision tree for a decision table from A with the growth of the minimum complexity of a nondeterministic decision tree for this table.

Theorem 3.

Let ψ be a bounded complexity measure and A be a nonempty closed class of decision tables from

M_{k}

. Then,

H_{ψ, A} (0) = 0

and

H_{ψ, A}

is a nondecreasing function in its domain.

If

H_{ψ, A}

is not an everywhere defined function, then its domain coincides with the set

{n : n \in ω, n \leq n_{0}}

for some

n_{0} \in ω

.

If the function

H_{ψ, A}

is everywhere defined, then one of the following statements holds:

(a): If the function $ψ^{d}$ is bounded from above on the class A, then there is a nonnegative constant c such that $H_{ψ, A} (n) \leq c$ for any $n \in ω$ .
(b): If the function $ψ^{d}$ is not bounded from above on the class A, then there exists an infinite subset D of the set ω such that $H_{ψ, A} (n) \geq H_{D} (n)$ for any $n \in ω$ .

Remark 1.

From Theorem 1, it follows that the function

ψ^{d}

is bounded from above on class A if and only if the functions

S_{ψ}

and N are bounded from above on class A.

For the function

H_{ψ, A}

, we have received a preliminary description of the types of its behavior. Type (a) is degenerate: the number of rows in decision tables from the closed class is limited from above by a constant. Type (b) is of most interest to us. However, more research is needed to understand how the function can behave within this type.

3.4. Family of Closed Classes of Decision Tables

Let U be a set and

Φ = {f_{0}, f_{1}, \dots}

be a finite or countable set of functions (attributes) defined on U and taking values from

E_{k}

. The pair

(U, Φ)

is called a k-information system. A problem over

(U, Φ)

is an arbitrary tuple

z = (U, ν, f_{i_{1}}, \dots, f_{i_{n}})

, where

n \in ω \ {0}

,

ν : E_{k}^{n} \to ω

and

f_{i_{1}}, \dots, f_{i_{n}}

are functions from

Φ

with pairwise different indices

i_{1}, \dots, i_{n}

. The problem z is to determine the value

ν (f_{i_{1}} (u), \dots, f_{i_{n}} (u))

for a given

u \in U

. Various examples of k-information systems and problems over these systems can be found in [7].

We denote by

T (z)

a decision table from

M_{k}

with n columns labeled with attributes

f_{i_{1}}, \dots, f_{i_{n}}

. A row

(δ_{1}, \dots, δ_{n}) \in E_{k}^{n}

belongs to the table

T (z)

if and only if the system of equations

{f_{i_{1}} (x) = δ_{1}, \dots, f_{i_{n}} (x) = δ_{n}}

has a solution from the set U. This row is labeled with the decision

ν (δ_{1}, \dots, δ_{n})

.

Let the algorithms for solving problem z be algorithms in which each elementary operation consists of calculating the value of some attribute from the set

{f_{i_{1}}, \dots, f_{i_{n}}}

on a given element

u \in U

. Then, as a model of the problem z, we can use the decision table

T (z)

, and as models of algorithms for solving the problem z, we can use deterministic and nondeterministic decision trees for the table

T (z)

.

Denote by

Z (U, Φ)

the set of problems over

(U, Φ)

and

A (U, Φ) = {T (z) : z \in Z (U, Φ)}

. One can show that

A (U, Φ) = [A (U, Φ)]

; i.e.,

A (U, Φ)

is a closed class of decision tables from

M_{k}

generated by the information system

(U, Φ)

.

Closed classes of decision tables generated by k-information systems are the most natural examples of closed classes. However, the notion of a closed class is essentially wider. In particular, the union

A (U_{1}, Φ_{1}) \cup A (U_{2}, Φ_{2})

, where

(U_{1}, Φ_{1})

and

(U_{2}, Φ_{2})

are k-information systems, is a closed class, but generally, we cannot find an information system

(U, Φ)

such that

A (U, Φ) = A (U_{1}, Φ_{1}) \cup A (U_{2}, Φ_{2})

.

3.5. Example of Information System

Let

R

be the set of real numbers and

F = {f_{i} : i \in ω}

be the set of functions defined on

R

and taking values from the set

E_{2}

such that, for any

i \in ω

and

a \in R

,

f_{i} (a) = \{\begin{matrix} 0, & a < i, \\ 1, & a \geq i . \end{matrix}

Let

ψ

be a bounded complexity measure and

A = A (R, F)

. One can prove the following statements:

The function N is not bounded from above on the set A.
The function $S_{ψ}$ is bounded from above on the set A if and only if there exists a constant $c_{0} > 0$ such that $ψ (f_{i}) \leq c_{0}$ for any $i \in ω$ .
The function $ψ^{d}$ is not bounded from above on the set A.
The function $H_{ψ, A}$ is everywhere defined if and only if, for any $n \in ω$ , the set ${f_{i} : i \in ω, ψ (f_{i}) \leq n}$ is finite.

4. Auxiliary Statements

This section contains auxiliary statements.

It is not difficult to prove the following upper bound on the minimum complexity of deterministic decision trees for a table.

Lemma 1.

For any complexity measure ψ and any table T from

M_{k}

,

ψ^{d} (T) \leq W_{ψ} (T) .

The notions of a decision table and a deterministic decision tree used in this paper are somewhat different from the corresponding notions used in [23]. Taking into account these differences, it is easy to prove the following statement, which follows almost directly from Lemma 1.3 and Theorem 2.2 from [23].

Lemma 2.

For any complexity measure ψ and any table T from

M_{k}

,

ψ^{d} (T) \leq \{\begin{matrix} 0, & M_{ψ} (T) = 0, \\ M_{ψ} (T) {log}_{2} N (T), & M_{ψ} (T) \geq 1 . \end{matrix}

The following two statements are simple generalizations of similar results obtained in [20] for decision tables with 0–1-decisions. For the sake of completeness, we present their proofs.

Lemma 3.

For any complexity measure ψ and any table T from

M_{k}

,

M_{ψ} (T) \leq 2 {\hat{S}}_{ψ} (T) .

Proof.

Let

T \in M_{k} C

. Then,

M_{ψ} (T) = 0

. Therefore,

M_{ψ} (T) \leq 2 {\hat{S}}_{ψ} (T)

.

Let

T \notin M_{k} C

,

W (T) = n

and

f_{t_{1}}, \dots, f_{t_{n}}

be attributes attached to columns of the table T. Denote

D = {f_{i} : f_{i} \in P (T), ψ (f_{i}) \leq S_{ψ} (T)}

and

T^{*} = I (P (T) \ D, T)

. Evidently,

T^{*} \in [T]

. Taking into account that the function

ψ

has the nondecreasing property, we obtain that any two rows of T are different in columns labeled with attributes from the set D. Let for the definiteness,

D = {f_{t_{1}}, \dots, f_{t_{m}}}

.

Let

\bar{δ} = (δ_{1}, \dots, δ_{n}) \in E_{k}^{n}

. Let

(δ_{1}, \dots, δ_{m})

be a row of

T^{*}

. Since

T^{*} \in [T]

, there exist attributes

f_{t_{j_{1}}}, \dots, f_{t_{j_{s}}}

of the table

T^{*}

such that the row

(δ_{1}, \dots, δ_{m})

is different from all other rows of

T^{*}

in columns labeled with these attributes and

ψ (f_{t_{j_{1}}} \dots f_{t_{j_{s}}}) \leq {\hat{S}}_{ψ} (T)

. It is clear that

T (f_{t_{j_{1}}}, δ_{j_{1}}) \dots (f_{t_{j_{s}}}, δ_{j_{s}}) \in M_{k} C

. Therefore,

M_{ψ} (T, \bar{δ}) \leq {\hat{S}}_{ψ} (T)

.

Let

(δ_{1}, \dots, δ_{m})

be not a row of

T^{*}

. We consider m tables

T^{*} (f_{t_{1}}, δ_{1})

,

T^{*} (f_{t_{1}}, δ_{1}) (f_{t_{2}}, δ_{2})

, ⋯,

T^{*} (f_{t_{1}}, δ_{1}) \dots (f_{t_{m}}, δ_{m})

. If

T^{*} (f_{t_{1}}, δ_{1}) = Λ

, then

T (f_{t_{1}}, δ_{1}) = Λ

. Since

f_{t_{1}} \in D

,

ψ (f_{t_{1}}) \leq S_{ψ} (T)

. Taking into account that

S_{ψ} (T) \leq {\hat{S}}_{ψ} (T)

, we obtain

M_{ψ} (T, \bar{δ}) \leq {\hat{S}}_{ψ} (T)

. Let

T^{*} (f_{t_{1}}, δ_{1}) \neq Λ

. Then, there exists

p \in {1, \dots, m - 1}

such that

T^{*} (f_{t_{1}}, δ_{1}) \dots (f_{t_{p}}, δ_{p}) \neq Λ

and

T^{*} (f_{t_{1}}, δ_{1}) \dots (f_{t_{p + 1}}, δ_{p + 1}) = Λ

. Denote

C = {f_{t_{p + 1}}, f_{t_{p + 2}}, \dots, f_{t_{n}}}

and

T^{0} = I (C, T)

. Evidently,

T^{0} \in [T]

. Therefore, in

T^{0}

, there are attributes

f_{t_{i_{1}}}, \dots, f_{t_{i_{l}}}

such that the row

(δ_{1}, \dots, δ_{p})

is different from all other rows of the table

T^{0}

in columns labeled with these attributes and

ψ (f_{t_{i_{1}}} \dots f_{t_{i_{l}}}) \leq {\hat{S}}_{ψ} (T)

. One can show that

T^{*} (f_{t_{i_{1}}}, δ_{i_{1}}) \dots (f_{t_{i_{l}}}, δ_{i_{l}}) = T^{*} (f_{t_{1}}, δ_{1}) \dots (f_{t_{p}}, δ_{p}) .

Therefore,

T^{*} (f_{t_{i_{1}}}, δ_{i_{1}}) \dots (f_{t_{i_{l}}}, δ_{i_{l}}) (f_{t_{p + 1}}, δ_{p + 1}) = Λ

. Hence,

T (f_{t_{i_{1}}}, δ_{i_{1}}) \dots (f_{t_{i_{l}}}, δ_{i_{l}}) (f_{t_{p + 1}}, δ_{p + 1}) = Λ .

Since

f_{t_{p + 1}} \in D

,

ψ (f_{t_{p + 1}}) \leq S_{ψ} (T) \leq {\hat{S}}_{ψ} (T)

. Using the boundedness from above property of the function

ψ

, we obtain

ψ (f_{t_{i_{1}}} \dots f_{t_{i_{l}}} f_{t_{p + 1}}) \leq 2 {\hat{S}}_{ψ} (T)

. Therefore,

M_{ψ} (T, \bar{δ}) \leq 2 {\hat{S}}_{ψ} (T)

.

Thus, for any

\bar{δ} \in E_{k}^{n}

,

M_{ψ} (T, \bar{δ}) \leq 2 {\hat{S}}_{ψ} (T)

. As a result, we obtain

M_{ψ} (T) \leq 2 {\hat{S}}_{ψ} (T)

. □

Lemma 4.

For any table T from

M_{k} \ {Λ}

,

N (T) \leq {(k W (T))}^{S (T)} .

Proof.

If

N (T) = 1

, then

S (T) = 0

and the considered inequality holds. Let

N (T) > 1

. Then,

S (T) > 0

. Denote

m = S (T)

. Evidently, for any row

\bar{δ}

of the table T, there exist attributes

f_{i_{1}}, \dots, f_{i_{m}} \in P (T)

and numbers

σ_{1}, \dots, σ_{m} \in E_{k}

such that the table

T (f_{i_{1}}, σ_{1}) \dots (f_{i_{m}}, σ_{m})

contains only the row

\bar{δ}

. Therefore, there is a one-to-one mapping of rows of the table T onto some set G of pairs of tuples of the kind

((f_{i_{1}}, \dots, f_{i_{m}}), (σ_{1}, \dots, σ_{m}))

where

f_{i_{1}}, \dots, f_{i_{m}} \in P (T)

and

σ_{1}, \dots, σ_{m} \in E_{k}

. Evidently,

|G| \leq W {(T)}^{m} k^{m}

. Therefore,

N (T) \leq {(k W (T))}^{S (T)}

. □

Lemma 5.

For any table T from

M_{k} \ {Λ}

, there exists a mapping

ν : E_{k}^{W (T)} \to ω

such that

h^{d} (J (ν, T)) \geq {log}_{k} N (T) .

Proof.

Let

T \in M_{k} \ {Λ}

and

ν : E_{k}^{W (T)} \to ω

be a mapping for which

ν (\bar{δ}) \neq ν (\bar{σ})

for any

\bar{δ}, \bar{σ} \in E_{k}^{W (T)}

such that

\bar{δ} \neq \bar{σ}

. Denote

T^{*} = J (ν, T)

. Let

Γ

be a deterministic decision tree for the table

T^{*}

such that

h (Γ) = h^{d} (T^{*})

. Denote by

L_{t} (Γ)

the number of terminal nodes of

Γ

. Evidently,

N (T) \leq L_{t} (Γ)

. One can show that

L_{t} (Γ) \leq k^{h (Γ)}

. Therefore,

N (T) \leq k^{h (Γ)}

. Since

T \neq Λ

,

N (T) > 0

. Hence,

h (Γ) \geq {log}_{k} N (T)

. Taking into account that

h (Γ) = h^{d} (T^{*})

, we obtain

h^{d} (T^{*}) \geq {log}_{k} N (T)

. □

It is not difficult to prove the following upper bound on the minimum cardinality of a separating set for a table by the induction on the number of rows in the table.

Lemma 6.

For any table T from

M_{k} \ {Λ}

,

S e p (T) \leq N (T) - 1 .

Lemma 7.

For any complexity measure ψ and any table T from

M_{k}

,

ψ^{a} (T) \leq ψ^{d} (T) .

Proof.

Let

T \in M_{k}

. If

T = Λ

, then

ψ^{a} (T) = ψ^{d} (T) = 0

. Let

T \in M_{k} \ {Λ}

. It is clear that each deterministic decision tree for the table T is a nondeterministic decision tree for the table T. Therefore,

ψ^{a} (T) \leq ψ^{d} (T)

. □

Lemma 8.

For any complexity measure ψ and any table T from

M_{k}

, which contains at least two rows, there exists a table

T^{*} \in [T]

such that

ψ^{a} (T^{*}) = ψ^{d} (T^{*}) = W_{ψ} (T^{*}) = S_{ψ} (T^{*}) = S_{ψ} (T) .

Proof.

Let

\bar{δ}

be a row of the table T such that

S_{ψ} (T, \bar{δ}) = S_{ψ} (T)

. Let D be a subset of the set

P (T)

with the minimum cardinality such that

ψ (D) = S_{ψ} (T, \bar{δ})

and in the set of columns labeled with attributes from D, the row

\bar{δ}

is different from all other rows of the table T. Let

\bar{σ}

be the tuple obtained from the row

\bar{δ}

by the removal of all numbers that are in the intersection with columns labeled with attributes from the set

P (T) \ D

. Let

ν : E_{k}^{|D|} \to E_{2}

and, for any

\bar{γ} \in E_{k}^{|D|}

, if

\bar{γ} = \bar{σ}

, then

ν (\bar{γ}) = 1

and if

\bar{γ} \neq \bar{σ}

, then

ν (\bar{γ}) = 0

. Denote

T^{*} = J (ν, I (P (T) \ D, T))

.

From the fact that D has the minimum cardinality and from the properties of the function

ψ

, it follows that for any attribute from the set D, there exists a row of T, which is different from the row

\bar{δ}

only in the column labeled with the considered attribute among attributes from D. Therefore, for any attribute of the table

T^{*}

, there exists a row of

T^{*}

, which is different from the row

\bar{σ}

only in the column labeled with this attribute. Thus,

S_{ψ} (T^{*}, \bar{σ}) = W_{ψ} (T^{*}) .

(2)

Using properties of the function

ψ

, we obtain

S_{ψ} (T^{*}) \leq W_{ψ} (T^{*})

. From this inequality and from (2), it follows that

S_{ψ} (T^{*}) = W_{ψ} (T^{*}) .

(3)

Let

Γ

be a nondeterministic decision tree for the table

T^{*}

such that

ψ (Γ) = ψ^{a} (T^{*})

,

τ

be a complete path of

Γ

such that the row

\bar{σ}

belongs to the table

T (τ)

, and

F (τ) = f_{i_{1}} \dots f_{i_{t}}

. It is clear that

\bar{σ}

is the only row of the table

T (τ)

. Therefore, in columns labeled with attributes

f_{i_{1}}, \dots, f_{i_{t}}

the row

\bar{σ}

is different from all other rows of the table

T^{*}

. Thus,

ψ (F (τ)) \geq S_{ψ} (T^{*}, \bar{σ}) = W_{ψ} (T^{*})

. Therefore,

ψ (Γ) \geq W_{ψ} (T^{*})

and

ψ^{a} (T^{*}) \geq W_{ψ} (T^{*})

. Using Lemmas 1 and 7, we obtain

ψ^{a} (T^{*}) \leq ψ^{d} (T^{*}) \leq W_{ψ} (T^{*})

. Therefore,

ψ^{a} (T^{*}) = ψ^{d} (T^{*}) = W_{ψ} (T^{*}) .

(4)

By the choice of the set D,

W_{ψ} (T^{*}) = S_{ψ} (T)

. From this equality and from (3) and (4), it follows that

ψ^{a} (T^{*}) = ψ^{d} (T^{*}) = W_{ψ} (T^{*}) = S_{ψ} (T^{*}) = S_{ψ} (T)

. □

It is not difficult to prove the following upper bounds on the minimum complexity of nondeterministic decision trees for a table.

Lemma 9.

For any complexity measure ψ and any table T from

M_{k}

,

ψ^{a} (T) \leq S_{ψ} (T) \leq W_{ψ} (T) .

5. Proofs of Theorems 1, 2 and 3

Proof of Theorem 1.

Since A is a closed class,

Λ \in A

. By definition,

W_{ψ} (Λ) = 0

. Using this fact and Lemma 1, we obtain that

F_{ψ, A}

is an everywhere defined function and

F_{ψ, A} (n) \leq n

for any

n \in ω

. Evidently,

F_{ψ, A}

is a nondecreasing function. Let

T \in A

and

W_{ψ} (T) \leq 0

. Using the positivity property of the function

ψ

, we obtain

T = Λ

. Therefore

F_{ψ, A} (0) = 0

.

(a) Let the functions

S_{ψ}

and N be bounded from above on the class A. Then, there are constants

a \geq 1

and

b \geq 2

such that

S_{ψ} (T) \leq a

and

N (T) \leq b

for any table

T \in A

. Let

T \in A

. Taking into account that A is a closed class, we obtain

{\hat{S}}_{ψ} (T) \leq a

. By Lemma 3,

M_{ψ} (T) \leq 2 a

. From this inequality, inequality

N (T) \leq b

and from Lemma 2, it follows that

ψ^{d} (T) \leq 2 a {log}_{2} b

. Denote

c_{0} = 2 a {log}_{2} b

. Taking into account that T is an arbitrary table from the class A, we obtain that

F_{ψ, A} (n) \leq c_{0}

for any

n \in ω

.

(b) Let the function

S_{ψ}

be bounded from above on class A and the function N be not bounded from above on A. Then, there exists a constant

a \geq 2

such that for any table

Q \in A

,

S_{ψ} (Q) \leq a .

(5)

Let

n \in ω \ {0}

and T be an arbitrary table from A such that

W_{ψ} (T) \leq n

. If

T \in M_{k} C

; then, evidently,

ψ^{d} (T) = 0

. Let

T \notin M_{k} C

. Using the boundedness from below property of the function

ψ

, we obtain

W (T) \leq n

and

S (T) \leq a

. From these inequalities and Lemma 4, it follows that

N (T) \leq {(k n)}^{a}

. From (5) and Lemma 3, it follows that

M_{ψ} (T) \leq 2 a

. Using the last two inequalities and Lemma 2, we obtain

ψ^{d} (T) \leq 2 a^{2} {log}_{2} n + 2 a^{2} {log}_{2} k .

(6)

Denote

c_{3} = 2 a^{2}

and

c_{4} = 2 a^{2} {log}_{2} k

. Then,

ψ^{d} (T) \leq c_{3} {log}_{2} n + c_{4}

. Taking into account that n is an arbitrary number from

ω \ {0}

and T is an arbitrary table from A such that

W_{ψ} (T) \leq n

, we obtain that for any

n \in ω \ {0}

,

F_{ψ, A} (n) \leq c_{3} {log}_{2} n + c_{4} .

(7)

Let

n \in ω \ {0}

. Denote

c_{1} = 1 / {log}_{2} k

and

c_{2} = {log}_{k} a

. We now show that

F_{ψ, A} (n) \geq c_{1} {log}_{2} n - c_{2} .

(8)

Denote

m = ⌊n / a⌋

. Taking into account that the function N is not bounded from above on set A, we obtain that there exists a table

T \in A

such that

N (T) \geq k^{m}

. Let C be a separating set for the table T with the minimum cardinality such that

ψ (f_{i}) \leq a

for any

f_{i} \in C

. The existence of such a set follows from the inequality (5) and properties of commutativity and nondecreasing of the function

ψ

. Evidently,

|C| \geq m

. Let D be a subset of the set C such that

|D| = m

. Denote

T^{0} = I (P (T) \ D, T)

. One can show that for any attribute of

T^{0}

, there are two rows of the table

T^{0}

that differ only in the column labeled with this attribute. Therefore, D is a separating set for the table

T^{0}

with the minimum cardinality. By Lemma 6,

N (T^{0}) \geq m + 1 \geq n / a

.

Using Lemma 5, we obtain that there exists a mapping

ν : E_{k}^{m} \to ω

such that

h^{d} (J (ν, T^{0})) \geq {log}_{k} (n / a) .

Denote

T^{*} = J (ν, T^{0})

. Since

ψ

is a bounded complexity measure,

ψ^{d} (T^{*}) \geq {log}_{k} (n / a)

. Using the boundedness from above property of the function

ψ

, we obtain

W_{ψ} (T^{*}) \leq ⌊n / a⌋ a \leq n

. Hence, the inequality (8) holds. From (7) and (8), it follows that

c_{1} {log}_{2} n - c_{2} \leq F_{ψ, A} (n) \leq c_{3} {log}_{2} n + c_{4}

for any

n \in ω \ {0}

.

(c) Let the function

S_{ψ}

be not bounded from above on class A. Using Lemma 8, we obtain that the set

D = {W_{ψ} (T) : T \in A, ψ^{d} (T) = W_{ψ} (T)}

is infinite. Since the class A is closed,

Λ \in A

, and since

ψ^{d} (Λ) = W_{ψ} (Λ) = 0

,

0 \in D

. Evidently, for any

n \in D

,

F_{ψ, A} (n) \geq n

. Taking into account that

F_{ψ, A}

is a nondecreasing function, we obtain that

F_{ψ, A} (n) \geq H_{D} (n)

for any

n \in ω \ {0}

. □

Proof of Theorem 2.

Since A is a closed class,

Λ \in A

. By definition,

W_{ψ} (Λ) = 0

. Using this fact and Lemma 9, we obtain that

G_{ψ, A}

is an everywhere defined function and

G_{ψ, A} (n) \leq n

for any

n \in ω

. Evidently,

G_{ψ, A}

is a nondecreasing function. Let

T \in A

and

W_{ψ} (T) \leq 0

. Using the positivity property of the function

ψ

, we obtain

T = Λ

. Therefore,

G_{ψ, A} (0) = 0

.

(a) Let the function

S_{ψ}

be bounded from above on the class A. Then, there is a constant

c > 0

such that

S_{ψ} (T) \leq c

for any

T \in A

. By Lemma 9,

ψ^{a} (T) \leq S_{ψ} (T) \leq c

for any

T \in A

. Therefore, for any

n \in ω

,

G_{ψ, A} (n) \leq c

.

(b) Let the function

S_{ψ}

be not bounded from above on class A. Using Lemma 8, we obtain that the set

D = {W_{ψ} (T) : T \in A, ψ^{a} (T) = W_{ψ} (T)}

is infinite. Since the class A is closed,

Λ \in A

, and since

ψ^{a} (Λ) = W_{ψ} (Λ) = 0

,

0 \in D

. Evidently, for any

n \in D

,

G_{ψ, A} (n) \geq n

. Taking into account that

G_{ψ, A}

is a nondecreasing function, we obtain that

G_{ψ, A} (n) \geq H_{D} (n)

for any

n \in ω \ {0}

. □

Proof of Theorem 3.

Let

n \in ω

and both n and

n + 1

belong to the domain of the function

H_{ψ, A}

. Immediately from the definition of this function, it follows that

H_{ψ, A} (n) \leq H_{ψ, A} (n + 1)

. Let

T \in A

and

ψ^{a} (T) \leq 0

. Using the positivity property of the function

ψ

, one can show that

T \in M_{k} C

. From Lemma 2, it follows that

ψ^{d} (T) = 0

. Therefore,

H_{ψ, A} (0) = 0

.

Let

H_{ψ, A}

be not an everywhere defined function and m be the minimum number from

ω

such that the value

H_{ψ, A} (m)

is not defined. It is clear that

m > 0

and the value

H_{ψ, A} (n)

is not defined for each

n \in ω

such that

n \geq m

. Denote

n_{0} = m - 1

. Then, the domain of the function

H_{ψ, A}

is equal to

{n : n \in ω, n \leq n_{0}}

.

We now consider the case when

H_{ψ, A}

is an everywhere defined function.

(a) Let there exist a nonnegative constant

c

such that

ψ^{d} (T) \leq c

for any table

T \in A

. Then, evidently,

H_{ψ, A} (n) \leq c

for any

n \in ω

.

(b) Let there be a nonnegative constant c such that

ψ^{d} (T) \leq c

for any table

T \in A

. Let us assume that there exists a nonnegative constant

d

such that

ψ^{a} (T) \leq d

for any table

T \in A

. Then, the value

H_{ψ, A} (d)

is not defined, but this is impossible. Therefore, the set

D = {ψ^{a} (T) : T \in A}

is infinite. Since

Λ \in A

,

0 \in D

. By Lemma 7,

ψ^{a} (T) \leq ψ^{d} (T)

for any table

T \in A

. Hence,

H_{ψ, A} (n) \geq n

for any

n \in D

. Taking into account that

H_{ψ, A}

is a nondecreasing function, we obtain that

H_{ψ, A} (n) \geq H_{D} (n)

for any

n \in ω

. □

6. Conclusions

In this paper, we studied the complexity of deterministic and nondeterministic decision trees for tables from closed classes of conventional decision tables. The obtained results allow us to point out the cases when the complexity of deterministic and nondeterministic decision trees is essentially less than the complexity of the set of attributes attached to columns of the table. This may be useful in applications. Future research will be devoted to a more in-depth study of relationships between the complexity of deterministic and nondeterministic decision trees for conventional decision tables from closed classes.

Author Contributions

Conceptualization, A.O. and M.M.; methodology, A.O. and M.M.; validation, A.O.; formal analysis, A.O. and M.M.; investigation, A.O.; resources, A.O. and M.M.; writing—original draft preparation, A.O. and M.M.; writing—review and editing, A.O. and M.M.; visualization, A.O.; supervision, M.M.; funding acquisition, M.M. All authors have read and agreed to the published version of the manuscript.

Funding

Research funded by King Abdullah University of Science and Technology.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Acknowledgments

Research reported in this publication was supported by King Abdullah University of Science and Technology (KAUST). The authors are grateful to the anonymous reviewers for useful remarks and suggestions.

Conflicts of Interest

The authors declare no conflict of interest.

References

Breiman, L.; Friedman, J.H.; Olshen, R.A.; Stone, C.J. Classification and Regression Trees; Wadsworth and Brooks: Monterey, CA, USA, 1984. [Google Scholar]
Chikalov, I.; Lozin, V.V.; Lozina, I.; Moshkov, M.; Nguyen, H.S.; Skowron, A.; Zielosko, B. Three Approaches to Data Analysis—Test Theory, Rough Sets and Logical Analysis of Data; Intelligent Systems Reference Library; Springer: Berlin/Heidelberg, Germany, 2013; Volume 41. [Google Scholar]
Fürnkranz, J.; Gamberger, D.; Lavrac, N. Foundations of Rule Learning; Cognitive Technologies; Springer: Berlin/Heidelberg, Germany, 2012. [Google Scholar]
Pawlak, Z. Rough Sets—Theoretical Aspects of Reasoning about Data; Theory and Decision Library: Series D; Kluwer: Alphen aan den Rijn, The Netherlands, 1991; Volume 9. [Google Scholar]
Quinlan, J.R. C4.5: Programs for Machine Learning; Morgan Kaufmann: Burlington, MA, USA, 1993. [Google Scholar]
Rokach, L.; Maimon, O. Data Mining with Decision Trees—Theory and Applications; Series in Machine Perception and Artificial Intelligence; World Scientific: Singapore, 2007; Volume 69. [Google Scholar]
Moshkov, M. Time Complexity of Decision Trees. Trans. Rough Sets 2005, 3, 244–459. [Google Scholar]
Moshkov, M.; Zielosko, B. Combinatorial Machine Learning—A Rough Set Approach; Studies in Computational Intelligence; Springer: Berlin/Heidelberg, Germany, 2011; Volume 360. [Google Scholar]
Moshkov, M. Comparative Analysis of Deterministic and Nondeterministic Decision Trees; Intelligent Systems Reference Library; Springer: Cham, Switzerland, 2020; Volume 179. [Google Scholar]
Boros, E.; Hammer, P.L.; Ibaraki, T.; Kogan, A. Logical analysis of numerical data. Math. Program. 1997, 79, 163–190. [Google Scholar] [CrossRef]
Boros, E.; Hammer, P.L.; Ibaraki, T.; Kogan, A.; Mayoraz, E.; Muchnik, I.B. An Implementation of Logical Analysis of Data. IEEE Trans. Knowl. Data Eng. 2000, 12, 292–306. [Google Scholar] [CrossRef]
Pawlak, Z.; Skowron, A. Rudiments of rough sets. Inf. Sci. 2007, 177, 3–27. [Google Scholar] [CrossRef]
Molnar, C. Interpretable Machine Learning. A Guide for Making Black Box Models Explainable, 2nd ed.; Independent Publishers: Chicago, IL, USA, 2022. [Google Scholar]
Blum, M.; Impagliazzo, R. Generic Oracles and Oracle Classes (Extended Abstract). In Proceedings of the 28th Annual Symposium on Foundations of Computer Science, Los Angeles, CA, USA, 27–29 October 1987; IEEE Computer Society: Washington, DC, USA, 1987; pp. 118–126. [Google Scholar]
Buhrman, H.; de Wolf, R. Complexity measures and decision tree complexity: A survey. Theor. Comput. Sci. 2002, 288, 21–43. [Google Scholar] [CrossRef]
Hartmanis, J.; Hemachandra, L.A. One-way functions, robustness, and the non-isomorphism of NP-complete sets. In Proceedings of the Second Annual Conference on Structure in Complexity Theory, Cornell University, Ithaca, NY, USA, 16–19 June 1987; IEEE Computer Society: Washington, DC, USA, 1987. [Google Scholar]
Tardos, G. Query complexity, or why is it difficult to separate NP^A∩coNP^A from P^A by random oracles A? Combinatorica 1989, 9, 385–392. [Google Scholar] [CrossRef]
Moshkov, M. On depth of conditional tests for tables from closed classes. In Combinatorial-Algebraic and Probabilistic Methods of Discrete Analysis; Markov, A.A., Ed.; Gorky University Press: Gorky, Russia, 1989; pp. 78–86. (In Russian) [Google Scholar]
Ostonov, A.; Moshkov, M. Comparative analysis of deterministic and nondeterministic decision trees for decision tables from closed classes. arXiv 2023, arXiv:2304.10594. [Google Scholar]
Ostonov, A.; Moshkov, M. Deterministic and strongly nondeterministic decision trees for decision tables from closed classes. arXiv 2023, arXiv:2305.06093. [Google Scholar]
Post, E. Two-Valued Iterative Systems of Mathematical Logic; Annals of Mathematics Studies; Princeton University Press: Princeton, NJ, USA, 1941; Volume 5. [Google Scholar]
Robertson, N.; Seymour, P.D. Graph Minors. XX. Wagner’s conjecture. J. Comb. Theory, Ser. B 2004, 92, 325–357. [Google Scholar] [CrossRef]
Moshkov, M. Conditional tests. In Problemy Kibernetiki; Yablonskii, S.V., Ed.; Nauka Publishers: Moscow, Russia, 1983; Volume 40, pp. 131–170. (In Russian) [Google Scholar]

Figure 1. Decision table from

M_{2}

.

Figure 1. Decision table from

M_{2}

.

Figure 2. Decision table obtained from the decision table shown in Figure 1 by removal of a column and changing of decisions.

Figure 3. A deterministic decision tree for the decision table shown in Figure 1.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Ostonov, A.; Moshkov, M. On Complexity of Deterministic and Nondeterministic Decision Trees for Conventional Decision Tables from Closed Classes. Entropy 2023, 25, 1411. https://doi.org/10.3390/e25101411

AMA Style

Ostonov A, Moshkov M. On Complexity of Deterministic and Nondeterministic Decision Trees for Conventional Decision Tables from Closed Classes. Entropy. 2023; 25(10):1411. https://doi.org/10.3390/e25101411

Chicago/Turabian Style

Ostonov, Azimkhon, and Mikhail Moshkov. 2023. "On Complexity of Deterministic and Nondeterministic Decision Trees for Conventional Decision Tables from Closed Classes" Entropy 25, no. 10: 1411. https://doi.org/10.3390/e25101411

APA Style

Ostonov, A., & Moshkov, M. (2023). On Complexity of Deterministic and Nondeterministic Decision Trees for Conventional Decision Tables from Closed Classes. Entropy, 25(10), 1411. https://doi.org/10.3390/e25101411

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

On Complexity of Deterministic and Nondeterministic Decision Trees for Conventional Decision Tables from Closed Classes

Abstract

1. Introduction

2. Main Definitions and Notation

2.1. Decision Tables

2.2. Deterministic and Nondeterministic Decision Trees

2.3. Complexity Measures

2.4. Parameters of Decision Trees and Tables

3. Main Results

3.1. Function $F_{ψ, A}$

3.2. Function $G_{ψ, A}$

3.3. Function $H_{ψ, A}$

3.4. Family of Closed Classes of Decision Tables

3.5. Example of Information System

4. Auxiliary Statements

5. Proofs of Theorems 1, 2 and 3

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Article Menu

On Complexity of Deterministic and Nondeterministic Decision Trees for Conventional Decision Tables from Closed Classes

Abstract

1. Introduction

2. Main Definitions and Notation

2.1. Decision Tables

2.2. Deterministic and Nondeterministic Decision Trees

2.3. Complexity Measures

2.4. Parameters of Decision Trees and Tables

3. Main Results

3.1. Function F ψ , A

3.2. Function G ψ , A

3.3. Function H ψ , A

3.4. Family of Closed Classes of Decision Tables

3.5. Example of Information System

4. Auxiliary Statements

5. Proofs of Theorems 1, 2 and 3

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

3.1. Function $F_{ψ, A}$

3.2. Function $G_{ψ, A}$

3.3. Function $H_{ψ, A}$