1. Introduction
The problem of coded caching introduced in [
1] plays a crucial role in reducing peak hour traffic in networks. In a coded caching scheme, a part of the content is made available in local cache of users, so that traffic can be reduced at peak hours. Coded caching scheme involves two phases: a placement phase and a delivery phase. In the placement phase or the prefetching phase, which is performed during off-peak times, the entire database is made available to each user. Users fill their cache with the available data. Delivery phase is carried out once the demands are revealed by the users. During prefetching, some parts of files have to be judiciously cached at each user in such a way that the rate of transmission is reduced during the delivery phase. The prefetching can be done with or without coding. If during prefetching, no coding of parts of files is done, the prefetching scheme is referred to as uncoded prefetching [
1,
2]. If coding is done during prefetching stage, then the prefetching scheme is referred to as coded prefetching [
3,
4,
5,
6,
7].
The seminal work in [
1] shows that apart from the local caching gains obtained by placing contents at user caches before the demands are revealed, a global caching gain can be obtained by coded transmissions. This scheme is extended to decentralized scheme in [
8]. The caching problem is extended to the case of non-uniform demands [
9], online coded caching [
10], hierarchical caching [
11], and device-to-device caching [
12]. Cache management scheme which incorporates decoding information in making cache replacement decisions was studied in [
13]. A decentralized secure coded caching approach was proposed in [
14], in which nodes only transmit coded files to avoid eavesdropper wiretapping and protect the user contents.
An error correcting delivery scheme is required if the shared bottleneck link between the server and the users is error-prone. The minimum average rate and minimum peak rate of error correcting delivery schemes is characterized in [
15]. The placement phase is assumed to be error-free. This assumption can be justified as during the placement phase there is no bandwidth constraint and any number of re-transmissions can be done to make the placement error-free. A similar model in which the delivery phase takes place over a packet erasure broadcast channel was considered in [
16,
17,
18,
19].
In this paper, we consider the caching schemes considered in [
3,
4]. The prefetching scheme employed in these papers is coded, i.e., parts of files are coded and placed in the user caches. Optimal linear error correcting delivery schemes are proposed for these prefetching schemes. The optimal error correcting delivery scheme refers to an error correcting delivery scheme, which uses the minimum number of transmissions for a given placement. The main contributions of this paper are as follows.
The prefetching scheme proposed in [
3] is considered for the case where the number of users is the same as the number of files. The minimum number of transmissions required for correcting finite number of transmission errors is obtained for this case (
Section 3.1).
When the number of users is greater than the number of files, a different prefetching scheme is employed in [
3]. For this prefetching strategy, the minimum number of transmissions required for correcting finite number of transmission errors is obtained (
Section 3.2).
A linear error correcting delivery scheme for coded caching problem with coded prefetching for small buffer sizes is proposed. The expressions for average rate and peak rate of this error correcting delivery scheme are found (
Section 4).
The caching scheme in [
4] is considered for the same cache memory region. The advantage here is that the sub-packetization requirement is low. Sub-packetization level refers to number of subfiles that each file is split into. An optimal linear error correcting delivery scheme is found for the prefetching scheme employed in [
4] (
Section 5).
Parts of the content of this manuscript are present in [
20]. The additional content in this manuscript is the construction of optimal linear error correcting delivery scheme for the prefetching scheme employed in [
4] with reduced sub-packetization (
Section 5). The proofs of Theorem 1 and Theorem 2 are also not present in [
20].
A work that is closely related to this work is present in [
15] and its extended version [
21]. In [
21], error correction is considered for a particular class of uncoded prefetching, namely symmetric batch prefetching. The major distinctions of this paper with the work in [
21] are as follows. This is the first paper in the literature that considers error correction for coded caching schemes with coded prefetching. On the other hand, the work in [
21] involves uncoded prefetching. For a fixed demand, the delivery scheme of a coded caching problem with uncoded prefetching is an index coding problem [
22]. When the prefetching is coded, it corresponds to a generalized index coding problem [
23]. To derive bounds in [
21], the concepts of index coding are used and, to derive the bounds in this paper, we use techniques from generalized index coding. This paper is the first to use the concepts of generalized index coding to derive bounds for coded caching problems.
In this paper, denotes the finite field with q elements, where q is prime power, and denotes the set of all non-zero elements of . For any positive integer K, denotes the set . For a matrix , denotes its ith row. For vector spaces , denotes that is a subspace of . A linear code over is a k-dimensional subspace of with minimum Hamming distance d. The vectors that belong to the subspace are called codewords. A matrix of size whose rows are linearly independent codewords of is called a generator matrix of . Thus, a linear code can be represented using its generator matrix as, denotes the length of the shortest linear code over , which has dimension k and minimum distance d.
2. Preliminaries and Background
Results from error correction for index coding with coded side information are used to obtain error correcting delivery schemes for the caching problem. In this section we recall results from error correction for index coding with coded side information introduced in [
24]. We also review the coded caching scheme with coded prefetching proposed in [
3,
4], for which optimal error correcting delivery schemes are presented in the following sections (
Section 3,
Section 4, and
Section 5).
2.1. Generalized Index Coding Problem and Error Correction
The index coding (IC) problem with side-information was introduced by Birk and Kol [
22]. A source broadcasts messages through a noiseless shared channel to multiple receivers, each demanding certain messages and knowing some other messages a priori as side-information. The source needs to meet the demands of each receiver in a minimum number of transmissions. The minimum number of transmissions required to meet the demands of all the receivers is termed as the optimal length of the index coding problem. In [
23] and [
25], a generalization of the index coding problem was discussed, where the demands of the receivers and the side-information are linear combinations of the messages. In [
25], the authors refer to this class of problems as Generalized Index Coding (GIC) problems.
An instance,
of a GIC problem is described formally, as follows. There is a message vector
and there are
m receivers. The
ith receiver demands a linear combination of the messages
, for some
, where
is the request vector and
is the request packet of the
ith receiver. The side-information is represented by a matrix
, where
is the number of packets possessed as side-information by the
ith receiver. The packets that are known as side-information possessed by the
ith receiver are
. Let
denote the row space of
. Receiver
i can also compute the linear combination of messages
. Let
be an
matrix over
having
as its
ith row. The matrix
represents the demands of all the
m receivers. In the definition of GIC problem in [
25] the source is assumed to possess only certain linear combinations of messages. In our work, it is assumed that all of the messages are independent and the source possesses all of them.
The min-rank of an instance
of the GIC problem over
is defined as
This is motivated by the fact that any encoding matrix for a GIC problem is of the form
. It is shown in [
24] that the min-rank is the optimal length of linear generalized index code. Intuitively, min-rank can be viewed as the minimum communication load among all of the delivery schemes with linear decoding functions, where each
corresponds to the coefficients for the locally cached contents.
For each
, the set
is defined as
Thus,
is a collection of vectors that belong to the null space of
and satisfy
. The set of vector spaces
is defined as
. The set
is the collection of subspaces such that the set of non-zero vectors of any subspace in the collection is a subset of
. The maximum value of dimension among the dimensions of all the elements of
is called the generalized independence number, denoted by
. Thus, the dimension of any subspace of
in
serves as a lower bound for
. The generalized independence number can be viewed as a cut set type of argument when the message lives in a subspace where all of the cached contents are zero. It was shown in [
24] that the min-rank serves as an upper bound for the generalized independence number, i.e.,
Generalized index coding problems were classified in [
26]. In generalized index coding with coded side-information problems, the demand of every receiver is uncoded but the side-information is coded. In generalized index coding problems with coded demands, the side-information of every receiver is uncoded but the demand is coded. In our work the focus is on generalized index coding with coded side-information problems.
Error correcting index codes were introduced in [
27] and later extended for generalized index coding problems in [
24]. An Error Correcting Generalized Index Code (ECGIC) is a map that encodes the message vector
, such that each user, given its side-information and received transmissions, can decode its requested packet
, even in the presence of at most
transmission errors. The smallest possible length of such a
-error correcting index code is denoted by
. An optimal linear
-ECGIC over
is a linear
-ECGIC over
of the smallest possible length. The optimal length of a linear error correcting index code is lower bounded by
-bound and upper bounded by
-bound as
where
is the length of an optimal linear classical error-correcting code of dimension
k and minimum distance
d over
[
24,
27].
If for some index coding problem,
, then bounds in (
3) meet with equality. Thus, for such problems,
In general, the
-bound is obtained by concatenating an optimal linear classical error correcting code and an optimal linear index code. When
, as the optimal length of the ECGIC is same as
-bound, for such problems, concatenation scheme would give optimal linear error correcting index codes [
26,
28,
29,
30].
2.2. Error Correcting Coded Caching Scheme
The problem of error correcting coded caching scheme was proposed in [
15]. The server is connected to
K users through a shared link, which is error prone. The server has access to
N files
each of size
F bits. Every user has an isolated cache with memory
bits, where
. There are many ways in which users can fill their cache contents. A prefetching scheme is denoted by
and it specifies the way in which user caches are filled. Each user demands one of the
N files. Let the demand vector be
, where
is the index of the file demanded by user
i. The number of distinct files requested in
is denoted by
. The set of all possible demands is denoted by
During the delivery phase, the server informed of the demand
, transmits a function of
over a shared link. A
-error correcting coded caching scheme should be in such a way that using the cache contents and the transmitted data, each user
i needs to decode the requested file
, even if
transmissions are in error.
For a -error correcting coded caching problem, a communication rate is achievable for demand if and only if there exists a transmission of bits such that every user i is able to decode its desired file even after at most transmissions are in error. Rate is the minimum achievable rate for a given , and . The average rate is defined as the expected minimum average rate for a given and under uniformly random demand. Thus, Another quantity of interest is the peak rate, denoted by , which is defined as
2.3. Coded Caching Scheme with Coded Prefetching
An optimal coded caching scheme for small cache sizes involving coded prefetching was given in [
3] by Chen, Fan, and Letaief. We refer this scheme as the Chen Fan Letaief (CFL) scheme. The prefetching scheme here is denoted by
. For this scheme, the cache memory size of each user is
. The prefetching strategy differs depending on the values of
N and
K. Thus,
involves two types of prefetching strategies, which are as given below:
Consider the case when
and
. Each file is split into
N subfiles, i.e.,
During prefetching, the cache of user
j is filled as
, an XORed version of subfiles. It is shown in [
3] that
for
and
for
are achievable. Furthermore, if
is achievable by memory sharing.
Consider
and
Each file is split into
subfiles, i.e.,
The cache of user
i is given by
for
For the number of distinct demands
files, it is shown in [
3] that
is achievable. For
, the rate
is achievable. Furthermore, if
is achievable by memory sharing.
2.4. Coded Caching Scheme with Coded Prefetching with Low Sub-Packetization
A scheme for
and cache capacities
was proposed in [
4] by Jesus Vilardebo. We refer to this scheme as the Jesus Vilardebo (JV) scheme. The prefetching scheme of the JV scheme is denoted by
. We consider this scheme only for
. The advantage of JV scheme over CFL scheme is that the sub-packetization requirement is low. The scheme in [
4] uses a low sub-packetization level as compared to the scheme in [
3]. A low sub-packetization level is always preferred, because any practical scheme will require each of the subfiles to have some header information that allows for decoding at the end users. When there are a large number of subfiles, the header overhead may be non-negligible. The prefetching scheme,
, is described, as follows: each file
is partitioned into
K parts
for
. Afterwards, the cache content for user
i is populated as
for
. The rate achieved is same as the CFL scheme.
2.5. Index Coding and Coded Caching
For a fixed prefetching
and for a fixed demand
, the delivery phase of a coded caching problem is an index coding problem [
1]. We denote such an index coding problem as
. In fact, for fixed prefetching, a coded caching scheme consists of
parallel index coding problems one for each of the
possible user demands. Thus, finding the minimum achievable rate for a given demand
is equivalent to finding the min-rank of the equivalent index coding problem induced by the demand
. Because the generalized independence number and min-rank of
depend on the caching scheme
and demand
, we denote them as
and
, respectively.
Consider the CFL prefetching scheme . The index coding problem that is induced by the demand for the CFL prefetching is Each subfile corresponds to a message in the index coding problem. Because prefetching is coded, represents a generalized index coding with coded side-information problem. Similarly, represents a generalized index coding problem that corresponds to the JV prefetching scheme.
There have been many works in the literature that make use of the link between index coding and coded caching. Some of them use index coding concepts to derive lower bounds on the rate [
31,
32]. In [
33], multiple groupcast index coding is used to design coded caching delivery for multiple requests. A novel index coding scheme was introduced in [
34], which, when applied to caching problem, is then shown to match an outer bound under the assumption of uncoded cache prefetching. A multihop index coding technique is proposed in [
35] to code the cached contents in helpers to achieve order-optimal capacity gains. Decentralized caching schemes were proposed for two-layer networks in [
36] which exploit index coding in the delivery phase and leverages multicast opportunities. None of these assume the channel to be error prone during the delivery phase. In our work, we use the error correction aspects of index coding to design optimal error correcting coded caching schemes.
We call a caching scheme optimal if both the placement and delivery schemes are designed in such a way to achieve an optimal rate memory pair. An optimal error correcting delivery scheme refers to an error correcting delivery scheme, which uses the minimum number of transmissions for a given placement. Unless the delivery scheme is carefully designed, one may end up with non-optimal error correcting delivery scheme for a given placement. This is clear from Example 4.8 of [
27]. In that example, an index coding problem with five messages and five receivers are considered. The side information sets are given as
,
,
,
, and
. For this problem, it can be calculated that
. Additionally, for this problem, min-rank,
over binary field. From code tables in [
37], we have
and
Hence,
. Using a computer search, the authors of [
27] have found that the optimal length
. Here the optimal length of the error correcting index code lies strictly between the
-bound and the
-bound. Thus, for this problem, the construction of optimal linear error correcting index code by concatenation is not optimal. Similarly, in general for coded caching problems with an arbitrary placement and demand, the optimal error correcting delivery scheme cannot be constructed by concatenation unless we prove that the
and
bounds meet for the corresponding index coding problem. In the schemes which we consider in this paper, we explicitly prove that the
and
bounds meet for all of the demand cases for the given placement. Hence, the optimality of concatenation is guaranteed.
3. Generalized Independence Number for
In this section we find a closed form expression for the generalized independence number
of the index coding problem
. There are two different prefetching schemes employed in [
3], depending on the relationship between the number of messages and the number of receivers. For all the index coding problems corresponding to both these prefetching schemes, the generalized independence number is shown to be equal to the min-rank.
3.1. Number of Files Equal to the Number of Users ()
In the CFL prefetching scheme, each file is split into
N subfiles. Hence, the number of messages in
is
. Each user is split into
N receivers each demanding one message. Hence, there are a total of
receivers. From the expressions of the achievable rates in [
3], we get the min-rank
as
We find the generalized independence number
for
. For different demands, a generalized independence number is calculated and it is shown to be equal to the min-rank of the corresponding generalized index coding problem. The technique of obtaining
is illustrated in the following example.
Example 1. Consider a coded caching problem with , (see Figure 1). Because , the CFL scheme is used for solving the coded caching problem. Each file is split into subfiles as , and . Let denote the vector obtained by concatenating and . The cache contents of user i is for Cache contents are depicted in Figure 1. First consider that all of the demands are distinct, i.e., . Without loss of generality, we can assume that the demand is Consider the equationsLet be the subspace of , which consists of the vectors satisfying the equations and . From the rank-nullity theorem, dim. The induced generalized index coding problem has 9 messages and 9 receivers. For this case, (1) can be rewritten as Let . The generalized independence number is the maximum among the dimensions of all the subspaces of in . We claim that all the vectors of belong to the set . Thus, . From the definition of , it is clear that the all zero vectors belonging to also belong to . Any other vector in will have at least one non-zero coordinate . The vector belonging to , having belongs to the set . Thus, all of the vectors in lie in and From (4), we get . Hence, by (2), we have Finally, assume and let In addition to and , consider the following set of equations Let be the subspace of , which consists of the vectors that satisfy the set of equations We follow the similar argument as above to show that all the vectors in lie in . By definition, lies in . All of the vectors in with for are present in . By and , all the vectors in have . The condition and the set of equations force . Hence, all of the vectors in with are present in . Thus, all of the vectors in are present in . Moreover, dim Therefore From (4), . Thus, by (2), For other possible demands also, it can be verified from Table 1 that . In Example 1, the generalized independence number of the index coding problem is equal to its min-rank. For different demands, the generalized index coding problem changes and, for all those problems, min-rank and generalized independence number are shown to be equal. This can be shown for all values of N, as given in the theorem below.
Theorem 1. For and ,, where is the number of distinct demands. Proof. In the CFL prefetching scheme , each file is split into N subfiles . User caches . Let be the vector obtained by concatenation of vectors .
For a given demand , the delivery phase of the coded caching problem becomes a generalized index coding problem with messages and receivers.
First, consider that all of the demands are distinct, i.e.,
. Let the demand of the
ith user be
. Thus
. Consider the set of
N equations denoted by
, where
Let
be the subspace of
, which consists of the vectors that satisfy the set of equations
. From the rank-nullity theorem, we have dim
.
For
, from (
1) we have,
Let
. The generalized independence number is the maximum of the dimensions of all subspaces of
in
. We show that
is such a subspace. For this, we need to show that all the vectors of
lie in
. By the definition of
, the all zero vector
lies in
. Any other vector in
will have at least one non-zero coordinate. The vectors that belong to
having
belongs to the set
. Thus, all of the vectors in
lie in
. The generalized independence number
From (
4), we get
. Hence, by (
2), we have
Consider the case where
. Without a loss of generality, we can assume that the first
users have distinct demands and that the
ith user demands the file
for
. Also, without loss of generality, we can assume that the set of indices of the files that are not demanded are
. There are
files that are not demanded. In addition to
, consider the following set of equations
, for
. The number of these equations is thus
. Let
be the subspace of
, which consists of vectors that satisfy these equations. Hence, dim
By definition,
lies in
. Any vector with the coordinate
for
lies in
. The set of equations force all
for
. Moreover if
the set of equations force some
for some
. Hence any vector with
lies in some
for
. Thus, all of the vectors in
lie in
. Therefore,
Applying (
4) and (
2),
□
3.2. Number of Users More Than the Number of Files ()
In the CFL prefetching scheme for
, each file is split into
subfiles. Hence, the number of messages in
is
. Each user is split into
receivers in
, each demanding a single message. Thus, there are a total of
receivers. From the expressions for achievable rates in [
3], we obtain the min-rank
as
We find the generalized independence number for . The technique of obtaining is illustrated in the following example.
Example 2. Consider a coded caching problem with , and (see Figure 2). According to the CFL scheme each file is split into subfiles as , , and . Let denote the vector obtained by concatenating and . The cache of the ith user contains three coded packets for The cache contents are given in Figure 2. For a given demand , this problem becomes a generalized index coding problem ,having 36 messages and 48 receivers.
First consider that and Consider the nine equations given by for and . Let be the subspace of satisfying these nine equations. From the rank-nullity theorem, we get dim. For this case, (1) can be rewritten as for . Let . The generalized independence number is the maximum among the dimensions of all the subspaces of in . We claim that is such a subspace. This would mean that . For this, we need to show that all of the vectors in lie in . By the definition of , the all zero vector lies in . Any other vector in will have at least one non-zero coordinate. All of the vectors in , having belongs to . Thus, all of the vectors in lie in and From (5), we get . Hence, by (2), we have Consider now that and . In addition to the nine equations for and , consider three more equations for . Thus, we consider a set of twelve equations given by . Let be the subspace of consisting of vectors that satisfy the equations in . Hence, from the rank-nullity theorem, we have dim By definition, lies in . Any non-zero vector in with for lies in the corresponding . By , any forces some for and, hence, such vectors also lie in . Thus all of the vectors in lie in . Therefore From (5), we get . Hence, by (2), we have Finally, consider and . The files and are not demanded by any user. In addition to the equations in , here we consider a set of equations for . Thus there are 24 equations in total. Let be the subspace of which satisfy these equations. By the rank-nullity theorem, the dimension of is given by dim The next step is to show that all the vectors in lie in . The all zero vector lies in by definition. Any non-zero vector in with for lies in the corresponding . The 24 equations considered involve equations of the form for . Hence, by E, any forces for and hence such vectors also lie in . Thus, all of the vectors in lie in . Therefore, From (5), we get . Hence, by (2), we have The theorem below gives the general expression for , when .
Theorem 2. For and ,, where is the number of distinct demands. Proof. For and , the CFL prefetching scheme is as follows. Each file is split into subfiles . User caches N coded packets given by for Let be the vector obtained by the concatenation of vectors For a given demand , this problem becomes a generalized index coding problem with messages and receivers.
First consider that all of the demands are distinct, i.e., . Without loss of generality we can assume that the first N users demand distinct files, such that the ith user demands for . Thus, , such that for . Let represent a set of equations, where . We consider a subset of the equations in E of the form for . There are such equations. Let be the subspace of consisting of vectors that satisfy these equations. From the rank-nullity theorem, we have dim.
For
, (
1) can be rewritten as
for
and
. Let
. The generalized independence number is the maximum among the dimensions of all the subspaces of
in
. We show that
is such a subspace. For this, we need to show that all of the vectors of
lie in
. By the definition of
, the all zero vector
lies in
. The vectors that belong to
having
belong to the set
. Thus, all of the vectors in
lie in
and
From (
5), we get
. Hence, by (
2), we have
Consider the case where
. Let the first
demands be distinct and the
ith user demands
for
Without loss of generality we can assume that the indices of the files that are not demanded are
. There are
files that are not demanded. In addition to the
equations that are presented in
, consider the following equations
, for
and
. The number of these equations is thus
. Let
be the subspace of
which consists of the vectors satisfying these equations. By the rank-nullity theorem, dim
By definition,
lies in
. Any vector in
with the coordinate
for
lies in
. The set of equations force all
for
and
. Moreover, by the set of equations presented in
,
would mean some other
for
. Hence, any vector with
lies in some
for
. Thus, all of the vectors in
lie in
. Therefore
From (
4), we have
. Hence from (
2),
□
4. Optimal Linear Error Correcting Delivery Scheme for the CFL Prefetching Scheme
In this section we give expressions for the average rate and the worst case rate for a
-error correcting delivery scheme for the CFL prefetching scheme. Also we propose a
-error correcting delivery scheme for this case. From Theorem 1 and Theorem 2, we can conclude that for all the generalized index coding problems
induced from the CFL prefetching scheme,
Hence, the
and
bounds in (
3) meet. Using this, the optimal linear error correcting delivery scheme can be constructed for the CFL prefetching scheme and hence the average rate can be calculated, as given in the following theorem.
Theorem 3. For a coded caching problem with the CFL prefetching scheme for ,where is the number of subfiles into which each file is divided in the CFL scheme. Proof. From (
6) and (
3), we can conclude that, for any generalized index coding problem induced from the coded caching problem with CFL prefetching, the
and
bounds meet. Thus, the optimal linear error correcting delivery scheme would be the concatenation of the CFL delivery scheme with an optimal linear error correcting code. Thus, the optimal length for
error corrections in those generalized index coding problems is
and, hence, the statement of the theorem follows. □
Corollary 1. For a coded caching problem with the CFL prefetching scheme for ,where the value of is obtained from (4) and (5) when . Proof. Worst case rate is required when the number of distinct demands is maximum. This happens when □
Because the
and
bounds become equal for
, the optimal linear error correcting coded caching delivery scheme here would be the concatenation of the CFL delivery scheme with optimal classical error correcting scheme which corrects
errors. Decoding can be done by syndrome decoding for error correcting generalized index codes proposed in [
24,
27].
In the remaining part of this section, few examples of the optimal linear error correcting delivery scheme for coded caching problems with the CFL prefetching are given.
Example 3. Consider the coded caching problem considered in Example 1 depicted in Figure 1. First consider that and . We have shown that, for this case, . The transmissions in the CFL scheme are , , , , , and . If transmission error needs to be corrected, then, from [37], we have . A generator matrix that corresponds to code isThe optimal linear single error correcting delivery scheme is the concatenation of the CFL delivery scheme with the above code. Thus, single error correcting delivery scheme involves 10 transmissions. In addition to the following transmissions are required.Now, consider and . For this case also, . The transmissions in the CFL scheme are , , , , and . For single error correction, the concatenation is done with the same code. Considering the same generator matrix as before, the additional transmissions in the error correcting delivery scheme are Finally, consider and . For this case, . The CFL transmission scheme involves the following three transmissions , , and . For single error correction, we have from [37] that . A generator matrix for the code isThus, the optimal linear single error correcting delivery scheme is the concatenation of the CFL delivery scheme with the above code. The additional transmissions required apart from and areDecoding is done by syndrome decoding for generalized index codes [24,27]. Example 4. Consider the coded caching problem considered in Example 2 depicted in Figure 2. Consider that and . We have shown that for this case . The transmissions in the CFL scheme are , , , , , , , , , , , , , , , , , , , , , , , , , , and . If transmission error needs to be corrected, then from [37], we have . The optimal linear single error correcting delivery scheme involves concatenation of CFL delivery scheme with a generator matrix that corresponds to the code. 5. Optimal Linear Error Correcting Scheme for with Reduced Sub-Packetization
In this section, we consider the scheme presented in [
4], which we call the JV scheme for
and discuss the error correction. For the case
, the JV scheme is exactly same as the CFL scheme. For
, the JV scheme has an advantage in terms of sub-packetization. The JV scheme splits each file into
K subfiles where as CFL scheme required
subfiles. Here, we find a closed form expression for the generalized independence number
of the index coding problem
. For all of the index coding problems that correspond to all possible demands, we show that the generalized independence number is equal to the min-rank. Hence in this case also,
and
bounds meet and the optimal error correcting delivery scheme is obtained by concatenation scheme.
The expression for min-rank for this scheme can be obtained from the achievable rate expressions in [
3,
4], as
The calculation of generalized independence number and how it becomes equal to the min-rank are illustrated in the following example.
Example 5. Consider a coded caching problem with , and . According to the JV scheme, each file is split into subfiles as: for . Let X be the vector obtained by concatenation of , and , given by . The cache contents of user i is for . For a given demand , this problem becomes a generalized index coding problem , having 18 messages and 36 receivers.
First consider that and let . Consider the equations for . Out of these, consider the first three equations and . Let be the subspace of consisting of vectors which satisfy these three equations. Subsequently, from the rank-nullity theorem, we have dim(). For this case, (1) can be rewritten as for . Let . The generalized independence number is the maximum among the dimensions of all the subspaces of in . We claim that is such a subspace. This would mean that . For this, we need to show that all of the vectors in lie in . By the definition of , the all zero vector lies in . Any other vector in will have at least one non-zero coordinate. All of the vectors in , having belongs to . Because all files are demanded, all of the vectors in lie in and From (7), we get . Hence, by (2), we have Consider now that and . In this case, consider the six equations . Let be the subspace of consisting of vectors which satisfy these equations. Hence from the rank-nullity theorem, we have dim By definition, lies in . Any non-zero vector in with for and lies in the corresponding . Now, we have to consider the vectors that satisfy these equations and for . From the equations , any forces some for and, hence, such vectors also lie in . Thus, all of the vectors in lie in . Therefore, From (7), we get . Hence, by (2), we have Finally, consider and . The files and are not demanded by any user. Here, we consider the equation . In addition to this, consider the equations , , , , and . Thus, there are twelve equations in total. Let be the subspace of which satisfy these equations. By the rank-nullity theorem, the dimension of is given by dim The next step is to show that all of the vectors in lie in . The all zero vector lies in by definition. Any non-zero vector in with for lies in the corresponding . From the set of twelve equations, force some for and, hence, such vectors also lie in . Thus, all of the vectors in lie in . Therefore, From (7), we get . Hence, by (2), we have This can be generalized as in the theorem below.
Theorem 4. For and ,, where is the number of distinct demands. Proof. For and , the JV prefetching scheme is as follows. Each file is split into K subfiles . The cache content of user i is given as . Let be the vector obtained by the concatenation of vectors For a given demand , this problem becomes a generalized index coding problem with messages and receivers.
First, consider that all of the demands are distinct, i.e., . Without a loss of generality, we can assume that the first N users demand distinct files, such that for . Thus , such that for . Consider the set of equations for . Out of these equations, consider a subset of N equations for . Let be the subspace of consisting of vectors satisfying these equations. From the rank-nullity theorem, we have dim.
For
, (
1) can be rewritten as
for
. Let
. The generalized independence number is the maximum among the dimensions of all the subspaces of
in
. We show that
is such a subspace. For this, we need to show that all the vectors of
lie in
. By the definition of
, the all zero vector
lies in
. The vectors that belong to
having
belong to the set
. Thus, all of the vectors in
lie in
and
From (
7), we get
. Hence, by (
2), we have
Consider the case where
. Let the first
demands be distinct and without loss of generality assume that the
ith user demands
for
Thus, the indices of the files that are not demanded are
. There are
files that are not demanded. In addition to the
K equations
, consider the following equations
, for
and
. Thus, the number of these equations is
. Let
be the subspace of
which consists of the vectors satisfying these equations. By the rank-nullity theorem, dim
By definition,
lies in
. Any vector in
with the coordinate
for
lies in
. The set of equations force all
for
and
. Moreover by the same set of equations,
would mean some other
for
. Hence any vector with
lies in some
for
. Thus all the vectors in
lie in
. Therefore,
From (
7), we have
. Hence, from (
2),
□
Because, for this scheme, also,
and
bounds meet, the optimal coded caching delivery scheme here would be the concatenation of the JV delivery scheme with optimal classical error correcting scheme that corrects
errors. Decoding can be done by syndrome decoding for error correcting generalized index codes proposed in [
24,
27]. This is illustrated while using an example below.
Example 6. Consider the coded caching problem considered in Example 5. Consider that and . We have shown that for this case . The transmissions in the JV scheme are , , , , , , , , , , , , , , . If transmission error needs to be corrected, then from [37], we have . The optimal linear single error correcting delivery scheme involves a concatenation of JV delivery scheme with a generator matrix that corresponds to the code.