Next Article in Journal
Slope Entropy Normalisation by Means of Analytical and Heuristic Reference Values
Next Article in Special Issue
Information Rates for Channels with Fading, Side Information and Adaptive Codewords
Previous Article in Journal
The Impact of Hoffmann Reflex on Standing Postural Control Complexity in the Elderly with Impaired Plantar Sensation
Previous Article in Special Issue
Speeding up Training of Linear Predictors for Multi-Antenna Frequency-Selective Channels via Meta-Learning
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

The Strongly Asynchronous Massive Access Channel

Electrical and Computer Engineering Department, University of Illinois at Chicago, Chicago, IL 60607, USA
*
Author to whom correspondence should be addressed.
Entropy 2023, 25(1), 65; https://doi.org/10.3390/e25010065
Submission received: 30 August 2022 / Revised: 16 December 2022 / Accepted: 21 December 2022 / Published: 29 December 2022
(This article belongs to the Special Issue Wireless Networks: Information Theoretic Perspectives Ⅱ)

Abstract

:
This paper considers the Strongly Asynchronous, Slotted, Discrete Memoryless, Massive Access Channel (SAS-DM-MAC) in which the number of users, the number of messages, and the asynchronous window length grow exponentially with the coding blocklength with their respective exponents. A joint probability of error is enforced, ensuring that all the users’ identities and messages are correctly identified and decoded. Achievability bounds are derived for the case that different users have similar channels, the case that users’ channels can be chosen from a set which has polynomially many elements in the blocklength, and the case with no restriction on the users’ channels. A general converse bound on the capacity region and a converse bound on the maximum growth rate of the number of users are derived. It is shown that reliable transmission with an exponential number of users with an exponential asynchronous exponent with joint error probability is possible at strictly positive rates.

1. Introduction

One of the main attributes of the Internet of the Things (IoT) paradigm is the wireless connectivity of large a number of distributed and uncoordinated devices. Each device is automated, and is allowed to transmit random bursts of data without network or human coordination. A contention-based/grant-based network access requires multiple rounds of back and forth transmissions between the device and the network to acquire synchronization, identify users, and grant them allocations. This signaling overhead and power consumption is not justified for power-limited devices with sporadic traffic patterns and motivates a grant-free network access implementation. An un-coordinated network access leads to a new traffic patterns and collision resolutions over the wireless network. A foundational study of such a system is needed.
We propose a novel communication and multiple-access model that captures these new network characteristics. The formal model definition is given in Section 2, but in order to demonstrate the differences between our proposed model and existing work in the literature, we briefly introduce our model parameters here for the Strongly Asynchronous Slotted Discrete Memoryless Massive Access Channel (SAS-DM-MAC). In a SAS-DM-MAC, the number of users K n : = e n ν increases exponentially with blocklength n with occupancy exponent ν 0 . Moreover, the users are strongly asynchronous, meaning, they transmit at a randomly chosen time slot within a window of length A n : = e n α slots, each slot of length n, where α 0 is the asynchronous exponent. In addition, when active, each user chooses uniformly at random one message to transmit from a set of M n : = e n R messages, where R 0 is the transmission rate. All transmissions are sent to an access point and the receiver is required to jointly decode and identify all users. The goal is to characterize the set of all achievable ( R , α , ν ) triplets.

1.1. Past Work

Strongly asynchronous communication was first introduced in [1] for synchronization of a single user, and later extended in [2] for synchronization with positive transmission rate.
In [3] the authors of [2] made a brief remark about a “multiple access collision channel” extension of their original single-user model. In this model, any collision of users (i.e., users who happen to transmit in the same block) is assumed to result in output symbols that appear distributed as noise. The error metric is taken to be the per user probability of error, which is required to be vanishing for all but a vanishing fraction of users. In this scenario, it is fairly easy to quantify the capacity region for the case that the number of users are less than the square root of the asynchronous window length (i.e., in our notation ν < α / 2 ). However, finding the capacity of the “multiple access collision channel” for joint probability of error, as opposed to per user probability of error, is much more complicated and requires novel achievability schemes and novel analysis tools. This is the main subject and contribution of this paper.
Recently, motivated by the emerging machine-to-machine type communications and sensor networks, a large body of work has studied “many-user” versions of classical multiuser channels as pioneered in [4]. In [4] the number of users is allowed to grow at most linearly with the blocklength n. A full characterization of the capacity of the synchronous Gaussian (random) many access channel was given [4]. In [5], the author studied the synchronous massive random access channel where the total number of users increases linearly with the blocklength n. However, the users are restricted to use the same codebook and only a per user probability of error is enforced. In the model proposed here, the users are strongly asynchronous, the number of users grows exponentially with the blocklength, and we enforce a joint probability of error over all users.
In [6,7] a broadcast channel model was introduced where each receiver is only interested in decoding its intended messaged (and not all messages). For a per user/per pair of users probability of error criteria doubly exponential number of receivers was shown to be identifiable by using a randomized code. In this work we study a multi-access channel model with non-cooperative transmitters and a single receiver which needs to decode all messages from all users.
Training based synchronization schemes (i.e., pilot signals) was proven to be suboptimal for bursty communications in [2]. Rather, one can utilize the users’ statistics at the receiver for synchronization or user identification purposes. The identification problem (defined in [8]) is a classical problem considered in hypothesis testing. In this problem, a finite number of distinct sources each generates a sequence of i.i.d. samples. The problem is to find the underlying distribution of each sample sequence, given the constraint that each sequence is generated by a distinct distribution.
All studies on identification problems assume a fixed number of sequences. In [9], authors study the Logarithmically Asymptotically Optimal (LAO) Testing of identification problem for a finite number of distributions. The authors in [10] also propose a test for a generalized Neyman-Pearson-like optimality criterion to match multiple yet finite number of sequences to their source distributions.
In contrast, we allow the number of users to increase exponentially with the blocklength. We assume that the users are strongly asynchronous and may transmit randomly anytime within a time window that is exponentially large in the blocklength. We require the receiver to recover both the transmitted messages and the users’ identities under a joint probability of error criteria. By allowing the number of sequences to grow exponentially with the number of samples, the number of different possibilities (or hypotheses) would be doubly exponential in the blocklength and the analysis of the optimal decoder becomes much more challenging than classical (with constant number of distributions) identification problems. These differences in modeling the channel require a number of novel analytical tools.

1.2. Contribution

We consider ( R , α , ν ) , the capacity region of the SAS-DM-MAC, where we require the joint probability of error to vanish. Our contributions are as follows:
  • We propose a new achievability scheme that supports strictly positive values of ( R , α , ν ) for identical channels for the users.
  • We define a new massive identification paradigm in which we allow the number of sequences in a classical identification problem to increase exponentially with the sequence blocklength (or sample size). We find asymptotically matching upper and lower bounds on the probability of identification error for this problem. We use this result in our SAS-DM-MAC model to recover the identity of the users.
  • We propose a new achievability scheme for the case that the channels are chosen from a set of conditional distributions. The size of the set increases polynomially in the blocklength n. In this case, the channel statistics themselves can be used for user identification.
  • We propose a new achievability scheme without imposing any restrictive assumptions on the users’ channels. We show that strictly positive ( R , α , ν ) are possible.
  • We propose a novel converse bound for the capacity of the SAS-DM-MAC.
  • We show that for ν > α , not even reliable synchronization is possible.
These results were presented in parts in [11,12,13].

1.3. Paper Organization

In Section 2 we introduce the notation and problem formulation for SAS-DM-MAC and the identification problem. We present our main results in Section 3. Section 4 concludes the paper with a discussion and future work. Proofs are provided in the Appendix.

2. Notations and Problem Formulation

We first introduce the notation used in the SAS-DM-MAC and then formally define the problem.

2.1. Notation

Capital letters represent random variables that take on lower case letter values in calligraphic letter alphabets. The notation a n e n b means lim n log a n n = b . We write [ M : N ] , where M , N Z , M N , to denote the set { M , M + 1 , , N } , and [ K ] : = [ 1 : K ] . y n is used to denote y n : = [ y 1 , , y n ] . The n-fold cartesian product of a set S is denoted by S n .
A transition probability/channel from X to Y is denoted by Q ( y | x ) (or Q by short), ( x , y ) X × Y , and the output marginal distribution induced by P P X through the channel Q as
[ P Q ] ( y ) : = x X P ( x ) Q ( y | x ) , y Y ,
where P X is the space of all distributions on X . We define the shorthand notation
Q x ( y ) : = Q ( y | x ) , y Y .
For a MAC channel Q ( y | x 1 , , x K ) , ( x 1 , , x K , y ) X 1 × × X K × Y , we define the shorthand notation
Q S y | x S : = Q ( y | x S , S c ) , S [ K ] ,
to indicate that users indexed by S transmit x i X i , i S , and users indexed by S c = [ K ] S transmit their respective idle symbol j X j j S c . When | S | = 1 and | S | = 0 , we respectively use
Q i ( y | x i ) : = Q { i } ( y | x i ) = Q ( y | 1 , , i 1 , x i , i + 1 , , K ) , Q ( y ) : = Q { } ( y | x i ) = Q ( y | 1 , K ) .
The P-type set and the V-shell of the sequence x n are defined, respectively, as
T ( P ) : = x n : N ( a | x n ) n = P ( a ) , a X ,
T V ( x n ) : = y n : N a , b | x n , y n N ( a | x n ) = V ( b | a ) , ( a , b ) ( X , Y ) ,
where N ( a , b | x n , y n ) = i = 1 n 1 x i = a y i = b is the number of joint occurrences of ( a , b ) in the pair of sequences ( x n , y n ) and where N ( a | x n ) denotes the number of occurrences of a X in the sequence x n . The empirical distribution of the sequence x n is
P ^ x n ( a ) : = 1 n N ( a | x n ) = 1 n i = 1 n 1 { x i = a } , a X .
If in using (7) the target sequence x n is clear from the context, we drop the subscript x n in P ^ x n ( · ) .
We also use the notation X n i . i . d P when all random variables in the random vector X n are generated i.i.d according to distribution P.
We use D ( P 1 P 2 ) to denote the Kullback Leibler divergence between distribution P 1 and P 2 , and
D ( Q 1 Q 2 | P ) : = ( x , y ) ( X × Y ) P ( x ) Q 1 ( y | x ) log Q 1 ( y | x ) Q 2 ( y | x )
for the conditional Kullback Leibler divergence. We let
I ( P , Q ) = D ( Q [ P Q ] | P )
denote the mutual information between random variables X and Y jointly distributed according to P X , Y ( x , y ) = P ( x ) Q ( y | x ) . As for binary functions, we use
d ( p q ) : = p log p q + ( 1 p ) log 1 p 1 q , p q : = p ( 1 q ) + ( 1 p ) q , h ( p ) : = p log ( p ) ( 1 p ) log ( 1 p )
to represent binary divergence, convolution and entropy functions, respectively.
The Chernoff distance and the Bhatcharrya distance between two distributions P 1 , P 2 P X are respectively defined as
C ( P 1 , P 2 ) : = sup 0 t 1 log x X P 1 ( x ) t P 2 ( x ) 1 t ,
B ( P 1 , P 2 ) : = log x X P 1 ( x ) P 2 ( x ) .
We extend this definition and introduce the quantities
C ( P i , Q i , P j , Q j ) : = sup 0 t 1 μ i , j ( t ) ,
μ i , j ( t ) : = log x i X , x j X , y Y P i ( x i ) P j ( x j ) Q i ( y | x i ) 1 t Q j ( y | x j ) t ,
C ( . , Q , P j , Q j ) : = sup 0 t 1 log x X , y Y P j ( x ) Q ( y ) 1 t Q j ( y | x ) t .
We also use
E r ( R , P , Q ) : = max 0 ρ 1 E 0 ( R , P , Q , ρ ) ρ R ,
E s p ( R , P , Q ) : = sup ρ > 0 E 0 ( R , P , Q , ρ ) ρ R
to denote the random coding and the sphere packing error exponent of channel Q ( y | x ) , with input distribution P ( x ) and rate R as defined in [14] where
E 0 ( R , P , Q , ρ ) : = log y Y x X P ( x ) Q ( y | x ) 1 1 + ρ 1 + ρ .

2.2. SAS-DM-MAC Problem Formulation

Let M be the number of messages (the same for each user), A be the number of blocks, and K be the number of users.
An ( M , A , K , n , ϵ ) code for the SAS-DM-MAC consists of:
  • A message set [ M ] , for each user i [ K ] .
  • An encoding function f i : [ M ] X n , for each user i [ K ] . We define
    x i n ( m ) : = f i ( m ) .
  • Each user i [ K ] chooses a message m i [ M ] and a block index t i [ A ] (called ‘active’ block henceforth), both uniformly at random. It then transmits [ i n ( t i 1 ) x i n ( m i ) i n ( A t i ) ] , where i X is the designated idle symbol for user i.
  • A destination decoding function
    g : Y n A [ A ] × [ M ] K
    such that its associated probability of error, P e ( n ) , satisfies P e ( n ) ϵ where
    P e ( n ) : = 1 A M K ( t 1 , m 1 ) , , ( t K , m K ) P g ( Y n A ) ( t 1 , m 1 ) , , ( t K , m K ) | H ( t 1 , m 1 ) , , ( t K , m K ) ,
    where the hypothesis that user i [ K ] has chosen to transmit message m i [ M ] in block t i [ A ] is denoted by H ( t 1 , m 1 ) , , ( t K , m K ) .
A tuple ( R , α , ν ) is said to be achievable if there exists a sequence of codes e n R , e n α , e n ν , n , ϵ n with lim n ϵ n = 0 . The capacity region of the SAS-DM-MAC at asynchronous exponent α , occupancy exponent ν , and rate R, is the closure of all possible achievable ( R , α , ν ) triplets.
Remark 1.
We should emphasize that we are focusing our attention on discrete memoryless channels (DMC) where a user not transmitting/being idle is equivalent to sending that user’s idle symbol in the DMC model, just as, in a Gaussian channel, a user not sending corresponds to sending the symbol ‘0’. We do not study Continuous channels in this work.
A depiction of a timeline for our channel model is provided in Figure 1a and for specific cases: one in which user 1 and 2 are not simultaneously transmitting (Figure 1b) and another in which user 1 and 2 are simultaneously transmitting (Figure 1c).

2.3. Identification Problem Formulation

We first summarize the notation specifically used in this Section (and in Appendix B) in Table 1 and then introduce the problem formulation.
Assume P : = { P 1 , , P T n } P X consist of T n distinct distributions on X and let random variable Σ be uniformly distributed over S T n (see definition in Table 1). Let σ S T n and let x n T n = [ x 1 n , , x T n n ] be a sample vector of X n T n = X 1 n , , X T n n , in which X i n i . i . d P σ i n , i [ T n ] . Given x n T n , we would like to find the permutation σ . More specifically, we are interested in finding a permutation σ ^ : X n T n S T n in which X i n i . i . d P σ ^ i , i [ T n ] . Let Σ ^ = σ ^ ( X n T n ) .
The average probability of error for the set of distributions P is defined by
P e ( n ) = P Σ ^ Σ = 1 ( T n ) ! σ S T n P Σ ^ σ ,
where in (19) the measure P over ( X 1 n , . . . , X T n n ) is X i n i . i . d P σ i , i [ T n ] .
We say that a set of distributions P is identifiable if lim n P e ( n ) 0 .

3. Main Results

We first introduce an achievable region for the case that different users have identical channels (in Theorem 1). We then derive the identification criteria (in Theorem 2) and use it in the study of a more general case of massive communications in which the users’ channels belong to a set of conditional probability distributions of polynomial size in n (in Theorem 3). In this case, we use the output statistics to distinguish and identify the users. Afterwards, we remove all conditions on the users’ channels and derive an achievability bound on the capacity of the SAS-DM-MAC in (Theorem 4). We also propose a converse bound on the capacity of general SAS-DM-MAC (in Theorem 5) and a converse bound on the number of users (in Theorem 6).

3.1. Users with Identical Channels

The following theorem is an achievable region for the SAS-DM-MAC for the case that different users have identical channels toward the base station when they are the sole active user.
Theorem 1.
For a SAS-DM-MAC with Q { i } ( y | x ) = W ( y | x ) (recall Definition (4)) for all users, the following ( R , α , ν ) region is achievable
P P X λ [ 0 , 1 ] ν < α 2 ν < min D ( W λ W | P ) , E r ( R + ν , P , W ) α + R + ν < D ( W λ Q | P ) R + ν < I ( P , W ) ,
where
W λ ( y | x ) : = W ( y | x ) λ Q ( y ) 1 λ y Y W ( y | x ) λ Q ( y ) 1 λ , ( x , y ) X × Y ,
and where E r was defined in (13).
In the scenario where users have identical channels toward the base station, one can combine the user identification and decoding stages, i.e., one can distinguish the user’s identity from the decoded message. This appears as the combination R + ν in the bounds in (20).
One could also interpret the result in Theorem 1 as follows. There is one single “big codebook” with M K codewords of length n. User 1 uses codewords with indices 1 through M; User 2 uses codewords with indices M + 1 through 2 M , and so forth, concluding with User K using codewords with indices ( K 1 ) M + 1 through K M . In each active slot, the receiver decodes the “big codebook”; reliable decoding imposes log ( M K ) / n = R + ν < I ( P , W ) . However, since in our setting we need to decode K codewords (one per active slot, as opposed to just one), in addition to R + ν I ( P , W ) we also need the bound log ( K ) / n = ν < E r ( ν + R , P , W ) . Comparing the third and fourth bounds of (20) with the per-user probability of error achievability region in [3], we can see an additional ν penalty as a result of a joint probability of error criteria. However, we do not claim (via a matching converse bound) that the rate penalty (at least in the format that we prove) is conclusive.
We should also note that for ν < α 2 (first bound in (20)), with probability approaching one as the blocklength n goes to infinity, the users transmit in distinct blocks. Hence, regardless of the achievability scheme, the probability of error under a hypothesis where a collision has occurred is vanishing as the blocklength goes to infinity. As a result, in analyzing the joint probability of error in our achievability scheme, we need to focus on the hypothesis that users do not collide. In our achievability scheme, we use a two-stage decoder which first synchronizes the users (i.e., finds the location of active blocks) and then decodes the users’ messages (which also identifies the users’ identities). The complete proof of Theorem 1 is given in Appendix A.
Remark 2.
The existence of an error exponent bound on the occupancy exponent (i.e., ν) is intuitive (second argument in the second condition in (20)). Even if the decoder has side information about the location of noisy blocks (i.e., blocks with only idle symbols for all users), it still has to decode all the messages within active blocks. Due to independence of the output in different blocks and by the fact that there cannot be intra-block coding, the probability of decoding error over active blocks is asymptotically equal to the sum of the decoding error probabilities in each active block. This puts a restriction on ν based on the best decay rate for the decoding error in each block, i.e., the error exponent.

3.2. Users with Different Choice of Channels

We now move on to a more general setting in which we remove the restriction that all users have identical channels towards the base station (when only one is active). Theorem 3 gives an achievable region when we allow the users’ channels (when they are the sole active user toward the base station) to be chosen from a set of conditional distributions of polynomial size in the blocklength n.
In this scenario, one may use the users’ statistics at the channel output to identify the set of users who have a similar channel. In this regard, before introducing Theorem 3, we introduce Theorem 2 in which we characterize the relation between the number of distributions and their pairwise distance for them to be identifiable. The proof of Theorem 2, given in Appendix B, is itself based on a novel graph theoretic technique to analyze the optimal Maximum Likelihood (ML) decoder, which is of independent interest.
Theorem 2.
A sequence of distributions P = { P 1 , , P T n } is identifiable iff
lim n 1 i < j T n e 2 n B ( P i , P j ) = 0 ,
where B ( P i , P j ) is the Bhatcharrya distance, which was defined in (11).
Remark 3.
From Theorem 2 one can see that when the number of distributions T n is a constant or grows polynomially with n, the sequence of distributions are always identifiable and by the problem formulation in Section 2.3, it is implied that the probability of error in the identification problem decays to zero as the blocklength n goes to infinity.
By using the criterion for identifiability of a massive number of distributions in Theorem 2, we move on to the SAS-DM-MAC problem. We adapt the concept of identification of distribution to identify the users’ statistics at the output with an optimal ML test. We use the result of Theorem 2 in the derivation of Theorem 3.
Theorem 3.
For a SAS-DM-MAC where Q i ( y | x ) = W c ( i ) ( y | x ) is the channel for user i [ K n ] , for W : = i [ K n ] W c i   a n d   | W | = p o l y ( n ) the region
lim inf n A n ,
is achievable where A n is defined as
A n : = ϵ > 0 P 1 , P | W | P X | W | λ 1 , λ | W | [ 0 , 1 ] | W | j [ | W | ] ν j + ϵ < α 2 ν j + ϵ < min D ( [ P j W j ] λ j [ P j W j ] ) , E r ( R + ν j , P j , W j ) α + ϵ < D ( [ P j W j ] λ j Q ) 0 < inf i [ | W | ] i j B [ P i W i ] , [ P j W j ] R + ν j + ϵ < I ( P j , W j ) ,
and where E r is defined in (13), B ( P i , P j ) was defined in (11) and
ν j : = 1 n log ( N j ) ,
N j : = i = 1 K n 1 { Q i = W j } and such that j = 1 | W | N j = K n ,
[ P j W j ] λ ( y ) : = [ P j W j ] ( y ) λ Q ( y ) 1 λ y Y [ P j W j ] ( y ) λ Q ( y ) 1 λ .
The proof of Theorem 3 is given in Appendix F.
In the proof of Theorem 3 we propose a three stage decoder which performs the task of synchronization, identification and decoding, sequentially. It is also worth noting that given that the number of users is exponential in the blocklength and the number of channels is only polynomial in the blocklength, there must exist at least a single channel that occupies exponentially many blocks. Hence, it is not surprising to see bounds on the occupancy level of each channel (i.e., ν j ) in (24).
Remark 4.
We should note that (as it is apparent in (24)) as long as we have the assumption that the number of user’s channels increase with blocklength n, our optimization space over all users also increases with n and hence our achievability and converse bounds would be dependent on the exact definition of n (e.g, compared to (20) in Theorem 1). However, in order to compute these bounds, we only need to optimize with respect to P i , i [ K n ] as opposed to P n in the conventional n-letter capacity expressions.

3.3. General Case

Now we investigate a SAS-DM-MAC with no restriction on the channels of the users toward the base station. The key ingredient in our analysis is a novel way to bound the probability of error reminiscent of Gallager’s error exponent [14]. We show an achievability scheme that allows a positive lower bound on the R and on ν . This proves that reliable transmission with an exponential number of users with an exponential asynchronous exponent with joint error probability is possible. We use an ML decoder sequentially in each block to identify the active user and its message.
Theorem 4.
For a SAS-DM-MAC the following region is achievable
lim inf n ϵ > 0 P 1 , P K n P X K n i [ K n ] ν < α 2 ν + R + ϵ < C ( P i , Q i , P i , Q i ) , 2 ν + R + ϵ < inf j [ K n ] j i C ( P i , Q i , P j , Q j ) , α + ν + R + ϵ < C ( . , Q , P i , Q i ) ,
The proof of Theorem 4 is given in Appendix G.
Remark 5.
Note that one can show the following (see Appendix H):
C ( P i , Q i , P i , Q i ) = log x , x , y P i ( x ) P i ( x ) Q i ( y | x ) Q i ( y | x ) ,
C ( . , Q , P i , Q i ) I ( P i , Q i ) + D [ P i Q i ] Q ,
where (28a) is the special case of E 0 ( R , P , Q , ρ = 1 ) in (15), and the right hand side of (28b) is the bound on α + R in the single user strong asynchronous channel [15]. From (28) and (27) we thus see that
ν < E r ( R , P i , Q i ) , α + R < I ( P i , Q i ) + D P i Q i Q ν ,
which shows that the second and fourth bounds in (27) are, respectively, less than the synchronous channel error exponent and the point-to-point strong asynchronous channel capacity.

3.4. Example

Consider the SAS-DM-MAC with input-output relationship Y = i [ K n ] X i Z with Z Bernoulli ( δ ) for some δ ( 0 , 1 / 2 ) . In our notation
Q ( y | x ) = P [ X i Z = y | X i = x ] = P [ Z = x y ] = 1 δ x y = 0 ( i . e . , x = y ) δ x y = 1 ( i . e . , x y ) .
Assume that the input distribution used is P = Bernoulli ( p ) for some p ( 0 , 1 / 2 ) . The achievability region in Theorem 1, is
p [ 0 , 1 2 ] λ [ 0 , 1 ] ν < α / 2 ν < min p · d ( ϵ λ δ ) , min q [ 0 , 1 ] d ( δ q ) + max { 0 , d ( q 1 2 ) R ν α + R + ν < p · d ( ϵ λ 1 δ ) R + ν < h ( p δ ) h ( δ ) ,
where ϵ λ : = δ λ ( 1 δ ) ( 1 λ ) δ λ ( 1 δ ) ( 1 λ ) + ( 1 δ ) λ δ ( 1 δ ) .
Moreover, by assuming P i = Bernoulli ( p i ) for all i [ K n ] , we can show that the optimal t in C ( P i , Q i , P j , Q j ) = sup t μ i , j ( t ) is t = 1 / 2 and hence the achievability region in Theorem 4 is
lim inf n ϵ > 0 P 1 , P K n P X K n i [ K n ] ν < α 2 ν + R + ϵ < B ( P i , Q ) = g ( p i p i , δ ) 2 ν + R + ϵ < inf i j C ( P i , Q , P j , Q ) = inf i j g ( p i p j , δ ) α + ν + R + ϵ < C ( . , Q , P i , Q ) = g ( p i , δ ) ,
where
g ( a , b ) : = log 1 a + 2 a b ( 1 b ) .
Finally, by symmetry, we can see that the optimal p i = 1 2 , i [ K n ] and hence g ( 1 2 , δ ) = log 1 / 2 + δ ( 1 δ ) > 0 . So on the BSC( δ ) strictly positive rates and ν are achievable. In this regard, the region in Theorem 4 reduces to
α + ν + R < log 1 / 2 + δ ( 1 δ ) .
The achievable region in (29) for ( α , ν , R ) is shown in Figure 2a. In addition, the achievable region for ( α , ν , R ) with the achievable scheme in Example (30) is also plotted in Figure 2b for comparison. Figure 2 shows that the achievable scheme in Theorem 1 indeed results in a larger achievable region than the one in Theorem 4.
The fact that the achievability region for Theorem 1 is larger than the achievability region of Theorem 4 for identical channels is not surprising. In Theorem 1 we separated the synchronization and decoding steps, whereas in Theorem 4 synchronization and codeword decoding was done the same time but sequentially for each block. The sequential block decoding step results in a smaller achievability region in Theorem 4.

3.5. Converse on the Capacity Region of the SAS-DM-MAC

Thus far, we have provided achievable regions for the SAS-DM-MAC when different users have identical channels; when their channels belong to a set of channels with size that grows polynomially in the blocklength; and without any restriction on the users’ channels. Theorem 5 next provides a converse to the capacity region of the general SAS-DM-MAC.
Theorem 5.
For the SAS-DM-MAC, such that ν < α / 2 , the following region is impermissible
lim inf n ϵ > 0 P 1 , P K n P X K n λ 1 , λ K n [ 0 , 1 ] K n ν > 1 K n i = 1 K n D ( Q i λ i Q i | P i ) + ϵ , α > 1 K n i = 1 K n D ( Q i λ i Q | P i ) ( 1 r ¯ n ) ( ν + R ) + ϵ ν > 1 K n i = 1 K n E s p ( R , P i , Q i ) + ϵ R > I ( P i , Q i ) + ϵ ,
where r ¯ n is the infimum probability of error, over all estimators T, in distinguishing different hypothesis Q i λ i ( y n | x i n ( m ) ) , i [ K n ] , m [ M n ] , i.e.,
r ¯ n : = inf T 1 K n M n i = 1 K n m = 1 M n Q i λ i ( T i , m | x i n ( m ) ) ) .
Proof. 
The complete proof is given in Appendix I. □
The first bound in (31) corresponds to synchronization, while the second and third bounds correspond to decoding. Also, as it was noted before in Remark 2, one can again see that the value of ν is restricted by the decoding error exponent.

3.6. Converse on the Number of Users in a SAS-DM-MAC

In previous sections, we restricted ourselves to the regime where ν < α 2 . However, an interesting question is how large a ν can be. Theorem 6 provides a converse bound on the value of ν such that for ν > α , not even reliable synchronization is possible.
Theorem 6.
For a SAS-DM-MAC with ν > α , reliable synchronization is not possible, i.e., even with M = 1 , one has P e ( n ) > 0 .
The complete proof can be found in Appendix J.

4. Discussion and Conclusions

In this paper we studied a Strongly Asynchronous and Slotted Massive Access Channel (SAS-DM-MAC) where K n : = e n ν different users transmit a randomly selected message among M n : = e n R ones within a strong asynchronous window of length A n : = e n α blocks of n channel uses each. We found inner and outer bounds on the ( R , α , ν ) tuples. Our analysis is based on a joint probability of error in which we required all users’ messages and identities to be jointly correctly decoded at the base station. Our results are focused on the regime ν < α 2 , where the probability of user collision vanishes. We proved in Theorem 6 that for the regime ν > α , not even synchronization is possible. We now discuss some of the difficulties in analyzing the region α 2 ν α .
For the region ν < α 2 , with probability A n K n ( A n ) K n 1 as blocklength n , the users transmit in distinct blocks. Hence, in analyzing the probability of error of our achievable schemes, we only need to bound the error under the hypothesis that users do not collide. For α 2 ν α , we need to consider the events where collisions occur. In particular, based on Lemma 1 (proved in the Appendix L), for α 2 ν α , the probability of every arrangement of users is itself vanishing in the blocklength.
Lemma 1.
For the region α 2 ν α the non-colliding arrangement of users has the highest probability among all possible arrangements, yet, the probability of this event is also vanishing as blocklength n goes to infinity.
As a consequence of Lemma 1, one needs to propose an achievable scheme that accounts for several arrangements (the number of which is non-trivial to the authors) and collision of users and drives the probability of error in these arrangements to zero. It is also worth noting that the number of possible hypotheses is doubly exponential in the blocklength. Finally, it is worth emphasizing that the authors in [16] can get to ν α since they require the recovery of the messages of a large fraction of users, and require the per-user probability of error to be vanishing (rather than the overall or joint probability of error, which is a much stronger condition). To prove whether or not strictly positive ( R , α , ν ) are possible in the region α 2 ν α , with vanishing joint probability of error, is a challenging open problem.

Author Contributions

Conceptualization, S.S., D.T. and N.D.; formal analysis, S.S., writing—original draft preparation, S.S.; writing—review and editing, S.S., D.T. and N.D.; supervision, D.T. and N.D., funding acquisition, D.T. and N.D. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by NSF under award 1900911. The contents of this article are solely the responsibility of the authors and do not necessarily represent the official views of the NSF.

Institutional Review Board Statement

Not applicable.

Data Availability Statement

The data presented in this study are available on request from the corresponding author.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A. Proof of Theorem 1

We first introduce the codebook generation and then introduce and analyze the decoder.

Appendix A.1. Codebook Generation

Let K n = e n ν be the number of users, A n = e n α be the number of blocks, and M n = e n R be the number of messages. Each user i [ K n ] generates a constant composition codebook with composition P by drawing each message’s codeword uniformly and independently at random from the P-type set T ( P ) (recall definition in (5)). The codeword of user i [ K n ] for message m [ M n ] is denoted as x i n ( m ) .

Appendix A.2. The Decoder and Analysis of the Probability of Error

A two-stage decoder is used, to first synchronize and then decode (which also identifies the users’ identities) the users’ messages. The probability of error of this two-stage decoder is thus given by
P Error P Error | No-collision + P collision (A1a) P Synchronization error | No-collision (A1b) + P Decoding error | No synchronization error , No-collision (A1c) + P collision .
Synchronization Step. We perform a sequential likelihood test as follows. Fix a threshold
T [ D ( Q W | P ) , D ( W Q | P ) ] .
For each block j [ A n ] if there exists any message m [ M n ] for any user i [ K n ] such that
L ( x i n ( m ) , y j n ) : = 1 n log W ( y j n | x i n ( m ) ) Q ( y j n ) T ,
then declare that block j is an ‘active’ block, and an ‘idle’ block otherwise. Let
H ( 1 ) : = H ( 1 , 1 ) , ( 2 , 1 ) , , ( K n , 1 ) ,
be the hypothesis that user i (for each i [ K n ] ) is active in block i and sends message m i = 1 . As explained earlier (in the paragraph after Theorem 1), the probability of collision in (A1c) is vanishing and we can focus on calculating (A1a) and (A1b). For (A1a), by symmetry of the different no-collision hypotheses, the average probability of synchronization error (over different such hypotheses) is upper bounded by
P Synchronization error | H ( 1 ) j = 1 K n P i = 1 K n m = 1 M n L ( x i n ( m ) , Y j n ) < T | H ( 1 ) + j = K n + 1 A n P i = 1 K n m = 1 M n L ( x i n ( m ) , Y j n ) T | H ( 1 ) j = 1 K n P L ( x j n ( 1 ) , Y j n ) < T | H ( 1 ) + j = K n + 1 A n i = 1 K n m = 1 M n P L ( x i n ( m ) , Y j n ) T | H ( 1 ) e n ν e n D W λ W | P + e n ( α + ν + R ) e n D W λ Q | P ,
where (A5) can be derived as in Chapter 11 [17]. The upper bound on the probability of error for the synchronization error in (A5) vanishes as n goes to infinity if the second and third bound in (20) hold.
Decoding Step. In this stage we evaluate the decoding error under the no-collision, no synchronization error hypothesis using a maximum likelihood decoder. Moreover, since in each block, we have to distinguish among e n ( R + ν ) messages, we can upper bound the probability of decoding error in (A1b) as
P [ Decoding error | No synchronization error , No-collision ] i = 1 K n P [ Decoding error in block i ] e n ν e n E r ( R + ν , P , W )
where (A6) is by [14] and E r is defined in (13). This retrieves the second bound in (20). Finally, as a synchronous channel and since in each block we have to distinguish among e n ( R + ν ) messages we retrieve the fourth bound in (20).

Appendix B. Proof of Theorem 2

We now introduce a graph theoretic Lemma, the proof of which is given in Appendix C.
Lemma A1.
In a complete graph K k w 1 , 2 , , w k , 1 , for the set of cycles of length r, C k ( r ) , we have
1 N k ( r ) c C k ( r ) G ( c ) w 1 , 2 2 + + w k , 1 2 k 2 r 2
where C k ( r ) and N k ( r ) are the set and number of all cycles of length r in a complete graph K k , respectively.
Next, we use Lemma A1 to derive upper and lower bounds on the probability of error.

Appendix B.1. Upper Bound on the Probability of Identification Error

We use the optimal ML decoder, which minimizes the average probability of error, given by
σ ^ ( x 1 n , , x T n n ) : = arg max σ S T n i = 1 T n log P σ i x i n ,
where P σ i x i n = t = 1 n P σ i x i , t . Let Σ ^ = σ ^ ( X 1 n , , X T n n ) be the estimate of the permutation of the distributions of X i n , i [ T n ] . The average probability of error associated with the ML decoder can also be written as
P e ( n ) = P Σ ^ [ T n ] = P σ ^ [ T n ] Σ ^ = σ ^ = P r = 2 T n σ ^ : i = 1 T n 1 { σ ^ i i } = r Σ ^ = σ ^
= P r = 2 T n σ ^ : i = 1 T n 1 { σ ^ i i } = r i = 1 T n log P σ ^ i X i n P i X i n 0 ,
where P over X 1 n , . . . , X T n n is used when X i n i . i . d P σ i , i [ T n ] and where (A8a) is due to the requirement that each sequence is distributed according to a distinct distribution and hence the number of incorrect distributions ranges from [ 2 : T n ] . In order to avoid considering the same set of error events multiple times in the union over σ ^ : i = 1 T n 1 { σ ^ i i } = r , we incorporate a graph theoretic interpretation of i = 1 T n 1 { Σ ^ i i } = r in (A8b) which is used to denote the fact that we have identified r distributions incorrectly. It is apparent from Figure A1a that the event that r distributions are incorrectly identified either forms a single cycle of length r ( r = 4 in Figure A1a) or multiple cycles with sum length equal to r (2 cycles of length 2 in Figure A1b). Hence, it is straightforward to show that we can replace the union over σ ^ : i = 1 T n 1 { σ ^ i i } = r in (A8b) with the union over cycles of length r in a complete graph K T n , i.e., c C T n ( r ) .
Figure A1. r incorrectly identified distribution will form cycles with sum length r.
Figure A1. r incorrectly identified distribution will form cycles with sum length r.
Entropy 25 00065 g0a1
Hence, we can re-write the probability of error in (A8b) as
P e ( n ) = P r = 2 T n c C T n ( r ) i c ( V ) log P c ( V ) ( i ) X i n P i X i n 0 r = 2 T n c C T n ( r ) e n i c ( V ) B ( P c ( V ) ( i ) , P i ) ,
where the inequality in (A9) follows from Appendix D. Next, we define w i , j : = e n B ( P i , P j ) to be the edge weight between vertices ( i , j ) in the complete graph K T n ( w ( 1 , 2 ) , w ( K n , 1 ) ) shown in Figure A2.
Figure A2. Complete graph K T n with edge weight e n B ( P i , P j ) for every pair of vertices i j [ K n ] .
Figure A2. Complete graph K T n with edge weight e n B ( P i , P j ) for every pair of vertices i j [ K n ] .
Entropy 25 00065 g0a2
With this definition of the complete graph K T n , it is easy to see that G ( c ) = e n i c ( V ) B ( P c ( V ) ( i ) , P i ) is the gain of cycle c in the complete graph K T n . Based on this, we re-write the probability of error in (A9) as
(A10a) P e ( n ) r = 2 T n c C A n ( r ) G ( c ) (A10b) r = 2 T n N T n ( r ) T n 2 r 2 w 1 , 2 2 + + w T n , 1 2 r / 2 (A10c) r = 2 T n 4 r 1 i < j T n e 2 n B ( P i , P j ) r / 2 (A10d) 16 1 i < j T n e 2 n B ( P i , P j ) 1 4 1 i < j T n e 2 n B ( P i , P j ) ,
where r enumerates the number of incorrectly identified distributions and where (A10b) is by Lemma A1 and (A10c) is by Fact 1 (see Appendix C) and
N T n ( r ) n T n r / 2 = T n r ( r 1 ) ! / 2 T n 2 r / 2 4 r .
Finally, (A10d) will go to zero as n goes to infinity if
lim n 1 i < j T n e 2 n B ( P i , P j ) = 0 .
Remark A1.
As a result of Lemma A1, it can be seen from (A10c) that the sum of probabilities that r 3 distributions are incorrectly identified is dominated by the probability that only r = 2 distributions are incorrectly identified. This shows that the most probable error event is indeed an error event with two wrong distributions.

Appendix B.2. Lower Bound on the Probability of Identifiability Error

For our converse, we use the optimal ML decoder, and as a lower bound to the probability of error in (A8b), we only consider the set of error events with exactly two incorrect distributions, i.e., the set of events with r = 2 . In this case we have
P e ( n ) P 1 i < j T n log P i ( X j n ) P j ( X j n ) + log P j ( X i n ) P i ( X i n ) 0 1 i < j T n P ξ i , j 2 ( i , j ) , ( j , k ) ( i , j ) ( l , k ) i j , l k P [ ξ i , j , ξ k , l ] ,
where P is taken over X 1 n , . . . , X T n n and is used when X i n i . i . d P σ i , i [ T n ] and where (A11) is by [18] and finally
ξ i , j : = log P i P j ( X j n ) + log P j P i ( X i n ) 0 .
We prove in Appendix E that a lower bound on P e ( n ) is given by
(A13a) P e ( n ) 1 i < j T n e 2 n B ( P i , P j ) 2 1 i < j < k T n e n B ( P i , P j ) n B ( P i , P k ) n B ( P k , P j ) + 1 i < j T n e 2 n B ( P i , P j ) 2 (A13b) 1 i < j T n e 2 n B ( P i , P j ) 2 8 1 i < j T n e 2 n B ( P i , P j ) 3 2 + 1 i < j T n e 2 n B ( P i , P j ) 2 (A13c) = 1 i < j T n e 2 n B ( P i , P j ) 8 + 1 i < j T n e 2 n B ( P i , P j ) ,
where (A13b) is by Lemma A1. As can be seen from (A13c), if lim n 1 i < j T n e 2 n B ( P i , P j ) 0 , the probability of error is bounded away from zero. As a result
lim n 1 i < j T n e 2 n B ( P i , P j ) = 0 ,
must hold, which also matches our upper bound on the probability of error in (A10d).

Appendix C. Proof of Lemma A1

We first consider the case that r is an even number and then prove
r n r 2 1 G ( c 1 ) + G ( c N ) N r n a 1 2 + + a n 2 r 2 ,
where as shorthand notation we defined and use N : = N k ( r ) and n : = k 2 and w 1 , 2 w k , 1 = a 1 , , a n .
Our goal is to expand the right hand side (RHS) of (A14) such that all elements have coefficient 1. Then, we parse these elements into N groups (details provided later) such that using the AM-GM inequality (defined in Table 1) on each group, we obtain one of the N terms on the LHS of (A14). Before stating the rigorous proof and to shed some light onto the proof strategy, we provide an example for the graph with k = 4 vertices shown in Figure A3. In this example, we consider the Lemma for r = 4 cycles (for which N = 3 ).
Figure A3. A complete graph with 4 vertices.
Figure A3. A complete graph with 4 vertices.
Entropy 25 00065 g0a3
We may expand the RHS in (A14) as
2 a 1 2 + + a 6 2 2 = Θ 1 + Θ 2 + Θ 3 , Θ 1 = { a 1 4 + a 2 4 + a 3 4 + a 4 4 + a 1 2 a 3 2 + a 1 2 a 3 2 + a 2 2 a 4 2 + a 2 2 a 4 2 + a 1 2 a 2 2 + a 1 2 a 2 2 + a 1 2 a 2 2 + a 1 2 a 2 2 + a 1 2 a 4 2 + a 1 2 a 4 2 + a 1 2 a 4 2 + a 1 2 a 4 2 + a 2 2 a 3 2 + a 2 2 a 3 2 + a 2 2 a 3 2 + a 2 2 a 3 2 + a 3 2 a 4 2 + a 3 2 a 4 2 + a 3 2 a 4 2 + a 3 2 a 4 2 } Θ 2 = { a 1 4 + a 6 4 + a 3 4 + a 5 4 + a 5 2 a 6 2 + a 5 2 a 6 2 + a 1 2 a 3 2 + a 1 2 a 3 2 + a 1 2 a 6 2 + a 1 2 a 6 2 + a 1 2 a 6 2 + a 1 2 a 6 2 + a 1 2 a 5 2 + a 1 2 a 5 2 + a 1 2 a 5 2 + a 1 2 a 5 2 + a 3 2 a 6 2 + a 3 2 a 6 2 + a 3 2 a 6 2 + a 3 2 a 6 2 + a 3 2 a 5 2 + a 3 2 a 5 2 + a 3 2 a 5 2 + a 3 2 a 5 2 } Θ 3 = { a 4 4 + a 5 4 + a 2 4 + a 6 4 + a 5 2 a 6 2 + a 5 2 a 6 2 + a 2 2 a 4 2 + a 2 2 a 4 2 + a 4 2 a 5 2 + a 4 2 a 5 2 + a 4 2 a 5 2 + a 4 2 a 5 2 + a 4 2 a 6 2 + a 4 2 a 6 2 + a 4 2 a 6 2 + a 4 2 a 6 2 + a 2 2 a 5 2 + a 2 2 a 5 2 + a 2 2 a 5 2 + a 2 2 a 5 2 + a 2 2 a 6 2 + a 2 2 a 6 2 + a 2 2 a 6 2 + a 2 2 a 6 2 } .
If we use the AM-GM inequality on Θ 1 , Θ 2 and Θ 3 , we can obtain the lower bound equal to 24 ( a 1 a 2 a 3 a 4 ) , 24 ( a 1 a 6 a 3 a 5 ) and 24 ( a 4 a 5 a 2 a 6 ) , respectively where r n r 2 1 = 24 and hence (A14) holds in this example.
We proceed to prove Lemma A1 for arbitrary k and (even) r 2 . We propose the following scheme to group the elements on the RHS of (A14) and then prove that this grouping indeed leads to the claimed inequality.

Grouping Scheme

For each cycle c i = { a i 1 , a i r } , we need a group of elements, Θ i , from the RHS of (A14). In this regard, we consider all possible subsets of the edges of cycle c i with 1 to r 2 elements (e.g., { a i 1 } , { a i 1 , a i 2 } , { a i 1 , a i r / 2 } , ). For each one of these subsets, we find the respective elements from the RHS of (A14) that is the multiplication of the elements in that subset. For example, for the subset { a i 1 , a i 2 , a i 3 } , we consider the elements like a i 1 n i 1 a i 2 n i 2 a i 3 n i 3 for all possible n i 1 , n i 2 , n i 3 > 0 from the RHS of (A14). However, note that we do not assign all such elements to cycle c i only. If there are cycles of length r that all contain { a i 1 , a i 2 , a i 3 } , we should assign 1 of the elements like a i 1 n i 1 a i 2 n i 2 a i 3 n i 3 , n i 1 , n i 2 , n i 3 > 0 to cycle c i (note the symmetry of the strategy over cycles).
We state some facts, which can be easily.
Fact 1. In a complete graph K k , there are N = N k ( r ) = k r ( r 1 ) ! 2 cycles of length r.
Fact 2. By expanding the RHS of (A14) such that all elements have coefficient 1, we end up with N k ( r ) r n n r 2 elements.
Fact 3. Expanding the RHS of (A14) such that all elements have coefficient 1, and finding their product yields
a 1 × × a n N r n r n r 2 1 .
Fact 4. In the above grouping scheme each element on the RHS of (A14) is summed in exactly one group. Hence, by symmetry and Fact 2, each group is the sum of r n r 2 1 elements.
Now, consider any two cycles c i = { a i 1 , , a i r } , c j = { a j 1 , , a j r } . Assume that using the above grouping scheme, we obtain the group of elements Θ i , Θ j (where by fact 3 each one is the sum of r n r 2 1 elements). If we apply the AM-GM inequality on each of the two groups, we obtain
Θ i r n r 2 1 a i 1 n i 1 × × a i r n 1 r 1 r n r 2 1 , Θ j r n r 2 1 a j 1 n j 1 × × a j r n j r 1 r n r 2 1 ,
where t = 1 r a i t n i t is the product of the elements in Θ i . By symmetry of the grouping scheme for different cycles, it is obvious that t [ r ] , n i t = n j t . Hence n i t = n j t = p t , i , j [ N ] . i.e., we have
Θ i r n r 2 1 a i 1 p 1 × × a i r p r 1 r n r 2 1 .
By symmetry of the grouping scheme over the elements of each cycle, we also get that n i k = n i l = q i , k , l [ r ] , i.e.,
Θ i r n r 2 1 a i 1 q i × × a i r q i 1 r n r 2 1 .
It can be seen from (A15) and (A16) that all elements of all groups have the same power n i t = p , i [ N ] , t [ r ] . i.e.,
Θ i r n r 2 1 a i 1 p × × a i r p 1 r n r 2 1 .
Since each element on the RHS of (A14) is assigned to one and only one group and since t = 1 r a i t n i t = t = 1 r a i t p is the product of the elements of each group Θ i , the product of all elements in Θ 1 + + Θ N (which is equal to product of the elements in the expanded version of the RHS of (A14)) is i = 1 N t = 1 r a i t p .
In addition, since each a i appears in exactly N r n of the cycles, by Fact 3 and a double counting argument, we have
p × N r n = N r n r n r 2 1 ,
and hence p = r n r 2 1 . Hence, the lower bound of the AM-GM inequality on the Θ 1 + + Θ N , will result in
r n r 2 1 G ( c 1 ) + + r n r 2 1 G ( c N k ( r ) ) ,
and the Lemma is proved for even r.
For odd values of r, the problem that may arise by using the grouping strategy in its current form, is when r < k 2 . In this case, some of the terms on the RHS of (A14) may contain multiplication of a i ’s that are not present in any of the G ( c i ) ’s. To overcome this, take both sides to the power of 2 m for the smallest m such that r m > k 2 . Then the RHS of (A14) is at most the multiplication of r m different a i ’s and on the LHS of (A14), there are 2 m cycles of length r multiplied together. By our choice of 2 m , now, all possible combinations of a i ’s on the RHS are present in at least one cycle multiplication in the LHS. Hence, it is straightforward to use the same strategy as even values of r to prove the theorem for the odd values of r.

Appendix D. Proof of (A9)

The inequality in (A9) is by
(A17a) P i c ( V ) log P c ( V ) ( i ) X i n P i X i n 0 exp n inf t log E i c ( V ) P c ( V ) ( i ) X i n P i X i n t (A17b) exp n i c ( V ) log E P c ( V ) ( i ) X i n P i X i n 1 / 2 = exp n i c ( V ) B ( P c ( V ) ( i ) , P i ) ,
where (A17a) is by the Chernoff inequality. The fact that we used t = 1 / 2 in (A17b) instead of finding the exact optimizing t, comes from the fact that t = 1 / 2 is the optimal choice for r = 2 and as we will explain in Remark A1. The rest of the error events are dominated by the set of only 2 incorrectly identified distributions. This can be seen as follows for X 1 n i . i . d P 1 , X 2 n i . i . d P 2
P log P 1 ( X 2 n ) P 2 ( X 2 n ) + log P 2 ( X 1 n ) P 1 ( X 1 n ) 0 = P ^ 1 , P 2 ^ : x X P ^ 1 ( x ) log P 2 ( x ) P 1 ( x ) + P 2 ^ ( x ) log P 1 ( x ) P 2 ( x ) 0 exp n D P ^ 1 P 1 n D P ^ 2 P 2 (A18) e n D P ˜ P 1 n D P ˜ P 2 = e 2 n B ( P 1 , P 2 ) ,
where P ˜ in the first equality in (A18), by using the Lagrangian method, can be shown to be equal to P ˜ ( x ) = P 1 ( x ) P 2 ( x ) x P 1 ( x ) P 2 ( x ) and subsequently the second inequality in (A18) is proved.

Appendix E. Proof of (A13a)

We upper bound the denominator of (A11) by
P [ ξ i , j , ξ i , k ] = P log P i ( X j n ) P j ( X j n ) + log P j ( X i n ) P i ( X i n ) 0 log P i ( X k n ) P k ( X k n ) + log P k ( X i n ) P i ( X i n ) 0 P log P i ( X j n ) P j ( X j n ) + log P j ( X i n ) P i ( X i n ) + log P i ( X k n ) P k ( X k n ) + log P k ( X i n ) P i ( X i n ) 0 exp n inf t log E P i ( X j n ) P j ( X j n ) · P j ( X i n ) P i ( X i n ) · P i ( X k n ) P k ( X k n ) · P k ( X i n ) P i ( X i n ) t exp n log E P i ( X j n ) P j ( X j n ) · P j ( X i n ) P i ( X i n ) · P i ( X k n ) P k ( X k n ) · P k ( X i n ) P i ( X i n ) 1 2
= exp n B ( P i , P j ) n B ( P j , P k ) n B ( P i , P k ) .
An upper bound for P ξ i , j , ξ k , l can be derived similarly.

Appendix F. Proof of Theorem 3

We propose a three-stage achievability scheme where we should again note that with similar arguments as the ones in Theorem 1, by imposing the first bound in (24), we can mainly focus on the no-collision hypothesis. The three achievability stages perform the tasks of synchronization, identification, and decoding, respectively and the joint probability of error can be decomposed as
P Error P Synchronization error | No-collision + P Identification error | No synchronization error , No-collision + P Decoding error | No synchronization , No identification error , No-collision + P collision .

Appendix F.1. Codebook Generation

Let K n = e n ν be the number of users, A n = e n α be the number of blocks, M n = e n R be the number of messages, and | W | = poly ( n ) be the number of channels. Each user i [ K n ] generates a random i.i.d codebook according to distribution P c ( i ) where the index c ( i ) [ | W | ] is chosen based on the channel Q i = W c ( i ) . For each user i [ K n ] , the codeword for each message m [ M n ] is denoted as x i n ( m ) .

Appendix F.2. Probability of Error Analysis

A three-stage decoder is used. We now introduce the three stages and bound the probability of error for each stage.
Synchronization Step. We perform a sequential likelihood ratio test for synchronization as follows. Recall Q i ( · | · ) = W c ( i ) ( · | · ) for all user i [ K n ] . Fix thresholds
T c ( i ) D Q [ P c ( i ) W c ( i ) ] , D [ P c ( i ) W c ( i ) ] Q , i [ K n ] .
For each block j [ A n ] if there exists any user i [ K n ] such that
L c ( i ) ( y j n ) : = 1 n log [ P c ( i ) W c ( i ) ] ( y j n ) Q ( y j n ) T c ( i ) ,
then declare that block j is an ‘active’ block. Else, declare that block j is an ‘idle’ block. Note that were able to calculate the probabilities of error corresponding to (A3) by leveraging the constant composition construction of codewords in Theorem 1. In here, we can leverage the i.i.d. construction of the codewords and calculate the probability of error corresponding to (A21).
We now find an upper bound on the average probability of synchronization error for this scheme over different hypotheses. With the same argument as the one after (A4) we have
P Synchronization error | H ( 1 ) P i = 1 K n L c ( i ) ( Y i n ) < T c ( i ) | H ( 1 ) + P s = K n + 1 A n i = 1 K n L c ( i ) ( Y s n ) T c ( i ) | H ( 1 ) i = 1 K n P L c ( i ) ( Y i n ) < T c ( i ) | H ( 1 ) + e n α P i = 1 K n L c ( i ) ( Y n ) T c ( i ) | H ( 1 )
= i = 1 K n P L c ( i ) ( Y i n ) < T c ( i ) | H ( 1 ) + e n α P j = 1 | W | L j ( Y n ) T j | H ( 1 )
i = 1 K n e n D [ P c ( i ) W c ( i ) ] λ i [ P c ( i ) W c ( i ) ] + e n α j = 1 | W | e n D [ P j W j ] λ j Q
= j = 1 | W | N j e n D [ P j W j ] λ j [ P j W j ] + e n α e n D [ P j W j ] λ j Q ,
where (A22a) is by taking j : = c ( i ) and (A22b) is by the Chernoff inequality in which [ P j W j ] λ was defined in (26) and where (A22c) is again by taking j : = c ( i ) and re-counting the events.
The probability of error in this stage will decay to zero as blocklength n goes to infinity if for all j [ | W | ] and for a ϵ > 0
ν j + ϵ < D [ P j W j ] λ j [ P j W j ] ,
α + ϵ < D [ P j W j ] λ j Q .
This corresponds to the second and third bounds in (24).
Identification Step. Having found the location of the ‘active’ blocks, we move on to the second stage of the achievability scheme to identify which user is active in which block. We note that, by the random codebook generation and the memoryless property of the channel, the output of the block occupied by user i [ K n ] is i.i.d distributed according to the marginal distribution
[ P c ( i ) Q c ( i ) ] ( y ) : = x X P c ( i ) ( x ) Q c ( i ) ( y | x ) .
We leverage this property and customize the result in Theorem 2 to identify the different distributions of the different users. Note that at this point, we only distinguish the users with different channels from one another. Users with the same channel are distinguished from each other in the decoding stage as will be explained next. In Theorem 2, it was assumed that all the distributions are distinct, while in here, our distributions are not necessarily distinct. The only modification that is needed in order to use the result of Theorem 2 is as follows. We need to consider a graph in which the edge between every two similar distributions has edge weights equal to zero (as opposed to e B ( P , P ) = e 0 = 1 ). By doing so, we can easily conclude that the probability of identification error in our problem using an ML decoder is upper bounded by
P [ Identification error | No synchronization error , No-collision ]
1 i < j | W | e 2 n B [ P i W i ] , [ P j W j ]
1 i < j | W | e 2 n inf i , j B [ P i W i ] , [ P j W j ]
which goes to zero for | W | = poly ( n ) if 0 < inf i , j B [ P i W i ] , [ P j W j ] . This corresponds to the fourth bound in (24).
Decoding Step. After finding the permutation of users in the active blocks, we can go ahead with the third stage of the achievable scheme to find the transmitted messages of the users using a maximum likelihood decoder. In this stage, we can group each set of users (with group size N j , j [ | W | ] defined in (25)) who have similar channel W j . In each block of each group we have to distinguish among e R + ν j messages and with similar steps as the proof of Theorem 1, we have
P Decoding error | No synchronization , No identification error , No-collision j = 1 | W | P Decoding error for users with channel W j | No synchronization and identification error , No-collision j = 1 | W | N j e n E r R + ν j , P j , W j = j = 1 | W | e n ν j E r R + ν j , P j , W j ,
which retrieves the second bound in (24). Finally, as for a synchronous channel and since we have to distinguish among e R + ν j messages in each block for some ϵ > 0 , we must have
R + ν j + ϵ < I ( P j , W j ) , j [ | W | ] ,
which corresponds to the last bound in (24).
Finally to conclude the proof, we should note that the bounds derived in (A23), (A26) and (A27) are all achievable bounds for each user for large enough block length n (say for n n 0 ). Since we need the intersection of the achievable bounds for all users while the number of users itself increases with n, we need to express the asymptotic bound with the definition of n going to infinity. By expressing the achievable set for all users as A n defined in (24) by the equivalence of lim inf n A n n 0 = 1 n = n 0 A n and by noting that for some n 0 every region A n n 0 is achievable by our achievability scheme previously introduced here, we see that (23) is indeed achievable.
Remark A2.
Note that the existence of ϵ > 0 in our bounds are due to the fact that we have a summation that depends on n and our exponent also decays exponentially with n while the n multiplier depends on the summation arguments. By including a constant ϵ > 0 that multiplies n, we guarantee that the summation as a whole decays to zero.
Remark A3.
The achievability proof of Theorem 1 relies on constant composition codes whereas the achievability proof of Theorem 3 relies on i.i.d. codebooks. The reason for these restrictions is that in Theorem 3 we also need to distinguish different users and in order to use the result of [13], we focused our attention on i.i.d. codebooks.

Appendix G. Proof of Theorem 4

For the proof of Theorem 4, we use an ML decoder sequentially to perform the synchronization, identification and decoding all together, for each block.
Proof. 
Codebook generation: Each user i [ K n ] generates an i.i.d. random codebook according to the distribution P i .
Probability of error analysis: For each block s [ A n ] , the decoder outputs
i * = arg max i [ 0 : K n ] , m [ M n ] Q i ( y s n | x i n ( m ) ) ,
where x 0 n = .
We now find an upper bound on the probability of error given the hypothesis H ( 1 ) in (A4) for this decoder as follows
P e ( n ) i [ K n ] m [ 2 : M n ] P log Q i ( Y i n x i n ( m ) ) Q i ( Y i n x i n ( 1 ) ) > 0 | H ( 1 )
+ i [ K n ] j [ 0 : K n ] j i m [ M n ] P log Q j ( Y i n | x j n ( m ) ) Q i ( Y i n | x i n ( 1 ) ) > 0 | H ( 1 )
(A28c) + s [ K n + 1 : A n ] j [ K n ] m [ M n ] P log Q j ( Y s n | x j n ( m ) ) Q ( Y s n ) > 0 | H ( 1 ) i [ K n ] e n R e n sup t log E Q i ( Y i X ¯ i ) Q i ( Y i X i ) t + i [ K n ] j [ 0 : K n ] j i e n R e n sup t log E Q j ( Y i | X j ) Q i ( Y i | X i ) t + e n α j [ K n ] e n R e n sup t log E Q j ( Y s | X j ) Q ( Y s ) t ,
where P X , X ¯ ( x , x ) = P X ( x ) P X ( x ) . The term given in (A28a) is due to decoding error, (A28b) is due to identification error and finally (A28c) is due to synchronization error. The last inequality is by the Chernoff bound. In order for each term in the probability of error upper bound to vanish as n grows to infinity and by the explanation in the proof of Theorem 3 for the inclusion of n in our bounds, we find the conditions stated in the theorem. □

Appendix H. Proof of (28)

To see (28a), we note that due to the symmetry in
C ( P , Q , P , Q ) = sup t log x , x , y P ( x ) P ( x ) Q ( y | x ) 1 t Q ( y | x ) t ,
the supremum is achieved at the midpoint t = 1 2 . hence C ( P , Q , P , Q ) = μ ( 1 2 ) . Moreover, we find an upper bound on C ( . , Q , P i , Q i ) by noting that μ 0 , i ( t ) defined in (12b) is concave in t with μ 0 , i ( 1 ) = 0 and
μ 0 , i ( t ) t t = 1 = I ( P i , Q i ) D ( [ P i Q i ] Q ) 0 .
Hence μ 0 , i ( t ) is always less than ( I ( P i , Q i ) + D ( [ P i Q i ] Q ) ) ( 1 t ) and for 0 t 1 it is always less than I ( P i , Q i ) + D ( [ P i Q i ] Q ) .

Appendix I. Proof of Theorem 5

We first define the following special shorthand notations that we will use throughout this proof
F n : = M n K n , Q i , m n ( y n ) : = Q i ( y n | x i n ( m ) ) ,
P Y n ( y n ) : = 1 F n i = 1 K n m = 1 M n Q i , m n ( y n ) , Q i , m n λ i ( y n ) : = Q i , m n ( y n ) λ i Q n ( y n ) 1 λ i y n Q i , m n ( y n ) λ i Q n ( y n ) 1 λ i ,
P Y n ( λ ) ( y n ) : = 1 F n i = 1 K n m = 1 M n Q i , m n λ i ( y n ) , P Y n t ( y n ) : = P Y n ( y n ) t Q n ( y n ) 1 t y n P Y n ( y n ) t Q n ( y n ) 1 t ,
Q n ( y n ) : = i = 1 n Q ( y i ) .
We use the optimal ML decoder to find the location of the ‘active’ blocks. In this stage, we are not concerned about the identity or the message of the users. In this regard, the decoder output is determined via
arg max ( l 1 , , l K n ) l i l j , i j l i [ A n ] , i [ K n ] i = 1 K n log P Y n - ( Y l i n ) Q n ( Y l i n ) .
Given the hypothesis that the users are active in blocks [ K n ] , denoted by H ( 1 ) in (A4), the corresponding error events for the ML decoder are given by
error | H ( 1 ) = ( l 1 , , l K n ) ( 1 , , K n ) i = 1 K n log P Y n ( Y l i n ) Q n ( Y l i n ) > i = 1 K n log P Y n ( Y i n ) Q n ( Y i n ) i [ K n ] j [ K n + 1 : A n ] log P Y n ( Y j n ) Q n ( Y j n ) log P Y n ( Y i n ) Q n ( Y i n ) j [ K n + 1 : A n ] log P Y n ( Y j n ) Q n ( Y j n ) T i [ K n ] T log P Y n ( Y i n ) Q n ( Y i n ) ,
for any T R . We take T to be
T : = 1 F n i = 1 K n m = 1 M n D Q i , m n λ i Q n D Q i , m n λ i P Y n - ,
for different λ i [ 0 , 1 ] , i [ K n ] .
We also find the following lower bounds
Q n log P Y n - Q n ( Y n ) T e n K n i = 1 K n D ( Q i λ i Q | P i ) ( R + ν ) ( 1 r - n ) + h ( r - n ) n ,
P Y n - log P Y n - Q n ( Y n ) T e n K n i = 1 K n D ( Q i λ i Q i | P i ) ,
which are proved in Appendix K.
By using the inequalities in (A33a) and (A33b), we find a lower bound on the probability of (A31) as follows
P j [ K n + 1 A n ] log P Y n - ( Y j n ) Q n ( Y j n ) T i [ K n ] T log P Y n - ( Y i n ) Q n ( Y i n ) | H ( 1 )
= P j [ K n + 1 : A n ] log P Y n - ( Y j n ) Q n ( Y j n ) T | H ( 1 ) P i [ K n ] T log P Y n - ( Y i n ) Q n ( Y i n ) | H ( 1 )
= : P Z 1 1 P [ Z 2 1 ]
1 Var [ Z 1 ] E 2 [ Z 1 ] 1 Var [ Z 2 ] E 2 [ Z 2 ]
1 1 j = K n + 1 A n P [ ξ j = 1 ] 1 1 i = 1 K n P [ ζ i = 1 ]
1 e n α + n K n i = 1 K n D ( Q i λ i Q | P i ) ( R + ν ) ( 1 r - n ) + h ( r - n ) n 1 e n ν + n K n i = 1 K n D ( Q i λ i Q i | P i ) ,
where (A34b) follows by independence of Y i n and Y j n whenever i j , i , j [ A n ] and the inequality in (A34d) is by Chebyshev’s inequality, where we have defined
Z 1 : = j = K n + 1 A n ξ j , ξ j : = B e r Q n log P Y n - Q n ( Y j n ) T ,
Z 2 : = i = 1 K n ζ i , ζ i : = B e r P Y n - log P Y n - Q n ( Y i n ) T ,
where { ξ j , ζ i } are mutually independent. We can see from (A34f) that if for some ϵ > 0
ν > 1 K n i = 1 K n D ( Q i λ i Q i | P i ) + ϵ ,
α > 1 K n i = 1 K n D ( Q i λ i Q | P i ) ( 1 r ¯ n ) ( ν + R ) + ϵ ,
then the probability of error is strictly bounded away from zero and hence the region specified by (A36) (first bound in (31)) is impermissible.
We now focus only on decoding error probability, with the assumption that we have identified the active blocks. This is hence a lower bound on the overall probability of error. We have
P Decoding error | No synchronization error
= P i = 1 K n Decoding error in block i
1 1 i = 1 K n P D i c
1 1 i = 1 K n e n E s p ( R , P i , Q i ) ,
1 1 K n e n 1 K n i = 1 K n E s p ( R , P i , Q i ) ,
= 1 e n ν 1 K n i = 1 K n E s p ( R , P i , Q i )
where
D i c : = Decoding error in block i ,
and where (A37a) can be derived using the same technique used in (A34). Finally, (A37d) is by lower bounding the probability of decoding error as in [14] where we defined E s p in (14). (A37e) is by Jensen’s inequality and finally as it can be seen in (A37f), if ν > 1 K n i = 1 K n E s p ( R , P i , Q i ) + ϵ for an ϵ > 0 the lower bound to the probability of decoding error is bounded away from zero for large values of n and this retrieves the second bound in (31). In addition, the usual converse bound on the rate of a synchronous channel also applies to any asynchronous channel and hence the region where R > I ( P , Q ) + ϵ is also impermissible. This concludes the proof.

Appendix J. Proof of Theorem 6

User i [ K n ] has a codebook with M n = e n R codewords of length n. Define for i [ K n ] an ‘extended codebook’ consisting of A n M n codewords of length n A n constructed such that m [ M n ] and t i [ A n ]
X ˜ i n A n ( m i , t i ) : = i n ( t i 1 ) f i ( m i ) i n ( A n t i ) ,
as depicted in Figure A4.
Figure A4. Extended codebook.
Figure A4. Extended codebook.
Entropy 25 00065 g0a4
By using Fano’s inequality, i.e., H ( X 1 n A n , , X K n n A n | Y n A n ) n ϵ n : ϵ n 0 as n , for any codebook of length n A n we have
H ( X 1 n A n , , X K n n A n ) = H ( m 1 , t 1 , , m K n , t K n ) = n K n α + R = H ( X 1 n A n , , X K n n A n | Y n A n ) + I ( X 1 n A n , , X K n n A n ; Y n A n ) n ϵ n + n e n α | Y | ν + log 1 + 1 α K n i [ K n ] R i n α + log 1 + ϵ n e n α Y n ,
where log 1 + 1 α K n i [ K n ] R i n 0 and log ( 1 + ϵ n e n α Y ) n 0 vanish as n goes to infinity.

Appendix K. Proof of (A33a) and (A33b)

Before deriving lower bounds on (A33a) and (A33b), we note that by the Type-counting Lemma [19], at the expense of a small decrease in the rate (which vanishes in the limit for large blocklength) we may restrict our attention to constant composition codewords. Henceforth, we assume that the composition of the codewords for user i [ K n ] is given by P i . Moreover, to make this paper self-contained, we restate the following Lemmas that we use in the rest of the proof.
Lemma A2
(Compensation Identity). For arbitrary π i : i = 1 K π i = 1 and arbitrary probability distribution functions P i P X , i [ K ] , we define P ¯ ( x ) = i = 1 K π i P i ( x ) . Then for any probability distribution function R we have
D P ¯ R + i = 1 K π i D P i P ¯ = i = 1 K π i D ( P i | | R ) .
Lemma A3
(Fano). Let F be an arbitrary set of size N. For P ¯ = θ F P θ N we have
1 N θ F D P θ P ¯ 1 r ¯ log N ( 1 r ¯ ) + r ¯ log N r ¯ N 1 ,
where
r ¯ : = inf T 1 N θ F P θ T θ
in which the infimum is taken over all possible estimators T.
We now continue with the proof of (A33a). Using the Chernoff bound we can write
Q n log P Y n Q n ( Y n ) 1 F n i = 1 K n m = 1 M n D Q i , m n λ i Q n D Q i , m n λ i P Y n - Chernoff e sup t A ( t ) .
The Chernoff bound exponent, sup t A ( t ) , is expressed and simplified as follows
A ( t ) : = t F n i = 1 K n m = 1 M n D Q i , m n λ i Q n D Q i , m n λ i P Y n - log E Q n P Y n - Q n ( Y n ) t = t F n i = 1 K n m = 1 M n y n Q i , m n λ i ( y n ) log P Y n - ( y n ) Q n ( y n ) log y n P Y n - ( y n ) t Q n ( y n ) 1 t = 1 F n i = 1 K n m = 1 M n y n Q i , m n λ i ( y n ) log P Y n - ( y n ) t Q n ( y n ) 1 t y n P Y n - ( y n ) t Q n ( y n ) 1 t Q n ( y n ) = 1 F n i = 1 K n m = 1 M n y n Q i , m n λ i ( y n ) log P Y n - t ( y n ) Q n ( y n ) = 1 F n i = 1 K n m = 1 M n D Q i , m n λ i Q n D Q i , m n λ i P Y n - t = 1 K n i = 1 K n n D Q i λ i Q | P i 1 F n i = 1 K n m = 1 M n D Q i , m n λ i P Y n - t ,
where (A42) is the result of the constant composition structure of the codewords. As a result,
sup t A ( t ) = 1 K n i = 1 K n n D Q i λ i Q | P i inf t 1 F n i = 1 K n m = 1 M n D Q i , m n λ i P Y n - t .
Moreover,
inf t 1 F n i = 1 K n m = 1 M n D Q i , m n λ i P Y n - t = inf t 1 F n i = 1 K n m = 1 M n y n Q i , m n λ i ( y n ) log Q i , m n λ i ( y n ) P Y n - t ( y n ) = inf t 1 F n i = 1 K n m = 1 M n y n Q i , m n λ i ( y n ) log Q i , m n λ i ( y n ) P Y n - t ( y n ) · P Y n ( λ ) - ( y n ) P Y n ( λ ) - ( y n ) = inf t 1 F n i = 1 K n m = 1 M n D Q i , m n λ i P Y n ( λ ) - + D P Y n ( λ ) - P Y n - t 1 F n i = 1 K n m = 1 M n D Q i , m n λ i P Y n ( λ ) - .
Note that P Y n ( λ ) - is the average of Q i , m n λ i over m , i ’s ( m [ M n ] , i [ K n ] ) and hence based on Lemma A3, we have
1 F n i = 1 K n m = 1 M n D Q i , m n P Y n ( λ ) - 1 r ¯ n log F n ( 1 r ¯ n ) + r ¯ n log F n r ¯ n F n 1 1 r ¯ n log F n h ( r ¯ n ) .
As a result
sup t A ( t ) 1 K n i = 1 K n n D ( Q i λ i Q | P i ) n ( R + ν ) ( 1 r ¯ n ) + h ( r ¯ n ) .
Now we continue with the proof of (A33b). Again, using the Chernoff bound we have
P Y n - log P Y n Q n ( Y n ) 1 F n i = 1 K n m = 1 M n D Q i , m n λ i Q n D Q i , m n λ i P Y n - = P Y n - log Q n P Y n ( Y n ) 1 F n i = 1 K n m = 1 M n D Q i , m n λ i P Y n - D Q i , m n λ i Q n e sup t B ( t ) ,
where
sup t B ( t ) : = sup t t F n i = 1 K n m = 1 M n y n Q i , m n λ i ( y n ) log Q n ( y n ) P Y n - ( y n ) P Y n - ( y n ) log y n Q n ( y n ) t P Y n - ( y n ) 1 t = sup t y n P Y n ( λ ) - ( y n ) log P Y n - 1 t ( y n ) P Y n - ( y n ) = sup t D P Y n ( λ ) - P Y n - D P Y n ( λ ) - P Y n - 1 t D P Y n ( λ ) - P Y n - = y n 1 F n i = 1 K n m = 1 M n Q i , m n λ i ( y n ) log 1 F n i = 1 K n m = 1 M n Q i , m n λ i ( y n ) 1 F n i = 1 K n m = 1 M n Q i , m n ( y n ) y n 1 F n i = 1 K n m = 1 M n Q i , m n λ i ( y n ) log Q i , m n λ i ( y n ) Q i , m n ( y n ) = 1 F n i = 1 K n m = 1 M n D Q i , m n λ i Q i , m n = 1 K n i = 1 K n n D Q i λ i Q | P i ,
and where the inequality in (A44) is by the Log-Sum inequality.

Appendix L. Proof of Lemma 1

We will prove the Lemma by contradiction. Define
t i Number of users in block i .
Assume that the arrangement with highest probability (let us call it A ) has at least two blocks, say blocks 1 , 2 , for which t 1 t 2 > 1 . This assumption means that the arrangement with the highest probability is not the non-overlapping arrangement.
The probability of this arrangement, P ( A ) , is proportional to
P ( A ) K n t 1 K n t 1 t 2 = K n ! t 1 , ( K n t 1 ) ! ( K n t 1 ) ! t 2 ! ( K n t 1 t 2 ) ! = K n ! t 1 ! t 2 ! ( K n t 1 t 2 ) ! .
We now consider a new arrangement, A new , in which t 1 , new = t 1 1 and t 2 , new = t 2 + 1 and all other blocks remain unchanged. This new arrangement is also feasible since we have not changed the number of users. The probability of this new arrangement is proportional to
P ( A new ) K n t 1 1 K n t 1 1 t 2 + 1 = K n ! ( t 1 1 ) ! ( K n t 1 + 1 ) ! ( K n t 1 + 1 ) ! ( t 2 + 1 ) ! ( K n t 1 t 2 ) ! = K n ! ( t 1 1 ) ! ( t 2 + 1 ) ! ( K n t 1 t 2 ) ! .
Comparing P ( A ) and P ( A new ) we see that P ( A ) < P ( A new ) which is a contradiction to our primary assumption that A has the highest probability among all arrangements. Hence there do not exist two blocks which differ more than one in the number of active users within them in the arrangement with the highest probability.

References

  1. Chandar, V.; Tchamkerten, A.; Wornell, G. Optimal Sequential Frame Synchronization. IEEE Trans. Inf. Theory 2008, 54, 3725–3728. [Google Scholar] [CrossRef]
  2. Tchamkerten, A.; Chandar, V.; Wornell, G.W. Asynchronous Communication: Capacity Bounds and Suboptimality of Training. IEEE Trans. Inf. Theory 2013, 59, 1227–1255. [Google Scholar] [CrossRef] [Green Version]
  3. Tchamkerten, A.; Chandar, V.; Caire, G. Energy and Sampling Constrained Asynchronous Communication. IEEE Trans. Inf. Theory 2014, 60, 7686–7697. [Google Scholar] [CrossRef] [Green Version]
  4. Chen, X.; Chen, T.Y.; Guo, D. Capacity of Gaussian many-access channels. IEEE Trans. Inf. Theory 2017, 63, 3516–3539. [Google Scholar] [CrossRef]
  5. Polyanskiy, Y. A perspective on massive random-access. In Proceedings of the IEEE International Symposium on Information Theory (ISIT), Aachen, Germany, 25–30 June 2017; pp. 2523–2527. [Google Scholar] [CrossRef]
  6. Ahlswede, R.; Dueck, G. Identification via channels. IEEE Trans. Inf. Theory 1989, 35, 15–29. [Google Scholar] [CrossRef] [Green Version]
  7. Han, T.S.; Verdu, S. Identification Plus Transmission. In Proceedings of the 1991 IEEE International Symposium on Information Theory, Budapest, Hungary, 24–28 June 1991; p. 267. [Google Scholar] [CrossRef]
  8. Kiefer, J.; Sobel, M. Sequential Identification and Ranking Procedures, with Special Reference to Koopman-Darmois Populations; University of Chicago Press: Chicago, IL, USA, 1968. [Google Scholar]
  9. Ahlswede, R.; Haroutunian, E. On logarithmically asymptotically optimal testing of hypotheses and identification. In General Theory of Information Transfer and Combinatorics; Springer: Berlin/Heidelberg, Germany, 2006; pp. 553–571. [Google Scholar]
  10. Unnikrishnan, J. Asymptotically optimal matching of multiple sequences to source distributions and training sequences. IEEE Trans. Inf. Theory 2015, 61, 452–468. [Google Scholar] [CrossRef] [Green Version]
  11. Shahi, S.; Tuninetti, D.; Devroye, N. On the capacity of strong asynchronous multiple access channels with a large number of users. In Proceedings of the IEEE International Symposium on Information Theory (ISIT), Barcelona, Spain, 10–15 July 2016; pp. 1486–1490. [Google Scholar] [CrossRef]
  12. Shahi, S.; Tuninetti, D.; Devroye, N. On the capacity of the slotted strongly asynchronous channel with a bursty user. In Proceedings of the 2017 IEEE Information Theory Workshop (ITW), Kaohsiung, Taiwan, 6–10 November 2017; pp. 91–95. [Google Scholar] [CrossRef]
  13. Shahi, S.; Tuninetti, D.; Devroye, N. On Identifying a Massive Number of Distributions. In Proceedings of the IEEE International Symposium on Information Theory (ISIT), Vail, CO, USA, 17–22 June 2018. [Google Scholar]
  14. Gallager, R.G. Information Theory and Reliable Communication; Springer: Berlin/Heidelberg, Germany, 1968; Volume 588. [Google Scholar]
  15. Polyanskiy, Y. Asynchronous Communication: Exact Synchronization, Universality, and Dispersion. IEEE Trans. Inf. Theory 2013, 59, 1256–1270. [Google Scholar] [CrossRef] [Green Version]
  16. Chandar, V.; Tchamkerten, A. A Note on Bursty MAC. 2015. Available online: https://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=&ved=2ahUKEwiL8qDGgZ78AhXgmVYBHcCSAtoQFnoECA4QAQ&url=https%3A%2F%2Fperso.telecom-paristech.fr%2Ftchamker%2Fburstymac.pdf&usg=AOvVaw0ferJC5-OOrr-KzrA7XG1V (accessed on 29 August 2022).
  17. Cover, T.M.; Thomas, J.A. Elements of Information Theory; John Wiley & Sons: Hoboken, NJ, USA, 2012. [Google Scholar]
  18. Chung, K.L.; Erdos, P. On the application of the Borel-Cantelli lemma. Trans. Am. Math. Soc. 1952, 72, 179–186. [Google Scholar] [CrossRef]
  19. Csiszár, I.; Körner, J. Information Theory: Coding Theorems for Discrete Memoryless Systems; Cambridge University Press: Cambridge, UK, 2011. [Google Scholar]
Figure 1. Transmitted sequence for each user and the received sequence in colliding and non-colliding transmission time for user 1 and 2.
Figure 1. Transmitted sequence for each user and the received sequence in colliding and non-colliding transmission time for user 1 and 2.
Entropy 25 00065 g001
Figure 2. Comparison of the achievable region in Theorems 1 and 4, for the Binary Symmetric Channel with cross over probability δ = 0.11 .
Figure 2. Comparison of the achievable region in Theorems 1 and 4, for the Binary Symmetric Channel with cross over probability δ = 0.11 .
Entropy 25 00065 g002
Table 1. Special notation for the identification problem.
Table 1. Special notation for the identification problem.
S A A symmetric group which is the set of all possible permutations of A elements.
σ i The i-th element of the permutation σ S A ; i [ A ] .
K k : = K k ( w { 1 , 2 } w { k , 1 } ) ; A complete graph with k nodes and edge weights w { i , j } between node i , j .
c r ( V ) , c r ( E ) A cycle of length r represented by the set of its vertices and edges, respectively.
c r ( V ) ( i ) i-th vertex of the cycle c r ( V ) .
C k ( r ) Set of cycles of length r in the complete graph K k .
N k ( r ) Number of cycles of length r in a complete graph K k .
G ( c ) : = i j w i , j ; Gain of a cycle which we define as the product of the edge weights within the cycle c.
AM-GM inequalityRefers to the following inequality n i = 1 n a i 1 n i = 1 n a i .
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Shahi, S.; Tuninetti, D.; Devroye, N. The Strongly Asynchronous Massive Access Channel. Entropy 2023, 25, 65. https://doi.org/10.3390/e25010065

AMA Style

Shahi S, Tuninetti D, Devroye N. The Strongly Asynchronous Massive Access Channel. Entropy. 2023; 25(1):65. https://doi.org/10.3390/e25010065

Chicago/Turabian Style

Shahi, Sara, Daniela Tuninetti, and Natasha Devroye. 2023. "The Strongly Asynchronous Massive Access Channel" Entropy 25, no. 1: 65. https://doi.org/10.3390/e25010065

APA Style

Shahi, S., Tuninetti, D., & Devroye, N. (2023). The Strongly Asynchronous Massive Access Channel. Entropy, 25(1), 65. https://doi.org/10.3390/e25010065

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop