1. Introduction
Sixth-generation (6G) technology is envisioned to provide global and seamless coverage with space–air–ground–sea networks, which is of great importance for sparsely populated areas (e.g., old-growth forests), regions where geological disasters frequently occur, and pelagic zones [
1,
2]. The integration of 6G-enabled Internet of Things (IoT) will facilitate a significant advancement in the intelligence of forest ecological management and monitoring, particularly in remote areas characterized by complex terrains and extensive distributions [
3]. As some of the most critical ecosystems, forests play a crucial role in maintaining biodiversity, regulating climate, and ensuring ecological balance. Efficient forest management and monitoring is not only vital for ecosystem protection but also for early detection of risks such as forest fires, illegal logging, and pest outbreaks. Projects such as TreeTalker [
4,
5,
6] are designed for real-time monitoring by deploying a substantial number of sensor devices in the monitored areas of a forest. These sensors measure various environmental parameters, including air quality, soil moisture, weather conditions, and tree canopy health [
7,
8,
9]. However, as the density of sensor devices increases significantly, traditional orthogonal multiple access (OMA) schemes, including time-division multiple access (TDMA), frequency-division multiple access (FDMA), and orthogonal frequency-division multiple access (OFDMA) [
10], begin to expose their limitations in terms of massive connectivity. This is due to the fact that each device in an OMA system necessitates a dedicated radio resource for data transmission [
11]. When the number of sensor devices sending data simultaneously exceeds the number of dedicated resources the system can provide, resource congestion, traffic jams, and significant communication delays will result. These issues are particularly evident in emergency situations, such as forest fires, and in multi-media sensor network systems [
12,
13], where there is a notable surge in data demand.
Sparse code multiple access (SCMA), a non-orthogonal multiple access (NOMA) technology [
14], is regarded as a promising solution to address the aforementioned issues with improved spectral efficiency. This technique draws inspiration from the sparse matrix encoding concept used in low-density signature (LDS) technology [
15], where a limited number of devices are superimposed on the same resource for data transmission. However, unlike LDS, SCMA replaces traditional modulation and spreading with sparse multi-dimensional complex-domain codebook mapping. It assigns different sparse codebooks in the sparse code domain to different devices so that the signals for different devices can be non-orthogonally superimposed on the same resources [
16]. Compared to conventional OMA techniques, the load factor of devices on an independent resource in the SCMA system is greater than one, the load factor of the OMA system [
17]. That is to say, the number of simultaneous data transmissions in the SCMA system can exceed the number of available independent resources by a considerable margin. This is essential for applications that need to support multiple concurrent data transfers, particularly in scenarios where radio resources for data transmission are limited and there is a surge in data (e.g., in emergency situations). A typical example is forest management and monitoring, which facilitates ecosystem protection and the expeditious identification of environmental fluctuations. This is founded upon the aggregation of real-time data collected from a substantial number of terminal devices. For SCMA systems, a critical issue that has been extensively researched is the design of codebooks [
18]. Well-designed codebooks are conducive to enhancing the detection performance at the receiver side. This, in turn, exerts a direct influence on the communication coverage and the quality of service (QoS) experienced by the terminal devices. In other words, with a predetermined QoS requirement, the number of devices that can be involved in a non-orthogonal system will be affected. Therefore, the design of codebooks is also the focus of this paper.
The SCMA technology was first proposed by the Huawei Corporation in 2013 in [
16]. Since then, further research and studies have been conducted on the power-balanced SCMA (pb-SCMA) technology [
19,
20,
21]. Power balancing means that devices sharing the same radio resource apply codebooks with an equal average power level. The lattice rotation technique was used in [
19] to construct a multi-dimensional mother constellation with the desired Euclidean distance distribution. Specific operators were then applied to the mother constellation to generate SCMA codebooks. A novel SCMA codebook design scheme was proposed in [
20] from the perspective of capacity. It first optimized a basic M-order pulse amplitude modulation (PAM) constellation. A rotation of the angles was performed to enhance the sum rate. The resulting constellation sets were then employed to construct multi-dimensional codebooks based on Latin square principles. The authors of [
21] developed a systematic construction procedure for SCMA codebooks tailored to various channel conditions. From a simulation perspective, the proposed scheme was viewed to approach the near-optimal performance of symbol error [
22]. However, the design rules were not supported by a comprehensive theoretical foundation. In recent years, several studies have explored the potential of power-imbalanced codebooks to enhance decoding performance by leveraging the “near–far effect” among devices [
23,
24,
25]. While this method can improve overall performance, it sacrifices the performance of some devices in low-SNR regions.
Although some work has been carried out on codebook design for pb-SCMA systems, the optimal solution remains elusive. Currently, sub-optimal multi-stage optimization methods remain the main approach for codebook construction due to their low complexity [
21]. For instance, preceding studies [
19,
20,
21] all adopted this approach for codebook design. The generation of codebooks for different devices is accomplished by applying operations such as angle rotation, symbol permutation, and conjugate transposition to the mother codebook. However, to the best of our knowledge, there has been no explicit investigation into the impact of these operations on the corresponding detection performance, despite the significance of such investigation for the design of codebooks. Furthermore, the optimization objective for each stage has not been derived mathematically from a theoretical analysis perspective. Indeed, as a result of our research and analysis, we found that the existing codebooks have advantages and disadvantages with respect to the stage optimization objectives that will be derived in this paper, thus leaving room for improvement in the design of codebooks. Motivated by these factors, we studied the pb-SCMA system under an AWGN channel. The detection performance of symbol error probability was mathematically investigated. Although, as with [
21], this paper employed a multi-stage optimization methodology for codebook design, the objectives of each stage were derived theoretically. We then present a lightweight codebook design scheme with a detailed example. The main contributions of this paper are summarized as follows.
A mathematical transceiver model for an SCMA system under an AWGN channel is established. Based on this, an analysis of the symbol error probability at the receiver side is conducted. A theoretical upper bound is then obtained, which serves as a reference for codebook design.
A set of stepwise codebook design criteria are derived mathematically. These criteria include the theoretical optimization objectives for the codebook design on a single resource element (RE), for a single device, and for multiple devices on multiple REs.
Based on the derived stepwise criteria, a lightweight stepwise codebook design scheme is proposed. The codebooks designed in this paper exhibit detection performance no worse than that of the codebooks in [
21], which are viewed to approach near-optimal performance.
The remainder of this paper is organized as follows.
Section 2 introduces the mathematical transceiver model of an SCMA system under an AWGN channel.
Section 3 derives the symbol error probability based on the maximum a posteriori (MAP) detector. A theoretical upper bound is obtained for codebook design.
Section 4 presents a mathematical analysis of the optimization criteria for each stage of multi-device codebook design.
Section 5 describes the detailed codebook design process for the SCMA system.
Section 6 presents the simulation results of the proposed codebooks. Finally,
Section 7 concludes this paper.
2. Mathematical Transceiver Model
Figure 1 illustrates a downlink schematic diagram of an SCMA system. As illustrated in
Figure 1a, the base station, which may be a space station in a 6G system, concurrently transmits downlink data to
devices using
REs. In a forest scenario, these devices can be the head devices of each wireless sensor cluster.
Figure 1b delineates the data process procedure at the transmitter side, whereby the base station employs distinct complex codebooks to encode bit streams for different devices. The codebook for device
, denoted by
, is a matrix of size
, i.e.,
. Each column of
is named as a codeword for device
. Apparently, a codebook consists of
M codewords. For each device, the base station maps every
bits in the corresponding data stream to a codeword according to the device’s codebook. Here, we define
as a codeword for device
. Apparently, it is also a column of
, i.e.,
. In an SCMA system, the codewords in each codebook are sparse column vectors, as they possess non-zero values (i.e., valid data transmission) solely on a fixed set of REs. The REs for non-zero values vary between codebooks. For instance, the space station depicted in
Figure 1a transmits downlink data to
devices via
REs. The codeword of device 1 has non-zero values exclusively on RE 1 and RE 2. In the case of device 2, the non-zero values are mapped to RE 3 and RE 4. It follows that
and
. It can be figured out that the load factor of devices on a single RE is
in an SCMA system, which is 1.5 in the aforementioned instance. This load factor is substantially larger than the load factor in an OMA system, which is merely 1. Apparently, the highly improved load factor is essential for forest management and monitoring, where a substantial number of terminal devices need to be supported for the purpose of real-time data collection.
The mapping relationship between the non-zero values in each device’s codeword and the corresponding index of REs can be represented by a two-dimensional indicator matrix,
, with dimensions
and
. A row and a column of
represent an RE and a device, respectively.
means that device
engages in data transmission on RE
[
19]. In
Figure 1b, each device’s codeword occupies
REs for data transmission. Each RE contains a superimposed data signal intended for
devices. The mapping matrix corresponding to
Figure 1b is
It is evident that this is a sparse matrix.
The superimposed symbols transmitted by the base station on these
REs can be expressed as
where
is the actually transmitted codeword for device
and
represents the superimposed symbol on RE
, i.e.,
In Equations (2) and (3),
is a set of devices that have data transmissions on RE
. As illustrated in
Figure 1, RE 1 encompasses data transmitted to devices 1, 3, and 5. Therefore,
. We also define
to represent the set of REs actually occupied by device
. Apparently,
, as device 1 occupies RE 1 and RE 2 for data transmission.
The data received by device
can be represented as
where
is the data received by device
,
denotes the channel coefficient for device
, and
is the channel gain between the base station and device
on RE
. Additionally,
is the noise at the receiver side of device
, each element of which obeys a complex Gaussian distribution, i.e.,
.
In this paper, we assume that each receiver decodes the data for all devices. Therefore, in the subsequent analysis, no attempt is made to highlight which device is the recipient. The subscript
in Equation (4) can then be omitted, i.e.,
The data received on RE
can be expressed as
3. Analysis of Detection Performance
Let be the matrix comprising column vectors, i.e., (). Apparently, is a codeword or a column vector of the matric . Matrix is made up of the codewords actually transmitted from the base station to the devices. Set includes all possibilities of matrix . Set contains all the elements in set except , i.e., or . Given that there are codewords in a single codebook, it can be inferred that the numbers of elements in sets and are and , respectively.
By employing a maximum a posteriori (MAP) detector at the receiver side, the estimate of
can be expressed as
Here, we define
as the metric criterion corresponding to
. The estimate for
is the
that corresponds to the minimum element in the set
, i.e.,
. Let
represent the minimum element, i.e.,
. When
or
, an estimation error of
occurs with an error probability of
Equation (9) can then be rewritten as
where
is the received symbol signal-to-noise ratio (SNR) on RE
. For an AWGN channel, we can assume that
. Thus, Equation (8) becomes
Here, we define the upper bound in Equation (12) as
, i.e.,
Assume that
is the symbol (or codeword) detection error rate of the transmitted codeword,
, at the receiver side. It can be figured out that
. In a pb-SCMA system, the basic attribute of each codebook needs to be identical (e.g., the Euclidean distance between constellation points) so that the detection error rate of each device can be considered equal. In this way, it can be viewed that
at the same receiver side. Therefore, there is
which means that
, i.e.,
In the subsequent section,
is used as a reference criterion to evaluate the performance of the designed codebook. To validate the reasonableness of
, simulation results are provided in
Figure 2 with various existing codebooks [
19,
20,
21]. Given the high complexity of the MAP algorithm, a suboptimal alternative, the message-passing algorithm (MPA) is often employed as a surrogate for MAP in practical simulations [
26]. In this algorithm, the belief messages are passed between the neighboring RE nodes and device nodes in the factor graph generated according to the indicator matrix,
, for a number of iterations until the termination criterion is met. Although it is challenging to mathematically analyze the detection process of MPA, it is generally accepted that its performance is close to that of the MAP detector [
27].
Figure 2 illustrates the symbol error rate of each device,
(
), and the average symbol error rate,
, which is defined as
. The theoretical upper bound,
(see Equation (13)), is also provided in
Figure 2 for reference. From
Figure 2, we can find that
, which is consistent with our previous assumption. In the low-SNR region,
and
differ significantly. As
increases, the difference between
and
gradually decreases. Especially in the high-SNR region,
is very close to
. It is noteworthy that in
Figure 2,
sometimes appears slightly higher than
. Ref. [
18] has already provided an explanation of this phenomenon, stating that MPA does not consistently select
with the minimum
as
. This contrasts with MAP, such that MPA exhibits a slightly inferior performance compared to MAP.
4. Analysis of Multi-Stage Codebook Design Criteria for an AWGN Channel
In order to minimize the upper bound,
, of the designed codebook, each summation term,
, in (13) needs to be as small as possible. That is to say,
in each term is required to be as large as possible. Therefore, the minimization problem of
can be transformed to maximize
, which is defined as
To further reduce the complexity of codebook design, we also apply the multi-stage optimization approach. Next, we will study the objective of each stage, including the codebook design on a single RE and the codebook design for one device.
4.1. Codebook Design on a Single RE
For a single RE,
(
), only
devices have data symbols on them. As mentioned in
Section 2, the codeword or symbol of device
on RE
is represented as
, where
. Let
be the vector composed of
elements, the
th (
) element of which is a codeword or symbol for device
on this RE, i.e.,
(
).
represents the codeword vector actually transmitted by the base station to these
devices on RE
. Set
includes all the
possibilities of vector
. Set
includes the remaining
vectors except
, i.e.,
or
.
Similarly, the estimated codeword,
, at the receiver side is given by
. The detection error probability of
on RE
is
Here, we define
as the upper bound of the error probability on RE
, i.e.,
There are
summation terms in Equation (18). To lower
, the distance
in each term should be as large as possible. Therefore, the problem of minimizing
can be transformed into maximizing the elements in set
, especially those with small values. Here, we define set
as
, which is the set obtained by arranging the
elements of set
in order from smallest to largest. Thus, the minimization problem of
can be formulated as maximizing
, i.e.,
which is the sum of the first
smallest elements in set
. Here,
can be set to 3% or more of the number of elements in set
.
4.2. Codebook Design for a Single Device
In this subsection, we will study the codebook design problem for a single device. Let us start with the codebook design problem for a single device on a single RE. Assume that
is the set that includes all the
codewords or symbols of device
(
) on RE
. They satisfy
.
is used to represent the codeword actually transmitted for device
on RE
. Set
denotes the remaining
codewords or symbols except
, i.e.,
. Similarly, we can derive the upper bound of the codeword or symbol error probability for device
on RE
, i.e.,
Obviously, the larger the distance
is, the smaller the Q-function in Equation (20) becomes. Minimizing
can be transformed into maximizing the smallest element,
, among the
elements in
, where
is defined as
Next, we will consider the codebook design problem for a single device on the
REs it actually occupies. Assume that
is the codeword transmitted by the base station for device
on the corresponding
REs. Set
(
) includes all the
codewords of device
on these
REs.
represents the set that contains the remaining
possible codewords. The upper bound of the codeword error probability for device
on these
REs is
To lower
, the
elements in the set
should be as large as possible, where the element
is defined as
. Therefore, we transform the minimization of
to the problem of maximizing
, which is defined as
To sum up, we have derived the criteria for each stage of the multi-stage optimization method. Firstly, we designed the codebook containing one-dimensional codewords for a single device on a single RE with the goal of maximizing in (21). The codebooks for devices on a single RE were designed to maximize in (19). At this stage, each codebook had one-dimensional codewords. Next, we tried to design the codebook for one device on REs with the objective of maximizing in (23). At this stage, each codebook contained -dimensional codewords. Finally, the codebooks for devices on the REs were designed to maximize in (16).
5. Multi-Device Codebook Construction
Based on the design criteria obtained in
Section 4, we present a lightweight stepwise codebook construction method for an SCMA system with
,
, and
under an AWGN channel. In this example, each RE accommodates
devices, and each device actually occupies
REs. The corresponding mapping matrix,
, is the one provided in (1).
5.1. Codebook Construction for a Single Device on a Single RE
As in references [
19,
20,
21], we also selected a PAM constellation to design the codebook for a single device on a single RE. Assume that set
includes all the
codewords or symbols of the device on RE
, where
(
). To lower the upper bound in (20), the optimal codebook for this device can be obtained as
with a constraint of
. When
, the corresponding codebook,
, for a single device on a single RE can be derived as
.
5.2. Codebook Construction for Devices on a Single RE
In this subsection, we will construct codebooks for the
devices on a single RE, with each codebook containing
one-dimensional codewords. Thus, the codebook is also named a one-dimensional codebook at this stage. Here, angle rotation is applied to the initially obtained codebook,
, in (24) to generate different codebooks for different devices. The codebook for device
(
) on this RE can be represented as
where
is the rotation angle for the corresponding codebook,
. Here, we assume that
. To maximize
in (24),
can be derived as
Based on (26), we can determine that
and
or that
and
when
. The corresponding constellations of these
codebooks
are shown in
Figure 3a.
5.3. Codebook Construction for a Single Device
In an SCMA system, the codeword of a device actually occupies REs. , obtained in (25), is just a one-dimensional codebook on one RE for device (). In this subsection, we will study how the other one-dimensional codebooks on the other REs are designed for device .
In order to achieve shaping gain [
14] from codebooks on different REs and maintain fairness and balance in the energy of each codeword, it is essential that the one-dimensional codebooks on different REs are not simple repetitions. A permutation operation is employed here to generate codebooks on the other REs by swapping the mapped positions of the codewords on the constellation. Set
includes all the codebooks obtained by the permutation operation on the codebook
. Among these codebooks, the optimal one can be derived as
where
is defined in (23).
Figure 3b shows the constellations of the derived codebooks,
, where
. When
, the device can select one one-dimensional codebook from each of sets
and
. These codebooks will then constitute the device’s
-dimensional codebook.
5.4. Codebook Construction for Multiple Devices
In the preceding subsections, the sets of one-dimensional codebooks for devices to select from on different REs were obtained, i.e., and . However, the question of how each device selects these one-dimensional codebooks to construct its -dimensional codebook remains unsolved.
In this subsection, we will present the principles that need to be considered for a device to select one-dimensional codebooks. Assume that () is the index of the chosen codebook. The first principle is that devices sharing the same RE need to choose codebooks with different indices. For instance, when device () selects the codebook with index () on RE , i.e., it chooses either codebook or , another device occupying this RE will choose neither nor as its codebook. This requirement is to prevent the constellation points of different devices’ codebooks from overlapping on the same RE. Secondly, the one-dimensional codebooks on different REs for the same device will not all come from either set or set . For instance, when device selects the codebook on its first occupied RE, at least one codebook will be selected from the permutation set on the other REs occupied by this device. This requirement is to achieve shaping gain from the one-dimensional codebooks on different REs while maintaining fairness and balance in the energy of each codeword, as previously stated. There are many choices that satisfy these two principles. Nevertheless, the -dimensional codebooks of each device must be able to maximize in Equation (16).
When the corresponding mapping matrix,
, is the one provided in (1), the one-dimensional codebook selection scheme for each device on its occupied REs is provided as
A row and a column in (28) also represent an RE and a device, respectively. It can be seen that device 1 selects on RE 1. On its second occupied RE, i.e., RE 2, it chooses from the set as the one-dimensional codebook. Device 2 selects on RE 3. On RE 4, it chooses from the set as its codebook. Apparently, this selection scheme satisfies both of the aforementioned principles.
Figure 4 illustrates the constellations of the finalized
-dimensional codebooks for each device with the one-dimensional codebook selection scheme provided in (28).
(
) and
(
) are used to denote the indices of the RE and the device, respectively. As can be seen in this figure, the basic characteristics of each codebook are largely analogous, which suggests that these devices will exhibit comparable performance. The detailed codebooks of each device are provided in
Appendix A.
6. Simulation Results
In this section, the detection performance of the proposed codebooks provided in
Appendix A will be illustrated. The corresponding parameters are listed in
Table 1.
Figure 5 shows the symbol error rate of each device,
(
), and the average symbol error rate,
, with the proposed codebooks in an AWGN channel. The detection performance is consistent across devices, which corroborates our previous hypothesis. In the low-SNR region, the theoretical upper bound,
, is relatively loose in comparison to the simulation results. However, as
increases,
becomes close to the simulation results, particularly in the high-SNR region. This provides further evidence that the theoretical upper bound,
, derived in this paper is a reasonable approximation and can be used as a metric for codebook design.
Figure 6 compares the performance of the codebooks proposed in this paper with those in the existing literature [
19,
20,
21]. In the simulation, the mean value of the total power for each device’s
-dimensional codeword is normalized to one. The codebooks proposed in this paper demonstrate superior detection performance compared to the other codebooks. This can be inferred from the optimization objective of each stage listed in
Table 2. These parameters include
(defined in Equation (21)),
(defined in Equation (19)),
(defined in Equation (23)), and
(defined in Equation (16)). As can be seen from
Table 2, the codebooks designed in this paper achieve the maximum values for all four parameters, suggesting better performance. From
Table 2 and
Figure 6, it can be observed that maximizing
alone (the optimization objective on a single RE) or maximizing
alone (i.e., the optimization objective for a single device) does not yield the highest
. In other words, these factors need to be considered simultaneously. While Huawei [
19] has the largest
, it achieves the smallest
, resulting in the worst performance. Zhang [
20], Chen [
21], and the codebooks designed in this paper all employ M-PAM as the base constellation on an RE. Compared to [
20], ref. [
21] obtains a larger
, resulting in better performance in the high-SNR region. Overall, in the low-SNR region, the performance of the aforementioned codebooks was found to be relatively similar. In the high-SNR region, the codebook designed in this paper optimizes the corresponding parameter of each stage, thereby achieving a larger
and a relative performance improvement. For a wireless communication system, a lower SER may signify either an expansion of communication coverage or an enhancement of the QoS experienced by the terminal devices. In scenarios where the QoS requirement is specified, codebooks with a lower SER have the potential to augment the number of devices engaged in this non-orthogonal system. This is important in scenarios where a large number of devices need to be deployed, such as in forest management and monitoring.