1. Introduction
Measurement has been developed through the physical sciences and plays a very important role in industry, commerce, health and safety, and environmental protection [
1,
2,
3,
4,
5]. A unified model of measurement systems is critical to the design and optimization of measurement systems. However, the existing measurement theory which will be reviewed below is too abstract. To a certain extent, this makes it difficult to have a clear overall understanding of measurement systems and how to obtain information with measurement units during the measurement process at the outset. Therefore, measurement science needs a theoretical framework [
2] that can intuitively describe, analyze, and evaluate measurement systems and characterize how measurement units work to obtain the information of the measurand.
Numerous works of the modeling of the measurement or measurement system have been developed and published. Helmholtz and Hoelder developed a theory of measurement based on the concepts of the physical sciences [
1], which regarded measurement as the operation set of assigning the determinate numerical value to the physical quantity of the object [
2]. Then, three main model approaches or theories, the representational theory, the object-oriented method, and the probabilistic theory [
6,
7,
8,
9,
10,
11,
12,
13,
14,
15,
16,
17,
18,
19,
20,
21,
22,
23,
24,
25] were developed and studied. As the main body of these studies, the representational theory [
6,
7,
8,
9,
10,
11,
12,
13,
14,
15,
16] considers representing the mapping between the measurand and measurement result with general symbols, from different standpoints such as semiotics [
7,
9], set theory [
13], or domain-theory [
15]. The object-oriented method [
17,
18,
19] applies the object-oriented technology in computer programming to construct an object-oriented model of the measurement system, which divides the measurement elements into five classes described by their attributes, operations, or environment. The probabilistic theory [
20,
21,
22,
23,
24,
25] proposes a complete theory of measurement, including probability representations of different measurement scales, probabilistic descriptions of measurement systems, and measurement processes.
The abovementioned researches proposed the modeling methods from different perspectives or according to relevant theories. The models of measurement, measurement processes, or measurement systems were established, and even the special measurement problems, like in ratio, interval, ordinal, and nominal scales were adequately considered [
13,
16,
21,
22,
25]. However, the implementation of the measurement relies on a series of measurement units. These theoretical models have considered the measurability, the relationship between the input and output of the system, calibration, restitution, etc., but cannot describe the role of the measurement unit in the measurement process. Furthermore, one of the cores of all measurements is the problem of uncertainty [
26,
27], but a model of measurement systems established directly from uncertainty has not been proposed.
Since absolute zero is impossible to achieve and the measured object always interacts with the outside world, the absolute standard measurement environment is not possible. This causes the measurand to be essentially a random process or a random sequence [
28]. If the measurand is stationary, it can be considered as a random variable within a short measurement time. After the effective measurement, the uncertainty of the measurand is reduced compared to the uncertainty of the measurand before measurement. Therefore, the measurement is a process of uncertainty reduction, and its essence is information acquisition. Additionally, with the development of measurement science, the viewpoint which views measurement as an information process and regards instruments as information machines is widely recognized [
1,
2,
9,
29,
30]. Since Shannon proposed the concept of information entropy as a measure of information and uncertainty of a random variable [
31], information theory has been applied to some aspects of measurement [
32,
33,
34]. Therefore, information entropy has high potential as one of the methods to solve measurement problems and it is feasible to establish a system model from the perspective of uncertainty with information entropy.
In this paper, an information entropy-based modeling method for measurement system is proposed. The main contributions of this paper are as follows: (1) a modeling idea based on the viewpoint of information and uncertainty is presented; (2) the entropy balance equation based on the chain rule for entropy is proposed for system modeling; and (3) an information entropy-based model of measurement systems is established based on the entropy balance equation from the perspective of uncertainty and information acquisition.
The rest of this paper is organized as follows.
Section 2 presents the preliminaries on relations between different entropies, and proposes the entropy balance equation. The information entropy-based model of the measurement system is proposed in
Section 3.
Section 4 analyzes three cases of typical measurement units or processes with the proposed method. Finally, conclusions of the proposed model of the measurement system are drawn in
Section 5.
2. Methodology
2.1. Information Entropy and Related Concepts
Entropy is a measure of the uncertainty of a random variable. For a discrete random variable with limited states, probability of each state , , is denoted as . For the sake of simplicity, we use to represent probability instead of . Similarly, for discrete random variable , its probability function is denoted as , . The joint probability function of and is represented by .
Definition 1. The information entropy of the discrete random variableis defined as:
If the log is to base 2, the unit of information entropy is bits; if the log is to base e (the natural logarithm), the unit is nats, and if the log is to base 10, the unit is harts. For the related measures that will be introduced later, their units are the same. The unit of entropy and related measures for continuous random variables is also the same.
For a continuous random variable with probability density function , its information entropy is infinite since the number of its stats is infinite. In this case, information entropy is the sum of differential entropy and a constant that tends to infinity. The definition of differential entropy is given as follows:
Definition 2. The differential entropy of the continuous random variable X with probability density function,
, is defined as:
Obviously, differential entropy cannot represent the uncertainty of continuous random variables and does not have the connotation of information. However, when discussing mutual information, since two infinite constant terms will cancel each other, differential entropy has the same information characteristics as information entropy.
In this paper, in order to make each item in the model established in
Section 3 have the connotation of information, the uncertainty of a random variable is characterized by information entropy, whether the random variable is continuous or discrete. In addition, for continuous cases, mutual information is calculated using differential entropy.
Based on the information entropy, the related concepts and their definitions are introduced below:
Definition 3. The joint information entropy of discrete random variablesandis defined as:
Definition 4. The conditional entropy of the discrete random variablegivenis defined as:
Definition 5. The average mutual information (also referred to as mutual information) between discrete random variablesandis defined as:
The relationship between
,
,
,
and
can be expressed by the Venn diagram shown in
Figure 1. Two equations governing this are:
2.2. Entropy Balance Equation
In this part, the extension for the chain rule of joint entropy, called the entropy balance equation (Equation (8)), is developed for system modeling, which is given and proved below:
Theorem 1. Given random variableswhich are drawn according to, then:
Proof. By the chain rule for entropy [
35], we have:
Equation (9) can be readily proved with
and the definitions of entropy and conditional entropy. By symmetry, one can write:
Based on Equations (9) and (11), one can obtain the following equality:
which is equivalent to Equation (8). □
4. Application
To better understand the proposed model, three cases of typical measurement units or processes are discussed in this section.
4.1. Case 1: Bandpass Filter
The bandpass filter, which is a typical unit in the measurement system, is analyzed in this section. As shown in
Figure 6, the input of the filter
is
, where
is a Gaussian random variable with power of
,
is white Gaussian noise with power of
,
and
are independent of each other. The differential entropy of
can be expressed as:
and the differential entropy of
N is denoted by:
Before passing through the filter, since
and
are independent, the power of
satisfies
. The mutual information between
and
is:
After passing through the filter, the mutual information between
and
is:
where
and
represent the power of
and
after pass through the filter, respectively.
The increment of mutual information (IMI) is defined by:
Suppose that the power of noise
is
where
is the bandwidth of noise and
denotes bilateral power spectral density of noise. The filter is an ideal bandpass filter with a bandwidth of
and the gain is 1 in the passband. After passing through the filter, the power of the noise is
, then Equation (28) can be rewritten as:
According to the characteristics of the filter, the passband should be consistent with the frequency band of
, that is,
and
, therefore:
Equation (30) shows that the IMI is related to the bandwidth
and signal to noise ratio (SNR) of the input signal
. The narrower the bandwidth of the filter is, the larger the increment of mutual information is. In general,
, but the SNR of the input signal
is uncertain. For small signals, the SNR is less than 1 (
), then we have
If
, then:
For large signal, the SNR is generally much more than 1 (
), then:
The function of the filter is to filter out the noise contained in the signal. From the above three cases, the IMIs are all greater than zero, which means that at the information level, the role of filter is to increase the amount of information that can be obtained.
4.2. Case 2: Quantization Process
The quantization process is an important step in the measurement process. From the perspective of information acquisition, the quantization process is a process of information loss. For a continuous random variable, it requires infinitely high precision to describe itself in theory, and its information entropy is infinite. After quantization, the continuous random variable is transformed into a discrete random variable with limited precision, and its information entropy is finite.
Given a continuous random variable
with a probability density function of
, the range of
is evenly divided into intervals of length
. Assuming that
is continuous within each interval. According to the mean value theorem, there exists
within each interval such that:
After quantization, the discrete random variable
is obtained and its definition is:
Then, the probability of
is:
Therefore, the information entropy of
is:
If the function
is Riemann integrable, the first item in Equation (37) approaches
as
, which means:
Since
is not achievable in practice, there is information loss in the quantization process. For a
N-bit quantizer,
, then the information loss
can be defined as:
The amount of information obtained from
X with quantization process is:
Therefore, the quantization process can be illustrated as shown in
Figure 7. It can be found from Equations (39) and (40) that the larger
N is, the less information is lost and the more information is obtained.
For example, consider a continuous random variable
X with uniformly distribution on
. It is quantized by a
N-bit quantizer and the process is simulated with MATLAB R2018b (developed by the MathWorks, Inc. with headquarters in Natick, Massachusetts, USA).
is generated by the unifrnd function with 1,000,000 data points. The first 5000 data points of
are shown in
Figure 8a, and the probability density function of
is shown in
Figure 8b. It can be found that the simulated data of
is not ideal, and its probability density is significantly less than 1 when its value is close to 0 or 1. Here, five quantizers with
N-bit
are used to quantize
, and then the corresponding information entropies of
are calculated and the results are shown in
Figure 8c. As
, according to Equation (37), the information entropy of
is equal to
bits (since
, the mutual information is also
bits), when the log is to the base 2. It can be seen from
Figure 8c that the simulation results are consistent with the theoretical values within the allowable error. This also shows that the more bits the quantizer has, the more information can be obtained, which is consistent with the theoretical analysis.
4.3. Case 3: Cumulative Averaging Procedure
In some practical measurement applications, the noisy signal is sampled at high speed, then the cumulative averaging procedure is performed to the measured values to filter out the high frequency parts of noise to obtain higher measurement accuracy.
As shown in
Figure 9, given a Gaussian signal
with zero mean, a Gaussian noise
with zero mean,
and
are independent of each other and
, in a very short period of time
, the amplitude of the signal can be considered as constant, and the amplitude of the noise is a variable. Therefore, the correlation coefficient between the signal amplitudes at any two moments in
is 1, and for noise, the correlation coefficient is zero. Assuming that the number of cumulative averaging times is
n, and the power of the signal and noise at each sampling moment
is
and
(
), then after the cumulative averaging procedure, their power become:
where
and
are the amplitudes of the signal and noise at each sampling moment
, respectively;
and
are the average amplitudes of the signal and noise during
, respectively; and
and
are the average powers of the signal and noise during
, respectively.
After the cumulative averaging procedure, the mutual information that can be obtained from the processed data is:
which is greater than the mutual information before the cumulative averaging procedure, that is:
This shows that the cumulative averaging procedure can be equivalent to a digital filter, which can improve the signal-to-noise ratio and increase the mutual information. It can also be seen from Equation (43) that the mutual information increases as the number of cumulative averaging times n increases.
5. Conclusions
In this paper, an information entropy-based modeling method for measurement systems is proposed. The modeling idea of the measurement system based on the viewpoints of information acquisition and uncertainty is presented. Based on this idea, the entropy balance equation based on the chain rule for entropy is proposed for system modeling. Then, information entropy-based models of measurement units and measurement systems are established with the entropy balance equation. Finally, three cases of typical measurement units or processes are analyzed using the proposed model. Compared with the existing modeling methods of measurement systems, the proposed method considers the modeling problem from the perspective of information and uncertainty, and focuses on the loss of the measurand information in the transmission process and the representation of the role of the measurement unit, such as filtering, amplification, and introduced noise. From error entropy, noise entropy, and mutual information between input and output of each unit, the changes of information can be intuitively reflected. If the system input is without noise, the mutual information between the input and output of the system directly reflects the amount of information acquired from measurand, which can be directly used as an evaluation index of the performance of the measurement system.
The proposed model has excellent ability to intuitively describe the processing and changes of information in the measurement system. These characteristics make it easy to have a clear overall understanding of the concept of the measurement system and specific implementation of measurement with measurement units. Note that, although the proposed model has the above advantages, it is not considered and proposed from the perspective of metrological analysis. Compared with the existing models of the measurement system, the output of the proposed model cannot be directly applied to represent the measurement results in the traditional sense, and loses the time information of measurement result. The proposed model does not conflict with the existing models of measurement systems, but can complement the existing models of measurement systems, thus further enriching the existing measurement theory.