2.2. The Event Energy Model Based on Markov Random Field
The core concept of the Markov Random Field is the Markov property, which indicates that the value of the current observed variable is influenced by the current state and other states. To elaborate, it means that after a sequence of random variable states is unfolded in a chronological order, the conditional probability distribution of its future states, given the current state and all past states, depends only on the current state. In other words, given the current state of the random variable, its value is independent of past states, and such a random process is said to have the Markov property. The term “random field” refers to the configuration when a value from a certain space is randomly assigned to each position according to a certain distribution, the entirety of which is called a random field. The combination of these two concepts forms the Markov Random Field.
The model based on the concept of Markov Random Field [
42] is composed of nodes and connecting lines, and it is used to describe the undirected interactions between variables. As shown in
Figure 5, nodes represent variables, and the connecting lines represent the probability of interaction between neighboring variables. Based on the above prior knowledge, we utilize the Markov Random Field model and the inherent correlation between the event camera event stream, using the nodes in the model to represent the events in the event stream, and the connecting lines to represent the correlation between the events.
In the previous section, we initially divided the event stream into two based on the correlation of the events with the overall event stream: the main event stream
, primarily composed of real events, and the non-main event stream
, primarily composed of noise events. The main event stream U is generated by object motion, while the non-main event stream N is generated by inherent hardware and external interference. The combination of these two event streams forms the original event stream L,
However, to further filter out the noise events in the main event stream and the real events within the non-main event stream, it is necessary to combine the characteristics of the two event streams and design a distinction method that can dynamically adjust according to the situation and better reflect the correlation between events.
In the vast majority of cases, the event stream is dominated by real events, implying that there is a strong correlation between the events in event stream U and event stream L, whereas the correlation between the events in event stream N and event stream L is relatively smaller. Based on this situation, we can combine the related properties of the Markov Random Field model to represent the relationships between the events in event stream U and event stream L, as well as between the events in event stream N and event stream L, using joint probability distribution, and then proceed to the next step of processing. According to the Hammersley–Clifford theorem, the joint probability distribution of the Markov Random Field model can be represented as the product form of non-negative functions on its maximum clique of random variables. This operation is also known as the factor decomposition of the Markov Random Field model,
C is the maximal clique in the model. In the Markov Random Field model, any subset of nodes that are all connected by edges is called a clique. If C is a clique in the model and no additional node can be added to form a larger clique, then C is referred to as a maximal clique.
The potential function
is defined in units of maximal cliques, i.e., a graph with several maximal cliques will have several potential functions. The potential functions are used to quantify the joint probability of random variables in the maximal clique, but they are not normalized. Since the potential function is restricted to be strictly greater than zero to ensure
0, it is more convenient to represent the potential function in exponential form,
Z is the partition function, also known as the normalization factor, which corresponds to the sum of the factorization results of all maximal cliques in the Markov Random Field model, used to normalize the product form of
to a probability form,
represents the energy function within the corresponding clique. We combine the two equations above to represent the potential function in the joint probability distribution with the energy function. At this point, the joint probability distribution is in the form of a Gibbs distribution,
is the total energy function corresponding to the Markov Random Field model of event stream U and event stream L,
In the Markov Random Field model that we use, represents a single event in U, represents a single event in N, and represents a single event in L. In the event stream, pixels only exist in three states: an event with increased brightness occurs ( = 1); an event with decreased brightness occurs ( = 0); and no event occurs. Based on this characteristic, we can define , , and , where , respectively represent the horizontal position information of the event, vertical position information and the timestamp information.
Moreover, when is a real event, it has a strong correlation with . In addition, has a strong correlation with other events and in the spatial and temporal neighborhoods. However, when is a noise event in U, the situation is completely opposite; at this time, does not have a strong association with , , and . , and represent the differences in horizontal position, vertical position, and timestamp between the original event and other events in the neighborhood of this event.
According to the above knowledge, the Markov Random Field model of event stream U and event stream L includes the following four maximal cliques:
,
,
, and
. For maximal clique
, in order to describe the relationship between events in the clique, we defined the corresponding energy function,
In this energy function, when and are in the same state, the energy produced by the energy function is lower, and the corresponding correlation is higher. When and are in opposite states, the energy produced by the energy function is higher, and the corresponding correlation is lower.
Similarly, we also defined corresponding energy functions for the other three maximal cliques,
α and β are non-negative dynamic weight parameters within the corresponding maximal cliques. α reflects the correlation of events in space,
β reflects the correlation of events in time,
and represent the pixel differences in the horizontal and vertical directions of the two events in the energy function. and represent the maximum differences in both the horizontal and vertical directions among all events within all maximum cliques, respectively. represents the minimum time interval between the event and other events within all maximum cliques, while represents the maximum time interval.
In summary, we can obtain the complete energy function
corresponding to the main event stream U and the original event stream L,
, , , and correspond to the four maximal cliques , , and , respectively. The maximal clique a represents the relationship between events in event stream U and events in event stream L, and the other three maximal cliques represent the relationship between events in the spatial-temporal neighborhood in event stream U.
Similarly, we can obtain the complete energy function
corresponding to the non-main event stream N and the original event stream L,
Combining the relationships between the aforementioned energy functions, we can derive the event energy model based on Markov Random Field. And as shown in
Figure 6, it shows the relationship between the maximum cliques. However, the event situation in the noise-dominated event stream N is different from that in event stream U. In the previous section, when we distinguished the main event stream U and the non-main event stream N based on overall correlation, random noise events such as background activity noise with low intercorrelation and unconventional noise events like diffusive reflection noise with high intercorrelation were both categorized into the non-main event stream N. For diffusive noise in the non-main event stream, although there is a strong correlation among the events within this noise region, they do not strongly correlate with the events in the original event stream L. In contrast, the random noise events in event stream N have a weak correlation both with other events in N and with events in stream L. However, the real events in event stream N have strong correlations with both N and L events. With these characteristics as a judgment basis, we can make a good distinction among the conventional noise events, unconventional noise events, and real events in the event stream N.
After representing the relationships between events using energy functions in this section, we can more effectively measure the correlation between events within the same or different event streams. Moreover, based on the different event characteristics corresponding to the energy functions of the two event streams obtained in this section, we can use the EEIM algorithm in the next section, which includes dynamic thresholds belonging to different event streams, to further identify the real events in the non-main event stream and the noise events in the main event stream. In this way, we can effectively preserve the real events while removing both conventional and unconventional noise events.