Appendix A. Simulation Study on Event Arrivals
In this appendix, we simulate arrivals of a number of point processes with different intensities. We pursue two objectives. The first is to exemplify how the results of the analysis of arrival times together with interarrival durations vary depending on the intensity. Second, we want to illustrate how the results of the analysis of the number of event arrivals after the arrival of an event vary depending also on the intensity.
In
Chen and Stindl (
2018), a simulation algorithm is described for the univariate Hawkes process. The algorithm has two steps: First, the baseline intensity function is used to simulate events of the zeroth generation. Then, the excitation function is used to simulate each event’s offspring. Both simulations are carried out with Algorithm A1, which in this form was dervied from code in the R package “IHSEP”. The arguments of the function
simNHPP are the current time, the time until when to simulate, the intensity function, and the maximum value taken by the intensity function over the interval under consideration.
We extend that algorithm in two dimensions: On the one hand, we allow the underlying process to have a history. This may be achieved by simulating future offspring from historic events with Algorithm A1. On the other hand, we allow the underlying process to be multivariate. This is achieved as follows: First, Algorithm A1 fed with the baseline intensity function of each component of the multivariate Hawkes process is used to to simulate events of the zeroth generation. Then, for each event of the zeroth generation the offspring is simulated. The offspring may be events of the same component as the parent or events of different components. Each of these offspring simulations are carried out with Algorithm A1 once again.
Algorithm A1 Poisson simulation |
- 1:
function simNHPP() - 2:
- 3:
- 4:
while do - 5:
- 6:
- 7:
if then - 8:
append to - 9:
end if - 10:
end while - 11:
- 12:
- 13:
for do - 14:
if then - 15:
append to - 16:
end if - 17:
end for - 18:
return - 19:
end function
|
Algorithm A2 describes in detail how a multivariate Hawkes process with history may be simulated. The arguments of the function simHawkes are the current time, the time until to simulate the baseline intensity functions and their maximum values over the interval under consideration, the excitation functions and there maximum values, and the history of the process. The maximum values of the excitation functions are the jump sizes.
We consider a period of 21.25 h which starts at
, hence
. The baseline intensity takes one of the following forms:
If the intensities of the point processes would only consist of these baseline intensities, the expected number of arrivals in
would be 63.75, 60.56, 56.71, and 55.65, respectively. We consider both the case of presence of self-excitation and the case of absence. In case of presence, the self-excitation is of the exponential form, i.e.,
with
and
. Thus, we end up with 12 different models. For each model, we simulate the event arrivals
times.
Algorithm A2 Hawkes simulation |
- 1:
function simHawkes() - 2:
- 3:
- 4:
- 5:
for do - 6:
empty vector for generation- timestamps of component i - 7:
for do - 8:
- 9:
for do - 10:
simNHPP) - 11:
extended by those which are greater than t - 12:
end for - 13:
end for - 14:
simNHPP - 15:
extended by and sorted in ascending order - 16:
if length of then - 17:
- 18:
end if - 19:
end for - 20:
while do - 21:
- 22:
- 23:
for do - 24:
empty vector for generation- timestamps of component i - 25:
for do - 26:
- 27:
for do - 28:
simNHPP) - 29:
extended by and sorted in ascending order - 30:
end for - 31:
end for - 32:
if length of then - 33:
- 34:
end if - 35:
end for - 36:
end while - 37:
return - 38:
end function
|
At first, we consider the analysis of arrival times together with interarrival durations.
Figure A1 shows the results. Let us consider the case where the (baseline) intensity is constant, see
Figure A1a. The event arrivals are distributed equally over time. The bin in which the shortest durations between the time of arrival of an event and the subsequent one fall has the highest frequency, no matter what time is considered. The frequencies of the interarrival duration in which the smaller bins, the larger the durations they cover, again no matter what time is considered. Self-excitation does not invalidate these characteristics but it does cause the frequencies of the bins, which cover the shortest durations to increase and the difference between the frequencies of those bins and the following ones to be larger.
Figure A1.
Two-dimensional histograms of event arrival times (horizontal axis) and durations between event arrival times and the arrival time of the next event (vertical axis) for different baseline intensities and self-excitation structures. .
Figure A1.
Two-dimensional histograms of event arrival times (horizontal axis) and durations between event arrival times and the arrival time of the next event (vertical axis) for different baseline intensities and self-excitation structures. .
If the (baseline) intensity comprises not only a constant but also a component which grows visibly over the entire period of time under consideration, the interarrival durations which are associated with arrival times close to
take values in the region of 12.5 h, see
Figure A1b. While these durations are much larger than the maximum durations taken in case of the constant (baseline) intensity, they are also well below the maximum possible durations. With increasing time of arrival, the maximum values that are taken by the interarrival durations decrease. Furthermore, the frequencies of the bins covering small interarrival durations start on increase. As the (baseline) intensity is monotonously increasing, the bins with the highest frequencies are those which are close to
and which cover the smallest interarrival durations. The results in case of self-excitation are similar.
What is the effect if the (baseline) intensity also grows exponentially but growth really only becomes noticeable close to
?
Figure A1c sheds light on this question. In the hours after
, a concentration of arrival time-interarrival duration pairs in the region of 8.75 to 3.75 h below the maximum duration may be observed. These pairs are likely to be arrivals that are generated by the constant in the (baseline) intensity and which are followed by arrivals in the phase in which the growth of the (baseline) intensity becomes noticeable. This reasoning is supported by the fact that the arrival time-interarrival duration pairs after
in case of the (baseline) intensity without a constant are too few to explain the concentration, see
Figure A1d. Once again, self-excitation causes the frequencies of the bins covering the short interarrival durations to increase. In
Figure A1c, these increases are visible over the entire period under consideration because there are enough events over that period that trigger other events. By contrast, the (baseline) intensity without constant but with strong growth towards
does not produce this feature because arrivals after
fail to occur sufficiently and frequently.
The plots in
Figure A1 build upon numbers of observations in the region of
and more. With
Figure A2 where only
simulations are included in the analysis of arrival times and interarrival durations, we address the question how the same plots change if less data are available. In the case of the constant (baseline) intensity differences that would be worth mentioning are not observable. A similar conclusion holds in case of the (baseline) intensity that grows visibly over the entire period under consideration and also in case of the (baseline) intensity that grows visibly only close to
and that does not comprise a constant. The concentration of arrival time-interarrival duration pairs 8.75 to 3.75 h below the maximum duration, however, is not clearly observable.
Figure A2.
Two-dimensional histograms of event arrival times (horizontal axis) and durations between event arrival times and the arrival time of the next event (vertical axis) for different baseline intensities and self-excitation structures. .
Figure A2.
Two-dimensional histograms of event arrival times (horizontal axis) and durations between event arrival times and the arrival time of the next event (vertical axis) for different baseline intensities and self-excitation structures. .
Let us move on to the analysis of the event arrivals that follow the arrival of an event. The results for the 12 models which we also considered in the analysis of arrival times and interarrival durations are shown in
Figure A3. If self-excitation is absent, the mean number of events in intervals of 5 s after an event arrival appears to fluctuate randomly around some level, no matter the intensity is, see
Figure A3a. In the presence of self-excitation, the mean number of events after an event arrival initially decreases clearly with the increasing distance of the interval, see
Figure A3b–d. The amounts by which the means decrease become smaller with the distance of the interval and at some point the means seem only to fluctuate around some level. The initial decline is steeper if
, i.e., when the branching ratio is smaller. A clear indication that the growth in baseline intensity carries through to the mean number of arrivals after an event arrival is not recognizable.
Figure A3.
Mean number of event arrivals in time intervals of 5 s over 5 min after an event arrival. .
Figure A3.
Mean number of event arrivals in time intervals of 5 s over 5 min after an event arrival. .