With the lap-wise discretized race simulation (described in
Section 1) as a basis, there are few alternatives to MCS for evaluating the effects of probabilistic influences. One option would be to use what-if scenarios, for example, “What happens if the SC gets deployed on lap 30?”. However, this can only be dealt with if we refrain from considering combinations of many probabilistic influences due to the rapidly increasing complexity. Another idea would be to discretize the possible range of the random variables and then simulate races for all combinations (full factorial design). In contrast to MCS, this approach would lead to better sampling of the parameter space in low-probability regions. MCS, on the other hand, provides more meaningful results because it utilizes probability distributions that represent real behavior. Besides, a full factorial design suffers from the curse of dimensionality, which quickly increases computation time. As in literature, MCS is therefore preferred. MCS requires that the generated random numbers are independent and identically distributed [
3] p. 4. Provided that the computer’s random number generator (RNG) fulfills this requirement, it also holds for most of the commonly used random distributions, since they are sampled based on the RNG. This also applies to the distributions used in this paper: Gauss distribution [
11], Beta distribution [
12] and log-logistic (Fisk) distribution [
12].
The following sections describe how we have modeled the influences presented in the previous chapter in order to overcome the mentioned limitations. Damaged cars are not considered as we do not have the necessary data available. Besides, it is a rare case that a car that has been involved in an accident is only damaged so slightly that it can continue the race.
3.1. Modeling of Starting Performance
To be able to distinguish between good and bad starters, we need a reference, that is, an average starter. Therefore, we measured the times between race start and crossing the start line (in front of the starting grid)
as a function of starting grid position
for the 2019 races. This was done using videos from the cockpit perspective, which are available on F1 TV [
13]. As
Figure 5 reveals, a square root function is a good approximation of the average starter.
The square root function is physically rational if we assume a constant acceleration for the race start phase. This hypothesis can be made because Formula 1 cars are grip limited and not power limited in the lower speed range. The function is established as follows:
The parameters
and
stand for the (virtual) position of the start line and the reaction time of a human driver. They shift the origin so that a driver who would start directly on the start line would only have to overcome his reaction time.
is set 0.8 because the start line is located only slightly in front of the pole position. Therefore, the distance to the pole starter is significantly smaller than the usual distance between two grid positions (which is 8
). As a consequence,
cannot be set 0. For the reaction time
we use
. The average acceleration during the race start
is then determined using a least-square fit. This results in
when evaluating the data of the 2019 season as shown in
Figure 5 and using the distance of 8
between two starting grid positions.
With the parameterized reference curve, as depicted in
Figure 5, we can calculate the differences to the measured data points for every driver. These deviations are then used to calculate mean and standard deviation of a driver-specific Gauss distribution, which is used to modeling the starting performance
as stated by Equation (4). Samples from these distributions are added to the first lap time in the race simulation. The parameters of the drivers of the 2019 season can be found in
Table A1 in
Appendix A.
3.3. Determination of Accident and Failure Probabilities
As mentioned, we want to differentiate between accidents and (technical) failures. Therefore, we assume that an accident depends on the driver, while a failure depends on the car, that is, the team. If a team changed its name from one season to the next, for example, when Sauber became Alfa Romeo Racing, we treat it under its original name to ensure that the failure probabilities are determined correctly. The accident and failure probabilities are determined by applying Bayesian inference, as suggested by Sulsters [
7]. For Bayesian inference, a prior distribution and a likelihood function are required. As with Sulsters [
7], the Beta distribution is used as prior distribution, and the Bernoulli distribution as likelihood function (the possible race outcomes are: “finished” or “did not finish”). The prior distribution parameters
and
are determined to [
7] p. 11
and stand for mean and standard deviation of the prior distribution. They are determined using the total accident fraction per driver, and the total failure fraction per team. Hereby, only drivers and teams with at least 30 races in the database are considered. The two prior distributions for accidents and failures then represent our knowledge about the respective probabilities on the entire database.
Thereafter, driver-, team- and season-specific posterior distributions are calculated taking into account the corresponding accident and failure fractions within the particular season. This proceeding combines the overall knowledge with the specific influence factors of driver, team, and season. For the chosen combination of prior distribution and likelihood function, the posterior distributions are also a
distribution [
7] p. 12.
z is the number of accidents or failures in the respective season, and
N stands for the number of attended races in that season.
Figure 6 shows the resulting probability density functions of the accident prior distribution and three driver-specific accident posterior distributions.
Finally, the mean values of the posterior distributions are used as accident
and failure probabilities
for the simulation. The parameters of the drivers and teams of the 2019 season are given in the
Table A1 and
Table A2 in
Appendix A.
3.4. Determination of Full Course Yellow Phases in Combination with Accidents and Failures
The determination of FCY phases and retirements must be performed before starting the actual race simulation in order to have the required information available even if backward drivers reach the specified start of a phase in an earlier lap than the race leader. The alternative would be to determine the retirements and their corresponding FCY phases “live” during the simulated race, as used in some of the literature. A small example shows why this does not work correctly with the lap-wise discretization principle. Looking at exemplary race times in
Table 3, we find that driver 1 is ahead of driver 2 and driver 3 in laps 20–22 because he reaches the end of each lap earlier (actually driver 3 was even lapped because
(driver 1, lap 21) <
(driver 3, lap 20)). Assuming that the simulation would decide in lap 22 that a VSC phase should be activated at
, we can conclude that it would affect driver 1 shortly after starting into lap 22, while driver 2 and driver 3 would have already been affected in lap 21. Therefore, the problem is that once the simulation decides to activate the VSC phase in lap 22, the previous lap has already been fully simulated due to lap-wise discretization. As a consequence, the SC could not be considered for driver 2 and driver 3 in lap 21 anymore.
The solution is to determine all FCY phases and retirements before starting the actual race simulation, as explained in the following. For the definition of FCY phases, a process to fix start race times, durations, and type (VSC or SC) is required. The definition must happen in conjunction with the determination of accidents and failures since they are the causes of FCY phases. For our process, we assume that accidents lead to SC phases. In contrast, if a driver retires due to a failure, he tries to drive to a safe spot. Therefore, we assume that this either causes a VSC phase or no FCY phase at all. We use the following procedure to keep the overall chances of SC, VSC, accidents, and retirements as realistic as possible, although it violates the real cause-effect principle in the case of SC phases:
Determine SC phases (quantity, start, duration) and derive accidents
Determine failures (quantity, start) and derive VSC phases (duration)
Convert race progress to race time
Determine SC phases and derive accidents
The SC phases are fixed at first because they have a significant impact on race strategy, and therefore their probability of occurrence should be no conditional probability. The quantity of SC phases for a race is chosen between zero and three, whereby empirical probabilities
according to the real fractions of the seasons 2014–2019 are used for each of the options, see
Figure 7. The exact values are given in
Table A3 in
Appendix A.
Then, the start of every SC phase is defined. Therefore, the race is divided into six groupings (first lap,
,
,
,
,
) with individual probabilities
. The laps in each group are then assigned the same proportion of the corresponding probability. This classification can be compared with the actual data in
Figure 8. The exact values for
are given in
Table A4 in
Appendix A. The first lap has to be considered separately since over 36 % of the SC phases start here, which can be explained by the small distances between the drivers shortly after the start that cause a high probability of accidents.
The duration of an SC phase is chosen to be between two and eight laps with empirical probabilities
derived from data of the seasons 2014–2019. The exact values are given in
Table 4. The start of an SC phase is further modified by a uniform distribution
to include the fact that it does not start precisely at the point laps are completed.
Before finally saving the created phase, it is assured that it does not overlap an already existing phase. This would be the case if
both hold.
is the race progress at the start and end of the new and
the race progress at the start and end of the existing FCY phase currently in comparison.
is a minimum distance which should be kept between two phases.
As mentioned before, we assume that every SC phase is caused by an accident. Therefore, the simulation chooses one driver who retires at the start of every SC phase. The selection happens based on the drivers’ accident probabilities that were determined earlier. Selecting only a single driver for an accident is a simplification, since sometimes two or even more drivers are involved in reality. However, our available data is not detailed enough to be able to modeling and parameterize these cases. Furthermore, retired drivers are not crucial for race strategy determination.
Determine failures and derive VSC phases
Thereafter, the simulation determines, for those drivers not involved in an accident, whether they suffer a failure. The team-specific failure probability
determined earlier is used in this respect. Subsequently, the simulation checks for every failure appearing if it causes a VSC phase using the conditional probability
. Assuming that every VSC is caused by a failure, it can be calculated using the number of VSC phases
and the number of failures
(2015–2019, as the VSC was introduced in 2015) in the database:
This is a simplification because there are some cases where, after an accident, VSC phases were first activated and shortly afterward replaced by an SC phase, for example. However, as before, the available data is not detailed enough to analyze these cases. The start of the failure (and probably of the phase) is sampled from a uniform distribution
since no outstanding race section could be identified in the data.
stands for the number of laps in the race. The duration of a possible VSC phase is chosen in the range between one and four laps, with empirical probabilities
, and modified by a uniform distribution
as with the start of SC phases. The exact probabilities for the duration determination are given in
Table 5.
Convert race progress to race time
Due to the lap-based nature of the information in the database, the definition of FCY phases and retirements is also based on laps (i.e., race progress). However, as mentioned in
Section 2.5, race times are required instead of race progress so that every driver can be affected at the same point in time. This is achieved by converting the race progress information into race times using a pre-simulation of the actual race with a single driver. It gives a reasonable estimate at which race time a particular stage of progress is reached during the race. Thus, the progress information of the FCY phases can be converted to race times. Deviations between the race times of the pre-simulation and the real race simulation cause no problems, as they change the start and duration of the phases equally for every driver.
3.5. Modeling of Accidents, Failures, and Full Course Yellow Phases
The modeling of accidents and failures is implemented by simply taking the concerned driver out of the race as soon as his race time exceeds the defined time of retirement.
Modeling of the virtual safety car
The VSC is modeled by increasing the lap times of the drivers to
, cp.
Section 2.5. However, since FCY phases can start and end at any point during a lap, we have to calculate the lap fractions that are driven normally
and affected by the VSC phase
to obtain the correct lap time. If, for example, the phase starts within the current lap and ends in a later lap, the resulting lap time
can be calculated by
where
is the start race time of the VSC phase,
the race time of the driver at the end of the previous lap and
the unaffected lap time of the driver in the current lap. Similar calculations are performed when the phase ends. Overtaking is forbidden in the simulation if a VSC affects at least 50% of a lap. Due to the limited speed, tire degradation and fuel consumption are reduced to 50% during the phase. This is an estimation since the exact values cannot be derived from the publicly available data. In reality, the saved fuel is, of course, consumed after the phase, for example, by increasing the engine power. In the simulation, the average consumption per lap after an FCY phase is therefore automatically adjusted so that the saved fuel is used up by the race finish.
Modeling of the safety car
For the realistic modeling of SCs, we use driver-individual “safety car ghosts” (SCGs). The concept is illustrated in
Figure 9. An SCG can be imagined as a virtual car that is only visible to its corresponding driver and does not affect any other driver. Since it is a safety car, it cannot be overtaken. The driver-individual handling is necessary, since the drivers may be affected by the same SC in different laps due to lap-wise discretization. An SC deployment is modeled in two stages, a run-up stage and a following stage. If a driver reaches the start race time of an SC phase, we assume that his SCG starts driving on the finish line exactly at this time. Equally to the VSC, the lap time of the respective driver is then increased up to
for the remaining part of the lap and following laps to simulate the run-up stage under full course yellow condition. Every driver catches up with his SCG within several laps, since it drives at 160% of the base lap time, cp.
Section 2.5. The first SCG lap time is even slower to modeling the real behavior where the SC waits for the leading driver at the pit exit. If a driver’s calculated race time at the end of a lap is below that of his SCG, he would have overtaken it. Thus his lap time is artificially increased to stay behind. Keeping a minimum temporal spacing
between the drivers is hereby assured by adding
to the individual SCG race times, where
p stands for the drivers’ rank positions. Tire degradation and fuel consumption are reduced to 25% (again an estimation) while driving behind an SC. The value is smaller than that of the VSC phase because the speed is even lower. The SCGs remain active until the end of the lap in which the SC phase ends, even if the originally determined end time is reached before the end of the lap. This models the fact that the SC can only leave the race track by entering the pit lane at the end of a lap. This proceeding allows a realistic simulation of the re-start of the race with small gaps between the drivers. After each SC phase, the drag reduction system (DRS) is deactivated for two laps, as in reality. The DRS allows drivers to reduce drag resistance on straights when following another driver at a close range. It was introduced in the 2011 season to ease overtaking.
Adjustment of pit time losses under FCY condition
As mentioned, it is crucial to consider that the relative pit time loss
reduces if a driver drives through the pit lane during an FCY phase. Therefore, smaller pit time losses are added if entering or leaving the pit is fully covered by an FCY phase. During an SC phase, the time losses are often even smaller than during a VSC phase. Since the data are not publicly available, we measured them using videos from the cockpit perspective in 2018 and 2019, which are available on F1 TV [
13]. As
Table 6 indicates, the differences between normal conditions and FCY conditions vary largely depending on the track layout. Substitute values (for
and
) from similar tracks can be used for race tracks for which no VSC or SC phases have been declared in 2018 and 2019.