The environment as a spatial reference is the most straightforward use of the multi-agent environment for mobility applications. A spatial environment provides a reference to the agents on which they can be located, on which they can move, and provides them with metrics to compute a distance between them. This use of the environment is the primary component provided by traffic simulation platforms such as SUMO [
7], and general-purpose simulation platforms such as Repast Simphony [
8]. The environment provides information about the present situation in the system, such as the positions of all the agents, the states of the entities (e.g., connected or non-connected vehicles), and the current constraints for the agents’ actions (e.g., traffic light signals states).
The environment as a spatial reference for the agents reflects the system’s present situation. Agents perceive the state of the environment before acting on it. However, cognitive agents usually have some planning activities and have a representation of the time, additionally to the space dimension. This section presents an approach where the multi-agent environment has two linked dimensions: space and time. This representation can either be used by the agents for planning, to reserve certain parts of the network at specific periods, for instance, or to synchronise agents’ actions and movements and also as a mental model for the agents, i.e., an internal knowledge representation of the dynamics of the networks through time.
2.1. Space–Time Model of the Environment
Let a transport network
, with a set of nodes
and a set of arcs
. Let two matrices
and
of costs, of dimensions
(the arc
has a distance of
and a travel time of
). The representation of the multi-agent environment is made of a duplication of
G,
H times, with
H the maximum allowed time of the considered application:
, with
a set of nodes at time
t and
a set of directed links at time
t, and
. The temporal copies of
G are not necessarily identical (cf.
Figure 2). Indeed, we can have different travel times between two copies of
G to reflect the traffic state. Some nodes may be present in one copy while absent in another, to reflect the expansion of a crisis situation for instance [
9]. Arcs may also be absent to reflect vehicle schedules in public transportation as in the application described in the next section. The costs on the edges usually refer to travel times.
This representation of the multi-agent environment through time can be either a “passive” knowledge representation used by the agent to plan their future actions or to store historical states of the environment. It can also be used more profitably as an “active” entity, which is materialised as an explicit data structure, may have its dynamics, be accessed by all the agents and have a behaviour that can influence the behaviours of the agents.
So-called time-expanded graphs have long been used in the literature of transportation science, especially for transit networks, to compute shortest paths (e.g., [
11,
12]). The difference with the model presented in this paper is that the space–time network is active. It can act and react in the system and influence the behaviours of the agents. We describe this concept of active space–time environment with two mobility applications described in the following sections.
2.2. Space–Time Environment in Mobility Simulation
Transportation systems are becoming progressively complex as they are increasingly composed of intelligent and mobile entities. Travellers equipped with mobile devices and vehicles with connection capabilities allow passengers and vehicles to adapt their behaviour with up-to-date information. However, without control, the massive dissemination of information via billboards, radio announcements and individual guidance can have adverse effects and create new traffic concentrations and disruptions. Indeed, with this generalisation of real-time traveller information, the behaviour of modern transportation networks becomes more difficult to analyse and predict. It is then essential to properly observe these effects to consider the adequate methods to face them. To this end, we developed a multi-agent simulation platform that represents travellers, drivers, and public transportation vehicles and makes them move realistically on a multimodal transportation network. To allow travellers to receive only the disruption information relevant to them, the spatio-temporal network model described above is integrated into the simulation platform. In the following, we briefly introduce the multi-agent simulation platform, called SM4T (Simulator for Multi-agent MultiModal Mobility of Travelers) [
13], before presenting our method of disseminating information to relevant travellers with spatio-temporal networks.
The multi-agent simulation platform that we designed and implemented allows for the individual representation of travellers moving on a transport network. We enrich it with traveller information capabilities, both at the stops and with personal information directly on the travellers’ smartphones. A simulation represents itinerary planners, passengers, public transportation vehicles, and information means in a micro-level and simulate their dynamic movements (cf.
Figure 3).
The multi-agent system of the simulator is composed of the following entities. We define four types of agents: Public transport vehicle agents (representing buses, metros, tramways, etc. in the system), Connected travellers, which represent the passengers that connect to real-time information sources, Non-connected travellers, that represent the travellers that only have a spatial representation of the network, and Local information agents, that provide real-time information locally on the stations of the network. A planning service is defined and is responsible for calculating the best route for the connected travellers only. It bases its calculation on the latest network status, including the ongoing disruptions (cf.
Figure 4).
Non-connected travellers base their calculations on a static view of the network. They compute their shortest path based on this view. They wait for vehicles at scheduled stops and do not change their route until they either get stuck in an ongoing disruption (delay or line disconnection) or receive local information (from a Local information agent) about an ongoing disruption. When they receive the information, they infer the new network by applying the changes to their mental-and static-view of the transportation network and calculate a new shortest path based on this representation.
On the other hand, Connected travellers have their itineraries monitored by the Planning service. The latter uses the spatio-temporal network model defined earlier, representing the public transportation network, the network topology and the vehicle schedules. Recall that an arc connects two nodes and in when there is a vehicle departing from to at t. Otherwise, the arc is absent. The spatiotemporal network is active in this application: the space–time arcs store listeners for traveller agents and inform them when the departure time or travel time changes or when the vehicle cancels its mission. To be aware of only the events that concern them, the Connected traveller agents subscribe to the only arcs of the multi-agent spatiotemporal environment that form their route. When the travel time of an arc or the departure time of the vehicle changes, the information is broadcast to the subscribed connected travellers. The planning process is then launched with the new network state.
Disruptions are modelled exclusively by modifying the space–time arcs (according to the vehicle schedules). Indeed, a vehicle’s delay is injected into the model by dynamically adjusting the space–time arcs representing the corresponding vehicle schedule, setting the destination node time to the delayed arrival times. Breakdowns are also modelled by removing the arcs corresponding to the vehicle’s mission from the space–time network. To model the failure of an entire line, the arcs representing the schedules of the remaining vehicles on the line are all deleted. As soon as a schedule is modified, based on the space–time network, the information is immediately detected by the relevant Connected agents, and only to them. Thus, when a timetable is modified, information about the delay or breakdown is sent only to connected travellers interested in these vehicles’ missions.
We executed the experiments with the data of the city of Toulouse in France. We chose this French city because we have detailed data about its network and a description of the travel patterns of the region [
14]. The data came from Tisséo-SMTC, the public transportation authority of the Greater Toulouse. The public transportation network of Toulouse is composed of 80 lines, 359 itineraries and 3887 edges. Frequencies and edges costs are updated hourly. The multiagent system comprises 18,180 vehicles and from 5000 to 30,000 passengers. We define the number of ticks per simulation to 5000 for a journey from 6 am to 2 am. Every simulated tick corresponds to approximately 14 s. We chose the origins and destinations of passengers coherently with travel patterns of the region (the origins-destinations generation method is in [
10]). We executed the simulations on a PC under Windows 7 with a processor Intel Xeon CPU E5-2630 (12 cores at 2 Ghz) with 50 GB of memory.
Previous works have shown that traveller information has little impact in case of minor disturbances. For this reason, we decided to use severe disorders instead in the form of complete disconnection of edges. In every simulation, we generated five random edge disconnections on the network during the whole simulation (one disconnection every 233 real-time minutes approximately). Every disconnection lasts 250 simulated ticks (slightly less than one hour in real-time). Due to edge disconnections, some passengers can no longer find an itinerary to their destination because edge disconnection impacts network connectivity. In the following results, these passengers were not considered (the ratio of passengers without an itinerary is stable, around 5% in all the simulations). Disturbances are random but concern only a certain number of edges that we consider significant to disconnect: the edges through which pass at least five different itineraries. We chose the five randomly disconnected edges between 21 candidate edges that satisfy this requirement.
We considered six different information level scenarios and executed each one 25 times. The first scenario is “the reference configuration” (to which we compared all the others) where no up-to-date information is provided to the passengers, neither local nor personalised. They only have the static description of the network and timetables. In the second scenario, the system provides only local information. The new travel times are available for the only passengers present at the considered stop. We did not consider any connected passengers in this scenario. The system provides local information at the stops in the remaining scenarios (3, 4, 5 and 6), and personal information is only available for the connected passengers. We considered 20%, 50%, 80%, then 100% of connected passengers, respectively, in these scenarios. In the scenarios with local information (all the simulations except the reference configuration), we placed local information agents in all the network stops. We report the average travel times for the passengers. We considered 30,000 passengers (approximately one-quarter of the actual number of passengers).
We executed this system twice: one time with a broadcast of the information about disturbances to all the connected passengers and once using the spatiotemporal network to circumscribe the sending to the only connected passengers. The impact of using the space–time environment on the number of exchanged messages is reported in
Table 1. It saves more than 60% of the messages in all the scenarios.
The use of the space–time multi-agent environment in this simulation platform makes the information exchange between agents more efficient and saves communication bandwidth. The impact of travellers’ real-time information is reported in [
15].
2.3. Space–Time Environment for Planning in Mobility Applications
Vehicle routing problems (VRPs), modelling real-world applications such as dynamic carpooling or online food delivery, are complex optimisation applications that have attracted an enormous research effort for decades. In the online version of these problems, the optimisation of the response time to connected travellers is vital, along with the optimisation of the classical cost criteria (e.g., the size of the fleet of vehicles, the total travelled distance, the total travel times, the total waiting times, etc.). Multi-agent systems, on the one hand, and greedy insertion heuristics, on the other hand, are among the most promising approaches for this purpose. This section describes a multi-agent system coupled with a novel regret insertion heuristic. The heuristic is based on a space–time representation of the multi-agent environment. It serves as a mental model for the agents and as an explicit and active entity, guiding the planning of the agents.
In a VRP, many nodes must be visited once by several capacitated vehicles. These problems are challenging optimisation problems, and solving them has great practical utility. The time-constrained problem is one of the most studied variants of the VRP (Vehicle Routing Problem with Time Windows, VRPTW henceforth). In this variant, the vehicles must visit the nodes within time windows. Vehicle routing problems are divided into two categories: static problems and dynamic problems. In static problems, all the problem data is available before the optimisation process begins. In dynamic problems, the problem data is incomplete before the execution and is discovered progressively during the optimisation. The incomplete data can be any problem element, such as traffic data or available vehicles. However, the dynamic aspect usually refers to the travellers to be transported, which are unknown before execution (as in carpooling and dial a ride systems). Operational problems are never completely static, and it is reasonable to assume that a static system does not meet current operational configurations. Indeed, in real vehicle routing problems, even when all travellers are known in advance (with a reservation system, for example), there is always an element that makes the problem dynamic. These elements can be no-shows, delays, breakdowns, etc.
Online vehicle routing problems could be considered an extreme case of dynamic vehicle routing problems. Indeed, not only is the problem data not completely known before the optimisation starts, but travellers connect to the system in real-time and expect almost immediate responses to their requests. Therefore, the response time of the system in this type of problem is vital. It is more beneficial to immediately provide the current solution to the traveller, rather than waiting a long time for the optimisation system to improve its current solution slightly. Indeed, the traveller is unlikely to wait long for a response to her request.
To meet the requirement of short response times, we relied on the multi-agent paradigm to solve online vehicle routing problems. Multi-agent modelling of online VRPTW is relevant for the following reasons. On the one hand, since it allows the distribution of computations, it should shorten the response time to travellers’ requests. On the other hand, nowadays, vehicles are more and more connected and have onboard computing capabilities. In this context, the transportation system is de facto distributed and requires appropriate modelling to take advantage of these facilities. The multi-agent system (MAS) we describe in this section comprises vehicle agents, passenger agents, interface agents and planner agents. The multi-agent environment is explicitly modelled as a space–time network and used by the agents to plan their routes. The MAS simulates a distributed version of the so-called “insertion heuristic”. Insertion heuristics is a method of inserting individual travellers into vehicle routes. Each traveller is inserted into the route of the vehicle with the minimum marginal cost (the cost can refer to the detour incurred, for example). This method is the fastest known heuristic since there is no reconsideration of previous insertion decisions.
In heuristics and multi-agent methods in the literature, the hierarchical objective of minimizing the number of vehicles mobilised is considered to take priority over the overall costs (including the distance travelled by all vehicles). Most of the heuristics are based on a two-phase approach: minimizing the number of vehicles followed by minimizing the distance travelled [
16]. The model we propose in this section aims at minimizing the number of vehicles used in priority while maintaining the use of a “pure” insertion heuristic, i.e., without any additional improvement to meet the response time requirements. To this end, our heuristic encourages vehicle agents to cover a maximum spatio-temporal area of the transportation network, avoiding the mobilisation of a new vehicle if a new traveller appears in an uncovered area.
A space–time pair
—with
i a node and
t a time—is said to be “covered” by a vehicle agent
v if
v can be in
i at
t. The “vehicle action zone” is the set of space–time nodes that the vehicle agent covers. In the context of online VRPTW, maximizing the action zones of the vehicle agents gives them a maximum chance to satisfy the demand of a future (unknown) traveller. By modelling the spatio-temporal action zones of the vehicle agents, we propose a new method to compute the price of inserting the traveller into a vehicle’s route. This proposal is a kind of regret insertion heuristic. Regret insertion heuristics, instead of choosing the vehicle with the minimum marginal cost, choose the vehicle and traveller with the greatest “regret”. Regret is a measure of the potential price to be paid if a given traveller were not immediately inserted into the route of a given vehicle. There are several methods for calculating regrets, such as the sum of the differences between all available prices and the minimum price [
17].
2.3.1. Intuition of Spatio-Temporal Action Zones
Consider a vehicle agent
v that has an empty route. Consider also a new traveller
c described by:
n a node,
a time window,
s a service time, and
q a quantity. For
v to fit
c into its schedule,
l must be large enough to allow
v to be in
n without violating its time constraints (if
e is too small,
v will have to wait until
e). More precisely, the current time
t plus the travel time from the depot to
n must be less than or equal to
l. Based on this observation, we define the action zone of a vehicle agent as the set of pairs
of the space–time network that remains valid given its current route (
n can be visited by the vehicle at
t). The conical shadow in
Figure 5 illustrates the action zone of a vehicle agent with an empty route.
When a vehicle agent inserts a traveller into its route, it has to recalculate its action zone. Indeed, some pairs
become unfeasible. In
Figure 6, a new traveller is inserted into the vehicle route. The vehicle agent’s action zone after the traveller’s insertion is represented by the inside contour of the bold lines, which represent the space–time nodes that remain feasible after the insertion of the traveller.
The insertion price sent by a vehicle agent v to a traveller agent c corresponds to the hypothetical decrease in the action zone of v following the insertion of c in its route, i.e., the number of space–time nodes that would no longer be feasible. The idea is that the vehicle chosen for the insertion of a traveller is the one that keeps the maximum chance of being a candidate for the insertion of future travellers. Thus, the maximised criterion by the vehicle agents’ fleet is the sum of their action zones, i.e., the capacity of the MAS to react to the appearance of travellers without mobilizing new vehicles.
2.3.2. Coordination of Action Zones
Until now, the space–time model of the environment is used as a mental model of the agents, but it is not explicitly modelled and shared between the agents, and they do not interact with it. This method results in a better space–time coverage of the transportation network, materialised by a minimal mobilisation of vehicles with the appearance of new travellers. Each vehicle agent tries to maximise its action zone independently of the other agents with the mechanism mentioned above. However, it would be more interesting if the agents covered the network in coordination. Specifically, for a vehicle to lose space–time nodes that it alone covers should be more costly than losing nodes that the other agents cover.
To this end, we modelled the environment explicitly and associated with each node in the space–time network the list of vehicles that cover it. Each vehicle notifies the space–time nodes that it is part of its action zone, and each node continuously updates its list. Similarly, when a vehicle agent loses a node in its action zone, the node is notified, updating its list of vehicles.
When the price of inserting a traveller is calculated, each vehicle agent first determines the space–time nodes it would lose if it were to insert the new traveller. Then, it asks each of these nodes about the “price to be paid” if it were no longer covering it. This price is inversely proportional to the number of vehicles covering that node. Specifically, the price to pay is equal to
with
designating the vehicle agents covering the space–time node
and
the number of these vehicles.
This method, based on an active space–time environment, associates a higher penalty with the decision to stop covering a node less covered by others. Thus, vehicle agents have an incentive to cover the entire network in a coordinated manner.
Marius M. Solomon [
18] created a set of different static problems for the VRPTW. It is now admitted that these problems are challenging and diverse enough to compare the different proposed methods with enough confidence. In Solomon’s benchmarks, six different sets of problems have been defined: C1, C2, R1, R2, RC1 and RC2. The travellers are geographically uniformly distributed in the problems of type R, clustered in the problems of type C, and a mix of uniformly distributed and clustered travellers is used in the problems of type RC. The problems of type 1 have narrow time windows (very few travellers can coexist in the same vehicle’s route) and the problems of type 2 have wide time windows. Finally, a constant service time is associated with each traveller, equal to 10 in the problems of type R and RC, and to 90 in the problems of type C. There are between 8 and 12 files containing 100 travellers in every problem set.
We chose to use Solomon benchmarks while following the modification proposed by [
19] to make the problem dynamic. To this end, let
the simulation time. All the time-related data (time windows, service times and travel times) are multiplied by
, with
the scheduling horizon of the problem. The authors divide the travellers set into two subsets; the first subset defines the travellers known in advance, and the second is the travellers who reveal during execution. We did not make this distinction since we consider no travellers known in advance. An occurrence time is associated with each traveller, defining the moment when the system knows the traveller. Given a traveller
i, the occurrence time that is associated is generated randomly between
, with:
It is known that the behaviour of insertion heuristics is strongly sensitive to the appearance order of the travellers to the system. For this reason, we do not consider only one appearance order. We launch the process that we just described ten times with every problem file, creating ten different versions of every problem file.
We implemented two MAS with almost the same behaviour; the only difference concerns the measure used by vehicle agents to compute the insertion cost of a traveller. For the first implemented MAS, it relies on the Solomon measure (noted Distance). The second relies on the space–time model (noted Space–Time). We chose to run our experiments with the problems of class R and C, of type 1, which are the instances that are very constrained in time (narrow time windows).
For each problem class and type, we considered different travellers numbers to verify the behaviour of our model with respect to to the problem size. To this end, we considered the 25 first customers, the 50 first customers, and finally, all the 100 customers in each problem file.
Table 2 summarises the results. Each cell contains the best-obtained results with each problem class (the sum of all problem files). The results show, with the two classes of problems, that the use of the space–time model mobilises fewer vehicles than the traditional model (
). This result validates the intuition of the model, which consists of maximizing the future insertion possibilities for a vehicle agent.
The results show the superiority of this method, in terms of execution times and response times [
20], and in terms of number of mobilised vehicles [
21], compared to traditional methods.