Figure 1.
Summary of the different phases of the Pavlov conditioning experiment with a dog. (A) Initially, the dog is untrained and, when presented the conditional stimulus (CS, the bell ring), it shows no reaction. When presented the unconditional stimulus (US, food), it starts to salivate by anticipation (UR, the unconditional response). (B) In the training phase of the experiment, the dog is presented food and at the same time it hears the bell ring (US + CS training). (C) In the testing phase, no food is presented to the dog, but the bell ring alone triggers its salivation (CR, conditioned response). Hence, with training, the dog learns to associate the two stimuli together and eventually reacts positively to the bell ring even in the absence of food.
Figure 1.
Summary of the different phases of the Pavlov conditioning experiment with a dog. (A) Initially, the dog is untrained and, when presented the conditional stimulus (CS, the bell ring), it shows no reaction. When presented the unconditional stimulus (US, food), it starts to salivate by anticipation (UR, the unconditional response). (B) In the training phase of the experiment, the dog is presented food and at the same time it hears the bell ring (US + CS training). (C) In the testing phase, no food is presented to the dog, but the bell ring alone triggers its salivation (CR, conditioned response). Hence, with training, the dog learns to associate the two stimuli together and eventually reacts positively to the bell ring even in the absence of food.
Figure 2.
The aFish robot from the subCULTron project serves as a model for the simulated robots. It measures about 50 cm long, and 20 cm high. The main sensors and actuators used include thrusters (forward and backward motion, lateral rotation) and a buoyancy system for navigation. An acoustic transceiver is present for long range communication (<500 m) and modulated light transceivers disposed around the body are used for short range communication and perception of other robots (<0.5 m). A camera oriented towards the bottom allows for detecting target objects. In the simulations, the light transceivers are implemented with a perception cone to take into account range and aperture, and visual occlusions are not considered.
Figure 2.
The aFish robot from the subCULTron project serves as a model for the simulated robots. It measures about 50 cm long, and 20 cm high. The main sensors and actuators used include thrusters (forward and backward motion, lateral rotation) and a buoyancy system for navigation. An acoustic transceiver is present for long range communication (<500 m) and modulated light transceivers disposed around the body are used for short range communication and perception of other robots (<0.5 m). A camera oriented towards the bottom allows for detecting target objects. In the simulations, the light transceivers are implemented with a perception cone to take into account range and aperture, and visual occlusions are not considered.
Figure 3.
The experimental setup in our simulations is a circular pool (12.5 m diameter), with a beacon in its center that periodically emits acoustic messages and acts as an aggregation device to maintain the robots together (1.75 m range). Conditional and unconditional stimuli can be presented to the robots at any time, by triggering long-range acoustic signals.
Figure 3.
The experimental setup in our simulations is a circular pool (12.5 m diameter), with a beacon in its center that periodically emits acoustic messages and acts as an aggregation device to maintain the robots together (1.75 m range). Conditional and unconditional stimuli can be presented to the robots at any time, by triggering long-range acoustic signals.
Figure 4.
Timeline of an experiment, with four main situations represented. (A) In the initial condition, robots are randomly scattered in the pool. (B) In the training phase, the robots can be exposed two different conditions, either the unconditional stimulus alone, or both the unconditional and the conditional stimuli together. During this time, robots perform a random walk and aggregate under the beacon, in a mixed or segregated configuration depending on the stimuli perceived. (C) In the testing phase, once robots are aggregated, the conditional stimulus is triggered alone to test whether robots have learned to associate it with the unconditional stimulus. Robots observe their immediate neighbors to form an opinion about their local configuration. (D) To obtain a collective response, robots exchange their opinions in a peer to peer manner and converge to a single opinion.
Figure 4.
Timeline of an experiment, with four main situations represented. (A) In the initial condition, robots are randomly scattered in the pool. (B) In the training phase, the robots can be exposed two different conditions, either the unconditional stimulus alone, or both the unconditional and the conditional stimuli together. During this time, robots perform a random walk and aggregate under the beacon, in a mixed or segregated configuration depending on the stimuli perceived. (C) In the testing phase, once robots are aggregated, the conditional stimulus is triggered alone to test whether robots have learned to associate it with the unconditional stimulus. Robots observe their immediate neighbors to form an opinion about their local configuration. (D) To obtain a collective response, robots exchange their opinions in a peer to peer manner and converge to a single opinion.
Figure 5.
The collective memory is encoded in the spatial structure, that is, the configuration adopted by the group of robots. To this end, the robots are divided in two teams that differ only by their color, red or yellow. On the left, the robots are aggregated in a mixed state, in which on average each robot has the same number of red and yellow neighbors. On the right, the robots are in a segregated state and they each have on average a majority of neighbors with the same color as themselves.
Figure 5.
The collective memory is encoded in the spatial structure, that is, the configuration adopted by the group of robots. To this end, the robots are divided in two teams that differ only by their color, red or yellow. On the left, the robots are aggregated in a mixed state, in which on average each robot has the same number of red and yellow neighbors. On the right, the robots are in a segregated state and they each have on average a majority of neighbors with the same color as themselves.
Figure 6.
Average perception of a robot observing its neighbors in a mixed aggregate. The number of signals perceived from robots of each team ( for self team and for opposite team) is highly symmetrical and shows that most frequent observations involve fewer neighbors’ signals. The range and aperture of sensors (field of view) limit the detection of all present neighbors at once. Signals are accumulated in time windows of 10 s, a total of 60,000 observations are represented.
Figure 6.
Average perception of a robot observing its neighbors in a mixed aggregate. The number of signals perceived from robots of each team ( for self team and for opposite team) is highly symmetrical and shows that most frequent observations involve fewer neighbors’ signals. The range and aperture of sensors (field of view) limit the detection of all present neighbors at once. Signals are accumulated in time windows of 10 s, a total of 60,000 observations are represented.
Figure 7.
Average perception of a robot observing its neighbors in a segregated aggregate, which is used to encode learned information in the collective memory. The number of signals perceived by a robot from the opposite team () is significantly lower than from its own self team (). Signals are accumulated in time windows of 10 s, a total of 60,000 observations are represented.
Figure 7.
Average perception of a robot observing its neighbors in a segregated aggregate, which is used to encode learned information in the collective memory. The number of signals perceived by a robot from the opposite team () is significantly lower than from its own self team (). Signals are accumulated in time windows of 10 s, a total of 60,000 observations are represented.
Figure 8.
Decision tree describing the reactive behavior implemented by the robots. In circles are the signals that can be perceived by the robot, with B the beacon signal of the aggregating device, the unconditional stimulus, and the conditional stimulus (B, , and are implemented in simulation as acoustic messages that can be perceived by all robots at once). In rounded rectangles are the subroutines that can be executed, with mainly the random walk, the quorum, and the decision to stop any motion with a probability that can either depend on any (self or opposite team) nearby robots () or on the nearby robots of the same team only (). This behavior does not require memory at the individual level as it is only based on the execution of different subroutines controlled by the current perception of the robots.
Figure 8.
Decision tree describing the reactive behavior implemented by the robots. In circles are the signals that can be perceived by the robot, with B the beacon signal of the aggregating device, the unconditional stimulus, and the conditional stimulus (B, , and are implemented in simulation as acoustic messages that can be perceived by all robots at once). In rounded rectangles are the subroutines that can be executed, with mainly the random walk, the quorum, and the decision to stop any motion with a probability that can either depend on any (self or opposite team) nearby robots () or on the nearby robots of the same team only (). This behavior does not require memory at the individual level as it is only based on the execution of different subroutines controlled by the current perception of the robots.
Figure 9.
(
A) Bifurcation diagram of the steady states of
of model (
3) as a function of parameter
for
; (
B) state diagram of the type of existing solutions as a function of
and
. Other parameter values are
.
Figure 9.
(
A) Bifurcation diagram of the steady states of
of model (
3) as a function of parameter
for
; (
B) state diagram of the type of existing solutions as a function of
and
. Other parameter values are
.
Figure 10.
Snapshots of the simulated experiments, relating the different phases of the experiments and the resulting behavior of the robots. In the initial condition, the 30 robots start randomly scattered in the pool. Two different conditions are tested: in the upper part of the figure, only the unconditional stimulus (US) is presented to the robots. In the lower part of the figure, the unconditional stimulus (US) and the conditional stimulus (CS) are presented together as acoustic messages that are perceived at once by all robots. During the training phase, the robots aggregate in the range of the beacon, adopting different spatial configurations depending on the perceived stimuli. During the testing phase, each robot forms an initial opinion that is advertised using white and black colors from their status LEDs. They then carry out a quorum after which the whole group has converged to a collective decision. In these snapshots, when the robots are exposed to the US and CS stimuli during the training phase, they afterwards respond positively to the CS stimulus alone, indicating that they have learned the association. When only the US stimulus is presented during the training phase, the robots produce a negative response to the CS stimulus in the testing phase, indicating that they did not learn the association.
Figure 10.
Snapshots of the simulated experiments, relating the different phases of the experiments and the resulting behavior of the robots. In the initial condition, the 30 robots start randomly scattered in the pool. Two different conditions are tested: in the upper part of the figure, only the unconditional stimulus (US) is presented to the robots. In the lower part of the figure, the unconditional stimulus (US) and the conditional stimulus (CS) are presented together as acoustic messages that are perceived at once by all robots. During the training phase, the robots aggregate in the range of the beacon, adopting different spatial configurations depending on the perceived stimuli. During the testing phase, each robot forms an initial opinion that is advertised using white and black colors from their status LEDs. They then carry out a quorum after which the whole group has converged to a collective decision. In these snapshots, when the robots are exposed to the US and CS stimuli during the training phase, they afterwards respond positively to the CS stimulus alone, indicating that they have learned the association. When only the US stimulus is presented during the training phase, the robots produce a negative response to the CS stimulus in the testing phase, indicating that they did not learn the association.
Figure 11.
Impact of the recall threshold on the retrieval of information in the collective memory. The quorum responses of the robots in two different configurations, mixed or segregated aggregates, are tested for different values of . For each configuration and each tested value, 1000 simulations are performed. When is low, robots will have higher chances to consider that their configuration is segregated based on their local observations. Therefore, lower values increase the risk of false positives when robots are in mixed configuration. Conversely, when is high, robots have higher chances to detect a mixed configuration. Higher values increase the risk of false negatives in which the group fails to detect a segregated configuration. In addition, 95% confidence intervals are displayed around the proportion of errors in the experiments.
Figure 11.
Impact of the recall threshold on the retrieval of information in the collective memory. The quorum responses of the robots in two different configurations, mixed or segregated aggregates, are tested for different values of . For each configuration and each tested value, 1000 simulations are performed. When is low, robots will have higher chances to consider that their configuration is segregated based on their local observations. Therefore, lower values increase the risk of false positives when robots are in mixed configuration. Conversely, when is high, robots have higher chances to detect a mixed configuration. Higher values increase the risk of false negatives in which the group fails to detect a segregated configuration. In addition, 95% confidence intervals are displayed around the proportion of errors in the experiments.
Figure 12.
Outcome of the testing phase in the Pavlov experiment (n = 1000 trials, group of 30 robots). Left plots show the dynamics of the quorum and how the majority opinion is gradually propagating until the whole group has made a collective decision (median ±95% CI). Right plots show the response advertised by the group resulting from the quorum and in response to the conditional stimulus (CS) in the testing phase. The results show that, when the group is trained with the unconditional stimulus (US), it does not recall information in the testing phase in 79% of the trials. However, when the group is trained with the unconditional and the conditional stimuli together, it learns to associate the two stimuli and recalls the association during the testing phase, providing a positive response in 82% of the trials.
Figure 12.
Outcome of the testing phase in the Pavlov experiment (n = 1000 trials, group of 30 robots). Left plots show the dynamics of the quorum and how the majority opinion is gradually propagating until the whole group has made a collective decision (median ±95% CI). Right plots show the response advertised by the group resulting from the quorum and in response to the conditional stimulus (CS) in the testing phase. The results show that, when the group is trained with the unconditional stimulus (US), it does not recall information in the testing phase in 79% of the trials. However, when the group is trained with the unconditional and the conditional stimuli together, it learns to associate the two stimuli and recalls the association during the testing phase, providing a positive response in 82% of the trials.