1. Introduction
In many situations, finding the source of a gaseous chemical released in the air becomes vital. For instance, robotic olfaction systems might be deployed in search and rescue, or safety and security operations to find explosives or drugs in airports or industrial facilities, or to search for survivors in the case of natural hazards. Despite the growing community of robotics and environmental engineers studying this crucial topic for more than two decades, odor source localization using mobile or static sensor nodes remains a challenging operation in realistic environments. The main difficulty arises from the dispersion phenomenon [
1] which is a combination of molecular diffusion that drives odor patches away from the source, as well as the advection due to the airflow that carries the molecules in its direction [
2]. As a result, the odor plume is shaped not only by the characteristics of the source, but also by the airflow which, in turn, is influenced by the environment. Further, air flow characteristics such as turbulence, meandering motions, and fluctuations impact the plume structure as well and make it more difficult to traverse.
To simplify the challenging problem of Odor Source Localization (OSL), scientists usually divide it into three sub-tasks [
3]: (i) odor plume acquisition, which refers to a search method that aims to find the plume in the environment; (ii) odor plume tracking, which is the phase where the robot leverages the samples obtained from the plume to localize the source; and (iii) odor source declaration, the final task, during which the robot validates and declares the location of the source.
The first and the last sub-tasks usually involve other sensing modalities (such as vision) and depend on the application or the scenario in which the strategy is taking place. Therefore, most studies in the literature focus on the plume tracking sub-task, which is the core and most challenging part of the mission. Various strategies have already been implemented for this sub-task, which can be classified into four often overlapping categories [
4]: gradient-based, bio-inspired, formation-based, and probabilistic algorithms.
Gradient-based algorithms assume the plume to be a smooth concentration gradient, which could be the case in an environment without an airflow. For this class of algorithms, the robot needs to sample at different points in the environment while moving slowly to measure the long-term average of odor concentration at each sampling point [
5]. Therefore, these algorithms, while being the most intuitive and computationally light ones, need relatively long time windows to find the source.
Bio-inspired algorithms define methods similar to existing behaviors in nature, such as search strategies of moths, dogs, bacteria, etc. [
6]. Casting, Surge-Cast and Surge-Spiral [
7,
8,
9] are among the most successful ones in this class that are inspired from moths behavior. The advantage of these algorithms is their independence from any a priori information about the environment, its atmospheric conditions, or the historic observations, which makes them robust, though not necessarily efficient, for unknown and dynamic areas. Nevertheless, due to low performance of current sensing and locomotion technologies compared to their biological counterparts, these algorithms are still far from being reliable in realistic environments [
10].
In a relatively new approach, formation-based algorithms are natively designed for multi-robot systems [
11]. Sampling multiple points at the same time is an advantage of this type of method, but since the coordination between agents is necessary, the entire method relies on inter-robot communication and relative positioning.
Probabilistic algorithms model a belief on the source location in the form of a probability distribution derived from the observations made by the agents in the environment [
4]. After each observation, the belief is updated using Bayesian estimation. This process continues until the probability distribution reduces to a Dirac function. Infotaxis [
12], particle-based algorithms [
13], and Source Term Estimation (STE) [
14] are the main examples of this category.
While probabilistic algorithms can be computationally expensive, they have many advantages. Firstly, unlike the previously mentioned methods that provide only the source position, they usually generate a richer set of information about the environment (e.g., [
15]) or the source characteristics [
14]. Secondly, since the framework is probabilistic, the provided information is associated with an amount of uncertainty, which shows how trustworthy the data are. Moreover, unlike the other algorithmic classes, the sensing agent does not necessarily have to physically approach the source to localize it, hence the process is faster and remains applicable in situations where the source is inaccessible. Lastly, compared to the other categories, probabilistic algorithms are more flexible in terms of the type of underlying hardware (e.g., static, mobile, and single- or multi-agent).
Most of the probabilistic algorithms rely on a plume model to update their belief on the source position. Since these models need to be customized for each specific environment, they cannot be used in an unknown environment. STE methods, on the other hand, while relying on a mathematical formulation for the model based on the physics of transport phenomena, do not use any a priori information on the parameters of the model, i.e., the characteristics of the plume.
Considering the above-mentioned advantages, we chose to focus on an STE algorithm in this paper. The goal of STE algorithms is to learn the parameters of the source, such as the release rate, the release start and stop time, the dispersion in the space, and, most importantly, the position of the source. The choice of the parameters depends directly on the chosen plume model. The method aims to learn the parameters of the model while the sensory system gathers data in the environment. Since the concept is very broad, the algorithm is not exclusively used for gas sources; depending on the model, it can be applied to any type of source (e.g., radiation [
16]). It is also not limited to mobile sensor nodes, as it can be used with a static sensor network (e.g., [
17]).
However, when used on mobile robots, STE algorithms can be coupled with a navigation strategy which makes the data collection more time-efficient. Different navigation methods have already been used with STE in the literature. Partially Observable Markov Decision Processes (POMDP) [
16] and mutual information maximization [
18] are among the popular methods in this area. However, most of the works involving mobile robots have only been evaluated in simplified simulations (e.g., [
18,
19,
20]).
Hence, the contributions of this work are the following:
An enhanced navigation strategy is presented for an STE algorithm and was evaluated in a high fidelity robotics simulation software as well as in a controlled physical set-up.
To make the evaluation suitable for unknown environments, the algorithm is developed for two different frameworks, one with global knowledge about the environment and the other with local knowledge.
A systematic performance evaluation was carried out to verify the robustness of the method to various environmental conditions.
The influence of a few key algorithmic parameters of the method on the performance was also studied.
The remainder of this paper is stuctured as follows. We present the details of the algorithm in the next section, and then we explain the evaluation process and present the obtained results. Finally, we discuss the outcome of the experiments before we conclude and present the outlook for this work.
2. Methods
As discussed in the Introduction, this work is built on the STE algorithm. STE is a probabilistic framework in which different strategies can be used. The general idea is to rely on a plume concentration model whose parameters have to be estimated throughout the experiment. The nature of the parameters depends on the model of choice, but, in any case, they include the source position, which is the goal of the OSL problems.
STE methods are usually flexible enough to be performed on a large variety of sensing assets, ranging from static sensor networks to heterogeneous mobile robots. In the case of a mobile system, the estimation algorithm is augmented with a navigation strategy to increase the amount of acquired information at each step.
Since in most of the applications of OSL the optimal method is the one with the lowest time of estimation, we believe that using mobility could significantly improve the performance. Moreover, assuming slowly changing plume conditions, in absence of mobility, a large number of static sensors would be required to capture spatially rich enough information, while much fewer mobile sensing assets would be sufficient to gather the same amount of information. Therefore, in this work, we use a single mobile robot equipped with a chemical sensor to sample the environment.
As a result, our method consists of two main parts: estimation and navigation. In the estimation part, the algorithm estimates the parameters using the data that the robot has acquired. In the navigation part, the goal is to send the robot to the best neighboring point to obtain as much information as possible. This cycle, represented in
Figure 1, continues until the uncertainty on the parameter estimation becomes negligible.
2.1. Local vs. Global Frameworks
This concept is applied in two different frameworks that we call global and local. In the global framework, it is supposed that, firstly, the map of the environment is available for the robot, and, secondly, the robot can localize itself using a global positioning system. In the local framework, on the other hand, the robot has no information about the map of the environment and does not have access to global positioning information. Therefore, it creates a fixed-size local map (e.g., in this paper) aligned with the wind direction around itself and performs the estimation and navigation on that local map using odometry. Once enough information is acquired on the local map, it is then moved to another place.
Although the global framework seems more practical and easy to implement, we believe that the local framework is more realistic for unknown environments without a predefined map and availability of a global positioning system.
Even though the two frameworks seem very different in terms of localization method, we applied the same concept on both of them and there are only a few minor differences in the implementation. In the remainder of this section, we discuss the details of the method in both frameworks.
The method as well as additional results concerning the global framework are reported in a further conference paper [
21].
2.2. Plume Model
Probabilistic algorithms usually rely on a plume model for updating their belief about the source position. In this work, we chose to use the pseudo-Gaussian concentration plume model [
22], presented in Equation (
1), which describes the time-averaged concentration model for a continuous point source in a laminar flow, where
Q is the source release rate and
the average wind speed.
In this equation, the X-axis is assumed to be aligned with the direction of the airflow and it is defined for all the x downwind of the source (i.e., ). The concentration is simply 0 for all points upwind of the source (i.e., ). The source is positioned at and and are the standard deviations of odor dispersion in the Y- and Z-axis, respectively; they are simplified to be linear functions of upwind distance from the source .
Moreover, since we are leveraging an STE method, the goal of the algorithm is to estimate the parameters of the model. Therefore, the set of parameters to be estimated for this model is . As the standard deviations and are both of form , they each have two parameters to be estimated. Therefore, to simplify further, the plume is assumed isotropic in Y and Z directions, hence . Thus, after the simplifications, six parameters remain to be estimated, making the problem six-dimensional.
2.3. Parameter Estimation
The estimation is performed probabilistically using the Bayesian formulation presented in Equation (
2), where
m represents the set of model’s parameters and
D the obtained data through sampling. The posterior
represents the probability distribution on the parameters values.
The evidence
being a normalization factor, it can be neglected. Hence,
We consider the prior a uniform distribution in between the limits of each parameter. Therefore, the posterior will be proportional to the likelihood in the parameters limit, and equal to 0 outside.
The likelihood
defines the probability of obtaining a set of data, given a set of parameters. In other words, it returns the likelihood of a set of parameters given the data that the robot collected up to the present time. It is defined in [
17] as follows:
where
and
represent the standard deviations of model and measurement error, respectively. Both errors are assumed to be normally distributed, with mean on 0.
The sum in Equation (
4) is applied on all samples the robot gathered during the experiment.
is the concentration determined by the plume model for a set of source parameters
m for sample
i. It is defined using the pseudo-Gaussian concentration plume model presented in
Section 2.2.
is the actual measured data for the very same sample
i. Since we use a time-averaged plume model, the sample needs to be represented by the mean of the sensed concentrations over a fixed time window. Therefore, to gather one sample, the robot stops at each target point for 5 s and then calculates the average of the sensed values gathered at 10 Hz (50 concentration values in total).
Since there are six parameters to estimate, the posterior probability density function has to be a six-dimensional matrix, which is very time consuming to be entirely calculated. The solution to this problem would be to use an approximation algorithm such as the Markov Chain Monte Carlo (MCMC) [
23] that allows evaluating the posterior probability function through efficient sampling.
In this work, we use the Metropolis–Hasting method [
24]. The important factors in this method are the number of iterations and the proposal distribution, which is chosen as a 6-D Gaussian function, since the posterior probability function is also 6-D. Here, we only present the concept from a general perspective.
At each iteration, a candidate point p in the proposal distribution is chosen, which is equivalent of a set of parameters . Then, the likelihood of this point is calculated and compared to the one of the previous point. The higher is the likelihood of the new point compared to the previous one, the higher is the chance of accepting it. In the case it is accepted, its likelihood is saved on the posterior probability distribution and replaces the previous point as the mean of the proposal distribution.
In both frameworks, the estimation part is performed identically. The only difference would be the size and granularity of the posterior probability distribution on which the estimation takes place. More precisely, in the global framework, the range of the parameters related to the source position is based on the whole environmental map, while, in the local framework, it is defined by the limits of the local map. The range of the other parameters are equally chosen in both frameworks.
2.4. Navigation
The goal of the navigation method is to feed the estimation part with worthwhile information. Therefore, it needs to predict which point is rich in terms of information to send the robot to. For this purpose, we propose to use POMDP [
25] as a predictive navigation method to guide the robot in a profitable fashion.
POMDP requires three components for its operation: a posterior distribution, a set of possible actions, and a reward function. The posterior distribution is already given by the estimation part. As for the set of actions, to simplify the robot’s motion, we limit the movement on one axis at a time. Since the movement is possible in both positive and negative directions on all three axes, the set of possible actions becomes six-fold in a 3-D environment. The most essential component of this navigation method is the reward function, as it determines the behavior of the robot. In this work, it is chosen to be the relative entropy (also known as Kullback–Leibler divergence) [
26] which represents the gain of information from one probability density function
P to another
Q, defined in Equation (
5). In our case,
P would be the probability density function obtained through the estimation part, and
Q the predicted probability density function that is calculated separately for each of the target points.
Thus, at each step, once the six possible target points are determined, the robot will first predict the concentration that would be measured in each of them, using the current estimation of the model parameters. Then, it calculates the potential update of the posterior probability function using the Bayesian formulation, which would result in the predicted posterior probability function
Q. Finally, it evaluates the gain of information at each point using Equation (
5).
The chosen target point leads the robot to the direction with the best gain of information. However, in the case where the robot does not have any information about the plume, i.e., no concentration is sensed, no direction would give more information than others. Hence, the robot tends to serially sample in all directions, and, as a result, it stays in the same region for a long time. To make the search more global, we need a component in the navigation strategy that promotes more exploration at the beginning, as well as leads the robot gradually towards the source when the estimation becomes more accurate. This indicating index can be seen in the evolution of the maximum a posteriori value of the source position.
Indeed, while the value of entropy is around maximum (i.e., very little information is available), the maximum a posteriori value, remains very unstable, but it converges towards the ground truth, similarly to the expected value, when the estimation becomes more accurate.
Therefore, we suggest to define the movement vector as a weighted sum of two 3-D vectors: the one leading towards the target point that provides more information, given by Kullback–Leibler divergence calculation,
and the vector that leads to the maximum a posteriori value of the source position
.
Intuitively, one would suggest that the algorithms should lead the robot towards the expected source position, as opposed to the maximum a posteriori. However, the expected source position never promotes exploration and might entrap the robot in an equilibrium. The coefficient
in Equation (
6) determines how global or local the search would be. Its optimal value is studied in
Section 3.
Specifications of the Navigation Method in the Local Framework
The above-mentioned navigation method is valid for both local and global framework. However, in the case of the local framework, we need to introduce additional algorithmic features in order to cope with the lack of global localization and environment map. A summary of all of them can be found in Algorithm 1.
As mentioned above, since the robot does not have the map of the environment, it has to generate a local one as a base for its estimation and navigation. The same procedure as the global framework is then applied to the local map with the exception that it is done in a smaller scale and in different steps. More concretely, the algorithm is run on one local map, and then the map is moved to another place where the robot restarts from the beginning. The cycle continues until the source is localized inside the map and thus the entropy of the posterior probability distribution drops below a fixed threshold.
Unlike in the global framework, where the shape of the environment is known for the robot, in the local framework, while navigating towards a target point to sample, the robot might encounter an obstacle, for instance a wall. In this work, for the sake of simplicity, when the robot encounters an obstacle on its way, it simply stops after having carried out an obstacle avoidance maneuver (in our case, we used a simple Braitenberg algorithm [
27]), and then sets the measurement of the target point to 0. In this way, that point and its neighboring points are not considered as sampling point candidates. Then, the cycle continues normally.
Another process that is special to the local framework is the way the robot decides to move from one local map to the successive one. Conceptually, this should happen when there is enough information for the robot to move on. However, since the error on odometry accumulates with every step, we wish that the robot resets all position based information (i.e., self-localization and samples positions) in the new local map to reduce the influence of odometry error on the performance. Therefore, moving to a new map means erasing all the previously acquired information. Hence spending too much time in one map would be a waste of time. Spending too little time, on the other hand, although it reduces the number of iterations, could be misleading. As a result, the maximum number of samples in the local map is one important factor that we will study in this paper.
Moreover, sampling the same number of points in all the local maps does not appear to be optimal. Depending on where the local map is located with respect to the source location and the plume in general, the robot might need to sample more, or fewer points. The case that illustrates this situation is when the source is located inside the local map. In this situation, we would want the robot to continue sampling in the same map until the end condition is satisfied, rather than move to another map (which will be most likely again around the source) and start over. The opposite case is when the local map is completely outside the plume and the robot would sense no concentration at any point of the map. In this situation, we would like the robot to move on to the next map as soon as possible, where it might have a better chance of obtaining relevant information.
Figure 2 illustrates the two mentioned examples.
Based on the arguments above, we adapted the number of samples in each map to the amount of information that we obtain through sampling. More specifically, we define an initial sample budget
that represents the number of samples that the robot would make if no information was available. The budget is then reduced every time one sample is made. However, this reduction is not the same at every iteration and depends on how valuable the newly obtained sample was. If the sample does not bring any information, i.e., the entropy stays the same as the previous iterations or even increases, the budget is reduced by 1. If the entropy drops, on the other hand, the budget would be reduced by less, depending on the amount of obtained information. Equation (
7) shows the mathematical definition of the budget reduction. Note that
can never be equal to 0 because, as explain in
Section 2.5, as soon as the entropy drops below a certain positive threshold, the algorithm stops. The optimal value of the initial sample budget
is studied further in this work.
Once we know when to move to a new map, we need to know the best point to move to. For the reasons mentioned in
Section 2.4, we believe that maximum a posteriori value of the final posterior probability density function would be the best option. Indeed, we need to explore the environment as much as possible when we receive very little information and we want the robot to get closer to the source step by step when it has enough information. Using relative entropy here would not be useful since, when we move to the new map, the previous points would already be forgotten. Therefore, pure exploitation here seems to be preferable. Thus, when the robot decides to move to a new map, it goes to the point indicated by the maximum a posteriori value of the posterior probability density function, which becomes the center of the new map. From that point, all the procedure starts over.
Figure 3 shows a graphical representation of the move from one local map to another.
In the same line of exploitation, when no concentration is sensed in one map, there is still one source of information that can be exploited, which is the wind direction. Therefore, to increase the exploitation, our method moves the map crosswind when the sum of all sensed concentrations is equal to 0. Whether to move left or right along the crosswind direction is chosen randomly in a uniform way, and can be changed at each iteration.
Algorithm 1: Local framework: when and how to change the position of the local map. |
|
2.5. End of Algorithm
As mentioned above, the algorithm stops when the uncertainty on the source parameter estimation becomes very low. More concretely, we calculate the entropy on the posterior probability function, which indicates the amount of uncertainty, and, when it goes below a certain threshold, the algorithm supposes to have enough information to stop and start the source declaration procedure.
Additionally, there is also a timeout that forces the algorithm to stop when the estimation takes more time than expected. In the global framework; this timeout is represented by a pre-established total number of iterations and, in the local framework, it corresponds to a given maximal number of local maps.
4. Discussion
The algorithm shows robustness to the tested environmental conditions in the global framework, but, in the local framework, it only works in high wind speed. We believe that it is because, in the global framework, the source is certainly located in the map, where the estimation takes place. In the local framework, the local map does not initially enclose the source, but, using the estimated posterior distribution, it supposedly guides the robot towards the source.
One advantage of the local framework that we did not address is the low travelled distance. In
Figure 15, the total travelled distances in both frameworks are shown in simulation in condition C. It appears that, in the local framework, on average, the robot moves three times less than in the global framework. The limited navigation strategy in the local framework, on the one hand, and the large exploration steps in the global framework, on the other hand, are the causes of the difference between the two frameworks.
In one of our previous works [
32], proposing a 3-D odor source localization algorithm based on Infotaxis, a similar structure to the present work was used. They both rely on an odor plume model, but, in the Infotaxis algorithm, the model with all its parameters is assumed to be known before the experiment. This assumption, while being applicable for a known environment, limits the usage of the method in real world scenarios where there is usually no a priori information about the plume. The STE-based algorithms, on the other hand, do not use a priori information on the parameters of the model and estimate the values based on the acquired data. The only predefined values are the limits on the range of each parameter values. Thus, by choosing realistic yet meaningful ranges, the algorithm could be run in different configurations.
It is worth mentioning that a few assumptions are included in this STE algorithm, as explained in
Section 2.3. However, they are different from a priori information which are meant to simplify the evaluation. In this work, by making assumptions, we did not seek to simplify the evaluation of the algorithm, but only the algorithm itself and its computation. For example, assuming that the plume is isotropic (i.e.,
) was meant to decrease the number of parameters to estimate, hence to simplify the computation. Such simplifications might not match with the real setup and thus the algorithm would work less efficiently. However, since the algorithm showed success with these limitations, we can expect that it would work more efficiently without them. The same reasoning is valid for the choice of the model, and the assumption based on which it is driven.
Another interesting difference between this work and our previously mentioned one [
32] is that, in that work, the estimation was done on the entire map of the environment (globally) while the navigation was done in small steps, i.e., more similar to our local framework. The results of that work showed lower performance on low wind speed as well compared to high wind speeds. However, the mean of error on source localization was lower in [
32] than in the present paper. Therefore, a combination of local navigation and global estimation seems to be more suitable for this type of algorithms.
5. Conclusions
We successfully developed an odor source localization algorithm based on the STE method. The algorithm was evaluated in two frameworks, global and local, in both simulation and physical reality. The impact of different algorithmic parameters were studied and the robustness of the method to different environmental configurations, namely wind speed and source release rate, was evaluated.
One of the advantages of this algorithm was its ability to cover the three sub-tasks of odor source localization problem. The enhanced navigation methods proposed in this paper allowed for an efficient search for the plume acquisition part. The amount of uncertainty associated with the estimation ensured the source validation sub-task as well.
Thanks to the MCMC method, the high computational cost related to probabilistic algorithms was remarkably reduced and the algorithm could be run entirely on a Khepera IV robot.
The successful evaluation of this method in the local framework, without any predefined environment map and any global positioning system, shows the high potential of this algorithm for unknown environments. However, its downside is the lack of robustness to different environmental conditions, which could be studied in future work.