1. Introduction
Image segmentation is a fundamental process of image processing wherein the background is separated from the region of interest. There are several segmentation techniques, usually grouped into methods based on the similarity between pixels in a region, among which thresholding techniques stand out, and methods based on edge detection [
1]. The latter include algorithms based on the use of convolution masks such as the Roberts [
2], Prewitt [
3], and Sobel–Feldman [
4] methods, and edge detectors such as those developed by Canny [
5] and Deriche [
6]. Notwithstanding their age, the Canny edge detection algorithm and its Deriche variant are still considered state-of-the-art filters and are widely used in diverse applications, particularly in computer vision [
7,
8] and even combined with neural networks (NNs) [
9].
Furthermore, recent advancements in deep learning have given rise to methods based on neural networks, with a predominant reliance on convolutional neural networks. Although NN-based techniques claim to outperform humans on small-scale datasets, they have significant limitations, including the requirement for large labeled datasets for training, sensitivity to noise, challenges in detecting fine edges due to convolutional layers emphasising more salient and larger details, high computational costs, and difficulties in generalizing to new contexts. Additionally, the majority of improvements in current architectures often come with a trade-off in terms of generalizability, resulting in reduced performance when confronted with shifts in the dataset distribution, encompassing changes in both the training and test sets. Consequently, the problem of edge detection remains far from being fully addressed, and interpretability in decision making is lacking [
10]. Phase congruency is another technique used for edge detection, although it is less well-known. Despite several recent works utilizing this method, such as its integration into pipelines for image registration or edge detection in various applications [
11,
12,
13], there have been no new developments concerning the principles of this technique, except those produced by the authors themselves [
14].
Edges can be defined as significant changes in the intensity level of an image that often occur at the boundaries of two different regions [
15].
In addition, significant changes in the brightness level of an image can correspond to ridges or valleys, and their study is important in applications such as fingerprint technologies. In this case, optimal detectors such as the Canny algorithm produce undesired results, yielding double edges, since they were not designed to detect this type of discontinuity [
16]. There are different techniques to detect edges, which, according to the domain wherein they work, can be classified as spatial, frequency, and wavelet methods [
1]. In the spatial domain, gradient or Laplacian-based techniques are used, corresponding to techniques that depend directly on the brightness changes in an image. In this category are the Roberts, Prewitt, and Sobel operators and the Canny detector [
17]. In the frequency domain, the properties of the different frequency components of the image are used; for example, in phase congruency, the coincidence of the phases of the different frequency components is quantified to detect edges [
18]. In the wavelet domain, multi-resolution images are used, which allows noise reduction and feature analysis at different scales [
19].
Phase congruency has the ability to detect different types of edges by modeling symmetric and antisymmetric signal behaviour. The ideal symmetric edge corresponds to a square signal, and in the antisymmetric case to a step signal. The former occurs when the intensity value changes abruptly and then returns to the original position. The second occurs when the intensity also changes abruptly from one value to another, but does not return to the original position. However, in practice, the abrupt changes are not discontinuous, and therefore the ideal boxcar-type and step-type edges correspond more precisely to a triangular signal or a ramp, respectively [
20], as illustrated in
Figure 1.
Phase congruency has the advantage of being able to identify the three different types of edges by making use of the phase of the frequency components. The phase is a feature that remains unaffected by changes in brightness or contrast, setting it apart from spatial-domain edge detection methods. This characteristic enables the detection of contours without being influenced by the signal magnitude at the edge. Several implementations of phase congruency have been published [
21,
22,
23], which can be grouped into two classes, taking into account the type of filter used to determine the phase. The first class employs directional filters at different scales using wavelets. The second class uses monogenic filters to improve the efficiency of the method, reducing the computational cost [
22]. Jacanamejoy et al. observed that phase congruency can be represented as the product of three factors, one of them being the quantization function of the phase congruency [
24]. In 2021, the authors of [
25] evaluated different quantization functions and their properties, finding that the most efficient were the exponential and the boxcar functions, depending on the type of edges present in the image. Therefore, in this work, a deeper study is presented, wherein the quantization function is represented by a mathematical equation that generalizes the exponential, Gaussian, and boxcar functions into one. Hence, we reduce the complexity of the PC parameterization, since it is not necessary to use different quantization functions, but simply adjust two parameters to shape the function.
This article is organized as follows:
Section 2 provides a brief description of phase congruency using monogenic filters and the importance of the quantification function for its calculation.
Section 3 presents the materials used for the evaluation of the proposed quantification function, which is presented in
Section 4. The results are presented in
Section 5, and finally, the conclusions are presented in
Section 6.
2. Monogenic Phase Congruency
Phase congruency was pioneered by the Italian scientist M.C. Morrone at the University of Perl in 1986 [
26], but it was not until 1996 that the Australian researcher Peter Kovesi first employed it in his doctoral thesis [
21]. This technique is grounded on the coincidence of the Fourier component phases of a signal when an edge is present, as depicted in
Figure 2, where the approximation of a square signal, shown in blue, is observed. Notably, the phases of the first four frequency components, represented by dotted lines, coincide in the presence of an edge.
One of the main advantages of this method is its capability to detect edges regardless of the image’s gray levels. This is achieved by considering that edges manifest when there is phase congruence among the frequency components.
This implies that, independent of contrast, if an edge is present, the phases of the various frequency components must coincide. Morrone and Owens introduced the mathematical definition of phase congruency, as shown in Equation (
1) [
27].
where the function
results from maximizing the expression for the weighted average of the local phase,
, at each point
x.
Equation (
1) enables an effective determination of phase congruency, leading to robust edge detection, independent of image intensity variations.
The calculation of the amplitude of the Fourier components,
, is essential for solving the optimization problem, especially considering that these components are closely spaced in practice. Initially, Kovesi employed wavelet filters to calculate the PC [
18]. However, this approach demanded the use of 24 filters to extract frequency components when working with four scales. As a consequence, the method’s inefficiency increased due to the necessity to compute all Fourier components at four different frequencies to assess whether there existed a phase coincidence.
Cognisant of the computational demands entailed by the utilization of wavelet filters for PC computation, Felsberg and Sommer introduced a more efficient alternative through the employment of monogenic filters [
28]. These filters enable the concurrent assessment of all directions using a single filter at a single scale, thereby significantly mitigating computational costs. For instance, when employing four scales, as shown in
Figure 3, the filter count diminishes from twenty-four to merely four. In light of these advancements, Kovesi implemented a software-based PC version that incorporated these filters. Consequently, the optimization problem posited in Equation (
1) was resolved by means of the following equation [
29]:
For analytical purposes, it can be decomposed into three components that signify frequency distribution weighting, phase congruency quantization, and noise compensation. This trichotomy not only facilitates comprehensive examination but also enables the consolidation of various technique variations through the utilization of the equation [
14]
where
represents the weighted frequency distribution,
the phase congruency quantization, and
the noise compensation. The second function,
, maps the mean phase deviation to a range between zero and one. These values correspond to image pixels belonging either to the background or to an edge, respectively.
Conceptually, the
factor holds the highest significance, as, ideally, it should stand alone, while the other two factors emerge due to pragmatic constraints stemming from approximations applied to the frequency components. Consequently, research endeavors have focused on the quantization function, with the aim of refining edge detection outcomes [
25].
As previously mentioned, phase congruency offers the ability to discern various types of edges. Consequently, the consideration arose to employ synthetic images for emulating phase congruency detection in challenging scenarios. In this endeavor, recognizing that the phase congruency quantization factor is pivotal for determining the extent of alignment required among frequency components within a pixel, it was observed that the segmentation outcome varies based on a mathematical function. This function, termed the “quantization function”, serves to quantify phase congruency, as suggested by its name. The crucial attributes of this function encompass being centered at the origin, exhibiting even symmetry, and possessing a global maximum value of one.
Moreover, through the manipulation of the quantization function’s structure, it becomes viable to enhance its edge detection capabilities. Consequently, Forero et al. undertook a comprehensive exploration of a set of mathematical functions possessing the attributes of a quantization function. The objective was to observe the impact of these functions on edge detection via phase congruency. The experimental findings revealed that the most favorable response was achieved when employing symmetric functions characterized by an aperture of 0.4, a concept formally defined in Definition 2. Among the array of functions investigated, the exponential function emerged as the top performer, followed by the quartic and the boxcar functions, as depicted in
Figure 4 [
25].
3. Materials and Methods
In order to evaluate the new quantification function introduced in this work, an image bank was built consisting of thirty grayscale samples of different sizes, twenty of which corresponded to photographs taken from a mobile phone and ten to synthetic images. To facilitate the analysis, objects with straight lines including the three types of edges studied and different levels of contrast were considered. A ground-truth image was created for each sample, showing the edges present in each.
The methods used in this study were written in Java as plugins to the freely available ImageJ software v1.54f. In
Figure 5, the method for obtaining the edge image in three steps is presented. Firstly, the PC of the input image was computed, adjusting the quantification function as necessary. The result was a grayscale image with values ranging from 0 to 1 in real numbers, which was then normalized to a range between 0 and 255. Secondly, the image was binarized using an empirically chosen global threshold of 70. Finally, in the third step, a morphological skeletonization operation was applied to obtain the final edge image.
The proposed method for applying phase congruency entailed the precise adjustment of a single, overarching quantification function. For function adjustment, two parameters were at play, capable of continuous variation within their valid ranges. This flexibility greatly aided in pinpointing the optimal function to employ, aligning with the specific characteristics of the processed image. The subsequent section elaborates on the proposed quantification function.
4. Generalized Quantization Function of Phase Congruency
The function used to quantify the phase congruence plays an important role in edge detection. By means of this factor, the coincidence of the frequency components of the PC is determined and thus evaluated based on a value, whether the pixel corresponds to an edge or not. This function is characterized by being centered at the origin and having even symmetry and a maximum value equal to one.
In a previous work, wherein different mathematical functions were studied for the quantification of PC, the exponential, quartic, and boxcar functions were found to give the best edge detection results [
25]. Therefore, in order to simplify the tuning of phase congruency parameters, it was necessary to develop a mathematical function that generalized the above functions by means of a shape parameter.
The domain of the quantization functions had to be , since, for the calculation of the average phase deviation, the atan2 function was used, which returned values in the abovementioned range. Thus, taking into account the characteristics that this function had to comply with, the following definition was proposed.
Definition 1 (Features of the quantification function). Let the function , defined on , be such that it satisfies all of the following conditions:
, then ;
has a global maximum equal to 1;
is non-decreasing ;
is non-increasing ;
.
In addition to the general characteristics of the quantification function, the concept of opening was introduced to ensure accurate scaling adjustment. This property was defined under the premise that this value corresponded to the scale of the function.
Definition 2 (Property: opening of a function)
. Let be an even function ; its opening interval is given by such that:where is known as the opening of the function. The opening enabled the comparison of results from different functions under the same scaling condition. In conventional functions, the aperture is not explicitly present. Hereafter, the functions that were unified into a generalized form are defined.
Definition 3 (Exponential function)
. Let the functionwithThen, it is said that the function is an exponential function for this work. Definition 4 (Gaussian function)
. Let the functionwithThen, it is said that the function is a Gaussian function for this work. Definition 5 (Boxcar function)
. Let the functionwithThen, it is said that the function is a boxcar function for this work. The aperture parameter did not directly appear in the definition of each function, except in the case of the boxcar function. In this case, the requirement to satisfy the opening property was directly reflected in the proposed definition. This was due to the fact that a typical boxcar function, based on its inherent definition, cannot adhere to the opening property.
In the generalization of the function proposed in this study, considering the case of the boxcar function, which involves discontinuities, it became necessary to employ the concept of pointwise convergence, as defined below.
Definition 6 (Pointwise convergence)
. Let be a subset of real numbers, and let be a sequence of functions defined on a set of real numbers X. Furthermore, let f be a function of X on . It is said that the sequence converges to a function f on X if, for all , the sequence of real numbers converges to , i.e., Proposed Function
Definition 7. Let the functiondefined as To obtain comparable results between the different quantization functions, a strategy was designed to make them take the value of at the same point in x.
It can also be observed that and .
Starting from the Function (
8) shown in Definition 7 and taking
, it follows that
from which the value of
k was obtained in terms of
, i.e.,
Substituting Equation (
9) into Equation (
8) produced the function
with
and
. Note that setting
in the function
resulted in the exponential function, given in Definition 3, and setting
resulted in the Gaussian function, given in Definition 4.
Proposition 1. Let , the sequence of functions be of the form (10), and the function be defined by (5). It is said that the sequence converges pointwise to on Ω if for every and for every , there exists a natural number such that whenever . Proof. Consider the cases below.
It is observed that
and so
Case 3:
Thus,
Therefore,
converged point-wise to the function
. □
Figure 6 shows three examples of the function family, obtained by modifying the
q parameter, showing how effectively all three types of behavior could be obtained by the proposed function given in (
10).
5. Results
To evaluate the new quantization function using the constructed database, a phase congruency code written in Java was modified for the freely available software ImageJ [
30].
In addition to processing the images with the PC, they were also treated with the Canny edge detector to compare the results obtained with both techniques. The Canny method was chosen since it is still considered state-of-the-art among the gradient-based edge detection techniques [
31,
32,
33,
34]. Finally, to evaluate the quality of the results, the Dice–Sørensen metrics (DS) [
35,
36] and the Abdou and Pratt figure of merit method (FOM) were used [
37].
Table 1 contains the DS indices and the FOM, obtained by comparing the 30 test images with the results obtained using the Canny edge detection and phase congruency methods. In addition to the proposed generalized function in this study, the boxcar and quartic functions were also employed to compute the PC values. The purpose was to facilitate a comparison of the results obtained using these functions. An explicit column for the exponential function was not included, since this case corresponded to the outcome obtained with the generalized function when
, as demonstrated in the example of Image 16. The cases highlighted in green represent the best results achieved, indicating that in the majority of the images, the proposed method produced better indices for edge detection.
It is important to underline that in cases where crest or valley edges were present, errors could be induced when interpreting the results obtained with the figure of merit. This occurred because in such situations, double edges were detected in the vicinity of the actual edge. Therefore, it was crucial to consider both metrics used in a complementary manner. For this reason, to select the optimal outcome in each row of
Table 1, the two indices were averaged, and the case with the highest average was highlighted in green.
In
Figure 7, a synthetic image composed of rectangles with varying dimensions can be observed. The results obtained by the different methods appeared to be similar, attributed to the absence of noise and the high contrast at the edges. However, the edge localization in Canny’s method was imprecise, as evident in
Table 1, where the Dice–Sørensen index is low, despite the high figure of merit.
Figure 8 depicts the outcomes obtained with the evaluation metrics for different values of the parameter
q. These results, utilizing Image 29, revealed that the variation in
q significantly impacted edge detection when
q was close to 2. The Dice–Sørensen index increased in such cases due to a more precise edge localization, reaching a maximum value of
, while the figure of merit remained high, around
. For Canny’s method, the Dice–Sørensen index yielded a value of
, and the figure of merit was
.
Figure 9 displays the outcomes achieved by segmenting the image of a shelf using all methods, showcasing the response to subtle variations in the grayscale. The result with the Canny detection method showed a loss of information at the edges that demarcated the depth of the shelf.
As observed in
Figure 10, the optimal edge detection occurred with the proposed function when the parameter
, indicating that the function approximated a Boxcar shape. This yielded the highest combined values of the Dice–Sørensen index and the figure of merit, with values of
and
, respectively. In contrast, for the Canny method, the Dice–Sørensen index was
, and the figure of merit was
.
Image 13 of the database, illustrated in
Figure 11, had very distinctive features, because other types of edges such as ridges and valleys were present. Canny’s method had a disadvantage in detecting double edges parallel to the real one, because this technique was not designed for the detection of ridges and valleys. On the other hand, the results with phase congruency were better because this technique was designed to detect all three types of edges. As observed in
Figure 12, the optimal edge detection occurred with the Boxcar function, closely followed by the proposed function. This happened because the proposed function approached the Boxcar as
, and the study was conducted only up to
.
In Image 30 of the database, as shown in
Figure 13, similar edges to those seen in Image 13 of
Figure 11 were present. It was evident that the Canny method produced double edges; however, in the results obtained with the phase congruency (PC) approach, this disadvantage in the detection response was not observed.
Figure 14 reveals that the best result was achieved when
, meaning the function approached a Boxcar shape. As can be seen, indeed, the outcome obtained with the Boxcar function was slightly worse, but the flexibility of the proposed function allowed for a superior result. On the contrary, it was observed that the Dice–Sørensen index obtained with Canny’s method was very low, below
, due to the error caused by the double edges in the ridges and valleys of the image. However, the figure of merit was higher, at
, though not reaching a very high value, as it was based on the calculation of the distance between the detected edge and the real edge.
From
Figure 8,
Figure 10,
Figure 12 and
Figure 14, the optimal values of
q were identified as
, 19,
, and 20, respectively. These results demonstrate that obtaining the best outcomes did not strictly require using a specific exponential, Gaussian, or boxcar function, as was carried out in previous works. Instead, intermediate values can be considered. For instance, in the case of Image 17, the best result occurred when
, which lies between 1 and 2. This clearly demonstrates the advantages of utilizing the proposed function, expanding the possibilities for edge detection by enabling the use of non-standardized quantification functions that have not previously been considered. Similarly, as observed in the results of
Table 1, in no case did the quartic function, which could not be obtained from the proposed function, yield the best result. This further underscores that the inclusion of such functions within the general proposed framework is unnecessary, reaffirming the generality of the proposed method.
6. Conclusions
Phase congruency is a technique that can be defined as the product of three factors, including the quantification function, which is crucial as its appropriate selection allows for better edge detection results. In this work, a general mathematical function was introduced that combined different families of functions into a single formula, enabling the adjustment of the shape, amplitude, size, and displacement of this function.
To validate the function, several images were selected based on their characteristics, aiming to include different types of edges, such as steps, crests, and valleys. The validation process confirmed that, since phase congruency detects step, crest, and valley edges, its results were often superior to those obtained with the Canny edge detection technique. It is worth noting that in cases where the Canny method achieved better results, the differences in the Dice–Sørensen index and the figure of merit between phase congruency and the Canny method were small. Conversely, in cases where phase congruency yielded better results, this difference was quite significant.