The Geographically Weighted Multivariate Poisson Inverse Gaussian Regression Model and Its Applications

Mardalena, Selvi; Purhadi, Purhadi; Purnomo, Jerry Dwi Trijoyo; Prastyo, Dedy Dwi

doi:10.3390/app12094199

Open AccessArticle

The Geographically Weighted Multivariate Poisson Inverse Gaussian Regression Model and Its Applications

Department of Statistics, Faculty of Science and Data Analytics, Institut Teknologi Sepuluh Nopember (ITS), Surabaya 60111, Indonesia

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2022, 12(9), 4199; https://doi.org/10.3390/app12094199

Submission received: 27 March 2022 / Revised: 13 April 2022 / Accepted: 19 April 2022 / Published: 21 April 2022

Download

Browse Figures

Versions Notes

Abstract

:

This study aims to develop a method for multivariate spatial overdispersion count data with mixed Poisson distribution, namely the Geographically Weighted Multivariate Poisson Inverse Gaussian Regression (GWMPIGR) model. The parameters of the GWMPIGR model are estimated locally using the maximum likelihood estimation (MLE) method by considering spatial effects. Therefore, the significance of the regression parameter differs for each location. In this study, four GWMPIGR models are evaluated based on the exposure variable and the spatial weighting function. We compare the performance of those four models in real-world application using data on the number of infant, under-5 and maternal deaths in East Java in 2019 using five predictor variables. In this study, the GWMPIGR model uses one exposure variable and three exposure variables. Compared to the fixed kernel Gaussian weighting function, the GWMPIGR model with the fixed kernel bisquare weighting function and one exposure variable has a better fit based on the AICc value. Furthermore, according to the best GWMPIGR model, there are several regional groups formed based on predictors that significantly affected each event in East Java in 2019.

Keywords:

infant, under-5, and maternal deaths; overdispersion; exposure; spatial analysis; GWMPIGR

1. Introduction

This study aims to develop a method for multivariate spatial overdispersion count data. In general, count data can be analyzed using Poisson regression models. However, the assumption of equidispersion in Poisson regression is difficult to fulfill. On the other hand, variance that exceeds the mean (overdispersion) is frequently found when analyzing real data [1,2]. Therefore, alternative methods are needed to model data with overdispersion conditions. Mixed Poisson models are often used as an alternative to Poisson regression models, such as the negative binomial (NBR) regression model [3,4,5] and Poisson-inverse Gaussian regression (PIGR) model [6,7,8]. In this study, the PIGR model was chosen because this model performs better when modeling data with high overdispersion. In this study, the PIGR model was developed into a multivariate model with two or more response variables [9,10].

The PIGR model is a global regression model that assumes each observation location is influenced by the same predictor. However, in some cases, effects of location cannot be ignored because each location has different characteristics, such as geography and culture, among others. Other spatial regression models, specifically a point-based spatial regression model for overdispersed count data, have been studied. The Geographically Weighted Bivariate Generalized Poisson Regression (GWBGPR) model was used by [11] to analyze the factors affecting the amount of infant and maternal mortality in East Java, Indonesia. Parameter estimation and hypothesis testing of the Geographically Weighted Multivariate Generalized Poisson Regression (GWMGPR) model has been studied by [12], theoretically. Both studies use the maximum likelihood estimation (MLE) method and Newton-Raphson (NR) iterative algorithm to estimate the model parameters. The NR methods require the second derivatives of the log-likelihood function with respect to each parameter in the model.

Generalized Poisson regression deals with problem related to over- or underdispersion. Meanwhile, the negative binomial regression (NBR) and PIGR models only deal with overdispersion. The Geographically Weighted Negative Binomial Regression (GWNBR) model was studied by [13], who applied it to simulation data and real data to show the superiority of the model compared to global regression models, such as the Poisson and negative binomial regression models. The iteratively reweighted least squares (IRLS) and NR methods are often used in parameter estimation. The GWNBR model was also used by [14] to model the confirmed COVID-19 cases in East Java. The model uses the adaptive bisquare Kernel weighting function and the parameter estimation is performed using a combination of the IRLS and NR methods. The results showed that COVID-19 spread quickly in locations with a high population density. The authors of [15] evaluated the Geographically Weighted Multivariate Negative Binomial Regression (GWMNBR) in a multivariate way, and the parameters were estimated by combination of the MLE and NR methods. The results showed that the GWMNBR method performed better than the global method.

As mentioned before, the PIGR method only deals with overdispersion, and according to [1] and other previous empirical studies [7,8,16], the PIGR model has larger variance than the NBR model. Therefore, the PIGR model can accommodate greater overdispersion than the NBR model can. The PIGR model has been developed into a multivariate model with two or more response variables [10,17]. Globally, the bivariate Poisson inverse Gaussian regression (BPIGR) model was developed by [9], while the multivariate Poisson inverse Gaussian regression (MPIGR) was developed by [10,18]. Locally, the Geographically Weighted Poisson Inverse Gaussian Regression (GWPIGR) model was created by [19]. For bivariate cases, Ref. [20] developed the Geographically Weighted Bivariate Poisson Inverse Gaussian Regression (GWBPIGR) model. To accommodate a combination of global and local parameter estimates, Ref. [21] developed an alternative GWBPIGR model, namely the Mixed Geographically Weighted Bivariate Poisson Inverse Gaussian Regression (MGWBPIGR) model. In all those studies, parameter estimation was carried out using the MLE and NR methods and the spatial kernel weighting function.

The purpose of this study was to develop a spatial based-point model for multivariate count data with PIG distribution, namely the Geographically Weighted Multivariate Poisson Inverse Gaussian Regression (GWMPIGR) model. The proposed model was developed based on the MPIGR model, which was developed by [10]. There are two main differences in the model that we developed in this study compared to those that have been developed in previous studies. First, by considering the probability of each occurrence in each research unit, several exposures were used for the case study in this study. We have previously studied univariance, via case study, using the PIGR model with exposure, which performed better than the PIGR model without exposure [9]. Second, we used the iterative Berndt-Hall-Hall-Hausman (BHHH) methods for parameter estimation. Compared to the NR method, the BHHH method does not need the second derivative of the log-likelihood function with respect to each parameter in the model. This second derivative has the potential to cause problems such as the Hessian matrix and may not be positively definite. It also typically results in a complex nonlinear function that needs to be solved, meaning that the expected value remains unknown. Therefore, the second derivative was eliminated, and a simpler computation with guaranteed convergence was obtained, the BHHH method [22,23].

For the real-world application of the proposed model, we analyzed one of the health problems in Indonesia, namely the high number of infant, under-5 and maternal deaths. Numerous studies have been conducted to produce effective policies to reduce the mortality rate of the three groups of individuals. This problem is also one of the targets in the third goal of the SDGs for 2030, living a healthy and prosperous life, which can partially be achieved by reducing the neonatal mortality rate to 12 per 1000 live births and the under-5 mortality rate by 25 per 1000 children under five in the middle of the same year. Furthermore, the maternal mortality rate should also be reduced to less than 70 per 100,000 live births.

Indonesia, the fourth most populous country in the world, is struggling to achieve the 2030 SDG targets for maternal and under-5 mortality. Based on the Indonesian Demographic and Health Survey (IDHS) in 2017, the Maternal Mortality Rate (MMR) was 305 per 100,000 live births. In 2020, the Infant Mortality Rate (IMR) in Indonesia was 17.6 per 1000 live births and the under-5 Mortality Rate (U5MR) was 23 per 1000 live births [24,25]. East Java is one of the cities in Indonesia with the highest number of infant, under-5 and maternal deaths compared to other regions in Indonesia. Therefore, it is necessary to study the causal factors to reduce infant, under-5 and maternal mortality in East Java.

In terms of the factors that affect infant, under-5 and maternal mortality, various aspects of life such as socioeconomic status, health facilities, environment, education and culture play equally important roles. Several factors are often the closest factor or the direct cause of the death. These factors include maternal factors, nutrition and disease control in infant and children under five, while bleeding and hypertension are the main causes of death in mothers [26,27]. In 2018, Ref. [28] conducted a study on the factors of neonatal mortality in Indonesia and found that low birth weight (LBW), giving birth to children too close together, medical personnel and postnatal/postpartum visits have a significant effect on neonatal mortality in Indonesia.

In this study, we use four GWMPIGR models, the exposure variable and the spatial weighting function. This study aims to compare on the performance of each model in modeling the number of infant, under-5 and maternal deaths and determine the factors that cause those events based on the best model. District/city level data with different characteristics (spatial heterogeneity) are used. Thus, the form of functions and parameters between locations are divergent. As a result, the parameter significance is also different for each location. Through the GWMPIGR model, districts/cities in Java can be grouped based on predictor variables that significantly affect the number of infant, under-5 and maternal deaths. This grouping is expected to help the relevant agencies make policies to solve this issue.

The specifications of the GWMPIGR model are described in Section 2, the real-world application of the GWMPIGR model is described in Section 3 and a discussion of the factors affecting the number of infant, under-5 and maternal deaths in East Java in 2019 based on the selected model will be discussed in Section 4.

2. Specifications of Geographically Weighted Multivariate Poisson Inverse Gaussian Regression

The MPIGR model is a model that was expanded from the PIGR model, which is a GLMs model with a logarithmic link function. Let (Y₁, Y₂, …, Y_m) ~ MPIG (μ₁, μ₂, …, μ_m), and the joint probability mass function is as follows:

P (y_{1}, y_{2}, \dots, y_{m}) = {(2 z π^{- 1})}^{\frac{1}{2}} e^{\frac{1}{τ}} K_{s} (z) {(z τ)}^{^{- \sum_{j = 1}^{m} y_{j}}} \prod_{j = 1}^{m} \frac{{(μ_{j})}^{y_{j}}}{y_{j}!}

(1)

with y_j = 0, 1, 2, … and

τ

is the dispersion parameter

(τ > 0)

,

s = \sum_{j = 1}^{m} y_{j} - \frac{1}{2}

,

z = \frac{1}{τ} \sqrt{1 + 2 τ \sum_{j = 1}^{m} μ_{j}}

, and

K_{s} (z) = K_{(\sum_{j = 1}^{m} y_{j}) - \frac{1}{2}} \frac{1}{τ} \sqrt{1 + 2 τ \sum_{j = 1}^{m} μ_{j}}

is the third modification of the Bessel function [8]. The properties of the PIGD can be found in [29,30]. The MPIGR model can be described as

E (Y_{j}) = μ_{j} = q_{j} \exp (x^{T} β_{j})

(2)

where j = 1, 2, …, m,

μ_{j}

is the predicted mean of the jth response variable,

q_{j}

is the exposure for the jth response variable, and τ is the overdispersion parameter.

β_{j}

is the regression parameter that corresponds to the predictor variable k, for k = 1, 2, …, p [10,18].

The GWMPIGR model is based on MPIGR model when it incorporates spatial or location-based aspects into the model. The GWMPIGR model is a point-based spatial model where the regression parameters depend on the geographic location. Therefore, each location has different functions and parameters. The GWMPIGR model for the jth response variable can be written as follows:

μ_{j} (u_{i}, v_{i}) = q_{j} e^{x_{}^{T} β_{j} (u_{i}, v_{i})}

(3)

where

(u_{i}, v_{i})

refer to the longitude and latitude coordinates of the ith location,

β_{j} (u_{i}, v_{i}) = {[\begin{matrix} β_{j 0} (u_{i}, v_{i}) & β_{j 1} (u_{i}, v_{i}) & \begin{matrix} . . . & β_{j p} (u_{i}, v_{i}) \end{matrix} \end{matrix}]}^{T}

is a regression parameter vector of the jth response variable for location i with the following dimension (p + 1) × 1.

For the GWMPIGR model, parameter estimation is performed using the maximum likelihood estimation (MLE) method and the Berndt-Hall-Hall-Hausman (BHHH) iteration method. At each observation location (point), parameter estimation is carried out using a spatial weighted matrix (W). The elements of matrix W show how much influence the observations at a particular location have on the surrounding observations (w_ii_*).

The joint probability mass function of Y_i₁, Y_i₂, …, Y_im with j = 1, 2, …, m, i = 1, 2, …, n and

(u_{i}, v_{i})

coordinates is as follows:

\begin{array}{l} P (y_{i 1}, y_{i 2}, \dots, y_{i m}) \\ = {(\frac{2}{π τ (u_{i}, v_{i})})}^{\frac{1}{2}} e^{\frac{1}{τ_{i}}} K_{s_{i}} (z (u_{i}, v_{i})) {(1 + 2 τ (u_{i}, v_{i}) \sum_{j = 1}^{m} μ_{j} (u_{i}, v_{i}))}^{- \frac{(2 \sum_{j = 1}^{m} y_{i j} - 1)}{4}} \prod_{j = 1}^{m} \frac{μ_{j} {(u_{i}, v_{i})}^{y_{i j}}}{y_{i j}!} \end{array}

(4)

with

s_{i} = \sum_{j = 1}^{m} y_{i j} - \frac{1}{2}

,

z (u_{i}, v_{i}) = \frac{1}{τ (u_{i}, v_{i})} \sqrt{1 + 2 τ (u_{i}, v_{i}) \sum_{j = 1}^{m} q_{i} e^{x_{i}^{T} β_{j} (u_{i}, v_{i})}}

,

K_{s_{i}} (z (u_{i}, v_{i})) = K_{\sum_{j = 1}^{m} y_{i j} - \frac{1}{2}} (\frac{1}{τ (u_{i}, v_{i})} \sqrt{1 + 2 τ (u_{i}, v_{i}) \sum_{j = 1}^{m} q_{i} e^{x_{i}^{T} β_{j} (u_{i}, v_{i})}})

, j = 1, 2, …, m,

y_{i j} > 0

, and

τ > 0

.

Let

θ_{i} = {[\begin{matrix} β_{1}^{T} (u_{i}, v_{i}) & β_{2}^{T} (u_{i}, v_{i}) & \dots & β_{m}^{T} (u_{i}, v_{i}) & τ (u_{i}, v_{i}) \end{matrix}]}^{T}

is the vector of parameter the GWMPIGR model. Then, the likelihood function for the GWMPIGR model is

L (θ_{i}, i = 1, 2, . . ., n) = \prod_{i = 1}^{n} P (y_{i 1}, \dots, y_{i m}; β_{1} (u_{i}, v_{i}), . . ., β_{m} (u_{i}, v_{i}), τ (u_{i}, v_{i}))

\begin{array}{l} = \prod_{i = 1}^{n} ({(\frac{2}{π})}^{\frac{1}{2}} e^{\frac{1}{τ (u_{i}, v_{i})}} {(\frac{1}{τ (u_{i}, v_{i})})}^{\frac{1}{2}} K_{s_{i}} (z_{i} (u_{i}, v_{i})) \times \\ {(1 + 2 τ (u_{i}, v_{i}) \sum_{j = 1}^{m} q_{i} e^{x_{i}^{T} β_{j} (u_{i}, v_{i})})}^{- \frac{(2 \sum_{j = 1}^{m} y_{i j} - 1)}{4}} \prod_{j = 1}^{m} \frac{{(q_{i} e^{x_{i}^{T} β_{j} (u_{i}, v_{i})})}^{y_{i j}}}{y_{i j}!}) \end{array}

(5)

Let

θ_{i^{*}} = {[\begin{matrix} β_{1}^{T} (u_{i^{*}}, v_{i^{*}}) & β_{2}^{T} (u_{i^{*}}, v_{i^{*}}) & \dots & β_{m}^{T} (u_{i^{*}}, v_{i^{*}}) & τ (u_{i^{*}}, v_{i^{*}}) \end{matrix}]}^{T}

is the vector of parameter for location i*. Then, the likelihood function to estimate the parameter at location i* with the spatial weighted matrix w_ii* is as follows:

L (θ_{i^{*}}) = \prod_{i = 1}^{n} {(P (y_{i 1}, y_{i 2}, \dots, y_{i m}; β_{1} (u_{i^{*}}, v_{i^{*}}), . . ., β_{m} (u_{i^{*}}, v_{i^{*}}), τ (u_{i^{*}}, v_{i^{*}})))}^{w_{i i *}}

\begin{array}{l} = \prod_{i = 1}^{n} (e^{\frac{1}{τ (u_{i^{*}}, v_{i^{*}})}} {(\frac{2}{π τ (u_{i^{*}}, v_{i^{*}})})}^{\frac{1}{2}} K_{s_{i}} (z_{i} (u_{i^{*}}, v_{i^{*}})) \\ {(1 + 2 τ (u_{i^{*}}, v_{i^{*}}) \sum_{j = 1}^{m} q_{i} e^{x_{i}^{T} β_{j} (u_{i^{*}}, v_{i^{*}})})}^{- \frac{(2 \sum_{j = 1}^{m} y_{i j} - 1)}{4}} {\prod_{j = 1}^{m} \frac{{(q_{i} e^{x_{i}^{T} β_{j} (u_{i^{*}}, v_{i^{*}})})}^{y_{i j}}}{y_{i j}!})}^{w_{i i *}} \end{array}

(6)

The log-likelihood function of Equation (6) is as follows:

\begin{array}{l} ℓ^{*} (θ_{i^{*}}) & = \log {\prod_{i = 1}^{n} {(P (y_{i 1}, \dots, y_{i m}; β_{1} (u_{i^{*}}, v_{i^{*}}), . . ., β_{m} (u_{i^{*}}, v_{i^{*}}), τ (u_{i^{*}}, v_{i^{*}})))}^{w_{i i *}}} \\ = \sum_{i = 1}^{n} w_{i i^{*}} \log (P (y_{i 1}, \dots, y_{i m}; β_{1} (u_{i^{*}}, v_{i^{*}}), . . ., β_{m} (u_{i^{*}}, v_{i^{*}}), τ (u_{i^{*}}, v_{i^{*}}))) \end{array}

\begin{array}{l} = \sum_{i = 1}^{n} (\frac{w_{i i^{*}}}{τ (u_{i^{*}}, v_{i^{*}})} + \frac{1}{2} \log (\frac{2}{π τ (u_{i^{*}}, v_{i^{*}})}) w_{i i^{*}} + \log K_{s_{i}} (z (u_{i^{*}}, v_{i^{*}})) w_{i i^{*}} - \\ (\frac{2 \sum_{j = 1}^{m} y_{i j} - 1}{4}) \log (1 + 2 τ (u_{i^{*}}, v_{i^{*}}) \sum_{j = 1}^{m} q_{i} e^{x_{i}^{T} β_{j} (u_{i^{*}}, v_{i^{*}})}) w_{i i^{*}} + \sum_{j = 1}^{m} y_{i j} \log (q_{i}) w_{i i^{*}} + \\ \sum_{j = 1}^{m} y_{i j} x_{i}^{T} β_{j} (u_{i^{*}}, v_{i^{*}}) w_{i i^{*}} - \sum_{j = 1}^{m} \log (y_{i j}!) w_{i i^{*}}) \end{array}

(7)

Several methods can be used to determine the spatial weights for each different location in the GWMPIGR model, including the kernel function. There are two types of the kernel functions: the fixed kernel function and the adaptive kernel function. The difference between those two functions lies in the bandwidth value (h). The bandwidth value for the fixed kernel function is the same for every location (h). Meanwhile, the bandwidth value for the adaptive kernel function is different for every location (h_i). In this study, we used fixed kernel functions, specifically the fixed Gaussian kernel functions and fixed bisquare kernel functions. The fixed Gaussian kernel function is formulated as follows:

w_{i i *} (u_{i}, v_{i}) = \exp (- \frac{1}{2} {(\frac{d_{i i *}}{h})}^{2})

(8)

Meanwhile, the fixed bisquare kernel function is

w_{i i *} (u_{i}, v_{i}) = {\begin{cases} {(1 - {(d_{i i *} / h)}^{2})}^{2}, untuk d_{i i *} \leq h \\ 0, untuk d_{i i *} > h \end{cases}

(9)

with

d_{i i *} = \sqrt{{(u_{i} - u_{i *})}^{2} + {(v_{i} - v_{i *})}^{2}}

is the eucliden distance between location

(u_{i}, v_{i})

and location

(u_{i *}, v_{i *})

and h is a smoothing parameter (bandwidth).

The bandwidth value is related to the accuracy of the model. A very small bandwidth value causes the variance to become larger. On the other hand, a large bandwidth value can cause a larger bias. Therefore, it is very important to choose the appropriate bandwidth value. According to [31], the selection of the optimal bandwidth value can be achieved using the cross validation (CV) criteria, which can be formulated mathematically as follows:

C V (h) = \min {\sum_{i = 1}^{n} {[y_{i} - {\hat{y}}_{(\neq i)} (h)]}^{T} [y_{i} - {\hat{y}}_{(\neq i)} (h)]}

(10)

The ML estimator that maximizes the log-likelihood function of the GWMPIGR model in Equation (7) is obtained by solving the system of equations for all of the first partial derivatives of the log-likelihood function for each parameter and then equating it with zero as follows:

\frac{\partial ℓ^{*} (θ_{i^{*}})}{\partial θ_{i^{*}}} = 0

(11)

The first-order partial derivatives of the log-likelihood function are as follows:

\frac{\partial ℓ (θ_{i^{*}})}{\partial β_{1} (u_{i^{*}}, v_{i^{*}})} = \sum_{i = 1}^{n} [y_{i 1} - M (y_{i 1}, y_{i 2}, \dots, y_{i m}) \exp (x_{i}^{T} β_{1} (u_{i^{*}}, v_{i^{*}}))] x_{i}^{T} w_{i i^{*}}

(12)

\frac{\partial ℓ (θ_{i^{*}})}{\partial β_{2} (u_{i^{*}}, v_{i^{*}})} = \sum_{i = 1}^{n} [y_{i 2} - M (y_{i 1}, y_{i 2}, \dots, y_{i m}) \exp (x_{i}^{T} β_{2} (u_{i^{*}}, v_{i^{*}}))] x_{i}^{T} w_{i i^{*}}

(13)

\frac{\partial ℓ (θ_{i^{*}})}{\partial β_{m} (u_{i^{*}}, v_{i^{*}})} = \sum_{i = 1}^{n} [y_{i m} - M (y_{i 1}, y_{i 2}, \dots, y_{i m}) \exp (x_{i}^{T} β_{m} (u_{i^{*}}, v_{i^{*}}))] x_{i}^{T} w_{i i^{*}}

(14)

\begin{array}{l} \frac{\partial ℓ (θ_{i^{*}})}{\partial τ (u_{i^{*}}, v_{i^{*}})} & = \frac{1}{τ^{2} (u_{i^{*}}, v_{i^{*}})} \sum_{i = 1}^{n} (M (y_{i 1}, y_{i 2}, \dots, y_{i m}) \times \\ (1 + τ (u_{i^{*}}, v_{i^{*}}) \sum_{j = 1}^{m} \exp (x_{i}^{T} β_{j} (u_{i^{*}}, v_{i^{*}}))) - 1 - \frac{1}{τ (u_{i^{*}}, v_{i^{*}})} \sum_{j = 1}^{m} y_{i j}) w_{i i *} \end{array}

(15)

where

M (y_{i 1}, y_{i 2}, \dots, y_{i m}) = \frac{1}{\sqrt{1 + 2 τ (u_{i^{*}}, v_{i^{*}}) \sum_{j = 1}^{m} q_{i} \exp (x_{i}^{T} β_{j} (u_{i^{*}}, v_{i^{*}}))}} \frac{K_{\sum_{j = 1}^{m} y_{i j} + \frac{1}{2}} (z (u_{i^{*}}, v_{i^{*}}))}{K_{\sum_{j = 1}^{m} y_{i j} - \frac{1}{2}} (z (u_{i^{*}}, v_{i^{*}}))}

(16)

The first derivative of the log-likelihood function of the GWMPIGR model parameters (Equations (12)–(15)) is non-closed-form equation. Thus, the estimator value is obtained through the Berndt-Hall-Hall-Hausman (BHHH) iteration method using the following algorithm:

Step-1: Determine the initial value of each parameter using the results of the MPIGR model parameter estimator.
Step-2: Define the gradient vector $g (θ_{i})$ with the first derivative (Equation (11)) as the element.
Step-3: Define the Hessian matrix $H (θ_{i})$ ,

$H (θ_{i}) = - [\sum_{i = 1}^{n} g_{i} (θ) {(g_{i} (θ))}^{T}]$

where $g_{i} (θ)$ is the individual gradient vector.
Step-4: Define the tolerance limits (ε = 10⁻³) and maximum iteration (t^* = 1000).
Step-5: Start the BHHH iteration using the following equation:

${\hat{θ}}_{i}^{(t + 1)} = {\hat{θ}}_{i}^{(t)} - H^{- 1} ({\hat{θ}}_{i}^{(t)}) g ({\hat{θ}}_{i}^{(t)}),$

(17)
Step-6: The iteration will stop at the t* iteration or at the value of $‖ {\hat{θ}}_{i}^{(t + 1)} - {\hat{θ}}_{i}^{(t)} ‖ \leq ε .$

Repeat this algorithm for each location i (i = 1, 2, …, n).

The simultaneous hypothesis testing of the GWMPIGR model parameters is carried out by testing H₀:

β_{j 1} (u_{i}, v_{i}) = β_{j 2} (u_{i}, v_{i}) = . . . = β_{j k} (u_{i}, v_{i}) = \dots = β_{j p} (u_{i}, v_{i}) = 0

versus H₁: at least one

β_{j k} (u_{i}, v_{i}) \neq 0

, where j = 1, 2, …, m; k = 1, 2, …, p; and i = 1, 2, …, n. The following statistical test used is:

G^{2} = 2 (ℓ (\hat{Ω}) - ℓ (\hat{ω}))

(18)

where

ℓ (\hat{Ω})

is the log-likelihood function for a parameter set under population

(Ω = {β_{1}^{} (u_{i}, v_{i}), β_{2}^{} (u_{i}, v_{i}), \dots, β_{m}^{} (u_{i}, v_{i}), τ (u_{i}, v_{i}); i = 1, 2, \dots, n})

and

ℓ (\hat{ω})

is the log-likelihood function for a parameter set under the null hypothesis

ω = {β_{01 ω} (u_{i}, v_{i}), β_{02 ω} (u_{i}, v_{i}), \dots, β_{0 m ω} (u_{i}, v_{i}), τ_{ω} (u_{i}, v_{i}); i = 1, 2, . . ., n}

.

The log-likelihood function of GWMPIGR model for parameter set under population is:

\begin{array}{l} ℓ (\hat{Ω}) & = \frac{n}{2} \log (\frac{2}{π}) + \sum_{i = 1}^{n} \frac{1}{\hat{τ} (u_{i}, v_{i})} - \frac{1}{2} \sum_{i = 1}^{n} \log \hat{τ} (u_{i}, v_{i}) + \sum_{i = 1}^{n} \log (K_{s_{i}} (z_{i} (u_{i}, v_{i}))) \\ - \sum_{i = 1}^{n} (\frac{2 \sum_{j = 1}^{m} y_{i j} - 1}{4}) \log (1 + 2 \hat{τ} (u_{i}, v_{i}) \sum_{j = 1}^{m} q_{i} e^{x_{i}^{T} {\hat{β}}_{j} (u_{i}, v_{i})}) + \\ \sum_{i = 1}^{n} \sum_{j = 1}^{m} y_{i j} \log (q_{i}) + \sum_{i = 1}^{n} \sum_{j = 1}^{m} y_{i j} x_{i}^{T} {\hat{β}}_{j} (u_{i}, v_{i}) - (\sum_{i = 1}^{n} \sum_{j = 1}^{m} \log (y_{i j}!)) \end{array}

(19)

The log-likelihood function of GWMPIGR model for parameter set under H₀ is:

\begin{array}{l} ℓ (\hat{ω}) & = \frac{n}{{\hat{τ}}_{\hat{ω}} (u_{i}, v_{i})} + \frac{n}{2} \log (\frac{2}{π {\hat{τ}}_{\hat{ω}} (u_{i}, v_{i})}) + \sum_{i = 1}^{n} \log (K_{s_{i}} (z_{\hat{ω}} (u_{i}, v_{i}))) - \sum_{i = 1}^{n} \frac{(2 \sum_{j = 1}^{m} y_{i j} - 1)}{4} \times \\ \log (1 + 2 τ_{\hat{ω}} (u_{i}, v_{i}) \sum_{j = 1}^{m} q_{i} e^{{\hat{β}}_{j 0 \hat{ω}} (u_{i}, v_{i})}) + \sum_{i = 1}^{n} \sum_{j = 1}^{m} y_{i j} \log (q_{i}) + \sum_{i = 1}^{n} \sum_{j = 1}^{m} y_{i j} {\hat{β}}_{j 0 \hat{ω}} (u_{i}, v_{i}) - \\ \sum_{i = 1}^{n} \sum_{j = 1}^{m} \log (y_{i j}!) \end{array}

(20)

The estimator value for the parameter set under H₀ are obtained using the same steps as those for estimating the parameters of the GWMPIGR model.

The G² test statistic (Equation (18)) has a

χ_{t r a c e (R)}^{2}

, where trace(R) is the number of effective parameters for the GWMPIGR model, in which the elements of the matrix R are formulated as follows:

r_{i} = X_{i} {(X^{T} W (u_{i^{*}}, v_{i^{*}}) A_{(t)} (u_{i^{*}}, v_{i^{*}}) X)}^{- 1} X^{T} W (u_{i^{*}}, v_{i^{*}}) A_{(t)} (u_{i^{*}}, v_{i^{*}})

The critical area for testing the regression parameter hypothesis can be determined for the GWMPIGR model simultaneously by rejecting H₀ if the value of

G_{}^{2} > χ_{α, t r a c e (R)}^{2}

[32,33].

This study will compare several models based on the exposure variable and the spatial weighting function used. The exposure variable in the GWMPIGR model in Equation (3) can consist of the same or different exposure levels for all of the response variables. Therefore, the best model will be selected using the corrected AIC (AICc) value:

A I C_{c} = - 2 L (θ_{M P I G R}) + 2 k^{*} + \frac{2 k^{*} (k^{*} + 1)}{n - k^{*} - 1}

(21)

where k* is the number of effective parameters in the GWMPIGR model, and

L (θ_{M P I G R})

is the likelihood function of the GWMPIGR model (Equation (5)).

3. Results

This section shows the fit of the GWMPIGR model on the number of infant, under-5 and maternal deaths in East Java in 2019. The data were sourced from the Public Health Office and Statistics Indonesia. The Health Office of Indonesia provides data on health indicators from the provincial and district/city levels, and these data are published regularly every year. Meanwhile, Statistics Indonesia provides data for all indicators (including health), and these data are published annually. District/city level data are provided by the Public Health Office and Statistics Indonesia for every province in Indonesia.

In this study, the data consisted of 38 districts/cities in the province of East Java Province, and there were three response variables: the number of infant deaths (Y₁), the number of under-5 deaths (Y₂) and the number of maternal deaths (Y₃); five predictor variables: the percentage of active integrated service post (X₁), the percentage of active family planning participants (X₂), the percentage of the population with BPJS health insurance (X₃), education index (X₄) and the percentage of household that has improved sanitation (X₅); and three exposure variables sourced from the Public Health Office and Statistics Indonesia. Four models will be used based on the exposure and the spatial weighting function, as presented in Table 1.

The first two GWMPIGR models only had one exposure variable, the number of live births, which was used by considering the definitions of the mortality rate for each individual group. Meanwhile, the third and fourth models use three exposure variables by considering the number of people who are at risk in each individual group. The best of the four models will be selected based on the AICc value. Then, the factors affecting the number of infant, under-5 and maternal deaths in East Java in 2019 are discussed based on the best model.

Figure 1, Figure 2 and Figure 3 are thematic maps that provide descriptions of the response variables. The number of deaths in each population is divided into four levels from the lowest to the highest number of deaths in the district/city.

Figure 1 and Figure 3 show that most of the districts/cities in East Java have a high number of infant and maternal deaths. Meanwhile, Figure 2 shows that the Probolinggo District and Surabaya City have the highest number of under-5 deaths. Furthermore, the summary statistics of the predictor variables and exposure variables can be seen in Table 2.

According to Table 2, the mean of the active integrated service post in East Java in 2019 was 80.95%, with a minimum percentage of 31%. Furthermore, the mean of active family planning participants is 75.4%, with a minimum percentage of 67.03%. This means around 33% of people do not participate in family planning. The mean of the population with BPJS health insurance is 50.47%, which means that the other half of population uses other types of health insurances. The mean of the education index is 0.628. The education index is calculated based on the expected years of schooling and the mean years of schooling. The higher the value of the education index, the better the quality of education and society, which will consequently improve the quality of life. As for the environment variables, there was a large gap in the percentage of households with improved sanitation, with the minimum percentage being 25.25%. This means that there are still many households with inadequate sanitation.

Furthermore, a multicollinearity test, which assumes that there is no relationship among predictor variables (mutually independent), was carried out as a requirement for the regression analysis. In this study, the variance inflation factor (VIF) criteria was used to identify multicollinearity among predictor variables. The last column in Table 2 shows that the VIF value of all of the predictor variables is lower than ten (VIF < 10). Thus, there are no multicollinearity issues among the predictor variables.

An initial examination of the relationship between the response and the predictor variables is important because it relates to the initial description of the relationships between variables found in the modeling. Based on Figure 4, the relationship between the predictor variable and the response variable seems non-linear. Log(Y₁/q₁) has the strongest correlation with X₃ (−0.23). Meanwhile, log(Y₂/q₂) has the strongest correlation with X₁ (−0.36) and log(Y₃/q₃) has the strongest correlation with X₂ (−0.29).

Several assumptions must be met before carrying out GWMPIGR modelling. Specifically, the response variable is not Poisson distributed, and there is spatial heterogeneity. Testing the distribution of the response variables was carried out using the Crockett test, and it was found that the response variables did not have a trivariate Poisson distribution (because there were three responses). This result is also supported by the issue of overdispersion on the response variables. The methods used to test the overdispersion are deviance per degree of freedom (deviance/df) and the Lagrange multiplier (LM) test, the test results of which are shown in Table 2.

Overdispersion exists if the value of deviance/df is greater than one and the LM method produces a value of

χ^{2} > Z_{α}^{}

or a p-value of <α. Based on Table 3, there are cases of overdispersion on the response variable. Therefore, the GWMPIGR model can be used to model data on the number of infant, under-5 and maternal deaths in East Java in 2019. Furthermore, the spatial heterogeneity test was carried out by using the Glejser test and it was found that there is spatial heterogeneity when the test statistics value G = 210.5993 is larger than

χ_{0.05; 15}^{2} = 24.996

.

The parameter of the GWMPIGR model were estimated locally using spatial weighting so that each observation location had a different parameter estimate value. Spatial weighting represents the location between observations. The closer an observation location is to another observation location, the greater the weighting value and the greater the influence on these observations. In this study, spatial weighting was determined using a fixed Gaussian and bisquare kernel function. The optimum bandwidth, minimum CV and AICc values for each model are presented in Table 4.

Based on Table 4, model 2 has the lowest CV minimum value. However, this does not mean that model 2 is the best model. The CV value is used to determine the optimum bandwidth for the spatial weight matrix. Therefore, the CV values in Table 4 refer to the minimum CV value of the optimum bandwidth for each model. On the other hand, the AICc can be used to evaluate how well a model fits the data and to determine the best model from multiple models for the same dataset. AICc uses the MLE of the model (log-likelihood) as a measure of fit. The model that fits the data the best has the maximum likelihood. Therefore, the model with high log-likelihood has a low AICc value. Based on Table 4, model 1 has the smallest AICc value. Thus, in this study, the GWMPIGR model with the fixed Gaussian kernel spatial weighting function and the number of live births as the exposure is the best model to determine the number of infant, under-5 and maternal deaths in East Java in 2019. Therefore, only model 1 will be discussed further.

The value of the parameter estimator of the GWMPIGR model is different for each observation location, as is the significance of the parameters. The GWMPIGR model generated 722 coefficient estimates for 38 districts/cities in the province of East Java. Simultaneous testing of the GWMPIGR model parameters shows that at least one parameter has a significant effect on the response variable (G² = 384.17 >

χ_{0.05; 12}^{2} = 21.026

). Meanwhile, partial hypothesis testing was carried out on the parameters of the GWMPIGR model locally for each observation’s location. For example, the parameter estimates for the GWMPGR model with the fixed Gaussian kernel spatial weighted function for Banyuwangi District (i = 10) and Surabaya City (i = 37) can be seen in Table 5.

Table 5 shows that there are differences in the parameter coefficient values and in the parameter significance for Banyuwangi District and Surabaya City. Some of the predictors that were significant in Surabaya City were not significant in Banyuwangi District. Variable X₃ and X₅ did not significantly affect Y₁ in the Banyuwangi District, while all of the predictors significantly affected Y₁ in Surabaya City. Variables X₁, X₃ and X₅ did not significantly affect Y₂ in the Banyuwangi District, while only variable X₃ did not significantly affect Y₂ in Surabaya City. Meanwhile, similar parameter significance was found for the variable Y₃ More specifically, variable Y₃ is equally influenced by variables X₂ and X₄ in the Banyuwangi District and Surabaya City. The significance of the parameters of the GWMPIGR model in other regencies/cities in the province of East Java that are different is that they can then be formed into several regional groups according to the predictors that significantly affect the number of infant, under-5 and maternal deaths. This grouping will be discussed further in the Discussion section.

To provide an example, the GWMPIGR model for Surabaya City based on Table 5 is presented as follows:

\begin{array}{l} {\hat{μ}}_{1} (u_{37}, v_{37}) = q_{37}^{} \exp (- 4.729 + 0.010 X_{1, 37} + 0.009 X_{2, 37} + 0.015 X_{3, 37} - 3.680 X_{4, 37} - 0.003 X_{5, 37}) \\ {\hat{μ}}_{2} (u_{37}, v_{37}) = q_{37}^{} \exp (1.249 - 0.042 X_{1, 37} - 0.114 X_{2, 37} - 0.002 X_{3, 37} + 7.195 X_{4, 37} - 0.011 X_{5, 37}) \\ {\hat{μ}}_{3} (u_{37}, v_{37}) = q_{37}^{} \exp (0.059 + 0.009 X_{1, 37} - 0.123 X_{2, 37} - 0.014 X_{3, 37} + 3.653 X_{4, 37} + 0.0003 X_{5, 37}) \end{array}

The value of the regression parameter coefficients in the model above shows the magnitude of the change in the average number of infant, under-5 and maternal deaths in Surabaya due to the influence of each predictor variable. The interpretation of the model will be discussed further in the discussion section.

4. Discussion

Model 1’s performance for Surabaya City was interpreted based on each predictor variable. Increasing the percentage of active integrated service posts (X₁) will increase the average number of infant deaths (Y₁), reduce the average number of under-5 deaths (Y₂) and increase the average number of maternal deaths (Y₃). However, X₁ does not significantly affect the number of maternal deaths (Y₃).

Increasing the percentage of active family planning participants (X₂) will increase the average number of infant deaths (Y₁), reduce the average number of under-5 deaths (Y₂) and reduce the average number of maternal deaths (Y₃). Increasing the population with BPJS health insurance (X₃) will increase the average number of infant deaths (Y₁), reduce the average number of under-5 deaths (Y₂) and reduce the average number of maternal deaths (Y₃). However, X₃ does not significantly affect the number of under-5 deaths (Y₂) or the number of maternal deaths (Y₃).

Increasing in education index (X₄) will reduce the average number of infant deaths (Y₁), increase the average number of under-5 deaths (Y₂) and increase the average number of maternal deaths (Y₃). Moreover, increasing the percentage of households with improved sanitation (X₅) will reduce the average number of infant deaths (Y₁), reduce the average number of under-5 deaths (Y₂) and increase the average number of maternal deaths (Y₃). However, X₅ does not significantly affect the number of maternal deaths (Y₃).

Table 6 shows the differences between correlation and GWMPIGR modelling for Surabaya City. Some predictors have a consistent relationship with correlation, and some do not. As we can see that the correlation between the number of under-5 deaths (Y₂) and education index (X₄) or the percentage of household that has improved sanitation (X₅) is not the same as that observed in the GWMPIGR modelling. Additionally, the correlation between the number of under-5 deaths (Y₃) and the percentage of the population with BPJS health insurance (X₃) is not consistent with the GWMPIGR modelling. However, X₃ shows that it is not significantly correlated with Y₃ and that it does not significantly affect Y₃. However, the pattern of the relationship between the response variable and the predictor variable is different for each location.

Furthermore, the results of the parameter testing for the GWMPIGR model 1 produced several regional groups based on significant predictors, which are presented in Table 7.

There are three regional groups according to the significant predictors for the number of infant deaths and the number of under-5 deaths, while there are two regional groups based on significant predictors for the number of maternal deaths. For the number of infant deaths (Y₁), 16 districts/cities in East Java are affected by all of the predictor variables, 21 districts/cities are influenced by the percentage of active integrated service post (X₁), the percentage of active family planning participants (X₂), the percentage of the population with BPJS health insurance (X₃), education index (X₄) and only one district/city is affected by the percentage of active integrated service post (X₁), the percentage of active family planning participants (X₂) and education index (X₄).

For the number of under-5 deaths (Y₂), 11 districts/cities in East Java are affected by the percentage of active integrated service post (X₁), the percentage of active family planning participants (X₂), education index (X₄) and the percentage of households that have improved sanitation (X₅), 26 districts/cities are influenced by variables the percentage of active integrated service post (X₁), the percentage of active family planning participants (X₂) and education index (X₄) and only one district/city is affected by the percentage of active family planning participants (X₂) and education index (X₄). Moreover, for the number of maternal deaths (Y₃), 36 districts/cities in East Java are affected by the percentage of active integrated service post (X₁), the percentage of active family planning participants (X₂) and education index (X₄) and two districts/cities are affected by the variables the percentage of active family planning participants (X₂) and education index (X₄). Visually, the groupings are presented in the thematic maps shown in Figure 5, Figure 6 and Figure 7.

Based on Figure 5, Figure 6 and Figure 7, the GWMPIGR modeling forms several regional groups. Each group is assumed to have the same characteristics, so the predictors that significantly affect each event are also the same. Thus, this will make it easier for the government to make policies to decrease the number of infant, under-5 and maternal deaths. For example, based on Figure 7, the percentage of active family planning participants and the education index affect the number of maternal deaths in the Pamekasan and Sumenep districts. Therefore, those two districts should focus on increasing the number of people using family planning services and on improving the quality of education. The same applies to other regional groups.

5. Conclusions

The GWMPIGR model is a point-based spatial regression model where the parameter estimator is influenced by location effects. In this study, four GWMPIGR models were constructed based on exposure and spatial weighting functions: the fixed Gaussian kernel function and the Bisquare fixed kernel function. Based on the AICc value, the GWMPIGR model with the fixed Gaussian kernel weighting function and the number of live births as the exposure was better at modeling data on the number of infant, under-5 and maternal deaths in East Java in 2019. Several regional groups in East Java were formed based on predictors that significantly affect each event. However, further research, such as simulation studies, is needed to evaluate the proposed method.

Author Contributions

Conceptualization, S.M., P.P., J.D.T.P. and D.D.P.; methodology, P.P. and J.D.T.P.; software S.M. and D.D.P.; validation, P.P. and D.D.P.; formal analysis, S.M. and D.D.P.; data curation. S.M.; writing original draft preparation, S.M.; writing—review and editing, P.P., J.D.T.P. and D.D.P.; supervision, P.P., J.D.T.P. and D.D.P.; project administration, P. All authors have read and agreed to the published version of the manuscript.

Funding

This research and the APC were funded by the Ministry of Education and Culture (Kemendikbud) of the Republic of Indonesia with grant number 969/PKS/ITS/2021.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Acknowledgments

All authors thank the editor and reviewers for providing helpful comments and suggestions to improve this paper.

Conflicts of Interest

The authors declare no conflict of interest.

References

Hilbe, J.M. Modeling Count Data; Cambridge University Press: New York, NY, USA, 2014; ISBN 9780333227794. [Google Scholar]
Cameron, A.C.; Trivedi, P.K. Regression Analysis of Count Data, 2nd ed.; Cambridge University Press: New York, NY, USA, 2013; ISBN 9781139013567. [Google Scholar]
Hilbe, J.M. Negative Binomial Regression; Cambridge University Press: New York, NY, USA, 2007; ISBN 9780511811852. [Google Scholar]
Hutchinson, M.K.; Holtman, M.C. Analysis of Count Data Using Poisson Regression. Res. Nurs. Health 2005, 28, 408–418. [Google Scholar] [CrossRef] [PubMed]
Stasinopoulos, D.M.; Rigby, R.A. Generalized Additive Models for Location Scale and Shape (GAMLSS) in R. J. Stat. Softw. 2007, 23, 1–46. [Google Scholar] [CrossRef] [Green Version]
Dean, C.; Lawless, J.F.; Willmot, G.E. A Mixed Poisson-Inverse-Gaussian Regression Model. Can. J. Stat. 1989, 17, 171–181. [Google Scholar] [CrossRef]
Zha, L.; Lord, D.; Zou, Y. The Poisson Inverse Gaussian (PIG) Generalized Linear Regression Model for Analyzing Motor Vehicle Crash Data. J. Transp. Saf. Secur. 2016, 8, 18–35. [Google Scholar] [CrossRef] [Green Version]
Willmot, G.E. The Poisson-Inverse Gaussian Distribution as an Alternative to the Negative Binomial. Scand. Actuar. J. 1987, 1987, 113–127. [Google Scholar] [CrossRef]
Mardalena, S.; Purhadi; Purnomo, J.D.T.; Prastyo, D.D. A Modified Inverse Gaussian Poisson Regression with an Exposure Variable to Model Infant Mortality. In Proceedings of the International Conference on Soft Computing in Data Science, Virtual Event, 2–3 November 2021; Springer: Singapore, 2021; pp. 286–300. [Google Scholar]
Mardalena, S.; Purhadi, P.; Purnomo, J.D.T.; Prastyo, D.D. Parameter Estimation and Hypothesis Testing of Multivariate Poisson Inverse Gaussian Regression. Symmetry 2020, 12, 1738. [Google Scholar] [CrossRef]
Purhadi; Sutikno; Berliana, S.M.; Setiawan, D.I. Geographically Weighted Bivariate Generalized Poisson Regression: Application to Infant and Maternal Mortality Data. Lett. Spat. Resour. Sci. 2021, 14, 79–99. [Google Scholar] [CrossRef]
Berliana, S.M.; Purhadi; Sutikno; Rahayu, S.P. Parameter Estimation and Hypothesis Testing of Geographically Weighted Multivariate Generalized Poisson Regression. Mathematics 2020, 8, 1523. [Google Scholar] [CrossRef]
da Silva, A.R.; Rodrigues, T.C.V. Geographically Weighted Negative Binomial Regression-Incorporating Overdispersion. Stat. Comput. 2014, 24, 769–783. [Google Scholar] [CrossRef]
Fitriani, R.; Gede Nyoman Mindra Jaya, I. Spatial Modeling of Confirmed COVID-19 Pandemic in East Java Province by Geographically Weighted Negative Binomial Regression. Commun. Math. Biol. Neurosci. 2020, 2020, 4874. [Google Scholar] [CrossRef]
Dewi, Y.S.; Purhadi; Sutikno; Purnami, S.W. Evaluation of Geographically Weighted Multivariate Negative Binomial Method Using Multivariate Spatial Infant Mortality Data. J. Phys. Conf. Ser. 2019, 1397, 012077. [Google Scholar] [CrossRef]
Shoukri, M.M.; Asyali, M.H.; Vandorp, R.; Kelton, D. The Poisson Inverse Gaussian Regression Model in the Analysis of Clustered Counts Data. J. Data Sci. 2004, 2, 17–32. [Google Scholar] [CrossRef]
Mardalena, S.; Purhadi; Purnomo, J.T.D.; Prastyo, D.D. Bivariate Poisson Inverse Gaussian Regression Model with Exposure Variable: Infant and Maternal Death Case Study. J. Phys. Conf. Ser. 2021, 1752. [Google Scholar] [CrossRef]
Ghitany, M.E.; Karlis, D. An EM Algorithm for Multivariate Mixed Poisson. Appl. Math. Sci. 2012, 6, 6843–6856. [Google Scholar]
Purnamasari, I.; Latra, I.N. Parameter Estimation and Statistical Test in Modeling Geographically Weighted Poisson Inverse Gaussian Regression. In Proceedings of the International Conference on Education in Mathematics, Science & Technology (ICEMST), Bodrum, Turkey, 19–22 May 2016; pp. 16–17. [Google Scholar]
Amalia, J.; Purhadi; Otok, B.W. Parameter Estimation and Statistical Test of Geographically Weighted Bivariate Poisson Inverse Gaussian Regression Models. AIP Conf. Proc. 2017, 1905, 050005. [Google Scholar] [CrossRef]
Arniva, N.S.; Purhadi; Sutikno. Parameter Estimation and Statistical Test in Mixed Model of Geographically Weighted Bivariate Poisson Inverse Gaussian Regression. In Proceedings of the 2018 International Symposium on Advanced Intelligent Informatics (SAIN), Yogyakarta, Indonesia, 29–30 August 2018; pp. 62–65. [Google Scholar] [CrossRef]
Berndt, E.K.; Hall, B.H.; Hall, R.E.; Hausman, J.A. Estimation and Inference in Nonlinear Structural Models *. Ann. Econ. Soc. Meas. 1974, 3, 653–665. [Google Scholar]
Greene, W.H. Econometric Analysis; Prentice Hall: Hoboken, NJ, USA, 2003; Volume 97, ISBN 0130661899. [Google Scholar]
BPS-Statistics Indonesia. Statistical Yearbook of Indonesia 2022; BPS-Statistics Indonesia: Jakarta, Indonesia, 2022.
United Nations Inter-agency Group for Child Mortality Estimation (UNIGME). Levels & Trends in Child Mortality: Report 2021, Estimates Developed by the United Nations Inter-Agency Group for Child Mortality Estimation; United Nations Children’s Fund: New York, NY, USA, 2021; ISBN 9789280653212. [Google Scholar]
Mosley, W.H.; Chen, L.C. An Analytical Framework for the Study of Child Survival in Developing Countries. Popul. Dev. Rev. 1984, 10, 25–45. [Google Scholar] [CrossRef] [Green Version]
McCarthy, J.; Maine, D. A Framework for Analyzing the Determinants of Maternal Mortality. Stud. Fam. Plann. 1992, 23, 23–33. [Google Scholar] [CrossRef]
Titaley, C.R.; Dibley, M.J.; Agho, K.; Roberts, C.L.; Hall, J. Determinants of Neonatal Mortality in Indonesia. BMC Public Health 2008, 8, 232. [Google Scholar] [CrossRef] [Green Version]
Stein, G.Z.; Juritz, J.M. Bivariate Compound Poisson Distributions. Commun. Stat. Theory Methods 1987, 16, 3591–3607. [Google Scholar] [CrossRef]
Taylor, P.; Stein, G.Z.; Zucchini, W.; Juritz, J.M.; Stein, G.Z.; Zucchini, W.; Juritz, J.M. Multivariate Extension Parameter Estimation for the Sichel Distribution and Its Multivariate Extension. J. Am. Stat. Assoc. 2012, 82, 938–944. [Google Scholar] [CrossRef]
Fotheringham, A.S.; Brunsdon, C.; Charlton, M. Geographical Wighted Regression: The Analysis of Spatially Varying Relationships; John Wiley & Sons, Ltd.: Chichester, UK, 2002; ISBN 0471496162. [Google Scholar]
Nakaya, T.; Fotheringham, A.S.; Brunsdon, C.; Charlton, M. Geographically Weighted Poisson Regression for Disease Association Mapping. Stat. Med. 2005, 24, 2695–2717. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Pawitan, Y. In All Likelihood: Statistical Modelling and Inference Using Likelihood; Oxford University Press Inc.: New York, NY, USA, 2001. [Google Scholar]

Figure 1. Distribution of the number of infant deaths in East Java in 2019.

Figure 2. Distribution of the number of under-5 deaths in East Java in 2019.

Figure 3. Distribution of the number of maternal deaths in East Java in 2019.

Figure 4. The matrix plot of the response variable and predictor variables.

Figure 5. Regional group based on factors that influence the number of infant deaths in East Java in 2019.

Figure 6. Regional group based on factors that influence the number of under-5 deaths in East Java in 2019.

Figure 7. Regional group based on factors that influence the number of maternal deaths in East Java in 2019.

Table 1. Model Specifications.

Model	Number of Exposure	Exposure	Spatial Weighting Function
1	1	q: The number of live births	Gaussian Kernel fixed function
2	1	q: The number of live births	Bisquare Kernel fixed function
3	3	q₁: The number of live births q₂: The number of people aged 1-4 years old q₃: The number of pregnant women	Gaussian Kernel fixed function
4	3		Bisquare Kernel fixed function

Table 2. Summary statistics of predictor variables and exposure variables.

Variable	Mean	Standard Deviation	Minimum	Maximum	VIF
The percentage of active integrated service post (X₁)	80.95	14.91	31.00	98.00	1.871
The percentage of active family planning participants (X₂)	75.40	3.39	67.03	82.97	1.186
The percentage of the population with BPJS health insurance (X₃)	50.47	10.70	32.83	74.87	1.998
Education index (X₄)	0.628	0.075	0.490	0.770	2.999
The percentage of household that has improved sanitation (X₅)	72.20	18.87	25.51	97.43	3.104
The number of live birth (q₁)	15,240	10,370	2012	44,378
The number of under-5 population (q₂)	14,787	10,008	2079	41,646
The number of pregnant women (q₃)	59,252	39,641	8013	168,060

Table 3. Overdispersion test.

Variable	Deviance/df	LM
Variable	Deviance/df	χ²	p-Value
Y₁	38.78	211,028.5	0.000 *
Y₂	8.13	1336.02	0.000 *
Y₃	5.44	3532.65	0.000 *

* Significant at α = 5%.

Table 4. Bandwidth optimum, CV minimum and AICc for each model.

Model	Bandwidth Optimum	CV Minimum	AICc
1	0.975566	505,601.5	64,309.95
2	2.043404	496,850.6	64,445.95
3	0.880499	529,872.7	74,930.46
4	1.834768	524,106.5	74,962.95

: The best model.

Table 5. Parameter estimate of the GWMPGR model with the fixed Gaussian kernel spatial weighted function for Banyuwangi District (i = 10) and Surabaya City (i = 37).

Parameter	Banyuwangi District (i = 10)			Surabaya City (i = 37)
Parameter	Estimate	Standard Error	Z Value	Estimate	Standard Error	Z Value
$β_{01} (u_{i}, v_{i})$	−4.828	0.007138	−676.437 *	−4.729	0.074794	−63.238 *
$β_{11} (u_{i}, v_{i})$	0.007	0.002722	2.689 *	0.010	0.001115	8.738 *
$β_{21} (u_{i}, v_{i})$	0.015	0.004518	3.397 *	0.009	0.001725	5.119 *
$β_{31} (u_{i}, v_{i})$	0.008	0.007047	1.110	0.015	0.001777	8.597 *
$β_{41} (u_{i}, v_{i})$	−3.686	0.008304	−443.894 *	−3.680	0.061363	−59.976 *
$β_{51} (u_{i}, v_{i})$	0.0006	0.003648	0.175	−0.003	0.001202	−2.139 *
$β_{02} (u_{i}, v_{i})$	1.242	0.00081	1532.867 *	1.249	0.005383	232.158 *
$β_{12} (u_{i}, v_{i})$	−0.025	0.015424	−1.595	−0.042	0.003436	−12.198 *
$β_{22} (u_{i}, v_{i})$	−0.159	0.037767	−4.225 *	−0.114	0.007923	−14.398 *
$β_{32} (u_{i}, v_{i})$	0.048	0.062046	0.769	−0.002	0.010041	−0.176
$β_{42} (u_{i}, v_{i})$	7.193	0.000808	8902.213 *	7.195	0.005621	1280.048 *
$β_{52} (u_{i}, v_{i})$	−0.020	0.018594	−1.083	−0.011	0.003638	−2.918 *
$β_{03} (u_{i}, v_{i})$	0.046	0.00162	28.508 *	0.059	0.009337	6.349 *
$β_{13} (u_{i}, v_{i})$	0.014	0.014956	0.962	0.009	0.006989	1.405
$β_{23} (u_{i}, v_{i})$	−0.142	0.020068	−7.092 *	−0.123	0.006156	−19.954 *
$β_{33} (u_{i}, v_{i})$	0.023	0.028089	0.826	−0.014	0.01074	−1.298
$β_{43} (u_{i}, v_{i})$	3.652	0.001021	3574.817 *	3.653	0.007974	458.116 *
$β_{53} (u_{i}, v_{i})$	−0.010	0.018	−0.569	0.0003	0.00712	0.004
$τ (u_{i}, v_{i})$	41.709	0.000228	182,747 *	41.706	0.002148	19,415.9 *

* Significant at α = 5%.

Table 6. The comparison of signs between correlation and regression parameters coefficient of the GWMPIGR model for Surabaya City.

Response Variable	Predictors	Cor (Y/q, X)	GWMPIGR Modeling
Response Variable	Predictors	Cor (Y/q, X)	X₁	X₂	X₃	X₄	X₅
Y₁	X₁	+ *	+ *
	X₂	+		+ *
	X₃	+			+ *
	X₄	- *				- *
	X₅	-					- *
Y₂	X₁	- *	- *
	X₂	-		- *
	X₃	-			-
	X₄	-				+ *
	X₅	+					- *
Y₃	X₁	+	+ *
	X₂	-		- *
	X₃	+			-
	X₄	+				+ *
	X₅	+ *					+

* Significant at α = 5%; + Positive relationship; - Negative relationship.

Table 7. Regional groups based on significant predictors by the GWMPIGR model 1.

Response Variable	Number of Groups	District/City Order Number	Total Number District/City	Significant Predictor
Y₁	3	6–8, 14–18, 25, 30, 32–35, 38	16	X₁, X₂, X₃, X₄, X₅
		1–5, 9, 11–13, 19–24, 26–29, 31, 36	21	X₁, X₂, X₃, X₄
		10	1	X₁, X₂, X₄
Y₂	3	1–3, 9, 11–12, 19–21, 29, 36	11	X₁, X₂, X₄, X₅
		4–8, 13–18, 22–28, 30–35, 37–38	26	X₁, X₂, X₄
		10	1	X₂, X₄
Y₃	2	1–27, 30–38	36	X₁, X₂, X₄
Y₃	2	28–29	2	X₂, X₄

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Mardalena, S.; Purhadi, P.; Purnomo, J.D.T.; Prastyo, D.D. The Geographically Weighted Multivariate Poisson Inverse Gaussian Regression Model and Its Applications. Appl. Sci. 2022, 12, 4199. https://doi.org/10.3390/app12094199

AMA Style

Mardalena S, Purhadi P, Purnomo JDT, Prastyo DD. The Geographically Weighted Multivariate Poisson Inverse Gaussian Regression Model and Its Applications. Applied Sciences. 2022; 12(9):4199. https://doi.org/10.3390/app12094199

Chicago/Turabian Style

Mardalena, Selvi, Purhadi Purhadi, Jerry Dwi Trijoyo Purnomo, and Dedy Dwi Prastyo. 2022. "The Geographically Weighted Multivariate Poisson Inverse Gaussian Regression Model and Its Applications" Applied Sciences 12, no. 9: 4199. https://doi.org/10.3390/app12094199

APA Style

Mardalena, S., Purhadi, P., Purnomo, J. D. T., & Prastyo, D. D. (2022). The Geographically Weighted Multivariate Poisson Inverse Gaussian Regression Model and Its Applications. Applied Sciences, 12(9), 4199. https://doi.org/10.3390/app12094199

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

The Geographically Weighted Multivariate Poisson Inverse Gaussian Regression Model and Its Applications

Abstract

1. Introduction

2. Specifications of Geographically Weighted Multivariate Poisson Inverse Gaussian Regression

3. Results

4. Discussion

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI