1. Introduction
Sustainable land use and transportation planning often requires a close investigation of human flow dynamics, including how people move in daily life and change their locations of residence and, more importantly, how (and why) the movement patterns change over time. Given their importance in understanding the growth, decline, and transformation of our cities/regions, the dynamics of human flows have drawn growing attention in various strands of academic literature. For instance, numerous studies in the field of urban planning, geography, and other social and behavioral sciences have been devoted to revealing the dynamics of migration and to understanding how our cities/regions have been reshaped by in- and out-flows of population (see, e.g., [
1,
2,
3,
4,
5,
6]). There are a similarly large number of theoretical and empirical studies on commuting and other types of urban travels, including recent research on emerging travel patterns in contemporary cities in which advanced information and communication technologies play critical roles and various mode choice options are increasingly available (see, e.g., [
7,
8,
9,
10,
11,
12,
13]).
In the literature concerning such human flow dynamics, increasing efforts have been made to better understand the complex, reciprocal interactions between residential relocation and commuting patterns. Recent studies have challenged the conventional, unidirectional view of the linkage that commuting patterns are determined by residential relocation, which redistributes the origins of commuting flows over space, and have suggested that the potential influence of commuting on household relocation (i.e., the reverse linkage) would be equally important. For instance, van Ommeren et al. [
14,
15] provided a theoretic framework that enables researchers to better characterize and analyze relocation dynamics with careful consideration of the importance of commuting in individuals’ search processes. Kan’s [
16,
17] examination of the job and residential location choice also put emphasis on the critical role of commuting in the interplay between job and residential choices as well as the intrinsic uncertainties involved in the decision making.
However, our knowledge base regarding the residential relocation–commuting nexus is still limited. (This does not mean that there has been no effort to develop analytical frameworks, in which the two critical human flow dynamics are examined together. Integrated urban system models shed light on the causal connections between residential relocation and commuting dynamics with a focus on land use-transportation interactions [
18,
19,
20].) In particular, much remains to be learned about how these two critical human flow dynamics are actually interrelated with each other in reality and how this relationship has been evolving over time, given that rapid advancements in various technologies have reshaped the way our cities/regions are functioning. Furthermore, there has been little effort to ascertain under what circumstances the reciprocal interactions between residential relocation and commuting tend to be more apparent.
This study attempts to fill this gap and advance our knowledge about the critical interrelationship between residential relocation and commuting through an empirical investigation of the three largest metropolitan areas in the state of California: the San Francisco, Sacramento, and Southern California (covering Los Angeles and San Diego) regions. More specifically, we present a matrix forecasting experiment designed to test some alternative approaches, in which the interrelationship is taken into account to capture their influences on one another, and compare them with the conventional, separate flow matrix forecasting. Through the experiment, we expect to reveal how residential relocation and commuting patterns are interconnected with each other in the regions and examine whether explicit consideration of the interrelationship can help us improve forecasting performance and accomplish a more effective analysis of complex human flow dynamics.
2. Residential Relocation and Commuting: Interlinkages
In the field of urban planning and associated literature, it has long been suggested that residential relocation and commuting are interconnected with each other. However, traditional research (and planning practice) has often focused on how residential relocation (or migration) can shape commuting, but not the other way around. For instance, in conventional four-step travel demand modeling and analysis, household relocation flows have been regarded as an exogenous factor that should be taken into account in the very first stage of the analysis of commuting (e.g., trip generation). By contrast, commuting has typically been assumed as an outcome of dynamic redistribution of people and jobs over space rather than as a determinant that can influence relocation dynamics simultaneously.
Such a unidirectional characterization of the relation between residential relocation and commuting has increasingly been challenged by recent studies that recognize the importance of commuting in making various household decisions, including where to live, having significant implications for sustainable development. As urban economists have suggested, in making their residential location choice, individual households confront a trade-off between two costs associated with commuting and housing, which are likely to be considered simultaneously. Given the tradeoff, the decision would be made with the aim to maximize utility—considering resistance against both traveling (commuting) and moving (residential relocation). Geographers have also recognized the important role of commuting in determining the geographic scope of residential and job locations in urban spaces [
21,
22]. However, by mainly focusing on the time-invariant geographic distance, earlier studies tended to pay little attention to the dynamics of urban spatial structure with mobile labor and mobile businesses.
Zax’s [
23] research on moving and quitting, two possible reactions of workers to the relocation of their employer, suggested that a change in commuting (caused by the employer’s relocation) can influence residential moves. In a later empirical study, Zax and Kain [
24] analyzed how the probabilities of workers’ residential moves differed between two groups who had to travel longer (i.e., losers) and shorter (i.e., gainers) distances due to the relocation of their employer. The authors found that the losers were more likely to move than their counterparts, suggesting that an increase in commuting time can elevate the probability of residential relocation. A similar point was reported by Levinson and Kumar [
25], who contended that during rapid suburbanization, rational decision makers may relocate both residence and workplace locations with the aim to optimize commuting time.
van Ommeren et al. [
14,
15,
26] investigated the interrelationships between residential mobility and labor market mobility in a search theoretic framework and found an interrelated forward-looking decision making process, in which commuting plays a critical role. According to the authors, commuting is undoubtedly determined by residential and job location choice, but at the same time has an effect on location choices. More specifically, commuting can alter the probability of accepting an offer in the process of a new job or residence search; for instance, when the commuting distance is long, there can be a larger degree of desire to reduce it, and this can increase residential or job mobility.
The critical role of commuting in shaping the location choices has also been recognized by Kan’s [
16,
17] research on the dynamics of residential location changes. Similar to van Ommeren et al. [
14,
15,
26], commuting cost is viewed as a significant factor that can motivate or discourage both residential moves and job changes in his analyses. Clark et al. [
27] analyzed the Puget Sound Transportation Panel survey dataset and reported a systematic linkage between commuting distance/time and residential location change, indicating “rational behavior of reducing the commute distance and time with greater separation, … [and] the importance of a critical isochrone, in this case about 8 miles, beyond which the likelihood of decreasing the distance to work grows rapidly” (p. 218). More recently, the way people view commuting has been changed with the technological advancement with which they can accomplish different types of task while on move. For instance, Bissell investigated the complexity of urban mobility with a focus on how the changing nature of commuting can transform our cities [
28]. Additionally, empirical studies among European countries demonstrated a shift in commuting patterns due to the changing nature of jobs and shed some light on how urban planners can deal with the ongoing change [
29,
30].
The above studies have suggested that it is not desirable to characterize the residential relocation–commuting nexus as a unidirectional relation or to analyze them separately. Commuting can have a sizable influence on the residential location choices of households in the sense that rational households may seek to avoid a long-distance commute, which generates substantial costs or disamenities that can exceed the benefits they can enjoy while living in a location far from their workplace. This mechanism may have contributed to preventing commuting distance/time from increasing dramatically in a rapidly growing metropolitan region, while more efforts (or interventions) are needed to create a sustainable metropolitan spatial structure.
However, in reality, the way commuting actually affects residential relocation would not be that straightforward. Local land use control, uneven distribution of housing and employment opportunities, and other contextual factors can alter the visible pattern of the relation between commuting and residential relocation dynamics (see, e.g., [
31,
32]). Furthermore, in a metropolitan area, in which numerous relocation flows are systematically interconnected with each other through housing and job vacancy chains, the impact of commuting on household relocation is hardly determinate. Rapid advancements in information and transportation technologies can also reshape the relation by alleviating the costs of commuting a certain distance (see, e.g., [
33,
34]). The smart city movement and new work/life styles emerging in contemporary metropolises based on cutting-edge communication technologies can also defy the traditional patterns of residential relocation–commuting interactions (see, e.g., [
10,
35])
3. Study Areas and Data
How are residential relocation and commuting dynamics associated with each other in reality, particularly in large metropolitan areas? Are there any systematic patterns in the interrelationship? Additionally, can we understand relocation and commuting dynamics more effectively by reflecting the interrelation in our investigation of the dynamics? As noted earlier, in this study, we investigate the following three regions anchored by the largest metropolitan areas in the state of California to address these questions: (1) San Francisco Region, (2) Sacramento Region, and (3) Southern California Region (
Figure 1).
The San Francisco region comprises the following twelve counties around the Bay Area: Alameda, Contra Costa, Marin, Napa, San Benito, San Francisco, San Joaquin, San Mateo, Santa Clara, Santa Cruz, Solano, and Sonoma counties. The Sacramento region represents a seven-county area, surrounding the state capital—Sacramento, CA—and consists of El Dorado, Nevada, Placer, Sacramento, Sutter, Yolo, and Yuba counties. Finally, in this study, Southern California refers to a seven-county region encompassing multiple metropolitan statistical areas (including both Los Angeles and San Diego): Imperial, Los Angeles, Orange, Riverside, San Bernardino, San Diego, and Ventura counties. As shown in the figure, densely populated counties are along the coastal line for the San Francisco and Southern California regions, whereas Sacramento County, the home of the state capital that shares borders with the San Francisco region, is the most densely populated county in the Sacramento metropolitan area. Jobs tend to be concentrated in these counties, while variation exists.
For US states and counties, the Internal Revenue Service’s (IRS) Statistics of Income (SOI) division provides annual migration data that describe place-to-place residential relocation flows. The IRS migration data date back to 1983, and in the absence of alternative annual datasets, the IRS is known to be the best available source to facilitate various analyses of dynamic intercounty migration patterns over a reasonably extensive timespan [
36]. Although the dataset tracks county-level spatial flows of individuals and households in a comprehensive manner, it is still far from perfect in a few respects. First, there is a temporal mismatch between migration year and tax year. The actual period of the annual migration data collection is between the tax filing date (15th April of a year) in a previous year and that in a current year, and this does not necessarily match with other datasets provided on the basis of a calendar year. Second, the IRS migration data do not cover all population and household flows, because not all individuals file a tax report every year. While the coverage rate is known to be approximately 90% of the population, three groups—the poor, the very wealthy and the retired—tend to have lower tax filing rates [
37]. Accordingly, the data tend to underrepresent the migration flows of these groups. Even with these shortcomings, the IRS migration dataset has been widely used by a broad range of applied research (see, e.g., [
38,
39,
40]) and employed in this study, as it is known to be the most reliable data source to capture residential relocation flows of households on an annual basis.
The data for commuting, the other important type of human flows, are from the U.S. Census, specifically the Longitudinal Employer-Household Dynamics (LEHD) Origin-Destination Employment Statistics. The US Census’ Center for Economic Studies operates the LEHD program, which releases data on employers and employees through its partnerships with other agencies. The program utilizes information related to unemployment insurance earnings as well as quarterly employment and wage data provided by state organizations to construct data files concerning the spatial distribution of population, employment, and their interactions. The Origin-Destination statistics present detailed patterns of commuting flows considering job locations and residential locations. Using the LEHD data, previous studies mainly focused on analyzing job-to-job flows taking into account business cycles [
41] and estimating the rates of job-to-job and job-to-employment status flows for missing states [
42]. Others studied the workers’ transition patterns among employers—and more broadly among industries—with consideration of workers’ demographic characteristics (see, e.g., [
43,
44]). Our study utilizes the information about the annual commuting flows of employees that can be extracted from the LEHD database.
Using the two data sources (i.e., IRS and LEHD datasets), county-level household relocation and commuting flow information can be compiled for each study area and organized in a matrix form.
Table 1 and
Table 2 show the San Francisco region’s matrices (year 2010), in which each value indicates the size of residential relocation or commuting flow from an origin (O) to a destination (D) county. Regarding the residential relocation pattern, while the majority of people do not move across the county boundaries within a given one-year time window (and thus falling in the diagonals of the matrix), many off-diagonal origin-destination (OD) pairs also have more than a thousand households, such as from Alameda County to Contra Costa, San Francisco, and Santa Clara counties, suggesting that the size of relocation flows between these counties is considerable. The inter-county flows are even more substantial in terms of the commuting pattern, involving a few OD pairs with more than 50,000 commuters.
A more important point to be stressed is the association between residential relocation and commuting. On the surface, the magnitude of residential relocation and that of commuting are highly correlated, as demonstrated in
Figure 2 in which each OD pair’s household relocation and commuting flow shares (obtained by excluding the diagonal elements and then performing row-normalization) are plotted. Although the correlation does not necessarily imply any causality, the revealed pattern appears to suggest that a connection may exist between residential relocation and commuting and that the nexus between the two types of human flows deserves empirical investigation.
4. Methodology
As noted earlier, we conduct a matrix forecasting experiment to explore the linkages between residential relocation and commuting flows in the three study regions and to take the linkages into account to better describe and forecast the evolution of the two interrelated human flow dynamics. Matrix forecasting has become an integral part of a wide range of analyses that lead to a more complete understanding of complex system behaviors, including dynamic relationships among multiple sectors (e.g., input-output matrix, representing inter-industry linkages through supply chains) and the transformation of a system from one state to another (e.g., Markov chain matrix, showing land use conversion over a period of time). Our experiment employs and compares two sets of matrix forecasting approaches: (i) traditional matrix forecasting methods and (ii) alternative models in which the interrelationship between migration (residential relocation) and commuting are taken into account. More specifically, the following ten (five traditional and five alternative) approaches have been tested to determine how the consideration of the interrelationship can improve the forecasting performance and/or analytical capabilities of the models.
Table 3 summarizes the ten matrix forecasting approaches tested.
Approach 1. Most recent matrix: According to Rogerson and Plane [
45], “a common method of modeling temporal change in spatial system is to ignore it. This is clearly evident in the widespread use of Markov models” (p. 148). For many socio-economic analyses in which migration and/or commuting are involved (especially when just a single set of historical observations are available), it is typically assumed that future migration or commuting pattern will be constant over time and the same as the most recently observed pattern.
Approach 2. Historical average: Another simple traditional method, which also ignores temporal changes (similar to the first method but not completely), is to take the average of available historical matrices.
Approach 3. First-order lag model: The third approach tested is a simple lag model discussed by Plane ([
46], pp. 452–455). Here, the future flow matrix pattern is formulated using two sets of previous observations, as below.
where,
= flow matrix in year
t = first-order trend coefficient
First-order trend coefficient () captures the flow pattern change over time, to be estimated. If = 0, this model is identical to the first method, using the most recent observation for future flow matrices.
Approach 4. Causative change matrix approach: This method introduces one or more
n ×
n causative matrices, describing the transition of a flow matrix (see, e.g., [
45,
47,
48]). Either a right-side or left-side causative change matrix with destination- and origin-based implications, respectively, can be adopted. Alternatively, to avoid the arbitrary choice, both right-side and left-side causative change matrices can be used at the same time, although this increases the computational burden. Simply, the left-side causative change matrix can be formulated as follows.
where,
= flow matrix in year
t = n × n left-side causative change matrix, explaining the transition
It should be noted that we have also tested the right-side causative change matrix; however, we have not quite identified a distinct forecasting result, although differences in terms of estimation outcomes and forecasting performance do exist. Therefore, in this paper, we do not present the right-side causative change matrix as a separate traditional approach. Furthermore, we found that a double causative change matrix could generate a relatively poorer forecasting outcome in this experimental setting, perhaps due to the short time span with the available data for the estimation. For this reason, we also decided not to include the approach separately.
Approach 5. Bi-causative matrices approach: The last traditional approach included in our experiment is the use of two bi-causative matrices to explain the temporal change of a flow pattern. This method, presented by de Mesnard [
49], uses diagonal matrices to measure and project the structural changes, as follows.
where,
= flow matrix in year
t and are bi-causative n × n diagonal matrices where all non-diagonal cells are assumed to be equal to zero.
The main advantage of this approach, compared to the traditional causative approaches, is a reduced number of parameters to be estimated. In other words, while each causative matrix (e.g.,
in our approach 4) bears
unknown parameters, each bi-causative matrix (i.e.,
and
) just has
parameters (only on the diagonal), so the computation and interpretation can be more straightforward and convenient. For more detailed explanations, please refer to de Mesnard [
49,
50] or Wan et al. [
51].
Approach 6: The first alternative approach assumes one of the simplest forms of the joint determination of migration and commuting patterns: both migration and commuting at
t are influenced by their own state and the counterpart at
t − 1 in a linear fashion, as follows.
where,
= migration flow matrix in year
t = commuting flow matrix in year t
and represent scalar coefficients to be estimated.
Approach 7: As discussed in the previous section, migration from area
i to
j can be associated with commuting
j to
i, and vice versa. The second alternative approach (i.e., approach 7) takes this possibility into account by slightly modifying the approach 6 model, as shown below.
where,
= migration flow matrix in year
t = transposed migration flow matrix in year t
= commuting flow matrix in year t
= transposed commuting flow matrix in year t
and represent scalar coefficients to be estimated.
Approach 8: This approach simply combines the above two alternative approaches by considering both normal and transposed forms of the counterpart in each model as follows.
where,
= migration flow matrix in year
t = transposed migration flow matrix in year t
= commuting flow matrix in year t
= transposed commuting flow matrix in year t
and represent scalar coefficients to be estimated.
Approach 9: The ninth approach utilizes the bi-causative approach and then considers the relationship between migration and commuting as in approach 8. The following formulas present the approach where the bi-causative method is embedded along with a parameter
.
where,
= migration flow matrix in year
t = transposed migration flow matrix in year t
= commuting flow matrix in year t
= transposed commuting flow matrix in year t
= bi-causative matrices for migration
= bi-causative matrices for commuting
and represent scalar coefficients to be estimated.
Approach 10: Similar to approach 9, the final approach also adopts the logic of a bi-causative approach. Here, four bi-causative matrices (i.e.,
,
,
, and
) are used in each model to describe the migration and commuting change dynamics, respectively.
where,
= migration flow matrix in year
t = commuting flow matrix in year t
= bi-causative matrices for migration
= bi-causative matrices for commuting
and represent scalar coefficients to be estimated.
The above ten frameworks are calibrated based upon the four annual (i.e., 2002, 2003, 2004, and 2005) residential relocation and commuting matrices for three regions: Southern California, Sacramento, and the San Francisco regions. In calibration (and our forecasting, explained in the following section), we exclude the diagonal cells of the flow matrices (to avoid the large influence of these elements) and make the matrices row-normalized. In other words, the matrices used in the present experiment have the proportion values that represent the share distribution over destinations from the perspective of each origin county. To accomplish the calibration, we employ a quasi-Newton method of optimization—more specifically, the BFGS approach [
52,
53,
54,
55]—available as an option of the
optim function in
R’s
Stats package.
After calibration, we compared the forecasting performance of each of the ten approaches (i.e., five traditional and five alternative methods). In detail, we generated the forecasts of residential relocation and commuting matrices for the subsequent five years (i.e., 2006 through 2010) using the models calibrated with 2002~2005 data, and we evaluated the performance of each approach based on the following five error metrics that have been discussed and/or used by many studies in the forecasting literature, such as Armstrong and Collopy [
56], Hyndman and Koehler [
57], and Hierro [
58]: (1) root mean square error (RMSE), (2) mean absolute error (MAE), (3) median absolute error (MdAE), (4) Theil’s U2, as explained in Bliemel [
59] and Armstrong and Collopy [
56], and (5) percent better (PB) as defined in Armstrong and Collopy [
56] (Since this study deals with proportions (i.e., row-normalized elements of migration and commuting matrices) and thus involves a large number of zeros or almost zero values, it would not be reasonable to use percentage-based metrics. Therefore, we used MAE and MdAE, instead of the mean absolute percentage error or median absolute percentage error, discussed in Armstrong and Collopy [
56]). Lower values for the first four error measures indicate better forecasting performance of an approach, while a higher value for PB indicates better performance.
5. Results
5.1. Forecasting Model Calibration Outcomes
Table 4 presents the calibrated coefficients in the five alternative approaches, where consideration is given to the association between residential relocation and commuting changes. Among others, the calibrated values of
and
clearly demonstrate that both residential relocation and commuting can largely be explained by their patterns in the previous year. More specifically, under approaches 6, 7, and 8, the coefficients for all three regions fall in the range between 0.96 and 1.03, suggesting that subtle and gradual temporal variation does exist. When the bi-causative matrix method is combined as in approaches 9 and 10, the magnitudes of the coefficients differ more substantially, but these coefficients are still found to play the largest role in most cases.
However, despite the large explanatory power of the previous year’s patterns, it appears that residential relocation and commuting influence each other as well. For instance, in the case of the San Francisco region, commuting in year t (non-transposed) seems to have a positive effect on residential relocation in year t + 1 ( = 0.01527 in approach 6 and = 0.01585 in approach 8, respectively).
These positive relationships may indicate that a large flow of commuting often induces residential relocation, perhaps to optimize the commuting costs, as suggested by Levinson and Kumar [
25], Clark et al. [
27] and others. A similar positive relationship is also found in San Francisco area for the commuting part of the model (
= 0.02526 in approach 6 and
= 0.02086 in approach 8, respectively). The commuting flow tends to be larger between two counties with a greater volume of household relocation in the previous year.
A more important thing to be noted is that the other regions tested in this study show dissimilar patterns in terms of the interrelationships between residential relocation and commuting. In the case of the Sacramento metropolitan area, exhibits negative estimates in both approaches 6 and 8. Although the sign of turns out to be positive as in San Francisco, its magnitude is much smaller than that for the San Francisco region. Estimation results for Southern California also differ from those for San Francisco in the sense that commuting (again, non-transposed one) shows a negative impact on residential relocation, whereas the impact of residential relocation on commuting is positive, consistent with the finding from San Francisco (Note that the coefficients for the transposed matrices (i.e., and in approach 7 and and in approach 8) show more consistent patterns of signs, but these estimates are relatively smaller.). These results may be attributable in part to the uneven distribution of new housing within the region. In fact, during this period of urban expansion, a large number of households continued to move to Riverside and San Bernardino counties, where housing supply was more elastic, rather than being relocated into job-rich Los Angeles and Orange counties to shorten their commuting distances.
The variation in the coefficients across study regions may suggest that the way residential relocation and commuting are associated is not always determinate. Rather, their interrelationship can be influenced by many indigenous factors that are unique to each metropolitan area, such as its history, culture, and institutional arrangements. Depending on urban growth patterns, each metropolitan area might be in a different stage of urbanization, suburbanization or densification. For instance, a typical trend of suburbanization with increasing residential relocation from core counties to surrounding counties can result in increased commuting from suburbs to core counties. Other metropolitan areas, with the return of both jobs and population back to their core counties, do not necessarily present such a pattern.
It would also be probable that the spatial structure and transportation systems in each metropolis can shape the relationship. Rapid advancement in transportation can generally save commuting costs, and the potential benefit of residential relocation can increase accordingly. This situation can induce more intra-metro relocation and increased long-distance commuting flows, but possibly with cheaper costs. However, public transit systems with cost-effective and reliable services within a central part of the metropolis can shape both residential relocation and commuting patterns differently. Admittedly, the relocation and commuting dynamics in a metropolitan area with poor connectivity are also less likely to be similar to the above cases.
Furthermore, it should be noted that business relocation dynamics that would be critical in determining the nexus between household relocation and commuting—but not taken into account here due to the data deficiency—would be able to elucidate distinct relational patterns found in the three different study areas. Among the relocation decision-making criteria, ‘when to move’ decision is closely related to the potential relocation of businesses. For instance, if a current employer relocates to a county where its employees mainly reside, the employees can reduce commuting cost even without residential relocation. This is the case where jobs follow people in an intra-metropolitan setting. Although generally workers are more mobile than employers, in certain metro areas where suburbanization is evident employers often move toward the places with their labor force (see, e.g., [
31]). Such critical business relocation patterns may also vary by urban development history, technological environments, and many other factors. The variation found in our model estimation could be attributed to business relocation as well as other region-specific characteristics in urban development, including transportation and land use regulation.
5.2. Forecasting Performance
Another important issue we explored in this study is whether the consideration of the relocation–commuting relationship can lead to a more effective analysis and/or forecasting of the two important human flow dynamics. As noted in the previous section, this was accomplished by using five well-known error metrics.
Table 5,
Table 6,
Table 7,
Table 8,
Table 9 and
Table 10 show the outcomes of this forecasting performance evaluation for each study area’s residential relocation and commuting. First of all, in the case of San Francisco’s residential relocation, the variation of the short-run forecasting performance is small (e.g., all approaches show RMSE between 0.00373 and 0.00497), but the gap widens as the forecast time horizon extends (see
Table 5). One notable point is that alternative approaches generally show better performance than traditional approaches, while the judgment outcome could differ by the evaluation metric (For instance, in terms of MdAE, the first three traditional approaches can be evaluated more favorably than alternative approaches.). Approaches 6, 7, and 8 show RMSE values range from 0.00728 to 0.00787 for the forecasting year 2010, whereas those of the five traditional approaches range from 0.01112 to 0.01733. This finding is clearly demonstrated in
Figure 3, where each approach’s performance in terms of RMSE over the forecasting time horizon is illustrated. As shown in the figure and table, the causative change matrix approach (i.e., approach 4), which is one of the most widely used traditional methods, turns out to be less accurate than all alternative approaches tested in this study.
The finding from the commuting models in the San Francisco region is not quite different (see
Table 6). Approach 7 shows the best performance in terms of RMSE for the forecasting year 2010, followed by approaches 8 and 6 (see
Figure 4). An evaluation based on Theil’s U2 can yield the same conclusion, while the traditional approaches 1, 2, and 3 can be considered better if the judgment is made based on other error metrics, such as MAE and MdAE.
Unlike in the San Francisco area, the performance of alternative approaches is not found to be superior in the case of residential relocation modeling for the Sacramento and the Southern California regions. For the relocation matrix forecasting for these two study regions, the first three traditional approaches show a higher degree of forecasting accuracy. In other words, alternative approaches are not better than the simple methods, using the most recent matrix or the historical average value, in forecasting how residential relocation dynamics will change over time in these areas.
A close look at the error patterns reveals that the relative performance of alternative approaches tends to get worse after 2008. This poor performance may indicate that the way in which commuting influences residential relocation dynamics may have changed substantially over time. As mentioned earlier, before the recession, the number of households who moved from job-rich Los Angeles and Orange counties to the Inland Empire area (Riverside and San Bernardino counties) was much larger than the magnitude of the reverse flow in Southern California. In later years, the gap has been reduced significantly with changes in housing market conditions.
However, in our commuting matrix forecasting, alternative approaches seem to show their competitiveness across all the study areas. In the case of the Sacramento area’s commuting (see
Table 8), approach 7 provides the lowest RMSE and Theil’s U2; and most alternative approaches (particularly approaches 6, 7, and 8) perform better than both causative and bi-causative change matrix approaches (i.e., approaches 4 and 5). For the Southern California area (see
Table 10), approach 6 shows the best long-term commuting forecasting performance based on RMSE, MAE, Theil’s U2, and PB. The best approach in terms of MdAE (for Southern California with the forecasting year 2010) is approach 9, another alternative method in which the association between residential relocation and commuting is reflected.
In summary, the alternative approaches seem to present some potential benefits as a framework dealing with complex human flow dynamics, although their forecasting performance does not always outpace that of traditional approaches. When used to forecast commuting patterns, the new, integrated approaches generally produce more accurate forecasts than traditional ones, while the result is somewhat sensitive to the evaluation metrics. They also perform better for residential relocation modeling in the case of San Francisco. This finding may suggest that even the simple linear form used in this study for comparison can contribute to capturing the connection between the two matrix evolutions to some extent and to leading to a more complete analysis of the human flow dynamics in metropolitan areas.
6. Summary and Discussion
Although the interconnection between residential relocation and commuting has been widely acknowledged, traditional research and planning practices have often failed to reflect this connection in the examination of human flow dynamics. The present study attempts to address this limitation by exploring how the two dynamic patterns are interrelated in reality and presenting an integrated framework in which the inter-linkages between residential relocation and commuting are taken into account. This is accomplished through an experiment testing a wide range of matrix forecasting models with the use of county-level migration and commuting data for three broadly defined metropolitan regions in California.
Our experiment indicates that there are bi-directional connections between the two critical human flows—i.e., household relocation can shape the commuting patterns within the metropolitan areas and vice versa—and, more importantly, shows that the way they are associated with each other is not always determinate or straightforward. While the San Francisco region exhibits positive, reciprocal interrelationships between residential relocation and commuting, this pattern does not always hold for other study areas. The variation detected in this study seems to imply that the traditional, simple view of the relationship is not precise enough. Rather, it suggests that the nexus between residential relocation and commuting can be influenced by many region-specific factors, such as the region’s unique development stage, land use regulation, transportation systems, institutional settings, and technological environments.
Even though the residential relocation–commuting relationship does vary by region, it appears that joint forecasting can enable us to attain a higher accuracy and thus provide meaningful value to urban planners and other policy makers. For instance, approach 6 showed great performance in forecasting San Francisco’s residential relocation and Southern California’s commuting matrices, suggesting that these human flow dynamics can be better analyzed when consideration is given to their interactions even in a simple fashion. Furthermore, in the other cases, its performance was not worse than that of many other approaches, although it should be acknowledged that the judgment outcome inevitably depends on the target year and the evaluation metric used.
This, however, does not mean that joint forecasting approaches are always better than traditional ones, and a one-size-fits-all method is not supported. In some cases, using historical averages (i.e., approach 2) can better minimize forecasting errors than relatively more sophisticated methods. Approaches that work well for some regions (or time periods) may not necessarily show the same level of performance in other settings. This is particularly true when circumstances change dramatically. In fact, some of the tested approaches showed a noticeable shift in their error trajectories around 2008 when the recession hit the study regions.
Nevertheless, it is hard to deny that joint forecasting has potential as a new means of analyzing human flow dynamics, and it can be more useful when various region-specific (and/or time-specific) factors are carefully incorporated into the framework. As acknowledged previously, a lack of data on business relocation dynamics and other variables limits the development of a comprehensive forecasting model in the present study. Future research may present a more advanced analytical framework that can help us examine real-world human flow dynamics more effectively for sustainable land use and transportation planning. Such a framework can be used to better assess planning/policy interventions and thus support more informed decision making by taking into account the changing nature of commuting and residential relocation in urban spaces. Directions for future research may also include the development of a more generalizable taxonomy of matrix forecasting models with consideration of each region’s unique circumstances as well as the ongoing transformation of residential relocation–commuting interactions and the application of joint forecasting to flow dynamics at a more disaggregated geographical level.