1. Introduction
Academic literature on cycling has been growing at enormous pace in multiple disciplines over the past years [
1]. This relevance can be explained by the vast potential of cycling for mitigating negative effects of car-oriented transport systems. Cycling efficiently meets mobility demands especially in compact areas [
2], contributing to healthier [
3], less polluted [
4] and more livable environments [
5]. In addition, the cost–benefit ratio of cycling is proved to be positive [
6], in contrast to motorized road transport [
7]. Against this backdrop, comprehensive cycling promotion is imperative. Many cities and regions have accelerated their efforts for a transition from car-centered to sustainable mobility systems in recent years [
8].
However, the effects of measures for promoting cycling, such as new or improved infrastructure, remain hard to be captured at a systemic level. In most evaluation studies, the use of single methods only sheds light on specific aspects, such as route shifts [
9], mode choice, or perception of the environment [
10]. In recent years, several authors have pointed to the limited capabilities of single-method approaches in adequately reflecting the complexity of cycling. Te Brömmelstroet et al. [
8] follow the conceptualization of cycling as socio-technical system [
11], which, requires an interplay of different concepts, methods, and tools. Anaya-Boig [
12] makes the case for integrated cycling policies instead of “cycling policy patchwork” [
12] (p. 20). For such an integration, it is necessary to take into account various environmental, social, cultural, or governance aspects of cycling mobility. In order to translate this paradigmatic aspiration into practice, sound multi- or mixed methods approaches are required. Psarikidou et al. [
13] call for a “more comprehensive and synthesized understanding of cycling” [
13] (p. 226), which comprises infrastructural as well as social aspects. In their rather provocative piece, Bertolini et al. [
14] state that single disciplines can only illuminate a few aspects, whereas a pooling of domain knowledge facilitates systemic understanding.
Over the past few years, systemic investigations and impact analysis of sustainable modes have been emerging. Storme et al. [
15] reviewed existing literature on the effects of new mobility services in various dimensions and found beneficial effects in terms of efficiency, social equity, sustainability, and quality of life. Pisoni et al. [
16] proved the potential of active mobility and highlight the economic and societal benefits that could be gained from increased shares of walking and cycling. Their Europe-wide analysis is based on national data and does not inform any impact assessment at the local scale. In the subsequent section, we review existing approaches, which focus on the investigation of single interventions, or local cycling promotion activities. However, to the best of our knowledge, these approaches have not been integrated into an established framework for investigating the systemic effects of cycling promotion measures. Thus, critical questions remain largely unanswered—such as: What mobility alternatives (mode and route alternatives, including their availability and accessibility) do cyclists have and how do they affect route choice and mode choice? How do individual preferences and interactions with the physical environment (e.g., regarding infrastructure, weather, etc.) influence mobility behavior? Can the factual use of newly built or improved infrastructure be traced back to a permanent change, or are there singularities (construction sites, holiday seasons, etc.), which influence road use patterns? How do personal, intrinsic factors and extrinsic factors influence perception and use of newly built or improved infrastructure and how do they relate to each other?
However, it becomes evident that fundamental matters, which are relevant for planning, maintaining, and managing infrastructure, as well as for marketing and communication purposes, can hardly be addressed with single, existing methods.
We therefore suggest a first step towards an integrated, mixed methods approach for capturing systemic effects of cycling promotion measures, primarily with regard to improving and building dedicated infrastructure. In this paper, we present a framework for linking methods from social sciences, geoinformatics (GIS), and human sensing. The proposed mixed methods approach allows for investigating cyclists’ perception of the built environment, its actual use, and the relation of these aspects with external factors. Moreover, we are able to determine the systemic effect of infrastructure measures with regard to infrastructure quality, accessibility, and centrality.
The focus of our research lies on the conceptual and technical integration of the aforementioned methods and their respective data and information sources. Based on a knowledge graph, in which we conceptualize the integration of data sources and information layers, we formulate hypotheses and domain questions that are enabled by utilizing mixed methods instead of single methods. We contextualize our research within the existing literature, which is reviewed eclectically in the following section.
1.1. Literature Review
In the wake of recent paradigmatic discussions in mobility research [
17] and social sciences (“mobility turn” [
18]), the need for methods that go beyond the respective domain, be it sociology or transport planning, has become evident. Consequently, an increasing number of mixed methods approaches for investigating sustainable mobility and cycling specifically has been published. However, the understanding of mixed methods is diverse and the range of studies which claim this label is broad.
The most basic combination of methods for investigating the effects of interventions, which is published as mixed methods, is linking quantitative surveys with qualitative interviews. Crane et al. [
19] did interviews with retailers and residents in close proximity to a cycle way in Sydney, Australia before and after it was built. These data were then combined with a quantitative survey among users in a sequential, partially mixed design. Guell et al. [
20] follow a sequential explanatory design (scoping—explanation), in which they first conducted a large quantitative survey and did qualitative interviews with a sub-sample, which was recruited according to answers in the survey, in a subsequent step.
In recent years, more sophisticated mixed methods studies, with a variety of sensors and methods have emerged in the mobility sector. In their investigation of transport disadvantages in rural areas, Shay et al. [
21] calculate and map theoretical risk for transport disadvantages by spatial overlay analysis of census data. The output then serves as basis for in-depth expert interviews and focus groups with residents. This sequential design confirms some of the results of the GIS analysis, but also extends the insights by local, sometimes informal knowledge and suggestions for solutions. Lucas et al. [
22], apply a similar research design to a social assessment of a local transport project (bypass of a major road) in Wales, UK. The results of a GIS analysis of impacts on communities in proximity to the project are reflected in qualitative fieldwork by conducting a survey, expert interviews, and focus groups. The authors conclude that the qualitative part helped uncovering unintended effects of measures that would remain hidden in a purely structural approach. Moreover, with their qualitative fieldwork, the authors are able to give a voice to frequently unheard and hard to reach groups of people. By using an online survey, Vasilev et al. [
23] reached a high number of users of an intervention site in Trondheim, Norway. They combined assessment of stated preference regarding road design with interactive mapping of locations where participants felt unsafe. The authors conclude that participatory mapping is a valuable method for evaluating interventions in transport planning. While Shay et al. [
21], Lucas et al. [
22] and Vasilev et al. [
23] do not collect any quantitative data in the field, other authors combine in situ measurements with qualitative data. Gadsby et al. [
24] present a mixed methods approach where self-stated behavior variables are compared to sensor data from an equipped bicycle. Coded qualitative data are related to environmental variables in order to identify stressors. Gamble et al. [
25] apply an ethnographic approach with photo diaries and associated sentiments collected by pro-cycling activists. These collected experiences are then related to the physical environment (100 m “experience space” around route) by spatial overlay analysis. Regression analysis reveals negatively and positively experienced environment variables. Similarly, Resch et al. [
26] equip study participants with physiological sensors (“human sensors”), first-person video camera and a mobile application for recording in situ feedback. These data sources are further complemented by post hoc interviews. The objectively measured response to the environment, together with self-stated perception and experience serve as input for generating walkability and bikeability maps. Emerging cold- and hot-spots are validated by video footage and post hoc interviews. Wesener et al. [
27] investigate a newly built major cycle way in Christchurch, NZ. For this, they collected different user-centered, in situ data from audio and video recordings from cyclists. Together with recorded movement data (GNSS, gyroscope, accelerometer) and post hoc ratings of the overall experience, the authors are able to assess the interaction of cyclists with the road environment and other road users.
All studies have in common that qualitative and quantitative methods are integrated at different stages of the investigation. The degree of integration, as well as the dependencies between qualitative and quantitative methods ultimately depends on the study design. Moreover, the studies vary in terms of where (in situ vs. post hoc vs. independent) and how (passively measured vs. self-stated) measurements are taken and whether the design focuses on sites or users. Creswell and Creswell [
28], as well as Steinmetz-Wood et al. [
29] point to the difference between a parallel use of methods and true mixed methods approaches, where different strands are integrated. However, the integration can be implemented at very different levels. To the best of our knowledge there are no examples from the mobility domain where the integration is implemented at the level of the study design (method integration) or at the level of data (data fusion). What we learn from the reviewed studies is an integration of results from qualitative and quantitative data acquisition and analysis.
1.2. Research Gap
Despite the growing body of literature that is relevant for planning infrastructure for cycling and active mobility, we identified a gap regarding the assessment of systemic effects that infrastructure interventions may induce. While diverse methods have been developed to acquire and analyze data for assessing singular aspects of an intervention, none of these approaches provide a multifaceted view including static external factors of the (built) environment, dynamic measurements of infrastructure use, and experience as well as individual aspects such as personal preference and perception. Therefore, we identified the need for a mixed methods approach embedded in an interdisciplinary setting to gain such insights from combining quantitative and qualitative as well as in situ and post hoc data.
Our research addresses the three following questions:
How can we integrate different data and information sources conceptually and technically? This question covers the integration of different data types (quantitative and qualitative), as well as different temporal (in situ and post hoc) and spatial references.
Which questions regarding an infrastructure intervention can we answer by using single methods and for which questions do we need a combination of methods?
How to identify input measurements that may be used as proxy for other measurements?
This paper proposes a framework that facilitates design and implementation of purpose-specific mixed methods studies for assessing effects of infrastructure interventions. Its main contribution is to guide and ease the process of selecting and integrating appropriate methods and data sets depending on the respective domain questions to answer. While we focus on an exemplary set of methods, using the methodology outlined one can easily extend and transfer this framework to adjacent use cases.
2. Materials and Methods
For addressing the posed research questions, we chose an approach consisting of several stages which is visually outlined in
Figure 1. First, we identified four aspects of interest relevant to cycling mobility, for which the mixed methods approach is designed: safety, stress, smoothness, as well as acceptance and behavior. For each of these core topics, we defined more detailed sub-aspects that contribute relevant information from different perspectives. In a second stage, we evaluated established methods that can contribute data and information to one or more of the previously defined thematic aspects of interest. From existing literature and our own experience in applying these methods, we identified capabilities and limitations of each primary method. Based on these findings—as a third step, we developed a knowledge graph representing relevant semantic links between the given data and information inputs and the thematic aspects of interest. This graph provides the semantic basis for integration on the level of data and information and for deriving advanced mixed methods designs. It is meant to aid researchers and practitioners selecting appropriate combinations of information and data sources for answering their specific questions within applied use cases. In addition to the aforementioned steps, we applied this framework in a series of case studies to verify its applicability and to assess different aspects of technical data and information integration.
In the following subsections, we provide detailed information on the building blocks of the framework—aspects of interest as well as data sources and methods considered. Furthermore, we outline our approach to linking data sources for deriving the conceptual framework. We then conclude this section with providing guidance for integrating these building blocks into mixed methods designs.
2.1. Aspects of Interest
The aspects of interest represent core topics relevant to answering domain questions. Their purpose is to provide the structure essential for relating data and information layers to facilitate semantic integration. Within the chosen application domain in cycling mobility we identified and defined the following aspects of interest:
Safety: Previous research has shown that safety concerns are among the main deterrents for choice of active modes, especially for cycling [
30]. Therefore, considering objective safety, but also individuals’ perceived safety is crucial to infrastructure design [
31].
Stress: Similar to the impact of safety, the sensation of stress during participation in traffic was shown to be an important obstacle for active mobility, especially cycling [
32]. Although individual tolerance to stress may vary between person and purpose of a trip, it is an essential variable to consider and consequently forms the basis for well-known concepts such as Levels of Traffic Stress [
33] or Bicycle Stress Level [
32].
Smoothness: We use the term “smoothness” to describe the overall experience of cyclists passing a series of road segments in regard to traffic fluidity, necessary stops and required turns—e.g., at intersections. There exists less literature on this aspect as compared to safety and stress. However, studies on route choice such as by Hochmair [
34] indicated that simplicity and speed or fluidity of a route are important.
Besides the aforementioned aspects,
acceptance and behavior are understood as overarching themes with high relevance within the domain context. Safety, stress and smoothness can all influence acceptance and behavior [
35,
36,
37], while other, less known or highly subjective aspects may still play a role.
2.2. Primary Data Sources and Methods for Data Acquisition
In this section, we give an overview of the primary methods for data acquisition, including data and information sources considered for this research. A brief outline of these data sources is given in
Table 1, before each method is explained in detail. Understanding of individual methods, including their capabilities and limitations is essential for contextualizing and understanding the conceptual integration as well as potential contributions of mixed methods approaches which implement the framework. The description of our approach to conceptualization of the framework and for integrating the building blocks in mixed methods designs follows in
Section 2.3 and
Section 2.4.
2.2.1. Quantitative Human Sensing and Stress Detection in Wearable Sensor Data
Human sensing facilitates the measurement of immediate stress reactions of individuals through bio-physiological sensors. In field studies, participants are equipped with non-invasive wearable sensors that capture parameters such as galvanic skin response (GSR), skin temperature (ST), and heart rate variability (HRV). From these physiological signals, moments of stress are derived—e.g., following a sensor fusion method by Kyriakou et al. [
38]. For enabling spatial analysis of measurements, location data from Global Navigation Satellite Systems (GNSS), such as GPS or Galileo, are stored alongside these measurements.
One key reason for using human sensing is to exclude cognitive bias, such as the illusory truth effect, consistency bias, memory errors, etc., which can arise with post hoc methods such as surveys or interviews. However, if used as single method without complementary assessment, this method does not allow for direct inference of stressors and context that may have induced a stress response. This is due to lacking contextual information for interpreting moments of stress. Furthermore, it does not provide any information on how individuals react on the short- or long-term to these stress responses by adapting their respective mobility behavior. Moreover, human sensing in the context of infrastructure evaluation has to be applied in a naturalistic setting. Thus, there are numerous internal and external variables, such as mood or singular events that cannot be directly controlled for.
Domain questions that we can answer by using this method:
Aspect of interest: stress
2.2.2. Measurements of Lateral Distances
Ultrasonic distance sensors in combination with highly integrated and energy-efficient microcontrollers and Systems on Chip (SoC) allow for affordable and reliable portable distance measurements. An example for a community-driven implementation is the “OpenBikeSensor” (OBS) [
39]. When mounted to a bicycle, such devices continuously track lateral distances of the cyclist. Due to the technical principle of ultrasonic distance sensing, only objects of a given minimum size are detected that are within the detection range. Consequently, narrow passages and minimum distances of passing vehicles can be detected while detailed 3D-reconstruction of the environment is not possible. For example, thin objects such as lamp posts or obstacles close to the road surface may not be detected. As a consequence, measurements towards the outer side of the road can give indication on narrow passages when the detected values are low. However, the reverse conclusion (high measured distance meaning plenty of space available) cannot be inferred. Furthermore, the raw distance data do not allow for concluding on the cause for a low distance.
With an additional push button mounted to the handlebar, a sensor device can record user input alongside distance measurements. This can be used to manually tag specific situations such as close passes or moments of fear, depending on individual instruction. With this option, users can add context information which helps overcome the semantic gap of raw distance measurements.
Domain questions that we can answer by using this method:
Aspect of interest: safety
How much space is practically available to cyclists (lateral, on both sides)?
Does motorized traffic keep the minimum safety distance towards cyclists as defined in road traffic regulations? How high is the share of lower overtaking distances?
Which parts of infrastructure do frequently show unsafe overtaking distances?
2.2.3. Movement Trajectory
Movement trajectories can be generated from data of different location providers. Most commonly, for the given context and purpose, smartphone-based GNSS is used. However, positioning through wireless network signals helps increase accuracy especially in densely populated areas, which is relevant for detecting stops and deriving speed levels from the data. While movement trajectories obtained through GNSS and wireless network positioning are highly objective, signal distortions need to be considered as potential source of error. While analysis of location and movement trajectories on its own can provide valuable insights, it is the key attribute besides time to be used for joining different datasets. Additionally, location data is the prerequisite for enabling spatial analyses.
Domain questions that we can answer by using this method:
Aspect of interest: smoothness
Where, how often and for how long do cyclists need to slow down or stop?
What are realistic cyclist travel speeds for each segment of infrastructure?
2.2.4. First-Person Videos
First-person videos that capture the road space in front of a cyclist enable in-depth analysis of specific cycling maneuvers and interactions with other road users. In combination with other data sources, such footage provides contextual information and can be used for validation as well as for explanatory purposes. Video footage is an objective information source but requires time-consuming manual interpretation. Therefore, it is included in our conceptualization as auxiliary data source that may be consulted for very specific questions.
Domain questions that we can answer by using this method:
Aspect of interest: safety
Where did impacts to safety occur?
What was the cause for a given safety impact?
Which other parties (if any) were involved in these situations?
Aspect of interest: stress
Aspect of interest: smoothness
Aspect of interest: acceptance and behavior
How did the cyclist actually use the given infrastructure? (e.g., lane used, individual maneuvers)
How did the cyclist react to potential stressors or impacts to safety?
Did the cyclist’s behavior potentially induce stressful or risky situations?
2.2.5. Quantitative and Qualitative Social Sciences Data Acquisition
The social science approach in this context allows for assessment of human perception, opinion, attitude, and behavior based on empirical, qualitative and quantitative data. The field of social science methods is very broad and rich of different paradigms. Thus, we need to outline our understanding of social science—its concepts, data and methods—in the particular context of evaluating interventions in the road space.
First and foremost, we distinguish between qualitative and quantitative data acquisition and associated methods.
Empirical data reflect knowledge from experience and sensory perception. Empirical research methods are for example surveys, observations, or interviews [
40]. Data acquired by these methods can be either qualitative or quantitative.
Qualitative data help understand human behavior on a deeper level by enabling the study of underlying reasons, opinions, and motivations. It aims at explaining ‘how’ and ‘why’ people behave as they do [
41]. The sample size in qualitative research is typically small since the analysis does focus on explanations and facets of a certain phenomenon, e.g., strategies of coping with perceived unsafety in traffic. The sampling follows theoretical considerations (e.g., maximum variety of perceptions) [
42]. Methods to collect qualitative data are for example in-depth interviews, focus group interviews, observations, and unstructured questionnaires using open-ended questions. The analysis methods in qualitative research are diverse and the results range from dense descriptions on a manifest level to the representation of latent levels that emerge from an in-depth analysis. Open guideline interviews can be used to collect data for analyses that deliver results on a manifest level, e.g., using thematic content analysis [
43]. Such an approach allows for more detail than standardized surveys and enables extending the range of topics and aspects covered.
Quantitative data originates from quantitative research that is used to quantify preferences, opinions, facts or behaviors and to generalize results from a larger sample population. Methods to collect this data are, e.g., surveys, structured questionnaires, and online polls, using close-ended questions. The analysis procedures include statistical methods such as descriptive, correlative and difference analyses. The representative sampling aims to accurately reflect the characteristics of the population.
Social science methods by design involve and target subjective or socio-structural aspects. While enabling insights into individual perception, experience and behavior that cannot be captured by technical sensors, cognitive bias may exist in individual responses and sample bias needs to be avoided.
Domain questions that we can answer by using this method:
Aspect of interest: safety
How do cyclists experience safety in relation to the infrastructure available?
Which aspects of the built environment or temporal interaction with space and other road users do they perceive as safety threat?
Aspect of interest: stress
What is the overall stress level experienced by a study participant?
When and where did a study participant feel stressed?
What are the stressors and reasons for experienced stress stated by study participants?
Aspect of interest: smoothness
How smooth do cyclists experience their ride at a given infrastructure?
What are the impacts to smoothness they experience?
How important do cyclists rate the factor of smoothness regarding their own mobility behavior?
Aspect of interest: acceptance and behavior
What is the individual’s personal mobility behavior?
Which influencing factors do individuals state regarding mode choice and route choice?—e.g., how do individuals state the impact of (perceived) issues with regard to safety, smoothness or experienced stress on their mobility behavior?
Which other intrinsic and extrinsic factors do they see as primary push- and pull-factors regarding their use of active mobility?
2.2.6. GIS Analyses and Data Describing the Static (Built) Environment
The method set of GIS enables various options for relating data by location, for retrieving objective information on network structure and for adding information on spatial context. However, the quality and explanatory value of results from geospatial analysis ultimately depend on the type and characteristics of input data.
While GIS methods are highly objective, bias and uncertainty may still be introduced. This can occur in the process of selecting input layers and when integrating potentially subjective aspects such as prevalent in the definition of place. Therefore, these aspects need to be considered in method design, data acquisition and implementation.
Within the domain context of this research, GIS methods and spatial data sets contribute objective information on available road and cycling infrastructure. Authoritative or crowd-sourced network data provide a digital representation of available roads, cycling and pedestrian infrastructure and their spatial configuration. Based on this data, quality indicators regarding infrastructure suitability for cycling and walking can be derived [
44] that serve as basis for computing realistic routes. Morphological analyses use these routes as an input to determine catchment areas and help better understand impacts of an intervention on population- as well as individual scale. The potential and quality of analyses based on network data sets is highly influenced by the attributes available per edge (road segment) and their respective consistency and quality.
Domain questions that we can answer by using this method:
Aspect of interest: safety
Aspect of interest: acceptance and behavior
How important is the intervention site for bicycle traffic within a given area?
For which route relations (origin to destination) does the intervention have an impact?
How many residents live in the catchment area of the intervention?
How does infrastructure suitability for cycling within the intervention area compare to overall infrastructure suitability within the whole catchment area?
2.3. Linking Data Sources and Semantic Enrichment
For linking the diverse data sources, we propose a systematic approach based on semantic categorization. We identified five stages of data and information flow and semantic enrichment, which are illustrated in
Figure 2: Originating from a data source or sensor (1), one or more data sets (2) are obtained that in conjunction with additional context form a semantically enriched information layer (3) derived from a single data source. By joining this information layer (3) with data sets and information layers from other data sources, integrated information layers (4) are generated that again raise the semantic value. In the last stage, these integrated information layers (4) are assessed embedded in mixed methods approaches to generate multifaceted insights into the thematic aspects of interest (5). This last step again facilitates semantic enhancement.
Joining data and information layers is enabled through availability of common semantic keys. Within the given thematic context, we identified these keys as
time, location, person, and theme. Each data source and derived information layer bears at least one of the primary keys (time, space, person) and a theme. In every stage of linking information layers, the common primary keys are then used to join layers. The semantic delta between thematic keys of involved layers forms the basis for information gain. For example, one may consider two information layers regarding stress:
measured physiological body response and
reported stress of study participants. The semantic delta in this case is formed by aspects such as perception and cognitive processes as well as causal reasoning. Considering the case of
measured lateral distances and
reported perceived safety, the semantic gap is even larger, as two potentially related but very distinct themes are addressed. However, through assessing both information layers in conjunction, insights into their co-existence and co-influence are generated. In order to support identification of common semantic deltas that arise due to, e.g., internal vs. external view, dynamic vs. static data and information or individual vs. collective assessment, we provide a schematic at the end of
Section 3.1. This schematic view can also guide practical implementation as detailed in
Section 3.2.
Following the conceptual foundation outlined above, we developed a framework for integrating various data and information sources in context of assessing infrastructure interventions for cycling. The resulting knowledge graph is presented in
Section 3.1. For practical implementation of this framework, different spatial and temporal references and possibilities for aggregation have to be adequately considered. This is further detailed in
Section 3.2.
2.4. Integration in Mixed Methods Designs
Besides the given conceptual foundation on how to join different data and information layers, the question of how to integrate these building blocks into mixed methods approaches arises. As outlined in
Table 1, some data has to be acquired in situ during cycling within a case study, while others may be acquired and assessed post hoc or independent from a field study. While this might give first indication regarding temporal succession, different mixed methods designs may be applied depending on the specific questions to be answered. In different settings, some elements of data acquisition may be conducted multiple times or even left out. Therefore, we do not aim to provide a full set of possible mixed method designs within the scope of this paper, but rather point to the basic designs that are relevant. More complex mixed method designs can then be developed by combining the basic designs for a specific purpose, based on the respective research questions to be answered and for the individual selection of data inputs and methods available.
Following Creswell [
45], there exist three basic mixed methods designs:
Convergent design,
explanatory sequential design and
exploratory sequential design. These are characterized by the sequence in which quantitative and qualitative stages are conducted. Consequently, this sequence also determines the purpose of conducting qualitative and quantitative parts of the research. An exploratory sequential design uses qualitative methods first to explore the broad range of a certain topic in order to design a quantitative assessment that is conducted at a later stage accordingly. In contrast, an explanatory sequential design uses qualitative methods to explore findings from a quantitative assessment—e.g., to find potential explanations for the results. The third, convergent design, applies quantitative and qualitative methods in parallel and links the results afterwards. This means, that this design does neither allow for, nor require, that results of one stage influence the conceptualization of the other.
All three basic mixed methods designs are possible options for designing a study using the proposed conceptual framework. In practice, almost certainly more complex designs with several assessments informing other (later) assessments will arise. This is fostered by the availability of numerous quantitative as well as qualitative methods in our proposed framework, as this setting goes beyond the common examples from social sciences with one quantitative and one qualitative method involved.
To conclude this section, we want to stress the importance of thorough conceptualization of such an advanced mixed methods design based on the specific requirements. The proposed conceptual framework for joining data and information layers can support this process through clear representation of links between the building blocks. We therefore propose the following workflow to derive an advanced mixed methods design in this context:
Precise definition of (research) questions to be answered
Determining data sources and methods that can provide insights regarding the aspects of interest
Identification of possible temporal stages and basic mixed methods designs that are applicable
Developing a draft for the advanced mixed methods design
Ensuring that all necessary keys for joining data and information layers are available per data source
Implementation of the study
3. Results
In this section, we outline the results from the approach described in
Section 2. First, we detail on the conceptual integration represented by a knowledge graph. This is followed by results regarding technical implementation. The third subsection links back to the thematic starting point by stating hypotheses and domain questions that may be answered through the integrated mixed methods approach and which could not be answered by a singular method.
3.1. Conceptual Integration
If read from left to right, the knowledge graph shows the flow from data source via data and information layers to aspects of interest. See also
Figure 2 for the conceptual formulation of the stages involved. Small pictograms attached to data and information blocks represent the keys available for joining the respective layers. Insights are generated at the bold red lines that link aspects of interest or their sub-aspects. In
Figure 4 we provide a simplified excerpt from the full knowledge graph to give some guidance on how to read and use this graph. The left section of the diagram shows one exemplary information layer, representing lateral distance during overtaking maneuvers which is derived from measurements of lateral distance and push button markers that label overtaking maneuvers. By following the links to the green boxes, we can identify the joint thematic information layers it contributes to. For both, the aspects of interest “
stress” and “
safety”, it contributes to the layer “
objective detected framing conditions”. These joint information layers are embedded within a network of several semantically connected information layers which are highlighted in the right section of
Figure 4 using the exemplary aspect of interest “
stress”. The semantic relations between these layers are represented as red lines and their respective quality and meaning is provided as textual label. What is left out for simplicity in this figure are the numerous relations to other data and information layers that contribute to the joint information layers and aspects of interest. These can be assessed using the full knowledge graph provided online.
For
practical application of this theoretical framework in various settings, the graph may be read from right to left in order to focus on questions that need to be answered and their associated aspects of interest. As the necessary integrated information layers (4) per aspect of interest (5) are identified, the different input information layers (3) connected to them can be followed until reaching the underlying data source(s) (1) on the left end of the graph. To ease this process, it proved helpful to utilize an interactive tool for graph assessment which allows for filtering, e.g., for neighborhood to a selected element. Filtering reduces complexity and thereby eases readability. An example for such filtering based on neighborhood is shown in
Figure 4 (left).
In addition to the knowledge graph, with
Figure 5 we provide a conceptual view on different categories of observations and how these data and information layers can be joined. While the knowledge graph focuses on semantic integration from a thematic perspective,
Figure 5 highlights different temporal and semantic scopes of input layers on a more generic level.
First, we distinguish between internal and external data and information. Internal refers to aspects related to an individual, such as demographics but also individual sensation and reactions. External refers to observations more closely related to the (built) environment, or which reveal how people interact with the environment as perceivable form the outside. For both categories, we found data and information that represent dynamic as well as static aspects—examples are given in
Table 2. The internal and external static layers can be described as context variables that may help explain and understand dynamic aspects or provide the basis for in-depth analysis, e.g., by generating subsets of dynamic input data based on values of static attributes.
For internal and external dynamic inputs, two different levels of aggregation can be chosen for further assessment. One option is to proceed on individual level which means to assess singular events during one cyclist’s ride or based on statistical results for a set of trips by one person. The other option is to aggregate data of all study participants by location and to continue assessment on this less detailed level. Such aggregation, however, removes the ability to relate other data and information layers based on person and time. In such cases, location becomes the only key for relating input layers. Which approach to choose in a practical use case can greatly depend on the questions to answer as well as on available resources.
The decision whether an observation should be acquired in situ or post hoc in general depends on the type of observation. For dynamic observations an in situ acquisition is required unless relying, e.g., on post hoc questionnaires or interviews. However, static observations as per their definition are less time-dependent and in consequence can be acquired independently before or after an in situ acquisition.
3.2. Technical Integration
The frame for technical integration is set through the decisions on conceptual integration as outlined in the previous section. The semantic keys for relating or grouping data and information layers are utilized to perform the respective task on the underlying data sets. Using person or time as key in general can be reduced to a data join operation on the respective attributes—person ID or timestamp. Still, the time attribute may contain relative information such as prevalent in human descriptions provided in an interview. In such cases, the qualitative information needs to be manually converted into an absolute timestamp. The same issue applies when utilizing location as common key. for instance, when respondents refer to places in describing their observations. However, complexity is added through potentially subjective place names and the presence of different types of spatial features. The latter can occur, e.g., in an interview, depending on whether a precise location, an area or vague definition of proximity is given.
Additionally, potential impacts that choice of temporal and spatial scale may have need to be considered.
3.3. Hypotheses and Domain Questions
As a result of applying the proposed diverse set of methods, numerous data sets encompassing vast amounts of data are generated. To efficiently handle these large amounts of data and to generate information and knowledge from the data, a structured process and strategy for data analysis is crucial. One possible approach to guide the process of analysis in a structured way is to utilize ex-ante hypotheses. As hypotheses may target different semantic levels, we defined three different levels of interest for clustering hypotheses on a theoretical level. These levels are designed to be generic and applicable to assessment of various thematic aspects, also beyond the present context of active mobility. We defined the three levels of hypotheses as follows:
Level 1: USE CASE
This level comprises hypotheses that are designed to answer questions regarding the specific thematic focus of a study. For our present research context, these questions target the effects of infrastructure interventions on different factors such as stress, safety and acceptance.
Level 2: METHOD
Within this level, hypotheses focus on the applied set of methods. It targets questions regarding the integration of different methods to assess the added value of a mixed methods approach. Hypotheses can foster validation or the gain of detailed knowledge regarding singular methods as well as their combined use.
Level 3: DATA
This level encompasses questions regarding the meaning of results within the respective thematic frame as well as potential influencing factors and their relevance and impact for inferring conclusions. Exemplary questions are “How can the data be interpreted?” or “Which influencing factors have to be considered—and how?”
We can subdivide the thematic, use-case-specific hypotheses (level 1) further by attributing them to the respective aspect of interest that they address. For example, a hypothesis could be “the infrastructure at location XY prevents overtaking distances lower than 150 cm”. This can be directly attributed to the aspect of interest safety.
Following up on this example, if distance measurements are assessed in conjunction with statements on perceived safety or physiological measurements that reveal moments of stress, more advanced conclusions are possible. Such assessment facilitates better understanding of how passing distances affect subjective safety, physiological stress reaction and systemic effects. In consequence, this allows for answering hypotheses regarding the methods used, their individual contributions and their value within mixed methods settings (level 2).
When adding more context such as information on personal familiarity with the infrastructure at hand, frequency of use or the temporal pattern of traffic flow, hypotheses regarding the data level (level 3) can be answered. It allows for assessing the relativity of effects found for an intervention.
3.3.1. Questions That Can Be Answered Using the Mixed Methods Approach
In this section we provide an exemplary set of questions that the mixed methods approach enables in addition to the method-specific abilities documented in
Section 2.2. Due to the high number of possible combinations of methods it is not feasible to provide an all-encompassing list.
Level 1: USE CASE
How stressful is the infrastructure for cyclists and what does cause stress at the given infrastructure? (objectively measured physiological stress reaction, questionnaire, interview)
How do individual aspects such as stress, safety and smoothness contribute to infrastructure usage and mobility behavior in the given environment? (questionnaire, interview, morphological network analysis, traffic counts)
How do passing distances affect perceived safety and stress at the given infrastructure? (measured lateral distances, measured moments of stress, questionnaire, interviews)
Where and why is smoothness of travel impacted at the given infrastructure? (movement trajectories, questionnaire, interview)
Level 2: METHOD
How well do measured physiological stress reactions align with subjectively stated stress? What are potential explanations in cases of mismatch? (objectively measured physiological stress reactions, questionnaire, interview)
How does experienced stress in general contribute to perceived safety and acceptance of infrastructure? (objectively measured physiological stress reactions, questionnaire, interview)
How do passing distances influence perceived stress, safety and acceptance of infrastructure? (objectively measured lateral distances, questionnaire, interview)
How does smoothness influence perceived stress, safety and acceptance of infrastructure? (movement trajectories, questionnaire, interview)
Can quality indices derived from infrastructure characteristics as represented in digital network models adequately indicate perceived safety, stress and/or acceptance of infrastructure? (GIS analysis, objectively measured physiological stress reactions, questionnaire, interview)
Level 3: DATA
What are further factors besides stress, safety and smoothness that influence acceptance and mobility behavior? (all methods involved)
Which additional parameters have significant influence on perceived safety, stress and/or acceptance of infrastructure besides the factors commonly represented in digital network models?
3.3.2. Identification of Potential Proxies
Using the knowledge graph presented in
Section 3.1, methods that may potentially serve as proxies for other methods can be identified. In order to locate candidates that provide similar insights as another method, all information layers connected to an aspect of interest or its specific integrated information layer need to be traced back to identify the method used for its generation. If two methods are connected to one aspect of interest but two different integrated information layers, it is important to reflect on the semantic difference between these layers. For example, within the aspect of interest
safety,
measured lateral distances contribute to
objective detected framing conditions, whereas post hoc
reported details on overtaking maneuvers contribute to
individual perception. So, while both information layers contribute to the same aspect of interest and have a common thematic focus, they describe different facets. One being objective measurements of distances and the other representing subjective perception of these situations. It may depend on the individual purpose of an assessment, whether the replacement of one method by another may be suitable. Therefore, we do not aim for providing a generic set of potential proxies and prefer to provide the toolset to enable purpose-specific decisions.
4. Discussion
With this research, we were able to answer the research questions defined in
Section 1.2 as follows:
How can we integrate different data and information sources conceptually and technically? This question covers the integration of different data types (quantitative and qualitative), as well as different temporal (in situ and post hoc) and spatial references.
With this research we propose a theoretical framework that facilitates semantic integration of various data and information layers on a conceptual level. This framework is represented as a knowledge graph, which allows for individual extension towards additional methods, input layers and aspects of interest. Furthermore, we provide conceptual reference for technical implementation, highlighting different options for aggregating and joining data and information layers. With successfully conducting first field studies that implement the theoretical framework we were able to show its feasibility. Details and results of these field studies will be published separately.
Which questions regarding an infrastructure intervention can we answer by using single methods and for which questions do we need a combination of methods?
We found several questions that may be answered by singular methods as presented in
Section 2.2, as well as questions that require an integrated mixed methods approach as outlined in
Section 3.3.1. Additionally, we provide three generic levels for clustering hypotheses that benefit from an integrated mixed methods approach.
How to identify input measurements that may be used as proxy for other measurements?
We outlined how to utilize the knowledge graph that represents a main result of this research to facilitate identification of methods that may serve as proxy for other methods.
4.1. Extending the State of the Art in Mixed Methods Mobility Research
We see our main contribution in providing a generic, extensible framework that offers guidance for developing and implementing case-specific mixed methods designs. Within the applied domain of assessing infrastructure for active mobility it supports the process of selecting appropriate methods to gain multifaceted insights. With this, we strengthen the current paradigm shift towards evidence-based planning. Moreover, the proposed approach paves the way for holistic assessments and monitoring in the re-design of road spaces. This is a crucial building block for a solid evidence base that informs future projects. Whereas currently, technical norms are exclusively decisive in planning processes, experiences from previous projects including the evaluation by the users are hardly ever considered explicitly. This mainly occurs because of the absence of objective evaluations. As we showed how a new set of hypotheses and domain questions may be answered through advanced mixed methods settings, we strengthen evidence for their future application in planning for all road users, particularly for active modes.
In context of the existing literature as summarized in
Section 1.1 our results contribute to advancing the capabilities of mixed methods approaches in active mobility research. The framework facilitates new, more complex combinations of methods and input layers. Consequently, it supports moving towards multifaceted assessment for better understanding complex real-world mobility interactions with the environment.
From interpreting the knowledge graph it becomes evident how single methods can contribute important details on very specific aspects, while the combination and integration of methods allows for a comprehensive and diverse assessment.
4.2. Better Understanding of Cycling Mobility
For the application domain with our present focus on cycling mobility we showed clear advantages of the systemic mixed methods approach. By integrating various perspectives, it adequately reflects the multifaceted and complex real-world mobility interactions. Furthermore, partly thematically overlapping methods and input layers minimize the risk of misinterpreting a single data or information source as well as the risk of bias. With the given framework, both, explanatory as well as exploratory approaches are possible, and in an ideal scenario both are combined. Consequently, multifaceted mixed methods approaches are supportive for evidence-based and inclusive planning and may thereby strengthen communication and citizen participation as important elements of democratic societies.
4.3. Limitations
Our proposed framework comes with an exemplary set of methods suitable for assessing interventions for cycling mobility. However, this is not all-encompassing. The framework is meant to be easily extensible to additional methods and adjacent use cases. Following its generic and extensible structure, it does not represent a full domain ontology that may be used as a direct manual for step-by-step implementation of an applied case study focused on a specific set of questions. For this, additional transfer steps are required as outlined in
Section 2.4. The methodology presented in this paper, provides guidance for specifying such studies.
We regard the epistemological comparison of single method and a mixed methods approach as a major contribution of this paper. However, it is not possible to go beyond a qualitative description of the added value. Following the diversity of methods, data sets and their prevailing semantic gaps, defining a consistent and universal set of quantitative indicators for systematic assessment is not feasible. Such systematic assessment is expected to require numerous case studies that implement the theoretical framework for specific applications. Thus, this exceeds the scope of the present paper.
4.4. Future Research and Application
Feasibility of applying an advanced mixed methods study in practice may be hindered by limited budgets for planning in combination with the high effort such studies require. Besides requiring specific expertise and—depending on the method—substantial amount of time for applying each individual method, integration of methods adds further complexity resulting in higher costs. An extended level of knowledge, acceptance and understanding of each other’s methods and their respective contribution to the overall research aim are required. It may be a time-consuming process to achieve this level when no common denominator or domain language exists for the parties involved. This is where a generic framework such as the one presented in this publication that provides guidance for planning and implementation of domain-specific mixed methods studies may help streamline the process.
Besides potentially still higher costs compared to applying a single method, an advanced mixed methods study is capable to provide multifaceted and more detailed insights. As it helps better understanding of potential interrelations and additional influencing factors, it substantially improves the quality of conclusions drawn—e.g., regarding the effects of a specific intervention. As a consequence, the limited budget is spent more effectively, as potentially adverse effects of an intervention can be detected and planning of future interventions can benefit from such learnings. The potential benefits can be further increased when applying such studies in structured pre- and post-evaluations.
Concluding from our results and the experiences from collaborating with domain experts from various backgrounds in mixed methods studies, we see strong benefits for scientific discourse and mutual learnings in such settings. Bridging domains, linking very different paradigms and methodologies, and finding a common ground regarding (domain) language and methods not only helps to better understand other disciplines, but also enriches work within the own discipline. For example, synergies may be found and critical reflection of own assumptions, methods and their application can lead to fruitful new ideas and methodological advances.
Very immediate research directions are to extend the framework for serving specific use cases and to add further data and information sources as well as additional aspects of interest.
A major demand for future work we see is the implementation of more complex, advanced mixed methods studies for assessing various aspects of interventions in road space. This is targeted at two research directions: First, for developing a set of standardized mixed methods designs that can easily be reproduced and which can generate comparable results. This may serve the further advance and dissemination of such methodology. Second, having standardized method sets and implementation guidelines available can foster better understanding of mobility interaction on a generic level. With generating results for various implementation sites with very different framing conditions, the potential knowledge gain can be maximized.
In conjunction with the development of standardized mixed methods designs and implementation guidelines, we see high value in defining indicators that can be used to benchmark methods and implementations. As individual applications have their specific requirements, domain questions, and methods, the definition of such indicators must be integrated into the process of creating purpose-specific mixed methods designs.
Another research direction is the assessment of costs and benefits associated with each method when used stand-alone and in combination with other methods. This is related to finding proxies for input layers that may be used as fallback option in case that costs for a study need to be reduced. However, such assessment needs to clearly state which implications the substitution of a method has for the overall methodology and interpretation of results. Furthermore, we see great value in assessing costs and benefits of individual methods and different mixed methods settings for enriching results obtained in this research with quantitative data.