1. Introduction
With robotaxis already on the roads in several Chinese and U.S. cities, their spread throughout other locations, namely European locations, is a matter of time, which is dependent on different factors such as customer adoption, political coverage, or technological availability, among others.
As highlighted by [
1], the deployment of a shared autonomous vehicle (SAV) service in urban areas is synonymous with huge benefits, specifying that, for instance, in the Lisbon area, the mobility needs would be satisfied with just 10% of the current car parks if SAVs could replace existing passenger cars. Moreover, such replacement would mean a drop of 27% in CO
2 emissions, and all surface parking would be eliminated.
The forecasted effects in 2017 were so significant that it is not logical that SAVs are not yet ubiquitous because usually, big changes tend to be quickly adopted. The reason is the existence of obstacles or pre-conditions that do not exist yet in the deployment of SAVs.
Refs. [
2,
3] built a reference model with the main concepts sustaining an SAV service, demonstrating their reference model with the existing Phoenix ride hailing service operated by Waymo and Beijing ride hailing service provided by Baidu Apollo. Both real examples validated the twenty-nine concepts of the reference model, adding four more in Phoenix and another two more in Beijing, so altogether, the referred research work identified 35 concepts that should exist to have a ride hailing service in an urban area.
The reference model, demonstrated and fine-tuned with real examples, is solid because it is based on scientific research (papers) and processes (systematic literature review and topic modeling), however, even though the production of papers in this fast-growing research area as well as the two cited examples are not 100% coincident and are continuously changing, more validations are needed.
Ref. [
4] stated that the demonstration of an artifact in more than one situation empowers the strength of that artifact. Ref. [
5] suggested that the reference model has to be verified in more than one way, and Ref. [
6] discussed methodological field trials based on surveys for quality assurance.
In this case, specifically listening to possible stakeholders about each of the concepts that constitute the reference model of such a digital ecosystem is a step forward regarding model validation.
Therefore, a group of respondents was approached to answer a questionnaire built to validate the 35 concepts. The obtained sample comprised professionals acting directly or indirectly in the mobility industry, people who could theoretically participate in the reached digital ecosystem sustaining an SAV service in an urban area.
This new validation resulted in a much more reliable group of concepts, which were then modeled in ArchiMate to produce a revised reference model for SAVs in urban areas, which had the goal of being a decision-making tool to be adopted by SAV stakeholders in their strategical options, contributing to the quicker adoption of this new type of service.
The rest of this paper is as follows.
Section 2 concerns the definition of the research background. In
Section 3, we describe the research method based on the survey.
Section 4 presents the results,
Section 5 is the discussion,
Section 6 is the updated reference model, and finally, our conclusions are presented in
Section 7, which includes the main contributions, limitations, and future work.
2. Research Background
According to [
7], a “survey” is often used to define a technique of obtaining data from a group of people in order to acquire information about the larger population from which the sample was obtained.
Continuing with [
7], there are many types of surveys and a vast number of reasons to use surveys, although some features are common. A survey is not a census, because they obtain data from only a small number of individuals. In a bona fide survey, the sample is not designated randomly or just from individuals who offer to join; it is methodically selected in a way that everybody has the same possibility of being selected. Therefore, the outcomes can be consistently predicted for a bigger community. Data are obtained by using homogenous inquiries in order to have all surveyed respondents answer the same question.
The goal of a survey is not related to the description of an individual but instead to obtain the statistical shape of the population. The people who answer the survey are never recognized, and in the end, the outcomes are accessible as synopses like statistical boards or charts. The sample magnitude is based on the desired trustworthiness, so a sample size instruction is something that does not exist at all.
For [
8], surveying populations is an appropriate research method, often used for non-experimental descriptive applications targeting the definition of the population. Therefore, for example, a survey research line can be utilized for the establishment of the frequency of a specific feature. Similarly, a survey approach is often utilized to obtain data regarding attitude and conduct. At times, there are themes that are better approached by a traditional trial, where the respondents are chosen from a randomized intervention group or a control group. The reality is that it is not easy to achieve an effective design. Of course, there are effective moral and real-world reasons showing why the respondents should not be randomly designated. Moreover, it is possible that a control group cannot be identified because controlling a randomization development is not an easy task.
Staying with [
8], the validity of a survey might either be internal or external, well-organized, cover geographically wide samples, can present moral returns, and is flexible. Instead, surveys dependent of the sampling frame do not always present the reason why people behave, and surveys based on interviews depend heavily on the interviewers’ effectiveness.
Still, according to these authors, regarding the collection of data for the survey, a survey is a sort of research design. Instead, interviews and questionnaires are techniques to collect information. There are plenty of ways to obtain people’s data, where the three principal ones are: (a) direct meetings; (b) phone conferences; and (c) surveys. Choosing the right approach varies in some aspects: contact with the respondents; learning level; theme; incentive; and existing resources.
To [
8], questionnaires are not as expensive as individual interviews, and they are faster in vast and extensively distributed samples. Similar to phone meetings, a postal survey can be valuable if the respondents are dispersed. However, because there is no direct contact, prior questionnaire design and layout are critical.
Ref. [
9] discussed the topic of questionnaires utilized in the sample. For the author, a questionnaire is a paradigm of consistent interrogations that operationalize the number of constructs. The objective is to show a homogenous incentive to respondents to facilitate comparable feedback from them. There is evidence proving that small variations in the inquiry wording or order are enough to considerably affect the answer, which highlights that questions should be precisely phrased, respecting the order, in order to gather analogous information. Continuing with the same author, people usually do not answer the predesign meaning of a query, but instead answer what they think it was meant to ask.
Nowadays, surveys might be web-based. Ref. [
10] states that obtaining online information allows researchers to access people from a wide variety of locations in an inexpensive manner. Online surveys provide exceptional opportunities for obtaining information. These surveys are mostly suitable for collecting primary information as well as for pretesting research design and query understanding.
There are many software programs to consider when performing surveys, and Statistical Product and Service Solutions (SPSS) is a common option. SPSS is a manageable statistics software driver based on Windows, offering users a point-and-click means to produce output. This package allows innovative structures, which permits users to challenge advanced statistical procedures. Researchers may also try to use the syntax editor to produce ´code´ with the goal of precise analysis, in contrast to the point-and-click scheme of producing output [
11].
Taken together, given the available options, a web-based survey will be used to understand what experts think about the reached concepts, and the consequent statistical treatment will be performed by SPSS.
The ArchiMate Specification, a standard of The Open Group, is an open and autonomous modeling language for Enterprise Architecture based on diverse tool merchants and consulting firms. The ArchiMate Specification offers tools to empower Enterprise Architects to define, examine, and picture relations amongst business domains in an explicit manner [
12].
3. Research Method
3.1. Sample
Sampling is to make a selection from a subset of persons that belong to a population to estimate the features of the entire population. With this method, the costs are lower and the information gathering process is faster [
13].
In brief, the idea is to create a relevant stakeholder sample in terms of quantity and quality (real decision makers). This list should involve what is thought to be, at this stage, representatives of reference model elements updated with Phoenix and Beijing, where the decision makers would then be approached through a structured query with questions for a final check of the improved concepts list.
Therefore, an online survey for collecting primary data was conducted to validate the hypothesis created. The platform used for this purpose was Qualtrics, a web-based survey platform that provides tools for building and distributing surveys, analyzing responses, and generating reports, which also offers a user-friendly interface and a wide range of features for creating and conducting surveys [
14].
The objective was to verify whether the twenty-nine reference model concepts added to the four new concepts found in Phoenix plus the two others identified in Beijing were confirmed by potential SAV stakeholders, who were then asked not only some general information about them (to know if they fulfilled the conditions to take part in the desired sample), but also about the relevance of each of the thirty-five identified concepts.
The population under analysis were top-tier decision makers (C-level executives, managers, directors, etc.) from various industries, ages, gender, and nationalities. The survey was sent to personal and non-personal contacts via social media (LinkedIn), direct e-mails, and WhatsApp to ensure that the targeted population was effectively reached (i.e., the sample was not randomly selected but instead the respondents were chosen based on defined criteria: people with decision-making capacity, with origin in industries that potentially support SAVs, from several countries, various ages, men and women).
Participants were asked to express their level of agreement with each statement by using a rating scale from 1 (strongly disagree) to 7 (strongly agree). Through the survey, the goal was to answer 35 hypotheses, each regarding a concept, namely whether those concepts are considered relevant for the implementation of SAVs in urban areas.
3.2. Online Questionnaire
3.3. Respondents’ Data
The survey was answered by 122 respondents, but 34 did not finish the study, leaving 88 valid responses (participants were forced to respond to all items to avoid losing observations). Hundreds of people were approached, but the difficulty of obtaining responses was immense, especially because the survey was sent to top-tier managers.
Among the valid participants, 87.5% were male and 86.3% had an age gap between 41 and 60 years. Regarding the nationality of the participants, 68.2% were from Portugal, and a total of 12 nationalities (USA, Brazil, UK, France, Germany, Italy, Mexico, Poland, Portugal, Slovakia, Spain and Turkey) were included in this study. Of the 88 valid participants, 87.5% were in the top three levels of decision making on a scale from 1 (lowest) to 7 (highest). For this study, to ensure that the respondents were undoubtedly people with a strong decision-making capacity, only the ones (77) who confirmed having a level of decision making equal to 5, 6, or 7 (on a Likert Scale of 1 (lowest) to 7 (highest)) were considered (
Figure 3).
In other words, requests were only sent to people with recognized decision-making capacity, and on top of that, only the answers from respondents who were recognized to have that decision-making capacity were considered. With this extra measure, the obtained sample was more likely to be the targeted sample, avoiding, for instance, a person who can apparently make decisions but in fact cannot having answered the survey, thus reducing the possibility of delegation to a person without decision-making capacity.
Table 1 represents the frequency of the level of decision making of the respondents.
Respondents were also questioned about the industry in which they worked. From the 77 responses considered for this study, 33.8% worked in the automotive industry, 33.8% worked in the banking/finance services industry, and 15.6% worked in the mobility industry. Approaching people from the banking/financial services industry does make sense because it is often through captive finance companies that automakers can promote new mobility concepts. The respondents in insurance, law, consulting, and research were also people specializing specifically in the automotive/mobility industry.
In general, the people approached were decision makers in areas forecasted to be around SAVs. Overall, the idea behind the present survey was to approach potential SAV stakeholders, and the sample attained that goal.
Table 2 shows the industry frequency where the respondents worked.
In addition to their level of decision making and industry, the participants were asked about their job position, which resulted in several types of answers (
Table 3).
For the purpose of this study, the positions were aggregated into three categories: C-level (top management), directors, and middle managers.
C-level executives, also known as C-suite executives, are high-ranking executives within an organization who typically hold titles beginning with the letter “C” such as CEO, COO, CFO, CMO, or CIO.
These individuals are part of the top leadership team and are responsible for making strategic decisions that impact the organization as a whole [
15].
Directors are individuals who are appointed or elected to serve on the board. The board of directors oversees the company’s administration and planned strategy. Directors have fiduciary responsibilities to make decisions respecting the best interest of the company and the respective owners. They are involved in making high-level decisions, setting company policies, and providing guidance to the executive management team [
15].
A middle manager is an individual within an organization who holds a management position and is typically responsible for overseeing a team or department within the company [
15].
From the selected 77 responses considered for this study, 51.9% were C-level executives, 44.2% were directors, and 3.9% were middle managers.
Table 4 shows the frequency of the positions of the respondents.
In general, the reached sample included relevant people for the implementation of SAVs like a Rent-a-Car association leader, the mayor of a European city capital (that dominates an urban area), and the global chief digital officer of an automaker captive finance company.
Regarding gender, taking into consideration that women CEOs run 10.4% of Fortune 500 companies, and taking that as a population feature, the 12.5% of female respondents, although far from being a positive figure, in practice, this is in line with the population. Additionally, according to the consulting firm Deloitte, the female gender represents only 20% of the automotive workforce, decreasing to an amount that does not reach 10% at the executive level, which means that a sample with 12.5% of females is aligned with what exists in the top-management executive market concerning gender.
Regarding the sample age, taking figures from consulting firm Korn Ferry, in 2019, the U.S. market had an average age of 56 years old for C-members compared with the obtained average sample of 49.3 years. The sample average was 6.3 years younger than the population, but considering that C-members will make SAV decisions in an uncertain future, the obtained sample will be older when those decisions are made, so it is likely that the age gap is not that material.
It is also fair to question and discuss whether the reached sample size is large enough to extrapolate statistical conclusions, especially because, among other factors, the sample size depends on the population dimension, in this case, the global decision-making executives from potential SAV stakeholders, which is a highly subjective number that is impossible to calculate.
The available literature is usually conservative about recommending a number for a sample size [
14], instead, it has pointed out that if the type of statistics is descriptive, like mean or frequencies (like this present case), then nearly any sample size will suffice, suggesting that a minimum of 100 elements are required for each major group or subgroup in the sample, and for each minor subgroup, a sample of 20 to 50 elements is enough. Therefore, in this study, our reached final sample of 77 respondents is still within the defined author boundaries.
In this way, considering the reached sample dimension, quality (40 C-level people), diversity (12 countries, 8 industries), gender, and age, to the best of our knowledge, our sample was valid to proceed with the statistical analyses.
4. Results
To test the relevance of the 35 concepts, descriptive statistics and the one-sample T-test were employed as analytical tools to assess the significance of the respondents’ agreement for the validation of the 35 identified concepts. Descriptive statistics was utilized to compute the mean and standard deviation, offering a concise summary of the central tendency and variability in the participants’ responses.
The one-sample T-test was then applied to compare the obtained mean with a theoretical value, specifically 4 on a 1–7 Likert scale, which represents neutral agreement. This statistical approach enables an exploration of whether the companies, as indicated by the survey respondents, exhibited a noteworthy inclination toward the validation of the concepts or not. The use of the one-sample T-test facilitates an objective assessment of whether the observed means significantly deviate from neutrality, providing a robust foundation for conclusions.
To facilitate the analysis, we decided to categorize the hypotheses into four distinct groups: Strongly Supported, Supported, Partially Supported, and Not Supported at all, based on a threshold analysis derived from the mean scores of the respondents (
Table 5).
Table 6 shows a summary of the figures obtained through descriptive analysis, using the categorization explained above for an easier understanding.
Altogether, in front of available options, a web-based survey will be used to know.
Table 7 summarizes reached results.
Table 8 shows a summary of the one-sample T-test conducted for the questions.
The initial analysis of the descriptive statistics provided insights into the central tendency and variability of the responses for each concept. The subsequent one-sample T-test served as a statistical validation, supporting the initial analysis by confirming the level of support for each hypothesis. The combination of both analyses enhanced the robustness of the findings, offering a more comprehensive understanding of the respondents’ attitudes and perceptions related to SAV services in urban areas. In this case, the one-sample T-test corroborated and reinforced the conclusions drawn from the descriptive statistics, providing a more robust basis for the support levels of each hypothesis.
The high t-values observed in the one-sample T-test signify a significant deviation of the sample means from the hypothesized population mean, providing strong statistical evidence in favor of the hypotheses. The very low one-sided p-values, often below the conventional significance level of 0.05, further indicate the support for the acceptance of the concepts. Additionally, the positive mean differences signify a consensus among respondents in favor of the concepts being tested. Collectively, the statistical indicators contributed to the robustness of the findings, reaffirming the support levels derived from the descriptive statistics and underscoring the reliability of the conclusions drawn from the survey data.
In short, the strong support for these hypotheses indicates a consensus among the survey participants regarding the critical factors that contribute to the success of SAV services. The positive mean differences, along with the narrow confidence intervals and very low p-values, reinforce the robustness of these findings. This alignment in opinions and the high level of support across concepts suggest a unified perspective among the respondents on the essential components and considerations that need to be considered by companies, cities, and governments when implementing SAV services in urban areas.
The initial analysis of descriptive statistics revealed valuable insights into the respondents’ attitudes and perceptions regarding SAV services in urban areas. This examination provided a clear understanding of the tendency and variability associated with each concept. Subsequently, the one-sample T-test served as statistical validation, affirming and reinforcing the initial analysis by confirming the level of support for each hypothesis. This dual-method approach not only deepened our comprehension of the respondents’ perspectives, but also enhanced the robustness of the findings. Together, these analyses offer a comprehensive evaluation of the relevance of various concepts, providing a solid foundation for understanding the factors that are crucial to the successful implementation of SAVs in urban areas.
5. Discussion
The reached artifact is mostly supported by scientific articles, and the consequent group of concepts is the result of several processes (SLR, topic modeling). Nevertheless, to come from science does not signify that it is science again, so our objective was the validation of the reference model to check whether it fit with real cases and the opinions of those who could theoretically implement and participate in an SAV service.
The model, in demonstration and evaluation, is a predictive and not a descriptive model because it is mostly supported by papers that anticipate a reality. Therefore, its confrontation with reality is, in a certain way, a litmus test to verify whether it makes sense or not.
These confrontations confirmed concepts and brought about some changes. In fact, updating the model with the incorporation of concepts that came from real cases in Phoenix and Beijing strengthened the model.
Considering the lessons from Phoenix, it is possible to have a shared autonomous vehicle service based on car intelligence and precise mapping long before full connections are available, though it is not possible to estimate when this will happen. In fact, while V2X is very important, it is not currently a reality, so accurate mapping is a workaround that allows for the existence of SAVs. Additionally, having an interface with the client and mobility provider is also important, as well as customer perks, targeting people with special needs with serious mobility constraints, and boosting client engagement.
Regarding Beijing, aside from the mapping already added because of Phoenix, the two new concepts not raised by the reference model but identified in Beijing were related to the increase in customer adherence to SAVs. The remote driving capacity aims to boost the customers’ confidence, offering the capability that if something goes wrong, the car will not stop on the street, but will instead be driven to a safe place. The QR code can also be associated with the customer experience, which is very important in a COVID-19 environment when customers are still reluctant to access transportation recently used by other customers.
Next, the twenty-nine reference model concepts, with the addition of the four extra concepts found in Phoenix and the two in Beijing were evaluated by a group of decision makers with origins in industries that could participate in the digital ecosystem constituted by the identified concepts.
The survey findings suggested that nine of the concepts were strongly validated, and fifteen others were also validated, making a total of 24 concepts that the respondents agreed were crucial for the implementation of a shared autonomous vehicle service in an urban area. The not-validated concept was “QR Code with a sanitary test”, which makes sense because the COVID-19 times are no longer, reinforcing, in a certain way, the validity of this study. Therefore, this concept will now be definitively excluded from the final model.
The 10 concepts that were partially supported need deeper diving, requiring further reflection.
Regarding the concepts “carsharing” and “ridesharing”, it may seem that the respondents were skeptical about giving them more relevance, but since people often have a “car ownership” mentality, we are still slowly moving toward a shared mobility state of mind, so the concept of sharing such a personal item may take some time accept.
Following the Tesla strategy, a carsharing peer-2-peer concept would be useful to convince AV car owners to share these cars, lowering or even recovering these investments. In a sense, this would allow car owners to continue to use the car whenever they need the car, and receiving fees in return from a ride hailing service whenever the car is not needed. This would also facilitate SAV promoter companies, here called “firms”, to enter the business, because the investment would not be as large through this rent-2-rent concept. Of course, this will not be easy to implement as the sense of ownership prevails, and it is still quite important for many people. On the other hand, this could be an important step toward SAVs because economically, it does make sense for either car owners or SAV platforms, so it will probably be a concept that is not adopted in the beginning, but over time.
Ridesharing might be perceived as something that threatens people because sharing a ride with strange people in a confined place without witnesses can be dangerous. This is likely to be a service segment for low-cost customers, and eventually to young people, with the relative weight related to the price decrease that a shared trip can provide.
Regarding the concept of “client”, it is strange that the respondents did not find it highly relevant, but this service would not be achievable if there were no clients. It is possible that what was in mind was some risk aversion that can freeze SAV adoption, which is something to address with other concepts like customer perks or remote driving capacity to improve the customers’ confidence.
When it comes to the concept of “customer perks”, it seems essential to recognize the importance of customer satisfaction and the role it plays in the success of shared autonomous vehicle services. The partial support can be related to the lack of importance given to perks in comparison with the other concepts, which makes sense. These perks might not only be small gifts to attract customers, for instance, when Uber offered bottled water to its customers, but could also be child seats or wheelchair facilities, enlarging the service to a wider segment of customers.
The concept of “firms” may have been subject to misinterpretation, as it may have not been clear that we referred to the companies that implement this service as it was intended, but rather how it could benefit firms as clients. This concept should have instead been called “mobility provider” to avoid this type of doubt. To have SAVs in an urban area, someone must move forward and launch the service, so the existence of mobility providers is mandatory.
The concept of “fraud” had positive agreement amongst the respondents, but a considerable diversity in perspectives. This is an important concept to keep, as enhancing fraud prevention strategies is important for confidence among SAV service users.
Regarding “metropolitan”, this concept may be valid in a further stage of implementation, as the initial stage focused more on an urban area, which may be the reason for some of the respondents’ different perspectives. It seems clear that once implemented in more urbanized locations, the SAV trend will spread throughout metropolitan areas. Right now, the stakeholders’ doubts make sense, because SAVs will not operate in all metropolitan areas, at least in an early stage.
The “remote driving service” concept, similar to the concept “firms”, may have been misinterpreted, as this is a futuristic technology of the car driving itself to the nearest help point in the event of a malfunction during the ride. Since this is not a service that exists currently, it may be subject to some holdbacks in its utility and practicality. However, it is a concept to consider in terms of service security.
When it comes to the concept of “researcher”, the respondents may have had a more practical perception and did not interpret this asset as fully necessary, especially when in comparison with the other concepts. Collaborative efforts between researchers and decision makers can ensure research activities, and aligned with the practical needs and expectations of stakeholders, can foster a mutually beneficial relationship, which is precisely the intention of the present research.
Finally, as previously addressed, the concept of V2X was understandably questioned, as this is a technology from the future being assessed in the present. This concept suggests seamless interaction between the vehicle grid and the surrounding infrastructure such as traffic lights, public services, and other cars. Although it was only partially supported by the respondents, it seems like a valid concept to keep in the model.
All in all, evaluating the reference model with Phoenix and Beijing and surveying the consequent 35 concepts, only one concept should be excluded—the QR code with the sanatory test—so the remaining 34 concepts should constitute the updated reference model, which is the main value proposition of the present research.
6. Updated Reference Model
The final set of concepts is represented in
Table 9.
Repeating the process followed by [
3], that is, classifying all 34 concepts into ArchiMate layers and sublayers of the ArchiMate 3.1 Specification and then using the Archi—Open Source ArchiMate Modeling4, it was possible to model the obtained concepts as well as the relationships among them, according to the ArchiMate defined relationships.
The results are represented in
Figure 4. These were not very different from the reference model updated by Phoenix in [
3], with the addition of the remote driving capacity that was identified in Beijing, and not including the QR code used in Beijing, clearly rejected by the survey because the COVID-19 times are no longer.
7. Conclusions
Overall, the main output of the referred match of the reference model, first with Phoenix and Beijing, and now with the survey, is encouraging.
In fact, the reference model is logical, and, to the best of our knowledge, its update with the realities in Phoenix and Beijing, now boosted by the present survey, represents a trustable instrument with which to make decisions in this mobility area.
Once the digital ecosystem behind SAV and its several concepts have been reached, it is crystal clear that SAVs will spread only if these concepts exist, and highlighting these conditions is itself a pre-condition for the decision-making process. The availability of an AV is not enough to have an SAV fleet operating in an urban area, and there is a list of assumptions that were here exhaustively mentioned that condition the existence of such a service, and those conditions now include the remote driving capability and do not definitively include the QR code to enter the car.
There are, however, some limitations related to this work.
An important constraint is that the reference model was built based on scientific works that were limited in time, which means that new papers published in the meantime can, in theory, transform the artifact here presented, which means repeating the work in this study with more real situations and consequently a newer survey.
Furthermore, the identified differences between Phoenix and Beijing prove that there is a need for local fine-tuning, that is, the reference model presented in this research will always need to consider the geography where the SAV service is deployed.
In novel research about multi-modal transportation networks, Ref. [
16] precisely joined the adaptation to the innovation and to the collaboration, targeting the creation of a transportation ecosystem that is safer, more efficient, more accessible, and more sustainable for generations to come.
Finally, the survey only validated the concepts, and the respondents were not asked to add other concepts. The alternative would have been to risk a diversity of opinions that would have been impossible to incorporate into the model, which was already solid enough due to its scientific origin, either in papers or in processes.
As artifact mutability is related to evolution [
4], that is, with a reliable and trustable foundation, this model is logically a never-ending creation, continuously updatable, which is also a lesson learned in the present research.
In this sense, further validations are expected to bring actualizations to the model, and future work is essential to keep the model actualized.