1. Introduction
As a consequence of the profound impact of the information technologies (IT) in the societal organization during the last decades, a growing body of knowledge and practice has evidenced the innovative potential of the digital transformation of public administration not only in terms of internal procedural management but also in terms of external service provision, including its relationship with the citizens [
1,
2,
3,
4,
5,
6]. Moreover, according to political agendas and governmental strategies, this digital transformation has become a key objective [
3,
7,
8], which is corroborated by the United Nations, which envisages the use of digital tools to support policy making and public service delivery for its sustainable development goals [
3,
8,
9].
As in other areas of research with intense dynamism, different concepts have emerged and evolved over time to characterize digitally enhanced public services [
8], such as digital government or e-government [
10]. Digital government has been broadly defined as the process of implementing IT-enabled government innovations by transforming the public organizational structures and services delivery [
3,
11]. This definition emphasizes the use of electronic means, particularly the internet, to deliver government information and processes to governmental and non-governmental entities, business, and citizens [
8,
10].
According to different studies (e.g., [
1,
12,
13,
14,
15,
16,
17]), various maturity stages of digital government may coexist, reflecting different degrees of technical sophistication and interaction with citizens [
18], including catalog (i.e., online existence of digital services), transaction (i.e., electronic transactions between the government and citizens), vertical integration (i.e., existence of connections between the local systems and higher-level systems), and horizontal integration (i.e., systems’ integration across different functions allowing citizens to access different public services) [
12,
18]. Considering these maturity stages, the citizens’ engagement assumes a normative perspective; that is, since the adoption of IT presents several advantages, it is desirable that citizens be actively engaged [
8]. However, when the focus is the political participation of the citizens, their engagement is not only required for public service delivery but also for an active participation in public governance activities [
8,
19,
20], namely in terms of decision making, policy formulation, collaboration, and overall management of governmental and societal affairs. In this respect, digital governance or e-governance might be considered the application of electronic means, including innovative technologies such as artificial intelligence [
21], to support both internal government operations and interactions between governmental and non-governmental entities, businesses, and citizens to improve information and service delivery, encourage citizens’ participation in decision-making processes [
22], and contribute to accountability, transparency, and democratic values [
8,
19,
23].
Citizens’ engagement is determined by multiple socio-organizational circumstances, such as citizens’ awareness and motivation to participate or mitigation of digital divide challenges (e.g., IT literacy, availability of accessible communication infrastructures or adequacy of the user-interfaces) [
24,
25,
26,
27]. Moreover, previous experiences and self-efficacy can influence the perception of citizens’ satisfaction and expectations towards using electronic public services and their engagement [
24]. This means that user experience (i.e., users’ states resulting from their characteristics and prior experience as well as the context of use of a specific product or service [
28]) and the related usability concept (i.e., the ability of a product or a service to help the user to achieve a specific goal in a given situation while enjoying its use [
29,
30]) are fundamental features of people-centered technological applications [
31]. User experience and usability have been considered fundamental dimensions of digital government and governance quality models (e.g., [
21,
32,
33,
34,
35,
36,
37,
38,
39]) and usability assessments of digitally enhanced public services have been performed worldwide, particularly in terms of institutional websites, portals, or online public services (e.g., [
40,
41,
42,
43,
44,
45,
46,
47]).
Considering secondary research studies related to the importance of the user experience and usability of digital government applications, it is possible to identify several reviews: (i) Desmal et al. [
38] explored the impact of usability quality attributes such as the efficiency, satisfaction, memorability, error, and compatibility of mobile government services; (ii) Aldrees and Gračanin [
18] identified factors (e.g., perceived usefulness or perceived ease of use) affecting user experience of digital government applications and provided recommendations to support the design and implementation of future applications; (iii) Desmal et al. [
48] identified quality attributes (i.e., usability, interaction, consistency, information, accessibility, and privacy and security) that impact the users’ satisfaction with mobile digital government portals; (iv) Menezes et al. [
49] systematized models, dimensions, instruments, and tools to evaluate public services from the perspective of users and identified the main dimensions regarding service evaluation (i.e., quality, success, and acceptance of information systems and user satisfaction and user experience); (v) Lyzara et al. [
50] identified adequate methods to asses usability of digital government applications; (vi) Monzón et al. [
51] identified models for measuring the level of balance between usability and safety; (vii) Alshamsi et al. [
52] performed a mapping review to establish the trade-off between usability and security; (viii) Yerlikaya and Durdu [
53] reviewed the usability research conducted on university websites over a decade (2006–2016) to identify the most frequently used usability evaluation methods; (ix) Cisneros et al. [
54] established how accessibility evaluations of digital government web applications are performed; and (x) Zhang et al. [
55] systematized how eye-tracking technology has been used in the usability evaluation of digital government applications.
However, the authors of this systematic review were not able to identify systematic literature reviews focused on the methodological quality of the user-centered usability evaluation of digital applications to promote citizens’ engagement and participation in public governance. To fulfill this research gap, this systematic literature review aimed to assess the methodological quality of user-centered usability evaluation of these digital applications by (i) systematizing their purposes; (ii) analyzing the evaluation procedures, methods, and instruments that were used; (iii) determining their conformance with recommended usability evaluation good practices; and (iv) identifying the implications of the reported results for future developments. Therefore, this systematic review might contribute (i) to increasing the awareness of the importance of the user-centered usability evaluation of digital applications to promote citizens’ engagement and participation in public governance, (ii) identifying good practices and methodological issues, and (iii) providing evidence to support the development of recommendations to improve the planning, conduction, and reporting of future user-centered usability evaluation studies.
4. Discussion
A total of 34 studies were included in this review. This relatively small number of included studies, when compared to the number of studies focused on digital government and governance [
8], does not reflect the level of importance that is being given to the development of digital applications to promote citizens’ participation in public affairs but rather the importance of user-centered usability within the development of such applications. As usability is an essential factor for citizens’ adherence to and acceptance of digital applications [
22,
24,
31,
32,
33,
34,
35,
36,
37,
38,
39], it was hypothesized that user-centered usability evaluation would deserve more interest from researchers focused on the specific topic of this systematic review.
A possible reason for the reduced number of studies focused on the user-centered usability evaluation of the digital applications considered for this systematic review is related to the fact that a significant percentage of the reported applications are still in early development stages (e.g., requirements elicitation, general overview of the proposed architectures, or performance evaluations of the proposed applications or some of their components) [
134] and are, therefore, cannot yet be subject to real-world evaluations by end-users. However, considering the distribution of the included studies by publication years, it is possible to conclude that there is a growing trend of interest in the usability evaluation of digital applications to promote citizens’ engagement in public affairs.
In terms of geographical distribution, Europe represented the biggest contribution, which might be a consequence of the importance of European scientific productivity in terms of the development of sustainable smart cities [
136,
137].
Concerning the specific purposes of the applications (i.e., the first research sub-question), the included studies were categorized into six different purposes: participatory reporting of urban issues, environmental sustainability, civic participation, urban planning, promotion of democratic values, electronic voting, and chatbots. The last two categories only include one study each. In turn, participatory reporting of urban issues was the most relevant category with 35% of the studies, and the remainder were distributed between environmental sustainability (18% of the studies), civic participation (15% of the studies), urban planning (15% of the studies), and promotion of democratic values (12% of the studies). These results corroborate the results of other reviews since participatory reporting of urban issues, environmental sustainability and urban planning are important purposes among the scientific literature on smart cities, while the promotion of democratic values are fundamental issues of the modernization of public administration [
2,
8,
134].
In general, the studies failed to present evidence about how data privacy, integrity, and confidentiality are guaranteed as well as how to incentivize the engagement of the citizens since only two studies [
100,
122] addressed concerns with data security and privacy mechanisms, and five studies [
102,
107,
124,
126,
127] proposed incentive mechanisms (e.g., gamification). This might result from the fact that the studies were focused on the usability evaluation of the proposed digital applications. However, privacy and security mechanisms might impact usability [
52,
53], and incentive mechanisms are important for the acceptance and continuous use of digital governance [
138].
Considering the second research sub-question (i.e., what usability evaluation procedures, methods, and instruments are being used?), it is possible to conclude that both test and inquiry methods are being applied and that there is a high heterogeneity in terms of procedures and instruments. Concerning the level of conformance of the procedures, methods, and instruments with recommended usability evaluation good practices (i.e., the third research sub-question), the results of the application of CAUSS (
Figure 3) suggest the existence of good and bad practices irrespective of the five dimensions of this scale (i.e., usability assessment instruments, procedures, participants, study evaluators, and context and tasks).
In terms of good practices, three CAUSS items were scored positively by more than 90% of the studies: (i) item 3 (i.e., coherence between the procedures used to assess usability); (ii) item 4 (i.e., adequacy of the assessment procedures to the solutions’ development state); and (iii) item 13 (i.e., representativeness of the tasks used for the usability evaluation).
In turn, five items were scored positively by more than 50% and less than 70% of the studies: (i) item 1 (i.e., use of valid measurement instruments of usability); (ii) item 2 (i.e., use of reliable measurement instruments of usability); (iii) item 8 (i.e., representativeness of the participants); (iv) item 12 (i.e., number of participants); and (v) item 15 (i.e., adequacy of the analyses that were performed and variables that were assessed).
In this review, almost 50% of the studies did not use reliable and validated measurement instruments of usability. Moreover, almost 40% of the studies developed ad hoc questionnaires. In turn, considering the studies that used validated scales and questionnaires, the System Usability Scale (SUS) was the most used, which is in line with other reviews on user-centered usability evaluation [
134,
139].
Moreover, a considerable number of studies failed to report on the quality criteria pre-identified by seven CAUSS items: (i) item 5 (i.e., adequacy of the procedures to the participants’ characteristics); (ii) item 6 (i.e., employment of triangulation methods for the assessment of usability); (iii) item 7 (i.e., usability assessment with both potential users and experts); (iv) item 9 (i.e., training of the investigator responsible for the usability assessment); (v) item 10 (i.e., independence of the investigator responsible for the usability assessment in relation to the development process); (vi) item 11 (i.e., usability assessment conducted in the real context or a close-to-real context where the evaluated solution is being evaluated); and (vii) item 14 (i.e., usability assessment based on continuous and prolonged use of the evaluated solution).
Despite the heterogeneity on the procedures, methods, and instruments used the user-centered usability evaluation, most studies failed to show their adequacy concerning the characteristics of the participants involved in the usability evaluations, particularly in terms of age, given that more than 50% of studies did not indicate the age of the participants.
Additionally, less than 40% of the studies used both test and inquiry methods, which means that most of the included studies did not perform triangulation of the methods to assess usability. Moreover, less than 30% of the studies conducted usability evaluations with both users and experts, which is a recommended practice to identify potential usability problems [
140] and is in line with the results of other reviews (e.g., [
50]).
Considering the training and independence of the investigator responsible for the usability assessment, only one study reported that the responsible investigator was a trained researcher, and six studies reported that the evaluators were not involved in the development process. These results should be analyzed carefully since they might not reflect a bad practice when performing usability assessment but rather an omission when reporting the usability assessment experience. However, this information is of great relevance to clarify that the inexperience of the researchers and potential conflicts of interest did not impact the results of the usability evaluation [
140].
Most of the included studies were conducted in the laboratory context. Consequently, the results of the usability evaluations did not reflect the use of the proposed applications in real environments (i.e., the applications were evaluated in their real context by less than one-third of the studies) or the continuous and prolonged use of the applications (i.e., only 20% of the studies evaluated the applications’ usability considering their prolonged and continued use).
Considering the fourth research sub-question (i.e., what are the implications of the usability assessment results on future development of digital applications to promote citizens’ engagement and participation in public governance?), there are diverse factors that should be considered during the applications development to increase their usability and acceptance, including the application of universal design principles to promote the inclusion of people with disabilities or other disadvantaged groups such as people with low literacy [
105]; to invest in visual and aesthetic quality, which was also identified by Desmal et al. [
48]; to minimize the effort required to achieve the intended results, in accordance with the results reported by Aldrees and Gračanin [
18]; to apply motivational features (e.g., gamification) to promote the continuous and sustainable use of the proposed applications; and to duly consider human values (e.g., transparency, fairness, or trust [
135]) when designing the applications. Moreover, low usability might reinforce participants’ distrust in both the applications and authorities and might negatively impact collaborative tasks due to the cognitive load.
Finally, concerning the research question that informed the present study (i.e., what is the methodological quality of the studies performing user-centered usability evaluation of digital applications to promote citizens’ engagement and participation in public governance?), based on the analysis of the included studies, it is possible to conclude that the methodological quality of user-centered usability evaluation should be increased to facilitate the reproducibility and comparability of results across studies. Therefore, the methodological quality could be improved considering diverse dimensions. The study evaluators should have usability evaluation expertise, and the reporting should clarify whether they are internal or external to the application’s development. Moreover, the study evaluators should select valid and reliable instruments of assessment. In terms of procedures, a rationale should be considered for the combination of methods and techniques. Moreover, considering the context of use and tasks, the study promotors should develop a participant script with a detailed description of the tasks, facilities, and material needed and identify and justify the choice of lab test or field test or both. Finally, in terms of participants, a clear definition of inclusion and exclusion criteria (e.g., age, gender, educational level, and academic background) and a rationale for the sample size, the sampling methods, and the recruitment should be provided.
The collected evidence of this systematic review might be used, together with other information sources, to sustain the development of recommendations to support user-centered usability evaluations of digital governance applications, including methodological guidelines, standardized study designs, and reporting checklists, to help researchers when designing their experiments.
6. Conclusions
The specific purposes of the digital applications developed by the included studies were distributed by participatory reporting of urban issues, environmental sustainability, civic participation, urban planning, promotion of democratic values, electronic voting, and chatbots. However, a large percentage of the included studies are still in an early development phase, and consequently, at this stage, they do not significantly contribute to the development of citizen participatory models with impact at the societal level.
The review results suggest that there is high heterogeneity both in terms of usability evaluation procedures, methods, and instruments being used and their conformity with recommended usability evaluation good practices. In terms of implications for future developments, most studies are focused on evaluating the usability quality of their applications or to show the viability of user-centered development approaches and not in generalizing implications for future developments. Even so, the results of a minority of the included studies pointed out that the application of universal design principles, the quality, the visual and aesthetic experience, the existence of motivational features, and the effort and performance expectancies contribute to better usability and might increase citizens’ trust in the applications and authorities. Moreover, several implicating human values (e.g., transparency, safety, universal usability, feedback, authenticity, fairness, representativeness, accountability, legitimacy, informed consent, autonomy, awareness, human welfare, attitude, and trust [
135]) should be incorporated into the development of digital applications to promote citizen engagement and participation in public governance.
Considering the methodological quality of the studies performing user-centered usability evaluation of digital applications to promote citizens’ engagement and participation in public governance (i.e., the research objective that informed this review), the results suggest that researchers failed to consider and report relevant methodological aspects. Therefore, recommendations to support user-centered usability evaluations of digital governance applications should be established and disseminated to improve the methodological quality of future studies. The conducting of rigorous experiments on user-centered usability is likely to improve comparability of usability results across studies, facilitate further research on the impact of usability on other outcomes, and provide efficient digital solutions to maximize the societal impact (e.g., wellbeing, sustainability, transparency, efficiency, accountability, or promotion of democratic values such as representativeness) of the citizens’ engagement and participation in public governance. In this respect, as the main conclusion of this review, it should be highlighted that there is a need to increase the research community’s awareness of the existing knowledge in terms of good practices of user-centered usability evaluation.
The assessment of the impact of digital applications to support the engagement and participation of the citizens in public governance goes far beyond usability evaluation and requires multidisciplinary teams with expertise beyond IT (e.g., political or social sciences). Therefore, future reviews are required to systematize the frameworks, metrics, procedures, and methods being used to assess the societal impact of these digital applications as well as the methodological quality of the assessment being performed and both the positive and negative outcomes being measured.