2. Related Work
Previous research on the legibility of text has been concerned with embedded reading, using metrics and tasks about reading entire paragraphs of text [
14]. However, modern reading behavior is fragmented with brief glances, whether looking on a smartphone or scanning at an in-vehicle screen [
15]. In the scene of driving, the displayed text on the in-vehicle screen has a significant impact on safe driving. Therefore, exiting studies have also included information on visual signal interaction, text design standards, and driving cognitive models.
A. Visual Signals’ Classification and Human–Computer Interaction Model
In visual interaction, the human and the computer exchange information through interfaces, in form of visual signals composed of text, icons, and colors [
16]. The visual signals are of primary importance in the driving and can be divided into various sensory dimensions such as color, luminance, and contrast, as well as stimulus dimensions such as location, size, height, length, shape, spacing, and periodicity [
6]. Visual signals must be seen to be effective, and placing them in optimal locations in vehicles can help quickly detect visual signals and promote faster responses to them. In addition, the sensory and stimulus dimensions mentioned above can be combined to maximize the legibility and comprehensibility of messages.
In each step during the interaction, drivers keep offering instructions to the computer and receiving feedback and the interfaces become the communication channel between humans and computers. The visual information provides its meaning for humans and the computer respectively. In order to establish a successful human–computer interaction through visual signals, Bottoni et al. [
17] suggest that the correct communication occurs if a pair of inverse morphisms can be established between user meaning and computer meaning.
B. Standards and Guidelines about the Size and Length of Text
Optimum text presented in the driver–vehicle interface is legible under a large number of viewing distances, viewing angles, and environmental conditions. Here are a lot of literature about reference values of text size.
ISO15008 recommends that the character heights for in-vehicle display alphanumerical text should be at least 20 arc-minutes of visual angle, but 16 arc-minutes are also acceptable. Furthermore, it is one of the few standards that provide reference values for Chinese font sizes, which recommends the minimum or modification size of Chinese characters is 24 ∗ 24 pixels. However, studies on Chinese character sizes and related recommended values are very limited. HUAWEI company suggests that 5.3 mm is the minimum size of Chinese characters that can be recognized [
18].
Referring to the length of characters per row, there are few standards providing reference values. However, the capacity of short-term memory is instructive to the length design, which indicates that it is easy for people to remember 5–7 items [
19].
Before starting this research, we investigated the central console interfaces of 22 intelligent vehicles in the Chinese market and sorted out their information architectures, and the main features of Chinese characters, especially in terms of size and length. The results show that the font size of the first-level heading is 36–40 pt, and of the second-level heading is 20–24 pt. In terms of length, the first-level heading is 4–6 characters, and the second-level heading is 10–16 characters.
After that, we invited six professional test drivers to subjectively score user satisfaction to central control screen interfaces and some vehicle’s interfaces received a high score of recognition. The minimum font size on these interfaces is 3 mm and the average number per row is about seven characters. Based on this literature and our statistical results, we selected the range of the two independent variables of the size and length for Chinese characters.
C. Driving Cognitive Model
There are conceptual cognitive models [
20,
21], which try to explain components and processing stages including information perception while driving. Mathematical cognitive models explain cognitive principles through signal detection theory [
22] for a driver’s information processing [
23], information theory [
24], and computational cognitive models [
25,
26,
27,
28], which focus on understanding the driving process and interaction with different cognitive patterns to reproduce driver behaviors.
Among the above cognitive models, we choose Endsley’s Situation Awareness to explain the cognitive process in the test tasks [
29]. Endsley defines situation awareness as “knowing what is happening around you”, which refers to the user’s cognition of the surrounding environment and state. The model describes three levels of understanding: perception of elements in the current situation, comprehension of the current situation, and projection of future status. In other words, factors in the environment are perceived within a certain time and space, their meanings and relationships are understood, and their states in the near future are predicted by drivers.
The driver must draw a conclusion quickly from the deserved information and must understand the meaning or importance of the information [
30]. Therefore, time from perception to understanding is very important. In the process of driving, 90% of the driver’s information comes from the human visual system. When engaging in non-driving tasks, the driver cannot withdraw his/her eyes off the road over 2 s [
15]. K. L young et al. [
31] summarized the decline of cognitive ability and physical ability of the elderly through the research on their use of interactive interface, and also put forward requirements for the design of HMI, such as the guidelines for amount of text, including the research on the optimal font size or display luminance. We explain the experiment tasks using this situation awareness model in the procedure part.
3. Method
We use the driving simulator to conduct the experiment. We require the experiment participates to complete a series of text information cognitive tasks while keeping driving safe. The font size and the length of the sentence were used as the variables of the experiment. We counted the time that took the participates to complete the task, and after a group of tasks were completed, we asked the participants to complete the usability scale and the workload scale.
A. Participants
A representative sample of experienced drivers was recruited to take part (n = 30), comprising 16 males and 14 females. Ages range from 25 to 44 years with a mean age of 33.2, driving experience more than 2 years. Most of them have the experience of using central consoles frequently or occasionally while driving.
B. Experimental environment
The location of the in-vehicle environment and the devices can affect the reading and understanding of the text, so we designed the position of the driver and the central control screen of a real car based on the ergonomics in the automotive design process [
32]. On the basis of seat height of 260 mm, the horizontal distance from the steering wheel to the accelerator heel point was set to 440 mm and the height of the steering wheel was set to 660 mm [
32]. As for the position of the central control screen, we set the height from the screen center to the ground as 710 mm, the transverse distance between the screen center and the center of the steering wheel as 390 mm, and the horizontal longitudinal distance as 150 mm [
32]. At the same time, we tried to ensure that the eyes of participants were 710 mm away from the center of the central control screen, which may vary with participants’ height. A Surface tablet computer was placed on the central control screen and visualize the test content, which ran on a unity application designed to enable the commands like initiate a trial. An experimenter is responsible for controlling the tablet computer. The daytime light (about 5 klx with a relative tolerance of 5%) was selected as the illumination condition in this test of vehicle visual display, as proposed by SAE J1757/1:2015 (SAE, 2015) (see
Figure 1).
C. Task and Stimuli
In task designing, driving straight forward at a speed of about 30 km/h was the primary driving task and the non-driving task was to name the text on the central console under the premise of safety and stability. Naming tasks has been used in a lot of English and Chinese character experiment [
13,
33], in which participants identify and read the target stimulus. The processing of fonts was measured by reaction time from the appearance of the target stimulus to the response of the participants.
The stimuli of retelling tasks were Mandarin sentences written in Simplified Chinese characters. The text contents consist of the functions of navigation, entertainment, vehicle settings, and driver assistance (see
Figure 2). The stimuli varied according to two independent variables: the size of the character (3 mm, 4.5 mm, 6 mm, and 7.5 mm) and the length or the number of characters per row (7, 9, 11, 13, and 15 per row). Each participant performed all 20 tasks combined in levels of size and length (4 levels of height * 5 levels of the number of characters per row = 20 tasks).
Chinese characters using the typeface named Noto sans s Chinese Regular were displayed in pure white (RGB: 255, 255, 255) against a background of a pure black (RGB: 0, 0, 0) in the center of a 12.3-inch screen with 16:9 ratio (see
Figure 2).
D. Procedure
Participants read and signed an informed consent and filled in a questionnaire covering demographic information and driving experience. They then came to the simulator and adjusted their seat so that they were comfortable. Next, the experimenter described the test purpose, test content, and test tasks to them and they received instruction on how to interact with the driving simulator. One must be emphasized to participants is that while performance on the non-driving task is important, they should balance driving safety while attempting to complete the tasks, just as they were driving a real car. A brief practice provided a familiarization with the simulator environment and task process, the formal experiment then began.
In this experiment, participants received instructions for non-driving tasks while driving smoothly at a speed of about 30 km/h. A notification tone was employed to cue the participants that the non-driving task was ready, they should scan and try to remember the meaning of the text which just appeared. They could let the experimenter know that they have completed the task by pressing a button on the steering wheel, then the text disappeared and the participants repeated what they had seen. In the cognitive process, after hearing the notification sound, participants began to obtain visual information from the central control screen and selected method to scan and recognize the text information based on the driving condition and surrounding environment, so as to judge whether they could retell the information completely and quickly based on short-term memory. According to the participants’ different abilities and experiences, they made the decision to press the button to start retelling and finally made the corresponding action according to the decision.
After each task was completed, the participants were asked to fill out a questionnaire about usability and a single global workload rating [
34]. When the process was repeated 20 times, the whole experiment was over. Test contents were carried out in random order. The whole test lasted approximately 30 min.
E. Questionnaires and Data Analysis
In the experiment, we used a questionnaire about the visual subjective rating, a usability questionnaire, and a global workload rating questionnaire. The visual subjective rating questionnaire was a 7-point scale (on a scale of −3 to 3) used to describe participants’ visual impressions of the text regarding the size and length of the characters. For the size of characters, −3 meant the participants thought the characters were too small, 0 meant the characters were just the appropriate size, and 3 meant the participants thought the characters were too big. Similarly, for the length of characters, −3 and 3 meant that the length of characters was too short and too long respectively, while 0 was the length that was most appropriate. Generally, the usability questionnaire describes efficiency, effectiveness, and satisfaction. Due to the experiment starting from visual evaluation, we chose the clarity of Chinese characters as the representative of efficiency and the understanding of Chinese characters as the representative of effectiveness. The chosen clarity and understanding are the operated items for the two dimensions. We then designed a usability questionnaire using Likert’s 7-point scale (α = 0.814), and we combined the three aspects to comprehensively describe the usability of Chinese characters. As for the workload, we chose the global workload rating questionnaire, which showed consistent results as NASA-TLX [
34]. The global rating was a short form to rate how much workload was associated with each type of task using a 0 to 10 scale, where 0 = very low workload, and 10 = very high workload.
Since the data did not meet the normal distribution, the Spearman’s rank correlation coefficient was used. All data were analyzed and visualized in SPSS and Excel.
4. Results
We analyzed objective and subjective data, including response time, visual satisfaction, usability, and workload. The results show that the font size and length have a certain influence on the reaction time. When the font size is less than 6 mm, the reaction time is positively correlated with the font size. When the font size is greater than 6 mm, the reaction time does not change significantly. When the font size is less than 13 characters, the response time is negatively correlated with the number of characters, while when the font size is more than 13 characters, the response time does not change significantly.
A. Reaction Time
Longer reaction times may indicate increased processing and higher uncertainty or difficulty [
35]. Therefore, shorter reaction times might make less uncertain or a bad influence on drivers’ behavior. A number of reaction time effects are shown in the data. The main effect of different sizes of characters on reaction time (r = 0.125,
p < 0.05) was found and was shown in
Figure 3, in which the reactions for the larger Chinese characters are faster. The reaction time of 4.5 mm character size is 6.97% shorter than that of 3 mm and the reaction time of 6 mm character size is 6.63% shorter than that of 4.5 mm character size, while participants reacting to 7.5 mm characters are just 2.39% faster than reacting to 6 mm characters. We can find that there are clear differences in reaction time between the size of 3 mm, 4.5 mm, 6 mm. As the size exceeds 6 mm, reaction time begins to obviously decrease. As a result, when the size of Chinese characters is larger than 6 mm, the size has little impact on the reaction time.
As for the length of Chinese characters per row, the reaction time increases with the increasing number of Chinese characters per row (r = 0.337,
p < 0.05). We can find in
Figure 4 when the number of characters per row changes from 7 to 13, the reaction time tends to increase. Reaction times even decrease from 13 to 15 characters per row. Therefore, the drivers use more attention resources as the length increases. And when the number of Chinese characters is more than 13, the effect of the length on the reaction time and the driver’s performance is unclear.
B. Visual Subjective Rating
After each task was completed, participants were immediately asked to score the visual perception of the Chinese characters in the task. When the size of Chinese characters is taken as an independent variable, it is found that the size of Chinese characters has an effect on the visual subjective rating and there is a strong correlation between the two variables (r = 0.725,
p < 0.01). As shown in
Figure 5, the participants thought the Chinese character sizes of 6 mm and 7.5 mm are both appropriate, but the size of 6 mm was better. At the same time, they believed that the 4.5 mm Chinese characters were a little small, while the 3 mm Chinese characters were particularly small, which may affect the legibility of Chinese characters.
The length of Chinese characters significantly impacts the visual subjective rating (r = 0.603,
p < 0.01). As illustrated in
Figure 6, the participants rated 7 or 9 words per row as the closest scores to 0, which means 7 or 9 words are the most visually appropriate length for them. As the number of characters per row increases, the visual subjective rating of the length increases. When the number of words per row was more than or equal to 13 words, the participants consider the text to be particularly long.
C. Usability Scores and Workload
In this section, usability and workload questionnaires are used to measure participants’ perception of different Chinese character sizes and lengths.
In this experiment, character size has significant effect on usability (r = 0.346,
p < 0.01) and workload (r = 0.399,
p < 0.01). As the character size increases from 3 mm to 6 mm, the usability increases by 19.8% and 8.2% respectively. When the characters change from 6 mm to 7.5 mm, the usability decreases by 1.6%. Regarding the workload, when character size increases from 3 mm to 6 mm, the workload decreases by 31.2% and 27.5% respectively, while the workload scores of 7.5 mm character are 5.3% higher than that of 6 mm character (see in
Figure 7). We can find that the character size of 6 mm is a turning point. When the character size is larger than or equal to 6 mm, the changing trend of usability and workload is smoother than the scoring trend of character sizes smaller than 6 mm. In summary, participants’ perception of character size is getting better as the size increases, but this effect becomes smaller when the size exceeds 6 mm.
As shown in
Figure 8, as the number of words per row increases, the usability decreases, and the workload increases. When the number of words per row increases from 7 to 15, the usability scores decrease by 9.6%, 0.4%, 3.5%, and 3.7% in turn. In other words, 9 words per row is a turning point in usability scoring. When the number of words per row exceeds 9 words, the usability decreases in a smoother trend. At the same time, with the increase of the length, the workload has been gradually increasing in a consistent trend. In general, when the number of words per row is 7 or 9, participants’ perception of text is better. Conversely, participants’ perception of text gradually deteriorates as the length continues to increase.