14th IFAC Symposium on 14th IFACDesign Symposium on Analysis and Evaluation of Human Machine Systems Available online at www.sciencedirect.com 14th IFACDesign Symposium on Analysis and Evaluation of Human Machine Systems 14th IFAC Symposium on Tallinn, Estonia, Sept. 16-19, 2019 Analysis Design Sept. and Evaluation of Human Machine Systems Tallinn, Estonia, 16-19, 2019 Analysis Design and Evaluation of Human Machine Systems Tallinn, Estonia, Sept. 16-19, 2019 Tallinn, Estonia, Sept. 16-19, 2019
ScienceDirect
IFAC PapersOnLine 52-19 (2019) 347–352
Measurement of Driver’s Mental Workload in Partial Autonomous Driving Measurement of Driver’s Mental Workload in Partial Autonomous Driving Measurement Mental Workload in Partial Autonomous Driving Measurement of of Driver’s Driver’s Mental Workload in Partial Autonomous Driving Weiya Chen*, Tetsuo Sawaragi** and Yukio Horiguchi***
Weiya Chen*, Tetsuo Sawaragi** and Yukio Horiguchi*** Weiya Tetsuo Sawaragi** and Yukio Horiguchi*** Weiya Chen*, Chen*, Tetsuo Sawaragi** and Horiguchi*** *Kyoto University, Kyoto, 〒 Yukio 615-8540 *Kyoto Kyoto, 〒 615-8540 University, Japan (Tel: 080-7046-7306; e-mail:Kyoto,
[email protected]). *Kyoto 〒 *Kyoto University, University, 〒 615-8540 615-8540 Japan (Tel: 080-7046-7306; e-mail:Kyoto,
[email protected]). ** Kyoto University, Kyoto, 〒 615-8540 Japan (Tel: 080-7046-7306; e-mail:
[email protected]). Japan (Tel: 080-7046-7306; e-mail: Kyoto,
[email protected]). 〒 615-8540 ** Kyoto University, Japan (e-mail:
[email protected]) ** University, ** Kyoto Kyoto University, Kyoto, Kyoto, 〒 〒 615-8540 615-8540 Japan (e-mail:
[email protected]) *** Kyoto University, Kyoto, 〒 615-8540 Japan (e-mail:
[email protected]) Japan (e-mail:
[email protected]) *** Kyoto University, Kyoto, 〒 615-8540 Japan (e-mail:
[email protected]) *** Kyoto University, Kyoto, 〒 615-8540 *** Kyoto University, Kyoto, 〒 615-8540 Japan (e-mail:
[email protected]) Japan (e-mail:
[email protected]) Japan (e-mail:
[email protected]) Abstract: Autonomous driving has the potential to be the next critical technology that changes the lifestyle Abstract: Autonomous driving theverified potential to be the critical technology changes lower the lifestyle of a human. It has, however, nothas been whether thenext autonomous technologythat at different levels Abstract: Autonomous driving the potential to be critical technology that changes the lifestyle Abstract: Autonomous driving has the potential todrivers. be the the next critical technology that changes lower the lifestyle of a human. Itincreases has, however, nothas been verified whether thenext autonomous technology at different levelsa decreases or the mental workload of This paper verifies the relationship between of aa human. It has, however, not been verified whether the autonomous technology at different lower levels of human. has, however, notlevel beenworkload the This autonomous technology at differentand lower levelsa decreases orItincreases theand mental of drivers. paper verifies relationship between driver's mental workload 0,verified 1 and whether 2 of autonomous driving usingthe NASA-TLX adjusted decreases or increases the mental workload of drivers. This paper verifies the relationship between a decreases or increases the mental workload of drivers. This paper verifies the relationship between driver's mental workload and level 0, 1 and 2 of autonomous driving using NASA-TLX and adjusted responsemental time inworkload a secondary task simulation study. Experiment results show that level 1 causes theadjusted lowesta driver's and level 0, 1 and 2 of autonomous driving using NASA-TLX and driver's mental andtask level 0, 1 and 2 of 2Experiment autonomous driving using NASA-TLX and response time inworkload aof secondary simulation results show that level 1 causes theadjusted lowest mental workload the driver, followed by study. level and level 0. We discuss this nonlinear relationship response time in aa secondary task simulation study. Experiment results show that level 1 causes the lowest response time in secondary task simulation study. Experiment results show that level 1 causes the lowest mental workload of the driver, followed by level 2 and level 0. We discuss this nonlinear relationship betweenworkload the levelsofofthe autonomous driving by support and the level mental0.workloads incurred to the human driver. mental driver, followed level 2 and We discuss this nonlinear relationship mental driver, followed leveland 2 and We discuss this nonlinear relationship betweenworkload the levelsofofthe autonomous driving by support the level mental0.workloads incurred to the human driver. between the levels autonomous driving and the mental workloads to driver. Copyright 2019. of The Authors. Published by Elsevier Ltd. rights reserved. incurred between the©mental levels of autonomous driving support support and the All mental workloads incurred to the the human human driver. Keywords: workload; autonomous driving; human-automation-interaction; NASA-TLX; response Keywords: mental workload; autonomous driving; human-automation-interaction; NASA-TLX; response time; secondary task Keywords: mental workload; autonomous driving; human-automation-interaction; NASA-TLX; response Keywords: mental workload; autonomous driving; human-automation-interaction; NASA-TLX; response time; secondary task time; time; secondary secondary task task Automotive Engineers' (SAE) Levels of Driving Automation 1. INTRODUCTION Automotive Engineers'and (SAE) Levels(SAE of Driving Automation is widely recognized accepted J3016). How the 1. INTRODUCTION Automotive Engineers' (SAE) Levels of Driving Automation Automotive Engineers' (SAE) Levels of Driving Automation is widely recognized and accepted (SAE J3016). How the 1. INTRODUCTION levels of automation inand autonomous driving are distinguished Autonomous driving1.has the potential to be the next critical is INTRODUCTION widely recognized accepted (SAE J3016). How the is recognized accepted (SAE J3016). Howtask" the of automation inand autonomous are driving distinguished Autonomous driving has our the potential betechnology the next critical is widely based on the allocation of the driving "dynamic technology that changes lifestyle. to The itself levels levels of automation in autonomous driving are distinguished Autonomous driving has the potential to be the next critical levels of automation in autonomous driving are distinguished Autonomous driving has the potential to be the next critical is based on the allocation of the "dynamic driving task" technology that changes our lifestyle. The technology itself between the automation system and"dynamic the humandriving driver.task" If a would be mature to be introduced to the commercial market by is based on the allocation of the technology that our The technology itself baseddriver on automation the allocation of the the "dynamic driving task" between the system anddriving the human driver. If a technology that changes changes our lifestyle. lifestyle. The itself would mature to manufacturers be introduced toasthefar commercial by human entirely performs tasks, the vehicle many be automobile as technology lower market levels of is between the automation system and the human driver. If a would be mature to be introduced to the commercial market by between the automation system and the human driver. human driver entirely performs the driving tasks, the vehicle would be mature to be introduced to the commercial market by many automobile manufacturers as far as lower levels of is said to be atentirely level 0; performs if the autonomous driving system isIf ina automation are concerned. It has, however, not beenlevels verified human driver the driving tasks, the vehicle many automobile manufacturers as far as lower of human driver entirely performs the driving tasks,system the vehicle many automobile manufacturers asdecreases far as not lower levels of is automation concerned. It has, however, verified said to be level 0; if the driving is5.in charge of all at driving tasks, theautonomous vehicle is classified as level whether the are autonomous technology orbeen increases the is said to be at level 0; if the autonomous driving system is automation are concerned. It has, however, not been verified is said to be at level 0; if the autonomous driving system is5.in in automation are concerned. It has, however, not been verified charge of all driving tasks, the vehicle is classified as level whether the autonomous technology decreases or increases the mental workload of the drivers at different partial autonomous charge of all driving tasks, the vehicle is classified as level whether the autonomous technology decreases or increases the Different from fully tasks, autonomous driving, level 0 astolevel level5. 2 charge of all driving the vehicle is classified 5. whether the autonomous technology decreases or increases the mental workload of the drivers at different partial autonomous levels. Different from fully requires autonomous driving, level to levelthe2 mental driving human drivers to 0 monitor mental workload workload of of the the drivers drivers at at different different partial partial autonomous autonomous autonomous levels. Different from fully autonomous driving, level 0 to level 2 Different from fully by autonomous driving, level 0monitor to levelthe 2 driving requires human drivers to levels. driving environment themselves supposing they should be The paper has the goal to categorize a driver's mental workload autonomous levels. autonomous driving requires human drivers to monitor the autonomous driving requires human drivers to monitor the The paper has the goal to categorize a driver's mental workload driving environment by themselves supposing they should be in the loop. Although the autonomous driving system be at in partial autonomous driving viaa driver's simulator studies. The kept driving environment by themselves supposing they should The paper the to categorize mental workload driving environment by themselves supposing they should be The paper has has the goal goal to driving categorize mental workload ina the loop. Although the autonomous driving system at in partial autonomous viaa driver's simulator studies. The kept such level could perform a part of the driving tasks importance to keep driver-in-the-loop will be discussed, which kept in loop. Although the driving system at in partial driving via simulator studies. The kept ina the the loop. the autonomous autonomous driving system at in partial autonomous autonomous driving via will simulator studies. Thea such level could perform awould part not of be the driving tasks importance to driver-in-the-loop be discussed, which independently, allAlthough driving tasks replaced by the is essential to keep the design of a cooperation strategy between such a level could perform a part of the driving tasks importance to keep driver-in-the-loop will be discussed, which such a level could perform a part of the driving tasks importance to keep driver-in-the-loop will be discussed, which independently, all driving tasks would not be replaced by the is essential the design of a cooperation strategy between a automation when the tasktasks has would to dealnotwith more complex human and toautomation lower level autonomous drivinga independently, all be by is essential the design in of aa cooperation strategy between independently, all driving driving tasks would be replaced replaced by the the is essential toautomation the of lower cooperation strategy between when the task hasthe toautomation dealnotwith more human andFor in level autonomous driving driving environments. When systemcomplex cannot vehicles. thisdesign purpose, a variety of methods werea automation automation when the task has to deal with more complex human and automation in lower level autonomous driving automation when the task has to deal with more complex human and automation in lower level autonomous driving driving environments. When the automation system cannot vehicles. For this purpose, a variety of methods were deal withenvironments. the situation, When the control right will besystem returned to a introducedFor so far. For instance, Auditory Cognitive driving the automation cannot vehicles. this purpose, aa Visual variety of methods were environments. When automation system deal with the situation, the control rightdriver will beneeds returned to a vehicles. For this For purpose, variety ofmental methods were driving introduced so(VACP) far. Visual Auditory Cognitive human driver. At that time, thethehuman tocannot grasp Psychomotor isinstance, a method used for workload deal with the situation, the control right will be returned to introduced so far. For instance, Visual Auditory Cognitive deal with the situation, the control right will be returned to aa human driver. At that time, the human driver needs to grasp introduced so far. For instance, Visual Auditory Cognitive Psychomotor (VACP) is a method used for mental workload situation quickly and appropriately to avoid potential prediction, and NASA-TLX and response time areworkload the two the human driver. At that time, the human driver needs to grasp Psychomotor (VACP) is a method used for mental human driver. At to that theathuman driver needs to grasp Psychomotor (VACP) is a method used mental prediction, and NASA-TLX and in response time areworkload the two the situation quickly and appropriately to that avoid potential damages incurred thetime, driver the time of turnover. indicators of mental workload the for experiment results. the situation quickly and appropriately to avoid potential prediction, and NASA-TLX and response time are the two the situation quickly and appropriately to avoid potential prediction, and NASA-TLX and response time are the two damages incurred to the driver at the time of that turnover. indicators of mental workload in the experiment results. Further, setsofof mental experimental data are analyzed using ANOVA. damages incurred to driver at that indicators workload in the results. This requirement driver is hard meetof he/she overincurredof toathe the driver at the thetotime time ofwhen that turnover. turnover. indicators mental workload in concerning the experiment experiment results. damages Further, setsofof experimental data are analyzed using Experiment results are discussed the ANOVA. different This requirement of a driver is hard to meet when he/shehuman overFurther, sets of experimental data are analyzed using ANOVA. trusts the automation system. If automation replaces Further, sets of experimental data are analyzed using ANOVA. Experiment results are discussed concerning the different This requirement of a driver is hard to meet when he/she overlevels of automation driving and different secondary tasks that This requirement of a driver is hard to meet when he/she overtrusts the automation system. If automation replaces human Experiment results are discussed concerning the different drivers in automation some function blocks, itautomation is unavoidable for drivers Experiment results are discussed concerning the different trusts the system. If replaces human levels of automation driving and different secondary tasks that a driverofisautomation requested to perform during driving simulation, and drivers trusts the automation system. If automation replaces human levels driving and different secondary tasks that in some function blocks, it is unavoidable for drivers to depend on the function system and lose their vigilance on monitoring ofisautomation driving andduring different secondary tasks that drivers in some blocks, it unavoidable for drivers alevels driver requested to perform driving and measured mental workloads therein are simulation, compared and to drivers in on some function blocks, it is is vigilance unavoidable forrequests drivers athe driver is requested to perform during driving simulation, and depend the system and lose their on monitoring the environment. Even if the driver could react to adiscussed. driver is requested to perform during driving depend on the system and lose their vigilance on monitoring the measured mental workloads therein are simulation, compared and to to depend on the system and lose their vigilance on monitoring the environment. Even if the driver could react to requests the measured mental workloads therein are compared and the autonomous driving system, their situation awareness the measured mental workloads therein are compared and from discussed. the environment. Even if driver could react requests the environment. if the thesystem, driver their could react to toawareness requests from the autonomous situation discussed. capabilities shouldEven bedriving challenged. discussed. from the autonomous driving system, their situation awareness 2. THEORETICAL BASICS from the autonomous system, their situation awareness capabilities should bedriving challenged. capabilities should be 2. THEORETICAL BASICS At that time, situation awareness requires extra mental capabilities should be challenged. challenged. 2. 2. THEORETICAL THEORETICAL BASICS BASICS At that time, situation awareness requirescorrectly extra mental workloads of a driver to percept surroundings and to At that time, situation awareness requires extra mental 2.1 Human-Automation Issue At that time, situation awareness requires extra mental workloads of a driver to percept surroundings correctly and to comprehend their meanings by projecting them onto his/her 2.1 Human-Automation Issue workloads of their aa driver to correctly and workloads driver to percept percept surroundings correctly and to to meanings by surroundings projecting them onto his/her 2.1 Human-Automation Issue status. In aofword, a driver needs more mental resources to 2.1 Human-Automation Issueof vehicle automation, among comprehend comprehend their meanings by projecting them onto his/her There are many categories comprehend their meanings by projecting them onto his/her status. In a word, a driver needs more mental resources to There are many categories of vehicle automation, among status. In a word, a driver needs more mental resources which are the many definition of the International Society of status. In a word, a driver needs more mental resources to There categories of automation, among to There categories of vehicle vehicle automation, among which are the many definition of the International Society of which which the the definition definition of of the the International International Society Society of of
2405-8963 Copyright © 2019. The Authors. Published by Elsevier Ltd. All rights reserved. Copyright 2019 responsibility IFAC 347Control. Peer review©under of International Federation of Automatic Copyright © 2019 IFAC 347 10.1016/j.ifacol.2019.12.083 Copyright © 2019 IFAC 347 Copyright © 2019 IFAC 347
2019 IFAC HMS 348 Tallinn, Estonia, Sept. 16-19, 2019
Weiya Chen et al. / IFAC PapersOnLine 52-19 (2019) 347–352
figure out what is going on around him/her in the current traffic situation.
cognitive demand can be placed upon different resource channels, each of which has its capacity.
Compared to the workload saved by automatic accelerating or braking, such a situation requires extra mental workload from a driver. How does the total mental workload of a driver change with the increase of levels of partial automation driving? Will it decrease workloads as expected from the original design purpose of autonomous driving, or rather will it increase workload needed for monitoring or perception tasks?
As Table 1 shows (due to the page limit, only Visual part is shown), seven resource channels are defined: Visual, Auditory, Cognitive, Fine motor, Gross motor, Speech, and Tactile. Each channel is scored based on how much the corresponding resource is occupied, and a higher score represents more resource used. The total mental workload is calculated by the sum of the seven channels' scores. Using VACP, the mental workload of any task can be quantitively predicted.
2.2 Mental Workload Measurement Table. 1. VACP Scales (Visual part) (Alion, 2015)
Mental workload (MWL) has been defined as the proportion of the human operator's mental capabilities that are occupied during performing a given task (Boff et al., 1994). Three types of evaluations for human mental workloads are the most well-known. The first type is subjective measures. These are done by asking humans to provide opinions through interviews and questionnaires or by observing their behaviors. The second type is a task performance measurement. The third type is physiological indicators such as galvanic skin responses, EEG (Electroencephalography), gazes, and so on.
Value Visual
Descriptors
0.0 1.0 3.0 4.0 4.4 5.0 5.1 6.0
No Visual Activity Visually Register/Detect (detect the occurrence of image) Visually Inspect/Check (discrete inspection/ static condition) Visually Locate/Align (selective orientation) Visually Track/Follow (maintain orientation) Visually Discriminate (detect the visual difference) Visually Read (symbol) Visually Scan/Search/Monitor (continuous/serial inspection, multiple conditions)
2.6 Previous study
2.3 Secondary Task Measures of Workload
Long before the autonomous driving technology revolution, aircraft domain has already built mature systems combining pilot manual control and automation control system. Joey Mercer and other researchers from NASA have done an experiment about air traffic controllers’ ability to detect and resolve conflicts under varying task sets, traffic densities, and run lengths. The ability of controllers to resolve conflict is separately measured with or without their engagement in the conflict detection task. Their results suggest that involving the controller in the detection task helped them to instruct a resolution maneuver in less time. The results also lead to the conclusion, that involving the controller in the detection task helped them to instruct a resolution maneuver with perhaps less effort (Mercer, et al., 2016).
Secondary task measurement is one of the most widely used methods to access an operator's mental workload. An operator needs to perform the primary task under a specific condition and to use spared resources and an attentional capacity to complete the required secondary tasks. How good the operator performs the secondary tasks is the indicator of how much mental workload the operator is incurred under at that time point (Gawron, 2008). The secondary-task technique has several advantages: First, it may provide a sensitive measure of an operator's capacity to deal with a task; secondly, it may provide a sensitive index of task impairment due to stress, and finally, it may provide a standard metric for the comparisons of various tasks.
In the automobile field, Young and Stanton conducted a study using secondary task measure and found out workload is significantly reduced when ACC (Adaptive Cruise Control) was engaged (Young & Stanton, 1997). Their result reflects the mental workload comparison between level 0 and level 1. It has, however, not been verified the relation of driver’s mental workload at different partial autonomous driving levels.
2.4 Subjective Measures of Workload Casali and Wierwille identified several advantages of subjective measures: "inexpensive, unobtrusive, easily administered, and readily transferable to full-scale aircraft and a wide range of tasks" (Casali & Wierwille, 1983). Muckler and Seven state that subjective measures may be essential among the mental workload measure methods (Muckler & Seven, 1992).
3. EXPERIMENTS
2.5 Quantitative Evaluation of Mental Workload
3.1 Experiment Design
Visual Auditory Cognitive Psychomotor (VACP) method (Bierbaum, Szabo, & Aldrich, 1989) is a quantitative implementation of multiple resource theory. VACP method has been built upon the work of Wickens' Multiple Resource Theory (Wickens & Hollands, 2000), by recognizing that
The purpose of the experiments is to measure the driver's mental workload incurred when he/she drives at level 0, level 1 and level 2 of partially autonomous driving using a driving simulator.
348
2019 IFAC HMS Tallinn, Estonia, Sept. 16-19, 2019
Weiya Chen et al. / IFAC PapersOnLine 52-19 (2019) 347–352
The experiments are carried out using a driving simulation software of PreScan (PreScan, 2017) on workstations, which can simulate level 0, level 1 and level 2 driving experiences.
349
mental workloads in recent research. It is proven that NASATLX is an effective way to measure the mental workload (Jin, Zheng, Pei & Li, 2017). The higher the NASA-TLX score represents the more mental workloads incurred. NASA-TLX asked participants to evaluate their "Mental Demand," "Physical Demand," "Temporal Demand," "Performance," "Effort" and "Frustration."
With Logitech G29 Driving Force Racing Wheel (Fig. 1, left) as the external device, a subject can drive in the human-in-theloop case in the simulation. Subjects are restricted only to the ones who have the driver's license for an automatic transmission car, and the shift strategy is all set to automatic shift. Except for the clutch pedal, nothing has been changed about the brake and the accelerator.
3.3 Secondary Task and Predicted Mental Workloads According to the suitable secondary task for autonomous driving and the feasibility in the experiment environment, nine secondary tasks are chosen to be performed. Their process description and corresponding VACP score for the mental workload are shown in Table 2.
Three Dell 34-inch digital high-end monitors (U3415W) are used to form a surround visual effect. The angles between neighboring monitors are adjusted to 120 degrees. As shown in Fig. 1, right side, the subject will face the middle monitor during the actual driving situation.
Table 2. Process description and VACP score for nine secondary tasks Task
Process Description
VA CP
AOSPAN
5 letters in random order will be presented on iPad. After showing it to the subject, 5 simple math problem will be presented by the soundtrack. After solving the math problems, the investigator will ask the subject to tell the 5 letters in order Estimate a time interval. To prevent the subjects from silently counting the seconds, a distracting question asked during each section Numbers are presented about every 3 s by the soundtrack and the subject’s task is to add them continuously Call investigator using a cell phone 10 number will be shown on iPad one by one, subject need to decide whether the current number was the same as that presented 2 trials ago Send a message to the investigator using a cell phone Solving simple math problems read by a soundtrack Answer question about the weather, favorite food, etc. Follow the instruction of cross turning
27.3
Time perception
Fig.1. Logitech G29 Driving Force Racing Wheel (left) and Driving Simulation Environment (Highway Scenario) (right)
PASAT
The experiment of each level is first executed in a highway scenario as a training trial. When a subject gets familiar with the simulator, like how fast the brake functions or how sensitive the steering wheel is, the formal experiment will be carried out in an urban scenario where more visual information exists, including other vehicles, pedestrians, buildings, and so on.
Calling 2-back task
Texting
Level 0 of automation driving, which means there involves no automation system, is easy to realize in PreScan. Different from Level 0, Level 1 has an ACC function embedded. Adaptive Cruise Control (ACC) is a partially autonomous system that aims to control the longitudinal velocity of a car and keep a safe distance to the proceeding vehicle. Besides the ACC function of Level 1, Level 2 also involves the Lane Keeping Assist System (LKAS) to help a driver driving keeping in the center of the lane assisting steering operations.
Math arithmetic Simple question Simple instruction
10.0
17.3
17.8 14.9
16.7 15.0 6.2 6.8
3.4 Experiment Procedure The subject will sign a participation agreement first. After a brief introduction to the equipment and the explanation of each secondary task, every subject is required to choose at least five secondary tasks from the nine secondary tasks mentioned before. The reason why we ask subjects to choose 5 secondary tasks is the duration influence of the experiment. It takes too long to perform all nine secondary tasks in all three levels of simulation. Fatigue, frustration, the emotional factors under long-time operation should be avoided due to subjective rating method.
3.2 Mental Workload Measurement Method Selection Due to lack of specialized equipment for physiological signal measurement, secondary task performance measurement and subjective rating techniques are chosen as the methodology. As compared with other secondary task performance indicators, response time is easy to observe and suitable for every kind of tasks. Thus, response time will be regarded as the reference indicator for secondary task performance evaluation. Longer adjusted response time means more mental workload are incurred. Also, as for subjective rating forms, NASA-TLX is chosen. NASA-TLX is widely accepted and has often been adopted as a subjective method to measure
Then, the test trial in the highway scenario of level 0 starts. When the subject is ready, the formal experiment with secondary tasks is conducted in the urban scenario of level 0. A video camera records the whole urban driving simulation. After the experiment in level 0, the subject does the NASA-
349
2019 IFAC HMS 350 Tallinn, Estonia, Sept. 16-19, 2019
Weiya Chen et al. / IFAC PapersOnLine 52-19 (2019) 347–352
TLX questionnaires of each secondary task using the NASATLX App on iPad (NASA, 2018).
Woltz and Was (Woltz & Was, 2006) proposed the correct rate score (RCS) as follows:
After level 0, level 1 and level 2 will be carried out in the same procedure as level 0.
𝑅𝑅𝑅𝑅𝑅𝑅 = (1 − 𝑃𝑃𝑃𝑃)/𝑅𝑅𝑅𝑅𝑅𝑅 (2) where (𝑅𝑅𝑅𝑅𝑅𝑅) is the average response time of correct responses. This score represents a number of correct responses produced per time unit of response activity. IES can intuitively show the tendency of adjusted response time, while RCS represents its reciprocal.
All ten subjects are students from Kyoto University, aged from 21 to 26, including seven males and three females. They all have valid driver's licenses. The small sample size may affect the accuracy of the results negatively. However, the conclusion can provide reliable indicators, which are useful for navigating further research.
IES evaluation results are presented in Table 4 for a more explicit comparison between NASA-TLX and response time. Table 4: IES Change Summary
4. EXPERIMENT RESULTS The NASA-TLX results and the response time based on the experiment videos will be shown and discussed in this section.
AOSPAN PASAT Calling 2-back task Texting Math arithmetic Simple question Simple instruction
4.1 NASA-TLX Results The change tendencies of NASA-TLX score from level 0 via level 1 to level 2 are summarized in terms of secondary task and subjects. Four tendencies: "down and up," "down," "up" and "up and down" are marked in green, orange, blue and yellow, respectively.
1
2
⸜⸝ ⸜⸝ ⸜⸝ ⸜ ⸜ ⸝⸜ ⸜ ⸜ ⸜⸝
⸜ ⸜ ⸜⸝ ⸜
⸜⸝
3 ⸜ ⸝⸜ ⸜ ⸜ ⸜⸝ ⸜⸝ ⸜⸝
4 ⸜ ⸜
5
6
7
8
9
10
⸜ ⸜
⸝ ⸜⸝
⸜⸝ ⸝
⸝⸜
⸝
⸜ ⸜⸝ ⸜ ⸜ ⸜
⸜
⸜⸝ ⸜⸝ ⸝
⸜ ⸜ ⸜⸝ ⸜⸝
⸜⸝ ⸜ ⸜
⸜ ⸝ ⸝
⸜
⸝
⸜⸝
⸜⸝
3 ⸜⸝ ⸜⸝ ⸜ ⸝⸜ ⸜⸝ ⸜⸝ ⸝
4
5
6
7
8
9
10
⸜
⸜⸝
⸜
⸜⸝
⸝ ⸜⸝ ⸜⸝
⸝ ⸝⸜ ⸜
⸜⸝
⸝⸜ ⸝⸜
⸜⸝ ⸜ ⸜⸝
⸜
⸜⸝
⸝⸜ ⸝ ⸜⸝ ⸜
⸝ ⸜⸝ ⸝⸜ ⸝⸜
⸜⸝ ⸜⸝ ⸜
4.3 Data Analysis
⸝ ⸜ ⸜
2 ⸝⸜ ⸜
In Time Perception Task the response time cannot be measured, so the rest of the eight secondary tasks are evaluated. In a total of 48 secondary tasks, "down and up," "down," "up" and "up and down" appear 19 times, 13 times, five times and 11 times, respectively. In IES aspects, "down and up" has the highest occupancy rate among four change possibilities. How much does the IES change in different levels of automation driving will be analyzed in section 4.3.
Table 3: NASA-TLX Change Summary AOSPAN Time perception PASAT Calling 2-back task Texting Math arithmetic Simple question Simple instruction
1 ⸝⸜ ⸜ ⸜ ⸝⸜ ⸝⸜ ⸝⸜ ⸜⸝ ⸝⸜
MANOVA is used for data analysis. Three independent variables are "the level of automation," "subject," and "secondary task." NASA-TLX score and adjusted response time are regarded as dependent variables.
From Table 3 the central tendency of mental workload change is discovered as decreasing with the increase of levels of automation driving, which dominates the number of tendencies in the table. The appearance proportions of "down and up," "down," "up" and "up and down" are 17/56, 27/56, 9/56 and 3/56 correspondingly. Another factor to consider is the amount of change, which is to be analyzed in section 4.3.
The combination of NASA-TLX and RCS has a p-value of 0.024613, which proves the significant difference among different automation levels (level 0, level 1 and level 2). The combination of NASA-TLX and IES's p-value is 0.035657, which also indicates a significant difference. The root-mean-square standardized effect size of NASA-TLX in one-way ANOVA (level as an independent variable) Ψ is 9.5662 with specific analytical confidence intervals of [2.1017 17.0307].
4.2 Response Time Experiments in cognitive psychology usually deal with the two dependent variables; the error ratios and the reaction time needed to the correct response. Several measures that combine both of speed and accuracy into a single measure have been proposed. Among those, the most popular and classical one is inverse efficiency score (Townsend & Ashby, 1978). This measure takes a ratio of the average correct RT (RTc) and a proportion of correct responses:
The root-mean-square standardized effect size of IES in oneway ANOVA (level as an independent variable) Ψ is -0.2454 with specific analytical confidence intervals of [NaN 0.8653]. 4.4 Relation of driver’s mental workload among levels of partial autonomous driving
𝐼𝐼𝐼𝐼𝐼𝐼 = 𝑅𝑅𝑅𝑅𝑅𝑅/(1 − 𝑃𝑃𝑃𝑃) (1) This measure is an estimation of RT adapted for the frequency of incorrect responses. Following this, three decades later,
Since it is proved using ANOVA that there exists significant difference among levels of automation, we can do the Boxplot to see the value of each level and see how the driver's mental
350
2019 IFAC HMS Tallinn, Estonia, Sept. 16-19, 2019
Weiya Chen et al. / IFAC PapersOnLine 52-19 (2019) 347–352
workload changes through the increase of levels of partial automation driving.
351
Bottom two boxplots in Fig.2 demonstrate the NASA-TLX and IES concerning each secondary task. This figure shows a clear difference NASA-TLX among various secondary tasks. The secondary task of "Simple Question" has the lowest NASA-TLX, which indicates that it causes the lowest mental workload to the driver. Secondary tasks of "Calling" and "Texting" reveal the highest NASA-TLX score, which proves these two secondary tasks make the driver incurred with more mental workloads. Compared to NASA-TLX, the different range of tasks in IES aspect is not so wide. However, "Calling" and "Texting" has the smallest IES, which means the mental workload is the lightest. This phenomenon may be due to the complicated procedure of task because the subject still has many steps to accomplish after the reaction, like reach the phone, unlock the phone, tap…etc. These follow-up procedures may be the reason why the subjects feel more mental workload but have the shortest response time.
Top two boxplots in Fig.2 is the boxplot of NASA-TLX and IES of three different levels. The difference between level 0 and level 1 is obvious while the score at level 2 is only a little bit higher than the score at level 1. This shows that the lowest NASA-TLX is at level 1, followed by level 2 and the highest score is at level 0. The gap between level 1 and level 2 is much smaller than the gap between level 0 and level 1, which confirms the introduction of autonomous system benefits the driver's mental workload and with a higher degree of autonomous system's engagement driver has more mental workload but still much lower than total manually driving. It demonstrates a "down and up" tendency. As for IES, three levels' adjusted response time seem identical.
5. DISCUSSIONS 5.1 Mental Workload of lower autonomous driving levels In our hypothesis, when the level of autonomous driving is increased from Level 0 to Level 1, in the case of ACC using scenario, the mental workload for feet action can be saved, but the more mental workload is needed to monitor the dashboard. Similarly, from Level 1 to Level 2, the mental workload of lane-keeping is released while more visual resources for dashboard monitoring is required. As shown in the boxplot of Fig.2, level 1 has the lowest mental workload from the subject's judgment, followed by level 0 and level 2. As for IES, the tendency is not so evident, but it seems to increase first and then decreases with the increase of autonomous driving levels. Due to the small sample size, the IES, which involves correct rates, may not be so accurate. The response time is measured by checking videos per human eyes, which caused significant deviation. The decrease of IES from level 1 to level 2 may come from the fact that the subject becomes more and more familiar with the secondary tasks in the experiment process. Thus, NASA-TLX is found to be a better indicator of mental workload measurement. Based on NASA-TLX, level 1 of automated driving can reduce a driver's mental workload, while level 2 of automated driving increases a driver's mental workload on the contrary. In general, in the first place where automation technology is introduced, a human driver will have much less mental workload. However, if more dynamic driving tasks are performed by an automation system, the more mental workload is necessary. Although the added mental workload compared to the saved mental workload is quite small, we can infer that a higher automation level leads to higher mental workload than level 1. The comparison of the mental workload at level 1 and level 4 needs further research. One reason for such transitional change may be the situation awareness issue. Level 1 of automation driving still needs the driver to be in-the-loop; the driver must observe the driving condition because he/she still needs to take control of the speed
Fig.2. NASA-TLX and IES Boxplot concerning Levels of automation driving (top two) and secondary tasks (bottom two) 351
2019 IFAC HMS 352 Tallinn, Estonia, Sept. 16-19, 2019
Weiya Chen et al. / IFAC PapersOnLine 52-19 (2019) 347–352
of steering. However, when the human driver is out-of-theloop at level 2 because of fully trust on the automation system, although monitoring environment is still driver's task, it takes longer time and more mental resources for the human driver to build the complete picture of what is going on right now in the driving environment.
subject to perform some secondary tasks during driving, the relation has been found. Driver's mental workload first decreases from level 0 to level 1 and then increases from level 1 to level 2, but level 2 causes lower mental workload than level 0. As for secondary tasks, the mental workload caused is different due to various resource channels being occupied. Further research can help find the relations among the mental workload caused by different resource channels.
Another reason may be the acceptability issue. At level 1, subjects are kept in-the-loop by the remaining steering task. All subjects keep their hands on the steering wheel while at level 2, some hand-off-wheel behavior has been observed. When the take-over issue occurs at level 2, the subjects act more anxiously. All participants have no experience driving actual level 2 autonomous vehicle that causes uncertainty when driving. If a human driver is familiar enough with autonomous technology, it is possible that a driver can handle emergency more calmly with the lower mental workload.
REFERENCES Alion. (2015). Improved Performance Research Integration Tool. Alion Science and Technology Corporation Bierbaum, C., Szabo, S., & Aldrich, T. (1989). Task Analysis of the UH-60 Mission and Decision Rules for Developing a UH-60 Workload Prediction Model. US Army Research Institute Boff, K. R., Kaufmann, L., and Thomas, J. P. (1994). Handbook of Perception and Human Performance, Vol. 2, Cognitive Processes and Performance. 42-2, New York, NY: John Wiley & Sons Casali, J., & Wierwille, W. (1983). A comparison of rating scale, secondary task, physiological, and primary-task workload estimation techniques in a simulated flight task emphasizing communications load. Human Factors, 623642 Gawron, V. J. (2008). Human Performance, Workload, and Situational Awareness Measures Handbook. CRC Press Jin, X., Zheng, B., Pei, Y., & Li, H. (2017). A Method to Estimate Operator’s Mental Workload in Multiple Information Presentation Environment of Agricultural Vehicles. EPCE 2017, 3-20 Mercer, J., Gomez, A., Gabets, C., Bienert, N., Edwards, T., Martin, L., Gujral, V. & Homola, J. (2016). Impact of Automation Support on the Conflict Resolution Task in a Human-in-the-Loop Air Traffic Control Simulation. IFAC -PapersOnLine 49-19, 36-41 Muckler, F. A., & Seven, S. (1992). Selecting performance measures "objective" versus "subjective" measurement. Human Factors, 441-455 NASA TLX App (2018), https://humansystems.arc.nasa.gov/ groups/TLX/tlxapp.php, NASA PreScan Manual Version 8.1.0 (2017), TASS International Townsend, J. T., & Ashby, F. G. (1978). Methods of modeling capacity in simple processing systems. In J. N. Castellan, & F. Restle, Cognitive Theory 3, 199-239. Lawrence Erlbaum Associates, New York Wickens, C. D., & Hollands, J. G. (2000). Engineering psychology and human performance. Prentice Hall, Upper Saddle River, N. J. Woltz, D. J., & Was, C. A. (2006). Availability of related longterm memory during and after attention focus in working memory. Memory & Cognition, 668-684 Young, M.S. & Stanton N.A. (1997). Automotive automation: Investigating the impact on drivers' mental workload. International Journal of Cognitive Ergonomics 1 (4), 325336
5.2 Mental Workload of Secondary Tasks Table 5 shows the summary the categories concerning with mental workloads of nine secondary tasks; comparison between the expected scores using VACP and the experimental results. Because of the difference of scoring units and dimensions, qualitative comparisons are shown rather than quantitative comparisons. Table 5: Mental Workload Summary Task
VACP
NASA-TLX
AOSPAN Time perception PASAT Calling 2-back task Texting Math arithmetic Simple question Simple instruction
High Low High High Average High Average Low Low
Average Average Average High High High Average Low Average
When using NASA-TLX as the experiment results, the correct rate for prediction is 4/9. The correct rate of prediction methods seems quite low, but the small sample size cannot offer an accurate enough dataset. Every secondary task occupied different resource channels and caused different mental workload in each channel. The secondary tasks related data should be analyzed more attentively using quantitative evaluation method. Some actions are taken simultaneously, and the overlap should also be considered. Building the quantitative system will be done in future research. Moreover, motor coordination may be another factor that influences the driving task. It is reasonable to believe, with the enlarge of sample size and new analysis system, prediction's accurate rate can be improved. 6. CONCLUSION This paper focuses on the driver's mental workload in the partially autonomous driving vehicle, how will it change concerning different levels of partial automation driving. Through driving simulation experiment, which requires the 352