14th IFAC Symposium on 14th IFAC Symposium on 14th IFAC Symposium on 14th IFACDesign Symposium on Analysis Design and Evaluation Evaluation of Human Human Machine Systems Available online at www.sciencedirect.com Analysis and of Machine Systems 14th IFAC Symposium on Analysis Design and Evaluation of Analysis Design and Evaluation of Human Human Machine Machine Systems Systems Tallinn, Estonia, Sept. 16-19, Tallinn, Estonia, Estonia, Sept. 16-19, 2019 2019 Analysis Design and Evaluation of Human Machine Systems Tallinn, Sept. 16-19, 2019 Tallinn, Estonia, Sept. 16-19, 2019 Tallinn, Estonia, Sept. 16-19, 2019
ScienceDirect
IFAC PapersOnLine 52-19 (2019) 235–240
Analysis of Human Skill Development Analysis of Human Skill Development Analysis of Human Skill Analysis of Human Skill Development Development Manual Ramp-Tracking Tasks Manual Ramp-Tracking Tasks Manual Ramp-Tracking Manual Ramp-Tracking Tasks Tasks M. M. M. M. M.
in in in in
Willems, D. M. Pool, van der H. J. Willems, D. M. Pool, K. K. van der El, El, H. J. 1Damveld, Damveld, Willems, D. M. K. van El, H. Willems, D. van M. Pool, Pool, K. and van der der El, H. J. J. Damveld, Damveld, M. M. Paassen Max Mulder M. M. M. Paassen Max Mulder Willems, D. van M. Pool, K. and van der El, H. J. 1111Damveld, M. van Paassen and Max Mulder M. M. van Paassen and Max Mulder 1 M. M. van Paassen and Max Mulder Control and Simulation, Faculty of Aerospace Engineering, TU Delft, Delft, Control and Simulation, Faculty of Aerospace Engineering, TU Control Faculty of Aerospace Engineering, Control and and Simulation, Simulation, Faculty ofThe Aerospace Engineering, TU TU Delft, Delft, 2629 HS, Delft, Netherlands 2629 HS, Delft, The Netherlands Control and Simulation, Faculty of Aerospace Engineering, TU Delft, 2629 HS, HS, Delft, Delft, The The Netherlands Netherlands 2629 2629 HS, Delft, The Netherlands Abstract: Abstract: Human Human modelling modelling approaches approaches are are typically typically limited limited to to feedback-only, feedback-only, compensatory compensatory Abstract: Human modelling approaches are limited to compensatory Abstract: Human modelling approaches are typically typically limited to feedback-only, feedback-only, compensatory tracking tasks. Advances in system identification techniques allow us more tracking tasks. tasks. Advances in system system identification techniques allow us to to consider consider compensatory more realistic realistic Abstract: Human modelling approaches are typically limited to feedback-only, tracking Advances in identification techniques allow us to consider more realistic tracking tasks. Advances in system identification techniques allow uspaper to consider more realistic tasks that involve feedforward and even precognitive control. In this we study the human tasks that involve feedforward and even precognitive control. In this paper we study therealistic human tracking tasks. Advances in system identification techniques allow us to consider more tasks that involve feedforward and even precognitive control. In this paper we study the tasks that involve feedforwardcontrol and even precognitive control.to Inaccurately this paperfollow we study the human human development of a feedforward response while learning a ramp-shaped development of a feedforward control response while learning to accurately follow a ramp-shaped tasks that involve feedforward and even precognitive control. In this paper we study the human development of of a a feedforward feedforward control control response response while while learning learning to to accurately accurately follow follow aa ramp-shaped ramp-shaped development target in presence a acting on element. experiment target signal signal of in athe the presence of of a disturbance disturbance actinglearning on the the controlled controlled element. An experiment development feedforward control response while to accurately follow An a ramp-shaped target signal in the presence of a disturbance acting on the controlled element. An experiment target signal in the presence of a disturbance acting on the controlled element. An experiment was conducted in which two groups of eight subjects each tracked ramps of different steepnesses was conducted conducted inthe which two groups groups of eight eight subjects subjects each tracked rampselement. of different different steepnesses target signal inin presence of a disturbance acting each on the controlled An experiment was which two of tracked ramps of steepnesses was conducted in which fashion. two groups of eight subjects each tracked rampsby of adifferent steepnesses in a random or ordered In addition, ordered runs were followed ‘surprise’ run in a random or ordered fashion. In addition, ordered runs were followed by a ‘surprise’ run with with was conducted in which two groups of eight subjects each tracked ramps of different steepnesses in a random or ordered fashion. In addition, ordered runs were followed by a ‘surprise’ run with in a random or ordered fashion. In addition, ordered runslearn wererapidly, followedcontinue by a ‘surprise’ run with a random ramp steepness. Results show that operators to learn during a random ramp steepness. Results show that operators learn rapidly, continue to learn during in a random or ordered fashion. In addition, ordered runs were followed by a ‘surprise’ run with a random random ramp ramp steepness. steepness. Results Results show show that that operators operators learn learn rapidly, rapidly, continue continue to to learn learn during during a the entire and adapt very quickly to Experiments involving the entire experiment, experiment, and can can adapt verythat quickly to surprise surprise situations. Experiments involving a random ramp steepness. Results show operators learn situations. rapidly, continue to learn during the entire experiment, and can adapt very quickly to surprise situations. Experiments involving the entire experiment, and can adapt very quickly to surprise situations. Experiments involving learning are challenging, it difficult balance-out all learning operators are and challenging, asvery it is isquickly difficulttoto tosurprise balance-out all experimental experimental conditions the entireoperators experiment, can adaptas situations. Experimentsconditions involving learning operators are challenging, as it is difficult to balance-out all experimental conditions learning operators are challenging, as it is difficult to of) balance-out all experimental conditions and for differences (groups subjects. and control control for inevitable inevitable differencesasbetween between (groups subjects. all experimental conditions learning operators are challenging, it is difficult to of) balance-out and and control control for for inevitable inevitable differences differences between between (groups (groups of) of) subjects. subjects. and control© for inevitable differences between (groups of)rights subjects. Copyright 2019. The Authors. Published by Elsevier Ltd. All reserved. Keywords: cybernetics, manual control, skill, learning, modeling Keywords: cybernetics, cybernetics, manual manual control, control, skill, skill, learning, learning, modeling modeling Keywords: Keywords: cybernetics, manual control, skill, learning, modeling Keywords: cybernetics, manual control, skill, learning, modeling 1. ramp-shaped 1. INTRODUCTION INTRODUCTION ramp-shaped target target signal, signal, with with aaa pursuit pursuit display. display. The The 1. ramp-shaped target signal, with The 1. INTRODUCTION INTRODUCTION ramp-shaped target signal, with a pursuit pursuit display. display. The feedforward path was found to approximate the inverse feedforward path was found to approximate the inverse 1. INTRODUCTION ramp-shaped target signal, with a pursuit display. The feedforward path was found to approximate the inverse feedforward path was found to approximate the inverse of the CE dynamics, for range of target and disturbance Models Models of of human human control control behaviour behaviour have have existed existed since since feedforward of the CE dynamics, aaa range target and disturbance path wasfor found to of approximate the inverse Models of human control behaviour have existed since of the CE dynamics, for range of target and disturbance Models of (Krendel human control behaviour have existedetsince of the amplitude CE dynamics, for a range of target and disturbance and 1960; Wasicko al., variations (Drop al., for the the 1960s 1960s (Krendel and McRuer, McRuer, 1960; Wasicko etsince al., signal signal variations (Drop et al., 2013) for all comModels of (Krendel human control behaviour have existedet of the amplitude CE dynamics, for a range and disturbance and McRuer, 1960; Wasicko al., the signal amplitude variations (Dropofet ettarget al., 2013) 2013) for all all comcomthe 1960s 1960s (Krendel and McRuer, 1960; Wasicko etcomal., signal amplitude variations (Drop et al., 2013) for all common CE dynamics (Laurense et al., 2015). and becoming 1966; McRuer and Jex, 1967). Research focused on 1966;1960s McRuer and Jex, Jex, 1967). Research focused onetcomcommon CE dynamics (Laurense et al., 2015). and the (Krendel and1967). McRuer, 1960; focused Wasickoon al., signal amplitude variations (Drop et al., 2013) forbecoming all com1966; McRuer and Research mon CE dynamics (Laurense et al., 2015). and becoming 1966; McRuer and Jex, 1967). Research focused on common CE dynamics (Laurense et al., 2015). and becoming learning advanced (Zhang et al., pensatory tracking, where the controller oppensatory tracking, where the human human controller (HC) op- stronger stronger when learning advanced (Zhang et and al., 2017). 2017). 1966; McRuer and Jex, 1967). Research focused(HC) on common CE when dynamics (Laurense et al., 2015). becoming pensatory tracking, where human controller (HC) oppensatory tracking,system, where the the human controller (HC) op- stronger stronger when when learning learning advanced advanced (Zhang (Zhang et et al., al., 2017). 2017). controlled element (CE), erates a dynamic the erates a dynamic system, the controlled element (CE), pensatory tracking, where the human controller (HC) opstronger when learning advanced (Zhang et al., 2017). controlled element (CE), erates a dynamic system, the In this paper, we aim to study the adaptation huerates on a dynamic system, the controlled element (CE),a In this paper, we aim to study the adaptation of of huthis paper, aim to the of hubased the presented tracking between based on on the visually visually presented tracking error error between erates a dynamic system, the controlled element (CE),aa In In this paper, we we aimlearning to study study thetoadaptation adaptation of hubased the visually presented tracking error between man controllers while how perform a rampbased on the visually presented tracking error between a man controllers while learning how to perform a rampIn this paper, we aim to study the adaptation of human controllers while learning how to perform a rampquasi-random target signal and the system output. The quasi-random target signal signal and tracking the system system output. The based on the visually presented error between a man controllers while learning how to perform a rampquasi-random target and the output. The tracking task, using the SOP as theoretical basis. An quasi-random target signal and the system output. The man tracking task, using the SOP as aaa theoretical basis. An controllers while learning how to perform a ramptracking task, using the SOP as theoretical basis. human operator then behaves as a feedback-only controller human operator then behaves as a feedback-only controller quasi-random target signal and the system output. The tracking task, using the SOPwhere as a theoretical basis. An An human operator then as feedback-only controller experiment will be presented subjects performed a humancan operator then behaves behaves as aa and feedback-only controller experiment will be presented where subjects performed a tracking task, using the SOP as a theoretical basis. which be accurately modelled predicted using the experiment will will be be presented presented where where subjects subjects performed performedAn a which can can be accurately accurately modelled and predicted controller using the the experiment human operator then behaves as a and feedback-only a which be modelled predicted using combined ramp-tracking disturbance-rejection task, with which canmodel be accurately modelled and predicted using the experiment combined ramp-tracking disturbance-rejection task, with will be presented where subjects performed a combined ramp-tracking disturbance-rejection task, with crossover (McRuer and Jex, 1967). crossover model (McRuermodelled and Jex, Jex,and 1967). which canmodel be accurately predicted using the combined ramp-tracking disturbance-rejection task, with crossover (McRuer and 1967). single integrator (SI) dynamics, while manipulating the crossover model (McRuer and Jex, 1967). single integrator (SI) dynamics, while manipulating the combined ramp-tracking disturbance-rejection task, with single integrator (SI) dynamics, while manipulating the crossover model (McRuer and Jex, 1967). single integrator (SI) dynamics, while manipulating the In real-life tasks, such as or the target of the ramp target signal. The experiment had In many many real-life real-life tasks, tasks, such such as as driving driving or or flying, flying, the the target target steepness steepness of the ramp target signal. The experiment had single integrator (SI) dynamics, while manipulating the In of the ramp target signal. The experiment had In many many real-life tasks, such as driving driving or flying, flying, the target steepness steepness of the ramp target signal. The experiment had two parts, one in which the subject performed the same or ‘reference’ signal is directly observable and its future or many ‘reference’ signal is such directly observable and the its future future two parts, which the subject performed the same In real-life tasks, as driving or flying, target steepness ofone thein ramp target signal. The experiment had or ‘reference’ signal is directly observable and its two parts, one in which the subject performed the same or some ‘reference’ signal is directly observable andtoitsactivate future two parts,condition one in which the subject performed the same in consecutive runs, ‘ordered’ to extent predictable, allowing the to some some extentsignal predictable, allowing the HC HC toitsactivate activate steepness condition in ten tenthe consecutive runs, the the the ‘ordered’ or ‘reference’ is directly observable andto future steepness two parts,condition one in which subject performed same to extent predictable, allowing the HC steepness in ten consecutive runs, the ‘ordered’ to some extent predictable, allowing the HC to activate steepness condition in ten consecutive runs, the ‘ordered’ session, the other in which different steepness condition aato feedforward control loop. This charactera versatile versatile feedforward control loop. the ThisHCis is to characterthe other in which aaa different steepness some extent predictable, allowing activate session, steepness condition in ten consecutive runs, the condition ‘ordered’ versatile feedforward control loop. This is charactersession, the other in which different steepness condition a versatile feedforwardOrganization control loop.of This is charactersession, the other inrun, which a different session. steepness condition ized in the Perception (SOP) performed each the In addition, in the the Successive Successive Organization of This Perception (SOP) was was performed each thea‘random’ ‘random’ In addition, aized versatile feedforwardOrganization control loop.of is charactersession, the other inrun, which different session. steepness condition ized Perception (SOP) was performed each run, session. In addition, ized in in (Krendel the Successive Successive Organization ofwhere Perception (SOP) was performed eachsession run, the thea ‘random’ ‘random’ session. Indone addition, and McRuer, 1960), humans can after each ordered ‘surprise’ run was with theory theory (Krendel and McRuer, McRuer, 1960),ofwhere where humans can after afterperformed each ordered ordered session ‘surprise’ session. run was wasIndone done with ized in (Krendel the Successive Organization Perception (SOP) was eachsession run, theaa ‘random’ addition, and 1960), humans can theory each ‘surprise’ run with theory (Krendel and McRuer, 1960), where humans can after each ordered session a ‘surprise’ run was done with a different ramp steepness. The HC control models will progress from feedback-only compensatory control (level progress(Krendel from feedback-only feedback-only compensatory control (level aa different ramp steepness. HC control will theory and McRuer,compensatory 1960), wherecontrol humans(level can after each ordered session a The ‘surprise’ run wasmodels done with progress from different ramp steepness. The HC control models will progress from feedback-only a different ramp steepness. The HC control models will compensatory control (level be identified using data from single runs, using averaged 1) to feedback-feedforward pursuit control (level 2) to 1) to feedback-feedforward pursuit control (level 2) to be identified using data from single runs, using averaged progress from feedback-only compensatory control (level a different ramp steepness. The HC control models will 1) to feedback-feedforward pursuit control (level 2) to be identified using data from single runs, using averaged 1) to feedback-feedforward pursuit control (level 2)conto data, be identified using data from single runs, using averaged possibly ‘open loop’ (feedforward-only), precognitive and data from small time intervals within a run. possibly ‘open loop’ (feedforward-only), precognitive condata, and data from small time intervals within a run. 1) to feedback-feedforward pursuit control (level 2) to be identified using data from single runs, using averaged possibly ‘open loop’ precognitive conpossibly ‘open loop’ (feedforward-only), (feedforward-only), precognitive con- data, data, and and data data from from small small time time intervals intervals within within a a run. run. trol (level 3), on (McRuer trol (level‘open 3), depending depending on the the HC HC experience experience (McRuer possibly loop’ (feedforward-only), precognitive con- The data,paper and data from small time intervals within a run. trol (level 3), depending on the HC experience (McRuer is structured as follows: Section 2 summarizes trol (level 3), depending on the HC experience (McRuer The paper is structured as follows: Section 2 summarizes The paper is structured as follows: Section 2 summarizes and Jex, 1967). Despite their paramount importance in and (level Jex, 1967). 1967). Despite on their paramount importance in The trol 3), depending theparamount HC experience (McRuer paper is structured as follows: tasks. Section 2 summarizes and their importance in previous research on ramp-tracking The experiment and Jex, Jex, manual 1967). Despite Despite their previous research on ramp-tracking The experiment paper is structured as follows: tasks. Section 2 summarizes in previous research on ramp-tracking tasks. The experiment everyday control, these higher received everyday manual control, their these paramount higher levels levelsimportance received only only and Jex, manual 1967). Despite paramount importance in The previous research on ramp-tracking tasks. The experiment everyday control, these higher levels received only is described in Section 3, its results are discussed in everyday manual control, these higher levels received only previous is described in Section 3, its results are discussed in research on ramp-tracking tasks. The experiment little attention in the literature (Mulder et al., 2018). is described in Section 3, its results are discussed in little attention in the literature (Mulder et al., 2018). everyday manual control, these higher levels received only is described in Section 3, its results are discussed in little Section 4, and conclusions are drawn in Section 5. little attention attention in in the the literature literature (Mulder (Mulder et et al., al., 2018). 2018). Section 4, and andinconclusions conclusions are drawn in Section Section 5. is described Section 3, its results are discussed in Section 4, are drawn in 5. little attention in the literature (Mulder et al., 2018). Section 4, and conclusions are drawn in Section 5. Empirical Empirical evidence evidence for for human human feedforward feedforward control control has has Section 4, and conclusions are drawn in Section 5. Empirical evidence for human feedforward control has Empirical evidence human feedforward has been in tracking, tasks predictable 2. BACKGROUND been found found evidence in pursuit pursuitfor tracking, and tasks with with control predictable Empirical for humanand feedforward control has 2. BACKGROUND been in tracking, and tasks with 2. been found found in pursuit pursuit tracking, and tasks with predictable predictable 2. BACKGROUND BACKGROUND target signals (Wasicko et al., 1966; Magdaleno et targetfound signals (Wasicko et al., al.,and 1966; Magdaleno et al., al., been in pursuit tracking, tasks with predictable target signals (Wasicko et 1966; Magdaleno et al., 2. BACKGROUND target Hess, signals1981; (Wasicko et al.,1990; 1966;Drop Magdaleno et al., et al., 2016). 1969; Yamashita, 1969; Hess, 1981; Yamashita, 1990; Drop et al., 2016). Organization of Perception Perception Successive target Hess, signals1981; (Wasicko et al.,1990; 1966;Drop Magdaleno et al., 2.1 et al., 2016). 1969; Yamashita, 2.1 Successive Organization of Successive 1969; Hess, 1981; Yamashita, 1990; Drop et al., 2016). 2.1 Successive Organization Organization of of Perception Perception Fairly recently, it found a feedbackFairly Hess, recently, it was was found that that a combined combined feedback1969; 1981; Yamashita, 1990; Drop et al., 2016). 2.1 Fairly recently, it was found that a combined feedback2.1 Successive Organization of Perception Fairly recently, it was found that a combined feedbackfeedforward model accurately describes observed HC feedforward model accurately describes observedfeedbackHC bebe- Krendel Fairly recently, it was found that a combined and McRuer proposed the feedforward model accurately describes observed HC beKrendel and McRuer proposed the Successive Successive OrganizaOrganizafeedforward model accurately describes observed HC beKrendel and proposed the Successive haviour levels. In the Krendel and McRuer McRuer proposed the Successive OrganizaOrganizahaviour on on higher higher SOP levels. describes In these these tasks tasks the HC HC was feedforward model SOP accurately observed HC was be- tion of Perception (SOP) scheme to characterize HC haviour on higher SOP levels. In these tasks the HC was tion of Perception (SOP) scheme to characterize HC conconKrendel and McRuer proposed the Successive Organizahaviour tion of Perception (SOP) scheme to characterize HC conon higher SOP levels. In these tasks the HC was to accurately follow a deterministic, predictable instructed tion of Perception (SOP) scheme to characterize HCthree coninstructed to accurately follow a deterministic, predictable haviour ontohigher SOP follow levels. aIndeterministic, these tasks the HC was trol strategies (Krendel and McRuer, 1960). It has accurately predictable instructed trol strategies (Krendel and McRuer, 1960). It has three tion strategies of Perception (SOP)and scheme to characterize HCthree coninstructed to accurately follow a deterministic, predictable trol (Krendel McRuer, 1960). It has 1 trol strategies (Krendel and McRuer, 1960). It has three instructed to accurately follow a deterministic, predictable levels, see Fig. 1: compensatory, pursuit and precognitive E-mail:
[email protected] [email protected] 1 1 E-mail: levels, see Fig. Fig. (Krendel 1: compensatory, compensatory, pursuit and precognitive trol strategies and McRuer, 1960). It has three and precognitive levels, see 1: pursuit
[email protected] 1 1 E-mail: levels, see Fig. 1: compensatory, pursuit and precognitive
[email protected] 1 E-mail: levels, see Fig. 1: compensatory, pursuit and precognitive E-mail:
[email protected] 2405-8963 Copyright © 2019. The Authors. Published by Elsevier Ltd. All rights reserved.
Copyright © under 2019 IFAC IFAC 235 Peer review responsibility of International Federation of Automatic Control. Copyright © 2019 235 Copyright © 235 Copyright © 2019 2019 IFAC IFAC 235 10.1016/j.ifacol.2019.12.105 Copyright © 2019 IFAC 235
2019 IFAC HMS 236 Tallinn, Estonia, Sept. 16-19, 2019
M. Willems et al. / IFAC PapersOnLine 52-19 (2019) 235–240
control. The level at which the HC can exercise control depends on the task variables: the type of visual display, the CE dynamics Yc and the characteristics of the target ft and disturbance fd signals acting on the closed loop. Effects of these task variables on HC tracking behaviour have been extensively studied (Wasicko et al., 1966; McRuer and Jex, 1967); here, we will also investigate an operator-centered variable, namely the level of experience. n
HC ft
e
+ -
fd +
Y pe
u
θ
+
Yc
+
+
(a) First SOP level: compensatory behaviour ft ft
e
+ -
θ
n +
Y pe
+ +
fd +
u
+
Yc
+
θ
+
Y pθ
(b) Second SOP level: pursuit behaviour HC ft
Synchronous generator
Mode selector
≈
1 Yc
n
2.2 HC model and identification
fd
+ + + + +
u
+
Yc
θ
We use the pursuit HC model of Fig. 1(b), without the state feedback, Ypθ = 0. We expect the HC open loop, precognitive response to the target ramps to become apparent in the feedforward (FF) path Ypt , the response to ft . This is equivalent to considering Fig. 1(c) with the ‘Mode Selector’ set to ≈ 1/Yc , but with an additional feedback loop Ype to compensate for fd and remaining errors in responding ‘open loop’ to ft . The third component of the control signal u is the remnant n, which reflects the control input that is not linearly related to the input of the HC model (McRuer and Jex, 1967).
+
Learned response
(c) Third SOP level: precognitive behaviour
Fig. 1. Successive Organization of Perception (SOP).
e
(a) Compensatory
At the third level, precognitive control, three possible ‘open-loop’ control modes are defined. The HC is assumed to adopt one of these modes when becoming very experienced with the task. Ideally, the HC FF response would be a perfectly timed and scaled response to an expected feature in the target, requiring a perfect inversion of the CE dynamics. For some types of (easy) controlled elements and (predictable) target signals humans can indeed be trained to the precognitive control level, e.g., in tracking sinusoids (Yamashita, 1990; Drop et al., 2016). To limit the degrees of freedom, in this paper we only consider the control of a single integrator CE with a pursuit display, and focus on investigating the adaptation in the HC feedback and feedforward paths when learning to perform a ramp-tracking task. Besides the ramp-shaped target signal ft , our analysis requires the insertion of a second signal into the closed loop, an unpredictable disturbance fd , to identify the HC behaviour, Fig. 1. This means that the HC continuously needs to compensate for effects of fd acting on the controlled element, and the observed HC behaviour will here never be purely open loop.
HC
Y pt
a combined feedback-feedforward model, with Ype the HC feedback response on e and with Ypt the HC feedforward response on ft can accurately describe HC behaviour.
e f θ t
With an SI controlled element, Ype is given by:
(b) Pursuit
Fig. 2. Example of a compensatory and a pursuit display for a pitch tracking task, with the tracking error e, the target ft and the output θ indicated. At the lowest SOP level the HC is shown a compensatory display, Fig. 2(a), and minimizes the error e between an unpredictable target reference ft and the controlled element output θ. Compensatory HC behaviour can be predicted well with McRuer’s crossover model (McRuer and Jex, 1967) which describes the HC as a feedback-only servo-controller, Ype in Fig. 1. When using a pursuit display, Fig. 2(b), or when the target signal has characteristics that allow the HC to predict its (near) future values, the HC can move on to the second SOP level, pursuit control. Here, the HC can be described as a multi-loop controller acting on error e, target signal ft and CE state θ. Wasicko et al. (1966) showed that (because e = ft − θ) a HC model with two inputs can fully capture the observed behaviour. Drop et al. (2013) reported that 236
Ype (s) = Kpe e−sτpe Ynms (s), (1) with Kpe and τpe the gain and effective time delay of the feedback response, and Ynms the neuromuscular (NMS) dynamics, modeled by a second order system. The FF dynamics Ypt are given by (Laurense et al., 2015): s 1 Ypt (s) = Kpt e−sτpt Ynms (s), (2) Kc (1 + TIt s) with Kpt and τpt the gain and effective time delay of the feedforward response. In (2) we see the inversion of the SI dynamics (s/Kc term). A lag term (time constant TIt ) is included to model the imperfect HC response to the discrete onset and ending of the ramp segments (Drop et al., 2013; Laurense et al., 2015). 2.3 Ramp tracking tasks Ramp signals Ramp-like target signals are representative for a variety of discrete flight and driving maneuvers and have been used extensively in previous research (Pool et al., 2010; Drop et al., 2013; Laurense et al., 2015). In the ramp-tracking task, the target line on the pursuit display
2019 IFAC HMS Tallinn, Estonia, Sept. 16-19, 2019
M. Willems et al. / IFAC PapersOnLine 52-19 (2019) 235–240
starts and stops moving at instances not communicated to the HC, i.e., no preview is available. Their movement has a constant velocity q that is typically used for all ramps occuring in a run, see Fig. 3; higher q’s yield steeper ramps.
ft , deg
ramp 2
ramp 3 q = 1 deg s−1 q = 10 deg s−1 0
40
20
time, s
ramp 4 60
80
Independent Variables The two independent variables were the target signal ramp steepness q (4 levels) and the order of conditions (2 levels: ordered and random).
100
Fig. 3. Typical ramp-like target signal definition. Phases in ramp tracking In the HC response to a ramp signal we hypothesize three phases, similar as McRuer et al. (1968), see Fig. 4. First, in the delay phase (I) the IIb
I
III
θ, deg
I IIa
ramps), 2, 4 and 6 deg/s. To study skill development, the experiment had two Sessions. In each Session, eight subjects performed the same condition in ten consecutive runs, the ‘ordered’ part, the other eight subjects performed a different condition every run, the ‘random’ part. Then, we could investigate control adaptations from Session 1 to Session 2, to study the overall learning process, but also how this process depends on a situation where subjects know exactly what to expect in the next run, versus one where each run may have a different ramp target.
Participants, Instructions Sixteen subjects participated, all males, students or staff of TU Delft and experienced in tracking tasks. Two groups of 8 subjects were comprised: Group A first performed the ordered session, then the random session; Group B did it the other way around. Subjects were instructed to minimize the pitch tracking error e on the pursuit display. Controlled Element Single integrator dynamics were used: Yc = Kc /s, with gain Kc = 1.0 (Drop et al., 2013), such that subjects would never reach the maximum deflection limits of the stick and were still able to provide fine, accurate control inputs.
ft θ time, s
Fig. 4. Definition of ramp response phases. HC is unaware of the onset of the ramp and is suddenly confronted with an error which rises with ramp steepness q. In the rapid response phase (IIa) the HC quickly reacts to the growing error, possibly in an open-loop fashion. In the ramp-tracking phase (IIb) the HC aims to match the velocity of the system to the ramp. It is during this phase that we hypothesize that the HC has ‘recognized’ the signal as a ramp with steepness q and tries to predict the remainder of the ramp. In phase III the compensatory path will again dominate; we focus on phases IIa and IIb as here we expect most of the feedforward activity.
Forcing Functions For each run, the target signal had 8 identical ramps (3 seconds each) constructed with the four levels of ramp steepness q, see Fig. 5. Note that in the first condition, q=0 deg/s, our subjects essentially performed a disturbance-rejection task (ft = 0). 20
q = q = q = q = fd
15
ft , fd , deg
ramp 1
9 6 3 0 −3 −6 −9
237
10 5
0 2 4 6
deg/s deg/s deg/s deg/s
0 −5
−10 −15
Previous experiments Pool et al. (2010) used timedomain identification to parameterize a combined feedbackfeedforward HC model for ramp tracking. A strong FF response could be identified only, however, when the disturbance fd was small. Drop et al. (2013) investigated the presence of the HC FF response as a function of a Steepness to Disturbance Ratio, SDR = q/Kd , with q the ramp steepness and Kd the disturbance signal scale factor. Their analysis showed that the FF path is more beneficial and increases in strength relative to the feedback path for higher SDR values. These experiments confirmed that human controllers, with predictable target signals on a pursuit display, perform at a higher SOP level than compensatory tracking. These also showed the promising applicability of time domain identification methods, which will be applied here. 3. EXPERIMENT Rationale A target-following disturbance-rejection task was done, with four ramp steepness conditions: q=0 (no 237
−20 0
10
20
30
40
50
60
70
80
90
time, s
Fig. 5. Target and disturbance forcing function time traces. To make sure that subjects could not predict the start of the first ramp, a random time τ , varying between 0 and 5 s was added to the 90 s measurement time. Furthermore, to prevent subjects to use the property that ramps stop on the horizon, the target signal was off-set by a random number between 0.1 and 1 degrees each run. The disturbance signal fd was defined as a multisine, with gain 0.4, similar as done by Drop et al. (2013). Dependent Measures We focus on just a few measures: (i) tracking performance, i.e., the RMS tracking error RMS(e), and (ii) the HC model parameters representing the gains in the feedback Kpe and FF Kpt channels. The HC model will be fit to the measurements using MLE identification (Zaal et al., 2009). For the q=0 condition only the feedback parameters will be estimated, as this is essentially a disturbance-rejection compensatory tracking
2019 IFAC HMS Tallinn, Estonia, Sept. 16-19, 2019 238
1 2 3 45 6 78 910 1 2 3 45 6 78 910 runs
0.6 0.4 0.2
(a) RMS(e), q =0 deg/s Session 1
Session 2
Kpe , -
2.5
1.0
Session 1
ord. rand rand ord.
1 23 45 6 78 910 1 23 45 6 78 910 runs
0.4
3.0
Session 2
(e) Kpe , q =0 deg/s
1.0
1 23 45 6 78 910 1 23 45 6 78 910 runs
Session 1
1.0
Session 2
0.5
0.8 0.6
(d) RMS(e), q =6 deg/s 3.0
1 23 45 6 78 910 1 23 45 6 78 910 runs
Session 1
Session 2
1.0
(i) Kpt , q =2 deg/s
1 23 45 6 78 910 1 23 45 6 78 910 runs
(h) Kpe , q =6 deg/s 1.0
Session 1
Session 2
0.8
0.7
0.6 0.5
Session 2
0.9
0.7
1 23 45 6 78 910 1 23 45 6 78 910 runs
Session 1
1.5
0.8
A, ord. B, rand A, rand B, ord.
1 2 3 45 6 78 910 1 2 3 45 6 78 910 runs
2.0
Kpt , -
Kpt , -
0.6
Session 2
2.5
0.9
0.8
Session 1
1.0
(g) Kpe , q =4 deg/s
0.9 0.7
Session 2
1.5
(f) Kpe , q =2 deg/s 1.0
Session 1
2.0
1.5 1.0
1 2 3 45 6 78 910 1 2 3 45 6 78 910 runs
2.5
2.0
A, B, A, B,
0.6
1.2
(c) RMS(e), q =4 deg/s
2.5
2.0 1.5
3.0
Session 2
0.8
(b) RMS(e), q =2 deg/s
Kpe , -
3.0
1 2 3 45 6 78 910 1 2 3 45 6 78 910 runs
Session 1
Kpe , -
0.2
1.0
Session 2
RMS(e), deg
0.4
Session 1
Kpe , -
0.6
0.8
Kpt , -
Session 2
RMS(e), deg
Session 1
RMS(e), deg
RMS(e), deg
0.8
M. Willems et al. / IFAC PapersOnLine 52-19 (2019) 235–240
0.6 1 23 45 6 78 910 1 23 45 6 78 910 runs
(j) Kpt , q =4 deg/s
0.5
1 23 45 6 78 910 1 23 45 6 78 910 runs
(k) Kpt , q =6 deg/s
Fig. 6. RMS error (top row), error feedback response gain (center) and feedforward response gain (bottom); data are shown for both sessions and groups. Group A (ordered session first) is shown in white symbols, Group B (random session first) is shown in colored symbols. Circles and triangles represent ordered and random session data, respectively. task where the HC will not be able to develop a FF response. The model will be fitted on data of entire runs and on data per ramp. Data Analysis Data are analyzed in three ways. First, we consider the data per run, to see how humans learn to control the ramps in either the ordered or random condition. Second, we consider the data averaged over the last five runs, the common approach to measuring human performance. Third, we consider the surprise run, for which we will also study the data per ramp.
to improve especially for the more difficult conditions. That is, the learning curve becomes less steep towards the end of the experiment, but its gradient is still non-zero for the harder runs (q = 4, 6 deg/s). When considering the q = 0 condition, purely disturbance rejection, we see that our subjects rapidly show a more or less constant performance, with Group A slightly better in Session 1, a performance difference which disappears in the second session. From this we can safely say that, at least for this condition, our two groups of subjects had – on average – comparable tracking skills, supporting H.I.
Hypotheses First, we grouped our subjects such that both groups were assumed to perform about the same, our first hypothesis (H.I). Second, we hypothesize that in the ordered session, human control behavior develops more towards the highest level, precognitive control, because after a few runs subjects are familiar with and can anticipate for the ramp steepness (H.II). Third, we hypothesize the effects of the surprise run to be largest in the first ramps of the surprise runs (H.III), as we expect our subjects to have ‘perfectly tuned’ their response to the repeated ramp conditions in the ordered session.
Considering the q = 2 deg/s condition, Fig. 6(b), we see that Group A, starting with the ordered runs, performs better than Group B, who started with the random runs 2 . In Session 2 the performance of both groups improves, where Group A continues to become better trackers in the random session, and Group B stablizes to the performance of Group A in Session 1 while tracking the ordered runs. Tentatively, the ordered runs help subjects to increase their skills very rapidly, and when then confronted with the random runs, they are able to keep up with this performance and even slightly improve further.
4. RESULTS AND DISCUSSION
Regarding the q = 4 and q = 6 deg/s conditions, Fig. 6(c)6(d) reveal that here a similar benefit exists for learning with ordered ramp conditions (Group A in Session 1), but tracking performance of Group B in Session 2, i.e., moving to the ordered conditions while coming from the random ones, improves quite remarkably, even outperforming subjects in Group A. This will be elaborated on further below.
4.1 Data per run Fig. 6 shows the performance RMS(e) and the HC model feedback gain Kpe and FF gain Kpt , for all ten runs in the first and second sessions. To compute or estimate each variable, the entire run (i.e., all 8 ramps) was used. Tracking performance Figs. 6(a)-6(d) show that performance worsens when ramps come into play, and for steeper ramps. It improves rapidly in the first runs, and continues 238
2 Note that here the runs were not performed in a sequential 1-2-3-... way, but were all taken from randomized conditions. I.e., although run 2 was done after run 1, there was at least one other run (different q) in between.
2019 IFAC HMS Tallinn, Estonia, Sept. 16-19, 2019
M. Willems et al. / IFAC PapersOnLine 52-19 (2019) 235–240
1.0 S1 S2
S1 S2
S1 S2
S1 S2
0.8
S1 S2
S1 S2
S1 S2
1.0
S1 S2
S1 S2
S1 S2
0.4
0.9
K pt , −
K pe , −
RMS(e), deg
3
0.6
0.2
1.1
4 S1 S2
239
0.8
2
0.7
1
0
2
q, deg/s
4
6
0
A, B, A, B,
2
0
(a) RMS(e)
q, deg/s
(b) Kpe
4
ordered random random ordered
6
A, B, A, B,
0.6 0.5
2
4 q, deg/s
ordered random random ordered
6
(c) Kpt
Fig. 7. RMS error (top), error feedback response gain (center) and feedforward response gain (bottom); data are averaged over last five runs per session. HC feedback gain Figs. 6(e)-6(h) show estimates of the feedback gain Kpe , and reveal that for all runs and sessions these gains were markedly lower for Group B compared to Group A. Only for the more difficult conditions (q = 6 deg/s) we see some evidence of a learning curve (increasing gain towards the end), but overall learning effects are small. Note that the feedback path operates on the disturbance signal fd , which is the same for all runs, and on what is left after the feedforward operation on ft . Apparently, Group A subjects were better ‘disturbancerejectors’, and/or Group B subjects were slightly better at following the ramp targets. When regarding ordering effects, it is clear that for Group B the ordered runs led to markedly higher feedback gains as the random runs, which could be attributed to the ordering effect, but may also have had a strong component from a continuing learning effect, as for all groups the gains were higher in Session 2 as compared to Session 1. Overall, our subjects continued to improve throughout the experiment, and whereas performance more or less stabilizes towards the end, the HC model parameters still show slight changes towards improvement. HC feedforward gain The feedforward gains Kpt , Figs. 6(i)6(k), slowly creep towards a higher value (ideally: Kpt = 1.0) towards the end of the experiment. Gains are higher for the steeper ramps (q = 4, q = 6 deg/s), confirming the SDR analysis of Drop et al. (2013) as steeper ramps have a higher SDR value with a constant disturbance signal power. Especially for the steeper ramps the increase in feedforward gain for Group B in Session 2 is substantial. That is, when confronted with a randomized ramp signal to be tracked, gains are more or less constant and lower with respect to the case the ramp conditions are ordered. Subjects rapidly increase their feedforward gains when they see that the ramps are identical every run. The subjects who go from the ordered runs towards the random runs (Group A) appear to slightly lower their gains, especially for the two more difficult ramp conditions. 4.2 Data averaged over last five runs Fig. 7 shows the RMS(e) and the HC model feedback gain Kpe and FF gain Kpt , when averaging over the last five runs in the two sessions. This is what is commonly being done in cybernetics-studies, averaging-out learning and adaptation effects (Mulder et al., 2018). 239
Performance Fig. 7(a) shows the averaged RMS(e) for the two groups in the two sessions. Performance for Group A is slightly better than Group B in the disturbancerejection task (q = 0), but differences are very small. Performance is, on average, better in the second session, which makes sense as our subjects continued to learn and improve. Clearly, in Session 1 the performance in the ordered runs is better, whereas in Session 2 the performance becomes more or less the same for both groups. When moving from the random session to the ordered session, Group B in Session 2, led to quite a substantial performance improvement, especially for the steeper ramps (q = 4, 6 deg/s). HC feedback gain Fig. 7(b) shows the averaged Kpe gains, again showing higher gains for Group A in almost all conditions. Gains slightly increase when moving from the ordered to the random runs (Group A), and more markedly increase when moving from the random to the ordered runs (Group B). Our subjects were clearly adapting and learning to the very end of the experiment, but changes in the feedback gain are on average very small. HC feedforward gain The averaged feedforward gains Kpt are shown in Fig. 7(c). Here we see that the gains slightly increase towards the end of the experiment no matter what groups we consider, indicating learning. When considering Group A (ordered runs first) the feedforward gain increases for the steeper ramps, reported in (Drop et al., 2013). Moving to Session 2 (random runs), the gains either increase (q = 2) or remain the same. Feedforward gains for group B are smaller for the steeper runs in the first, random, session, but then steeply increase when moving towards the second session. It is clear that this group benefits the most in the second session, especially for the steeper and more difficult ramps. These results support H.II: ordered conditions yield stronger feedforward control. 4.3 Behavioral Changes in Surprise Runs The surprise runs suddenly exposed subjects to a different ramp steepness when they had fully adapted their response characteristics to the ramp signal steepness of the ordered block. Data were averaged over the final five runs. Fig. 8 shows the estimates of Kpt , comparing the gains applied in the surprise runs to the averages of the gains applied in the ordered and random sessions, using the full
2019 IFAC HMS 240 Tallinn, Estonia, Sept. 16-19, 2019
M. Willems et al. / IFAC PapersOnLine 52-19 (2019) 235–240
balance-out these individual learning trajectories, a large number of subjects is mandatory.
1.4
1.4 Ordered
1.2
Random Surprise
K pt , −
Kpt , −
1.2 1.0
REFERENCES
1.0 0.8
0.8
0.6
0.6 2
4
6
4 q, deg/s
2
q, deg/s
(a) full run data
6
(b) first ramp data
Fig. 8. Means, 95% conf. intervals of Kpt estimates.
1.0
1.0
0.8
1.0
0.8
0.6
0.4
K pt , −
1.4 1.2
K pt , −
1.4 1.2
K pt , −
1.4 1.2
0.8
0.6
12345678 ramps
(a) q=2
0.4
0.6
12345678 ramps
(b) q=4
0.4
12345678 ramps
(c) q=6
Fig. 9. Mean, 95% conf. intervals of Kpt parameter estimates, surprise run. run data (left) or using only the data corresponding to the first ramp (right). Clearly, the differences are much larger in the latter case. For each ramp steepness, subjects adjust their gain towards the gain used in the random sessions. Fig. 9 shows how Kpt (averaged over all subjects) progresses throughout the surprise run. Subjects adapt quickly, within the first four to five ramps towards the value used in the random session, supporting H.III. For the surprise run with steepness q=2 deg/s the feedforward gain is initially too high (> 1.0) as subjects were experiencing either a q=4 or q=6 deg/s condition in the previous ten consecutive runs (requiring a high gain), but quickly reduce their gain after the first ramps. The opposite is true for the q=6 deg/s condition where Kpt is initially too low, as subjects were experiencing either a q=2 or 4 deg/s condition in the previous ten ordered runs (requiring a lower gain), and quickly increase their gain. For the surprise run with steepness 4 deg/s, the gain shows more variations around an average value, which is likely the result of averaging, since for this surprise run the previous runs either had a lower (q=2) or a higher ramp steepness (q=6). The lower value of Kpt in the seventh ramp is an artifact, caused by the disturbance signal fd , which at that time coincidentally moved the system in the right direction, requiring a smaller gain. 5. CONCLUSIONS We investigated human control behavior in ramp tracking tasks, and assessed manual control skill development using a combined feedback/feedforward model. Results show that: 1) as hypothesized, ordered runs lead to higher feedforward gains, 2) training with random runs and then moving to ordered runs yields the strongest performance benefits, and 3) HCs adapt very quickly to a change in ramp steepness. In addition, our data show that HCs continue to learn throughout the experiment, but the extent to which they do that depends on the individual subject. To 240
Drop, F.M., de Vries, R., Mulder, M., and B¨ ulthoff, H.H. (2016). The Predictability of a Target Signal Affects Manual Feedforward Control. In 13th IFAC/IFIP/IFORS/IEA Symp. on Analysis, Design, and Evaluation of Human-Machine Systems, Kyoto, Japan, Aug. 30 - Sep. 2, 177–182. Drop, F.M., Pool, D.M., Damveld, H.J., Van Paassen, M.M., and Mulder, M. (2013). Feedforward Behavior in Manual Control Tasks with Predictable Target Signals. IEEE Trans. on Cybernetics, 43(6), 1936–1949. Hess, R.A. (1981). Pursuit Tracking and Higher Levels of Skill Development in the Human Pilot. IEEE Trans. on Systems, Man, and Cybernetics, SMC-11(4), 262–273. Krendel, E.S. and McRuer, D.T. (1960). A Servomechanics Approach to Skill Development. J. of the Franklin Inst., 269(1), 24–42. Laurense, V., Pool, D.M., Damveld, H.J., Van Paassen, M.M., and Mulder, M. (2015). Effects of Controlled Element Dynamics on Feedforward Manual Control. IEEE Trans. on Cybernetics, 45(2), 253–265. Magdaleno, R.E., Jex, H.R., and Johnson, W.A. (1969). Tracking Quasi-predictable Displays Subjective Predictability Graduations, Pilot Models for Periodic and Narrowband Inputs. In Proc. of the 5th Ann. NASAUniversity Conference on Manual Control, 391–428. McRuer, D.T., Hofmann, L.G., Jex, H.R., Moore, G.P., Phatak, A.V., Weir, D.H., and Wolkovitch, J. (1968). New Approaches to Human-Pilot/Vehicle Dynamic Analysis. Technical Report AFFDL-TR-67-150. McRuer, D.T. and Jex, H.R. (1967). A Review of QuasiLinear Pilot Models. IEEE Trans. on Human Factors in Electronics, HFE-8(3), 231–249. Mulder, M., Pool, D.M., Abbink, D.A., Boer, E.R., Zaal, P.M.T., Drop, F.M., van der El, K., and van Paassen, M.M. (2018). Manual Control Cybernetics: State-ofthe-Art and Current Trends. IEEE Trans. on HumanMachine Systems, 48(5), 468–485. Pool, D.M., van Paassen, M.M., and Mulder, M. (2010). Modeling Human Dynamics in Combined RampFollowing and Disturbance-Rejection Tasks. In Proc. of the AIAA Guidance, Navigation, and Control Conference, Toronto, Canada, Aug. 2-5, AIAA-2010-7914. Wasicko, R.J., McRuer, D.T., and Magdaleno, R.E. (1966). Human Pilot Dynamic Response in Singleloop Systems with Compensatory and Pursuit Displays. Technical Report AFFDL-TR-66-137. Yamashita, T. (1990). Effects of Sine Wave Combinations on the Development of Precognitive Mode in Pursuit Tracking. Q. J. of Exp. Psych. Sec. A, 42(4), 791–810. Zaal, P.M.T., Pool, D.M., Chu, Q.P., van Paassen, M.M., Mulder, M., and Mulder, J.A. (2009). Modeling Human Multimodal Perception and Control Using Genetic Maximum Likelihood Estimation. J. of Guidance, Control, and Dynamics, 32(4), 1089–1099. Zhang, X., Wang, S., Hoagg, J.B., and Seigler, T.M. (2017). The Roles of Feedback and Feedforward as Humans Learn to Control Unknown Dynamic Systems. IEEE Trans. on Cybernetics, 48(2), 543–555.