Human Control Strategy: Modeling and Transfer

This page is udergoing major revisions. Please excuse any temporary broken links and/or incongruities. For a more comprehensive discussion of our work consult our list of publications.

Learning and Transfer of Human Control Strategies

Introduction

We are developing methodologies for modeling and transferring human control strategy (HCS) in response to real-time inputs. The problem has significant impact in a variety of applications such as space telerobotics, the Intelligent Vehicle Highway System (IVHS), human-machine interfacing, real-time training, and agile manufacturing. The research provides a framework for abstracting human skills so as to facilitate analysis of some aspects of human control, to develop human-like intelligent machines, to allow robots to learn from human partners in human-robot coordination, and to transfer skill from human to human through learning human-machine interfaces. We are addressing the following issues:

How to efficiently model human control strategy
How to validate the computed models
How to select state inputs in order to generate reliable models
How to evaluate the quality of the HCS models
How to optimize performance of the HCS models
How to effectively transfer human control strategy

We learn human control strategy in the efficient cascade neural network architecture, where network parameters are updated through extended Kalman filtering in a flexible functional form of variable hidden units and variable activation functions. We independently verify HCS model fidelity through a novel stochastic model validation procedure. A similarity measure, based on Hidden Markov Model (HMM) observation probabilities, compares dynamic human and model trajectories. By connecting this validation procedure with simultaneously perturbed stochastic approximation (SPSA) in an iterative loop, we can automatically select that input representation which maximizes model fidelity. In the process, we determine the fundamental control properties of order, granularity and control delay for any specific individual.

We also evaluate the skill exhibited in human control strategy and corresponding models through several defined task-dependent as well as task-independent performance criteria, including generalizability, long-term consistency, and robustness. Using specific performance criteria, we then optimize performance in initially stable HCS models through adaptive SPSA parameter refinement. One formulation of the performance criteria allows us to simplify a model's structure after training. Finally, we propose to use HCS models as virtual teachers in

Learning human control strategy (HCS)

Scientific knowledge of human intelligence is not nearly advanced enough to characterize human actions analytically. Therefore, in modeling human control strategy (HCS), as with other poorly understood phenomena, we must rely on modeling by observation, or learning, rather than theoretical or physical derivation. Human control strategy is dynamic, stochastic, possibly discontinuous, and nonlinear in nature. To meet these challenges, we model human control strategy using cascade neural networks with variable activation functions. We have demonstrated the applicability of cascade networks to modeling HCS in an inverted pendulum simulator and a driving simulator.

A cascade neural network grows in complexity as is required by the training data.

	Similarity analysis & model validation
	Similarity analysis & model validation
	The main strength of modeling by learning, as required for human control strategies, is that no explicit physical model is required; this also represents its biggest weakness, however, especially when the unmodeled process is (1) dynamic and (2) stochastic in nature, as is the case for human control strategy. For such processes, model errors can feed back on themselves to produce trajectories which are not characteristic of the source process or are even potentially unstable. Yet, most learning approaches today, including feedforward neural networks, utilize some static error measure as a test of convergence for the learning algorithm. While this measure is very useful during training, it offers no guarantees, theoretical or otherwise, about the dynamic behavior of the resulting learned model. Thus, we have developed a similarity measure, based on Hidden Markov Model analysis, as a means of validating learned models of human control strategy.
	An example of a left-to-right Hidden Markov Model (HMM) with five states and 16 observables.
	We are primarily interested in generating a probabilistic similarity measure between the dynamic system trajectories generated by the human and those generated by the learned HCS models (i.e. the cascade network models). The diagram below illustrates the overall approach. We generate normalized probabilities P1 and P2 for the HMM trained on the human control data. The relationsip between these normalized probabilites defines the similarity measure. As an example, we have applied this validation procedure to the learning of human driving.

Input selection

Little if anything is known a priori about the (1) structure, (2) order, (3) granularity, or (4) control delay inherent in a particular individual's internal controller. We therefore require a procedure which automatically refines the input representation for HCS models so as to arrive at better approximations of the actual human control strategy. To this end, we combine the model validation procedure with simultaneously perturbed stochastic approximation (SPSA) to select the best model input representation.

Skill evaluation

We are interested in assessing the skill exhibited in human control strategies and their corresponding models. Skill characterizes the performance inherent in the HCS models, and consists of an entire set of task-independent as well as task-dependent performance criteria, which may or may not be in conflict. Consider, for example, the driving control strategies for Stan and Oliver. Which control strategy demonstrates better skill? On the one hand, Stan's control strategy offers a smoother ride compared to Oliver's strategy, and is significantly more fuel efficient to boot. On the other hand, Oliver's strategy achieves higher average speeds over equivalent roads (72 m.p.h. vs. 56 m.p.h.) and exhibits stability for a broader range of initial and environmental conditions (as shown in the diagram below), where each control strategy model is asked to steer through s-curves of various radii (y axis) and different initial velocities (x axis).

(a) (b)

Stan's stability profile (a), and Oliver's stability profile (b), where orange and yellow colors indicate a successful maneuver through the s-curve, the red indicates a marginally successful maneuver, and the brown indicates an unsuccessful maneuver.

Performance optimization

Once the skill in a human control strategy and corresponding models has been evaluated, we optimize performance for the HCS models in initial, stable HCS models through simultaneously perturbed stochastic approximation (SPSA).

Human-to-human skill transfer

Using HCS models, we seek to replace an actual human expert instructor with a virtual expert instructor to transfer skill from one human to another. Rather than get advice from the human expert directly, an apprentice can get advice through the expert's HCS model, which acts as the virtual teacher. The model-generated advice can be presented continuously to the apprentice, while exploiting multiple sensor modalities. This has the potential to improve both learning speed and the quality of learning by the apprentice.

Researchers

Related Publications & Technical Reports

	[1]	M. Nechyba and Y. Xu, Stochastic Similarity for Validating Human Control Strategy Models, Proc. IEEE Conf. on Robotics and Automation, vol. 1, pp. 278-83, 1997.
	[2]	M. Nechyba and Y. Xu, Human Control Strategy: Abstraction, Verification and Replication, to appear in IEEE Control Systems Magazine, 1997.
	[3]	M. Nechyba and Y. Xu, On the Fidelity of Human Skill Models, Proc. IEEE Int. Conference on Robotics and Automation, vol. 3, pp. 2688-93, 1996.
	[4]	M. Nechyba and Y. Xu, Human Skill Transfer: Neural Networks as Learners and Teachers, Proc. IEEE Int. Conference on Intelligent Robots and Systems, vol. 3, pp. 314-9, 1995.
	[5]	M. Nechyba and Y. Xu, Neural Network Approach to Control System Identification with Variable Activation Functions, Proc. IEEE Int. Symp. on Intelligent Control, vol. 1, pp. 358-63, 1994.
	[6]	M. Nechyba and Y. Xu, Stochastic Similarity for Validating Human Control Strategy Models, Technical Report, CMU-RI-TR-96-29, Carnegie Mellon University, 1996.
	[7]	M. Nechyba and Y. Xu, Towards Human Control Strategy Learning: Neural Network Approach with Variable Activation Functions, Technical Report, CMU-RI-TR-95-09, Carnegie Mellon University, 1995.

Last updated April 7, 1997 by Michael C. Nechyba