CEREBELLAR CONTROL MODEL DESIGN FOR THE TEMPORAL COORDINATION OF ARM TRANSPORT AND HAND PRESHAPE

ZHANG SHAO-BAI1*, ZHOU NING-NING2, FENG ZHI-QUAN3
1College of Computer, Nanjing University of Posts and Telecommunications, Nanjing, Jiangsu, 210003,China
2College of Computer, Nanjing University of Posts and Telecommunications, Nanjing, Jiangsu, 210003,China
3School of Information Science and Engineering, Jinan University, Jinan, Shandong, 250022,China
* Corresponding Author : adzsb@163.com

Received : 15-12-2011     Accepted : 16-01-2012     Published : 19-03-2012
Volume : 2     Issue : 1       Pages : 8 - 14
J Pattern Intell 2.1 (2012):8-14

Cite - MLA : ZHANG SHAO-BAI, et al "CEREBELLAR CONTROL MODEL DESIGN FOR THE TEMPORAL COORDINATION OF ARM TRANSPORT AND HAND PRESHAPE ." Journal of Pattern Intelligence 2.1 (2012):8-14.

Cite - APA : ZHANG SHAO-BAI, ZHOU NING-NING, FENG ZHI-QUAN (2012). CEREBELLAR CONTROL MODEL DESIGN FOR THE TEMPORAL COORDINATION OF ARM TRANSPORT AND HAND PRESHAPE . Journal of Pattern Intelligence, 2 (1), 8-14.

Cite - Chicago : ZHANG SHAO-BAI, ZHOU NING-NING, and FENG ZHI-QUAN "CEREBELLAR CONTROL MODEL DESIGN FOR THE TEMPORAL COORDINATION OF ARM TRANSPORT AND HAND PRESHAPE ." Journal of Pattern Intelligence 2, no. 1 (2012):8-14.

Copyright : © 2012, ZHANG SHAO-BAI, et al, Published by Bioinfo Publications. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution and reproduction in any medium, provided the original author and source are credited.

Abstract

Based on Hoff and Arbib’ s control theory of the minimum jerk, this paper presents a new control model with cerebellar-like structure which can account for the temporal coordination of arm transport and hand preshape during reach and grasp tasks. And it has also be suggested that how the structure could learn two key functions required in the Hoff-Arbib theory, namely state look-ahead and TTG (time-to-go) estimation. The simulation results demonstrate that whether it is in a d or 2 d space, by training and learning, the model can obtain the accurate smooth motor trajectory.

Keywords

Control model, Arm transport, Preshape, Temporal coordination.

Introduction

Constructing arm transport balanced control model is an important subject of investigating robotics and control science. In the paper [1] , we have set up a cerebellum control model for directional movement of arm and supposed that kinematic plan for the movement is generated in premotor cortex. The trajectories of arm transport were generated in a feed-forward way, i.e., not taking into account the actual position of the hand other than at movement initiation. It is well established that human reaching trajectories in simple movements are smooth with a typical bell-shaped tangential velocity profile, but it has been shown that trajectory generation is in part a feedback process where both the position of the hand and target are constantly monitored.
However, arm control itself is just as a means of using arms to grasp the target in the various tasks, not a purpose. We are interested in the issue of temporal coordination between the form of arm transport and hand’s preshape during of reach and grasp tasks. If the task is to grasp an object, the hand is preshaped appropriately to be slightly larger than the size of the object during the transport phase. Hand shape, or to simplify, aperture size, is temporally coordinated with arm transport so that the hand attains maximum aperture approximately 200ms before the target is reached, followed by an enclose phase that is timed so that the aperture matches the object size when transport terminates.
It should be noted that grasp is a very complex process. Only grasp modes, there are more than eight species [2] , such as power grasp, cylindric grasp, hook grip, span grip etc.. S. A. Stansfield had defined three grasp modes which is called pre-grasp shape of the hand directing at their robot system [3] , namely wrap, grip and pinch. The paper discusses not specific grasp mode but grasping process based on mode defined by S. A. Stansfield.
Hoff and Arbib developed a control theoretic model [4] , based on the minimum jerk optimality criteria, that accounts well for the kinematics of hand transport and preshape under a variety of conditions, including perturbations in object position and size. A critical part of the model was a state look-ahead unit, needed to compensate for long efferent and afferent delays. We show that the cerebellum, by virtue of its structure and connectivity in the motor system, is uniquely suited to learn this function.
In the paper, we will show how the cerebellar module, here suggested to represent a part of the lateral cerebellum, can learn a forward model of the system it is embedded in. A forward model takes as inputs the control inputs to a system and predicts the future system state. Here we show how the lateral cerebellum could use collaterals of descending motor commands (combined with a delayed sensed state of the arm and hand) to provide predictions of the future state of the arm and hand to premotor areas, thereby allowing the accurate evolution of smooth, accurate kinematic trajectories.

Hoff-Arbib Model

As shown in [Fig-1A&B] , coordination between reach and grasp in the Hoff-Arbib model is achieved by determining the time needed for each of the aperture and transport controllers to reach its goal, then setting the duration input to the controllers to the maximum of the two values. The controllers had the form shown in [Fig-1A&B] . The control law for the arm,

(1)

with x location, D the time-to-go, t was designed to produce minimum jerk trajectories.
However, the model described by Equation (1) assumes that the system state can be measured without delay. In a realistic simulation of the nervous system, the long delays have to be taken into account and required a look-ahead unit (or state estimator) as shown in [Fig-6b] . to provide updated state information.

Model Constructing

As shown in Figure 2, we have preserved most of the Hoff-Arbib model, with the addition of the cerebellar modules to learn forward models of the controlled plants and provide time-to-go (TTG) estimates. The reason for complicating the basic Hoff-Arbib model is that biological systems have to deal with significant afferent and efferent delays. In Hoff and Arbib’ model, this problem was solved using an analytical forward model of the system. Here we show how the cerebellar module could acquire such a model through training.
Two models were implemented: in the first (Model 1), to simplify, both the transport and grasp aperture are represented by scalar numbers corresponding to distance moved and aperture size respectively. The second implementation (Model 2) extended the first by replacing the scalar distance representation of the transport phase with the two-joint planar arm described in the paper [1] , that is transfer one-dimensional distance training to multi-dimensional movement (direction and distance) training in Cartesian space, thereby increasing evaluation standard towards system complexity and practicality.

Construction of Cerebellar Module

For this model we used the abstract cerebellar module as described in the paper [5] . Although a single module, sharing all the inputs, was used for both models, the outputs were grouped into subsystems. For Model 1 the subsystems comprised the grasp system and the (scalar) distance system. For Model 2 the distance system was broken up into separate shoulder and elbow systems, to give three subsystems.
The subsystems have two outputs each to predict the current state (position and velocity). It is assumed that an estimate of acceleration can be determined from the motor command signal. Two further outputs are trained to be contact anticipation signals, and could also be interpreted as TTG estimates. Both Model 1 and Model 2 used only one TTG output for each of the grasp and transport systems.
For Model 1 the state estimates were directly used in Equation (1) to determine the "motor command" for each subsystem. For Model 2, however, the outputs were estimates of the arm state in joint space (to be consistent with the inputs available to the inferior olive), while trajectories are planned in Cartesian space. In order to allow the planning to proceed in Cartesian space, the joint-space estimates were converted to Cartesian state estimates and updated for a virtual wrist position using Equation (1). And from this virtual position the new desired joint state was computed. For this model we used a simple feedback controller to generate joint torques for the arm.
Modern biology suggests that parietal cortex is concerned with the visual, guidance of hand movement, especially in matching movements to the spatial characteristics of the object. Cortical areas project to the cerebellar cortex via the pontine nucleus. The cortex receives information about the current state of the limb via the cuneocerebellar tract.
In the model, the cerebellar module receives 5 population coded inputs as mossy fiber afferents from each subsystem. Four inputs correspond to the delayed state (position, velocity, acceleration) representing spinal afferents, plus an input equal to the difference between the target value and current position, which could be seen as originating in the posterior parietal cortex. Additionally, the module receives an efference copy of the current motor command.
Lastly, the module received the previous TTG prediction, again one each for the grasp and transport systems.
Mossy fibers were modeled as a array, with each row vector of the array coding a specific input variable --each element in the vector tuned to a different value for the variable to form the population coding. The activity, , of element i in such a row vector was determined as:

(2)

With
(3)

(4)

where x is the value to be coded, and and are parameters to determine the range of the variable.
As described in the paper [5] , randomly selected mossy fibers and one Golgi cell synapse with granule cells, modeled as leaky integrators with real-valued output computed as a sigmoidal function of the membrane potential to represent the instantaneous firing rate of the cell. The numbers were empirically found by using increasing numbers until performance no longer improved. The granule cell membrane potential was defined as:

(5)

where M is the set of four mossy fibers, randomly selected for each granule cell; time constant =1.33ms; and = .5. Granule cell firing rate was computed as

(6)

The mossy fiber weight, , was updated using a local learning rule to maximize information transfer

(7)

Where
(8)

(9)

Golgi cell activity was defined simply as the sum of the granule cell activities:

(10)

A single linear Purkinje cell was used for each output. Each Purkinje cell received input from all the granule cells via the parallel fibers, with connection weights, updated during training as described below. Purkinje cell firing was computed as

(11)

Nuclear cells receive excitation from mossy fiber collaterals and inhibition from Purkinje cells. Each Purkinje cell was paired with a single linear nuclear cell. The cells computing predicted state variables received the corresponding delayed state variable as mossy fiber input, while the cell computing the TTG signal received only constant background activity:

(12)

Where, is the delayed state variables that the specific nuclear cell is learning to predict.
Inferior olive cells were paired with nuclear cells. Each received excitatory sensory input (a delayed version of the variable coded by the nuclear cell) combined with inhibition from the nuclear cells, allowing them to act as error detectors for state prediction. To align the prediction (nuclear cell output) with the delayed sensed value, the cerebellar inhibition was also delayed by 60ms (see also the paper [1] ).
An exact match is not required, but the duration of this delay determines the time offset of the prediction, i.e., how far ahead the module predicts. Such long latency responses have been experimentally confirmed, although the effect of the nucleo-olivary pathway is not quite as simple.

(13)

where once again is the delayed state variable.

3.2 Learning Rule

The model applies learning rules that described in the paper [5] to update he parallel fiber-Purkinje cell weights:

(14)

with a small constant, IO the climbing fiber (inferior olive) activity, and e(t), the eligibility trace postulated to exhibit second order dynamics as described in the paper [5] , defined by

(15)

(16)

Where GC is the activity of Granule cell, time constant =0.0625.
The dynamics of the differential equations smooth out the granule cell input and once again do not have to match the delays exactly.

Grasp Processing Module

Hoff and Arbib determined that the maximum aperture was related to the size of the object s by =0 .75s +0 .4 and that the time of maximum aperture was coordinated with the transport phase so that the enclose time was approximately a constant 200ms [4] . Based on this data, the grasp-phase module was implemented as a single decision box that took as input the estimated TTG for the transport controller and set the target for the hand controller to if the input was greater than 200ms, s otherwise.

Model 1 Training Result

The model was exercised in two modes: training and evaluation. During training mode a total of 2000 normal reach and grasp movements were made where both the target size and movement duration were selected randomly in the ranges 5-10cm and 0.25-0.5s respectively. Computing the TTG for normal movements is trivial and could be used to train the TTG estimators.
Simulated movements made with an "untrained cerebellum" exhibit oscillations and overshoot due to the inaccurate state estimations sent to the feedback controller. In the absence of a TTG signal, the transport and preshape controllers run independently with no guarantee of cotermination. While this bears some resemblance to deficits caused by cerebellar lesions, it has been suggested (Thach, 1996c) that cerebellar patients will also consciously change strategy to compensate for the lack of coordination, e.g. by slowing down and excessive opening of the hand.
After training the accuracy of the model was evaluated by comparing its outputs to human performance for two perturbation experiments [6,7] : in the first experiment the distance to the object is unexpectedly increased; in the second the object size increases during the reach.

The Training Process

Training process showed in Figure 3. Each figure shows five different speed trajectories. The left column is before learning, the right is the performance after learning. Note that delayed state feedback is how to cause overshoot and ringing.

Training Results

Figure 4 shows that simulation of non-disturbance transport and grasp process. A and B are result of simulation. Of which, A is transport position and vector, B is aperture size and velocity. C, D, E are human date [6,7] based on Hoff-Arbib model that Paulignan provided. Of which, C is distance, D is velocity, E is aperture size. Both movement and aperture size are smooth and timed to coterminate.
During the location perturbation experiment, subjects exhibit a distinctive second bump in the transport velocity profile. Also, the hand interrupts the enclose phase and again opens up before following a normal enclose trajectory when the target is eventually reached. As seen in Figure5, the model faithfully reproduced the human data. The line above is the simulation results, the following line is human dates that given by Paulignan [6,7] .
Human and model performance for the size-perturbation experiment is shown in Figure 6. What is interesting to note here is that the arm transport slows down in order to give the hand time to preshape appropriately.
The estimated TTG signals during the perturbation experiments are shown in Figure 7. When the distance is suddenly increased, the estimated TTG for the transport controller increases accordingly, which puts the hand back in preshape mode. This then increases the TTG estimate for the hand also and ensures temporal coordination.
During the size perturbation experiment, the increase in object size leads to an increase in the hand TTG estimate. Because the transport TTG estimator receives as input the maximum of the two TTG signals, it now increases the estimate of the transport TTG.

Training on the Model 2

Model 1 used a trivial "plant" model for the arm (just using) , and could only be used to simulate movements as a one-dimensional distance from the starting point. Model 2 integrated the system with a more interesting arm (see the paper [1] ) and could be used to test the model in a more challenging environment. We have done some work in this aspect, training with good results. But due to space constraints, we are prepared for another paper devoted to it.

Conclusion

This paper presents a new control model with cerebellar-like structure, and it has also be suggested that how the structure could learn two key functions required in the Hoff-Arbib theory, namely state look-ahead and TTG (time-to-go) estimation. The model makes two separate predictions methods:
The fast prediction method. This method applies lateral cerebellum learning the forward model of the limb being controlled to construct a finite time state look-ahead predictor. It does this by monitoring delayed spinal feedback and provides this updated expectation to the premotor cortex where it is used for trajectory generation. In contrast to the Smith Predictor, we don't posit the cerebellar forward model to be part of a feedback loop for control.
The cerebellum helps the temporal coordination of multi-joint movements by providing an estimate of time-to-contact. Modern biology think that cerebellar patients will have difficulty to estimate time intervals involved in reaching for objects, or making contact with moving stimuli. The general lack of coordination of multi-joint movements seen in cerebellar patient are the result of a basic deficit in timing computation rather than motor control. The model can complete multi-joint movement temporal coordination task better by simulating cerebellar related function. Note thought that what is needed is not a metronome clock signal. We only need a system that can generate an internal "trajectory" to predict the onset of a sensory event.
The problems of the model need to be resolved or further research are so many, mainly in:
The current learning scheme needs a somewhat artificial TTG signal during training to bootstrap the process. A more plausible alternative might be to use some form of temporal difference algorithm.
Models using internal cerebellar circuitry for timing are notoriously sensitive to sensor noise [8] . While our system bases its timing estimates on external variables such as the observed state of the limb and should therefore not suffer from deficits such as diverging population vector trajectories, noise is a topic deserving of further study.
We have only looked at the very basic transport and aperture coordination. A next step would be to look at incorporating other aspects such as the speed-accuracy trade-off and the adaptation of the final wrist position based on the affordances for rasping offered by the target object.

Acknowledgements

We would like to thank reviewers for their helpful comments on the manuscript.
Funding: The work is supported by the National Natural Science Foundation of China (No.61073115,No. 61173079) Conflict of interest: none.

References

[1] Ruan X.G., Zhang S.B. (2007) Chinese Journal of Electronics, 35, (5), 991-995  
» CrossRef   » Google Scholar   » PubMed   » DOAJ   » CAS   » Scopus  

[2] Xiong Y.L., Xiong C.H. (2004) Huazhong Univ. of Sci. and Tech. (Natural Science),32 (9), 5-10.  
» CrossRef   » Google Scholar   » PubMed   » DOAJ   » CAS   » Scopus  

[3] Townsend.B.R., Subasi E. (2011) The Journal of Neuroscience,31 (40), 14386-14389, Doi:10.1523 /Jne- urosci. 2451-11.  
» CrossRef   » Google Scholar   » PubMed   » DOAJ   » CAS   » Scopus  

[4] Hoff.B., Arbib M.A.(1993) Journal of Motor Behavior, 25(3),175-192.  
» CrossRef   » Google Scholar   » PubMed   » DOAJ   » CAS   » Scopus  

[5] Zhang S.B., Ruan X.G. (2012) Journal of Nanjing University of Posts and Telecommunications (Natural Science), 42 (2), 18-22.  
» CrossRef   » Google Scholar   » PubMed   » DOAJ   » CAS   » Scopus  

[6] Paulignan.Y., McKenzie C., Marteniuk.R. (2006) Experimental Brain Research, 83, 502-512.  
» CrossRef   » Google Scholar   » PubMed   » DOAJ   » CAS   » Scopus  

[7] Paulignan.Y., McKenzie C., Marteniuk.R. (2006) Experimental Brain Research, 87, 407-420.  
» CrossRef   » Google Scholar   » PubMed   » DOAJ   » CAS   » Scopus  

[8] Kawato M., Kuroda S., Schweighofer N. (2011) The Cerebellum, Doi: 10.1007/ s12311- 010-0241 -2.  
» CrossRef   » Google Scholar   » PubMed   » DOAJ   » CAS   » Scopus  

Images
Fig.1A&B- Schematic of the Hoff-Arbib model
Fig.2- schematic of new model of temporal coordination of reach and grasp
Fig.3a- Training process schematic of model 1
Fig.3b- Training process schematic of model 1
Fig.3c- Training process schematic of model 1
Fig.3d- Training process schematic of model 1
Fig.4a- Simulation of A normal undisturbance reach and grasp
Fig.4b- Simulation of A normal undisturbance reach and grasp
Fig.4c- Simulation of A normal undisturbance reach and grasp
Fig.4d- Simulation of A normal undisturbance reach and grasp
Fig.4e- Simulation of A normal undisturbance reach and grasp
Fig.5a- Location disturbance experiments
Fig.5b- Location disturbance experiments
Fig.5c- Location disturbance experiments
Fig.5d- Location disturbance experiments
Fig.6a- Perturbation experiments of the object size
Fig.6b- Perturbation experiments of the object size
Fig.6c- Perturbation experiments of the object size
Fig.6d- Perturbation experiments of the object size
Fig.7a- Experimental disturbance remaining time (TTG) assessment A Position variation
Fig.7b- Experimental disturbance remaining time (TTG) assessment B size variation