Constructions underlying theory of mind and language
by  Peter Ford F. Dominey
http://www.interdisciplines.org/coevolution/papers/2

Statistical learning pervades cognition
Sergio Navega
Mar 3, 2004 13:25 UT

Peter Dominey's target article proposes an appealing vision relating the co-evolution of mechanisms able to process complex embedded structures and the emergence of high level constructs required by a Theory of Mind. Apart from a minor quibble related to the use of the predicate-argument format, I concur with Dominey's main claims. What follows is another path to reach similar conclusions.

Since Saffran et al. (1996) seminal study with 8-month-old infants, language learning has been put into a new perspective. Universal grammars and innate language modules are constructs that don't seem necessary to explain how children acquire linguistic expertise, although the whole issue is still under heavy discussion. Following Saffran's results, a number of studies provided similar accounts about the statistical learning on the visual domain (Fiser & Aslin 2002; Kirkham, Slemmer, Johnson 2002) and also auditory and touch (Conway & Christiansen 2002).

These statistical learning abilities have also been found to be present in nonhuman primates (Hauser, Newport & Aslin 2001; Conway & Christiansen 2001), which are important results to support a potential evolutionary path of these fundamental cognitive mechanisms.

However, one hypothesis that is rarely explored considers these statistical mechanisms acting not only on lower level signals, but also during the interpretation of high level behaviors. This idea may be supported by a study by Baldwin et al. (2001) that shows how infants can learn typical sequences of acts in an unsupervised way, reminding us of Saffran's results.

Baldwin et al. concludes that this ability to extract relevant "units of action" from continuous behavior is of great developmental importance. As a typical situation, they refer to the kind of learning that might happen when an infant observes cashier's actions during a supermarket checkout. At this point in an infant's mental development, he/she may not have the necessary sets of high level constructs such as intentions, beliefs or desires of others. However, even without these high level concepts, the infant could be thought to be using the sequences of learned actions as part of the necessary grounding for the later development of such concepts, in a way similar to the lexical and grammatical learning. Under a constructivist viewpoint, this grounding would be a prerequisite to the formation of a set of notions about intentions of others. Thus, it seems reasonable to entertain the hypothesis that part of the mechanisms responsible for language learning are also in use during the development of ToM. References

Baldwin, Dare A.; Baird, Jodie A.; Saylor, Megan M.; Clark, M. Angela (2001) Infants parse dynamic action. Child Development, May/June 2001, Vol 72, No. 3, 708-717.

Conway, C. M. and Christiansen, M. H. (2001). Sequential learning in non-human primates. Trends in Cognitive Sciences, 5(12):539--546.

Conway, C.M., & Christiansen, M.H. (2002). Sequential learning by touch, vision, and audition. Twenty-fourth Annual Meeting of the Cognitive Science Society.

Fiser, József; Aslin, Richard (2002) Statistical learning of new visual feature combinations by infants. PNAS Vol 99, No. 24, 15822-15826.

Hauser, Marc D.; Newport, Elissa L.; Aslin, Richard N. (2001). Segmentation of the speech stream in a nonhuman primate: Statistical learning in cotton-top tamarins. Cognition, 78, B53-B64.

Kirkham, Natasha Z.; Slemmer, Jonathan A.; Johnson, Scott P. (2002) Visual statistical learning in infancy: evidence for a domain general learning mechanism. Cognition 83 (2002) B35-B42.

Saffran, Jenny R.; Aslin, R. N; Newport, E. L. (1996) Statistical learning by 8-month-old infants. Science 274, 1926-1928.

    A Unified Construction Approach Motor control, Theory of Mind, and Language
Peter Ford Dominey
Mar 8, 2004 14:52 UT

Sergio Navega's discussion provides an interesting point of departure for an even larger generalization. We can consider that from certain formal "generativist" perspective there are three “impossible” problems solved by the nervous system. Impossible in the sense that they are ill-defined, with too many degrees of freedom for generalized learning mechanisms, and thus require some heavy domain-specific machinery to render the problems solvable. The first two are language acquisition and acquisition of theory of mind, and the third for our purposes (though the first from an evolutionary sense) is the "inverse" problem of motor control. The problem here has to do with the fact that to move your hand from your keyboard to your cup of coffee, there are an infinite number of solutions, due to the extra-degrees of freedom in your shoulder, elbow, wrist and fingers. Thus there is an infinite set of possible trajectories that could get your hand from one place to another. Furthermore, for each of these trajectories there are an infinite number of velocity profiles that can be used, and the calculation of the inverse dynamics (the force effects of moving masses) is likewise unbounded for even such a simple movement. Thus, from this sad perspective we would never move, never have that sip of coffee. This inverse (kinematics and dynamics) problem rings familiar from the "poverty of the stimulus" perspective, and a viable solution appears similarly from the construction perspective. Indeed, data suggest that the nervous system chooses a workable solution, based on the calculation of motor commands from a kind of parameterized construction inventory, in which specific input-output mappings (constructions) are parameterized and combined to provide an acceptable level of extension and generalization, while allowing for learnability (see Kawato 1999).

In this context, the idea is that during sensorimotor exploration, the infant begins to develop an inventory of visual-transformation-movement constructions that link parameters of visual target orientation (e.g. Looking at a target in space - that cup of coffee) and the (initially randomly generated) motor commands that bring the hand to that region of space. These motor constructions will initially be similar to the holoconstructions in language and theory of mind, and with experience they become compositional. It is interesting that this range of compositional motor control appears to have co-evolved with the development of cortical influence on the striatum in the parallel loops of cortex, basal ganglia and thalamus. We (Dominey et al. 1995, 1998, 2003) have demonstrated how this canonical circuit provides capabilities for aspects of motor control and language, and neurophysiological investigations reveal indeed that this canonical circuit is reduplicated throughout the neocortex and striatal complex providing an ensemble of motor, cognitive and emotional state related functional circuits. It will be of interest to determine if a more systematic analysis of this construction based learning capability can indeed demonstrate its proposed generalization across these three domains of motor control, theory of mind and language.

References:

Dominey PF, Hoen M, Blanc JM, Lelekov-Boissard T (2003) Neurological basis of language and sequential cognition: Evidence from Simulation, Aphasia and ERP Studies, Brain and Language, 86(2):207-25

Dominey PF, Lelekov T, Ventre-Dominey J, Jeannerod M (1998) Dissociable Processes for Learning the surface and abstract structure sensorimotor sequences. Journal of Cognitive Neuroscience, 10 :6 734-751

Dominey PF, Arbib MA, Joseph JP (1995) A Model of Cortico-Striatal Plasticity for Learning Oculomotor Associations and Sequences, J Cog Neuroscience, 7:3, 311-336

Kawato M (1999) Internal models for motor control and trajectory planning., Curr Opin Neurobiol. Dec;9(6):718-27.

    The importance of interaction
Sergio Navega
Mar 9, 2004 20:43 UT

I am again in full agreement with Peter Dominey's reply. It was opportune to remind that generativist's arguments concerning the "impossible problems" dissipate on a closer analysis of how human cognition might have developed. It pays to remind generativists that if one proposes innate organs instead of learned linguistic abilities, this leaves open the bigger issue of how these innate organs could have evolved in the first place. In the case of human symbolic language, it seems untenable to sustain that those proposed "innate organs" evolved in such a reduced period, in the time scale of Homo Sapiens evolution. A much more plausible explanation seems to be the idea that language emerges spontaneously as the result of the activity of a community of sufficiently intelligent agents (Kirby 2000).

Dominey rightly mentions that the problem of obtaining good performance from a sensorimotor standpoint is similar to the "poverty of stimulus" issue. It is, likewise, difficult to understand how a learning mechanism is capable of presenting any learning at all given the immense number of degrees of freedom (motor controls) and dimensionality (sensory signals). On the side of nativists, we can agree that innate circuitries capable of solving these problems occur frequently in nature. A gazelle is able to walk a few hours after birth. However, it seems intriguing that we, humans, take several months to accomplish such a feat, which suggest that our native endowance in this area is quite restricted.

Perhaps here is a point where I could add some additional ideas to Dominey's remarks. It really seems useful to consider the workings of parametrized input-output constructions as one of the fundamental processes in the control of such complicated apparatuses as the human arm and hand, in its effort to grasp a cup of coffee. However, one issue that appears to be missing is how the learning of these abilities take place. Given the huge amount of possible variations (the dynamics of arm movements is altered if one is wearing a thick coat), it seems necessary the learning of invariant aspects of these controlled movements (which, to be coherent with my previous post, requires statistical learning abilities, although not only relative to "surface aspects"). These constructs (or sensorimotor schemas, to use another wording) must necessarily be the result of abstractions of several prior experiences. But they must also be seen as being learned by trial and error methods, and, in my view, even this is not enough.

A recently born infant isn't aware of his/her own arms, demanding months to learn the covariation of movement of hands in relation to the corresponding perceptual identification of these movements. More to the point, an infant appears to be learning, in these episodes, not only the correct motor schemas and parametrized input-output sequences, required to command the limbs and legs in a certain way, but also all the concurrent perceptual activity resulting from these movements. This task is not a simple one, because perception must process not only visual stimuli, but also proprioception and frequently tactile feedback. An important implication of this process is that the schemas that emerge from such learning must represent simultaneously both kinds of information: one of a perceptual nature and another of "motoric" nature. Also, these structures must be firmly associated one to another, otherwise one wouldn't know how to "check" if what one is doing is correct or not (tennis players often hit the ball without looking at it, using only proprioception as feedback). This binding between perception/action structures seems essential to the development of efficacious and controlled movements. However, not only for that.

What happens to be a reasonable next step is to consider these "bound schemas" as the grounding over which several high level constructs may be supported. This is where can lead us such publications as Rizzolatti and Arbib (1998). Another indication that children use abstractions of perception/action schemas to ground meaning can be found in David Bailey's Phd thesis (1997). [on a relatively off-topic matter, this is also one of the greatest challenges of purely symbolic AI: the lack of a body from which to derive meaning conveyed by sensorimotor schemas]. From here, we go to "intentionality detection", as is suggested by Rizzolatti et al. (2000), which again appears to support part of Dominey's initial thesis in his target article.

References

Bailey, David (1997) When push comes to shove. Phd Thesis, University of California, Berkeley.

Kirby, Simon (2000) Syntax without natural selection: how compositionality emerges from vocabulary in a population of learners. In: Knight, C.; Studdert-Kennedy, M., Hurford, James; The Evolutionary Emergence of Language. Cambridge University Press.

Rizzolatti, G., and M. A. Arbib. (1998) Language within our grasp. Trends in Neurosciences 21(5): 188-194, 1998

Rizzolatti G, Fogassi L, Gallese V (2000) Mirror neurons: Intentionality detectors? Int J Psychol 35: 205-205