Français | English
Conferences       Bibliography       Links       About Us


Evolution of Language from Theory of Mind or Coevolution of Language and Theory of Mind?
Anne Reboul


 Moderators: Peter Ford F. Dominey, Anne Reboul, Gloria Origgi
 

Introduction

For centuries, philosophers and scientists have been trying to answer the question of what is specific to humankind, as opposed to the rest of nature, i.e. the other animals. Among the many abilities which have been proposed, from bipedalism to laughter, through tools fabrication and usage, two stand out as those for which no animal can clearly and non controversially be said to possess them: language and theory of mind (i.e., the ability of interpreting and predicting others' behavior by the attribution of mental states, such as beliefs, desires, feelings, etc.). Thus, despite the by now fairly numerous attempts to teach them language, no great ape, even the deservedly renowned Kanzi, can be said to have mastered a language comparable in complexity (both qualitatively and quantitatively) to human languages. On the qualitative side, it still remains very doubtful whether apes can learn syntax and on the quantitative side, apes' vocabularies remain extremely limited. The same may be said for theory of mind (henceforth ToM): despite the undisputed fact that great apes (mostly chimpanzees of both species) live in sophisticated hierarchical societies, with a fair amount of social fluidity, it has not been conclusively proved that they have anything like the kind of ToM with which (normal) human beings are endowed. In fact, there seems to be fair evidence that they may, according to Sterelny's distinction (2000) between behavior-readers and mindreaders, belong to the former rather than to the latter category, while human beings clearly are mindreaders.

On the other hand, the evolutionary status of language remains unclear: is it an adaptation or some kind of exaptation (i.e., a by-product of other abilities which were adaptations in their own rights)? In the first case, has it evolved from animal communication or is it sui generis? In the second case, what was it exapted from? And what were those adaptations from which language emerged and how exactly did they allow it to emerge? The very fact, outlined above, that both language and ToM seem to be species-specific abilities, combined with the fact that ToM seems to have some role in any sophisticated communicative behavior, and with the idea that language evolved from animal communicative systems, have led some researchers to the idea that ToM was a prerequisite for language evolution, or, alternatively, that it was one of the adaptations from which language was exapted. (It is presumably important to note that the fact that language might have first appeared as an exaptation does not preclude it from having become an adaptation, i.e. from being selected for and biologically inscribed in the genotype through environmental pressure. This is indeed is a plausible scenario for bipedalism, see e.g. Berge & Gasc 2001, Picq 2003.) If the link between communication and ToM and the hypothesis that language evolved from animal communication are accepted then, language, whether it evolved or was exapted, is seen primarily as a communicative system, i.e. a system whose main function is communication. Note that this does not mean that language cannot have other functions: only the function it was adapted for (or that made it a useful exaptation) was communication. In fact, this agrees pretty well with the Social Cognition hypothesis, proposed by Humphrey (1976), and according to which social cognition evolved under the pressure of group size among social animals and other cognitive abilities are derived from it.

The idea that language and ToM are strongly linked does not come as a surprise to the pragmatician, especially in the Gricean line: it is one of the central tenet of contemporary pragmatics that what is being linguistically communicated is semantically underdetermined and that the decoding of the sentence must be accompanied by inferential processes yielding the complete interpretation of the utterance (see, e.g., Sperber & Wilson 1995, Levinson 2000, Reboul & Moeschler 1998). A good candidate for these inferential processes may be on the line of ToM. Indeed, it is hard to imagine that linguistic communication could take place if our species could not mindread. In two recent papers, Origgi & Sperber (2000, in press) have linked this peculiarity of linguistic communication to the whole problem of language evolution. Basically they point out that it was not language as such that was adapted, but rather the ability for language acquisition. They outline the role of ToM in solving the paradox that this raises, i.e. what is the use of language acquisition if there is no language to be acquired? This paradox, which is untractable on a codic view of linguistic communication, dissolves on an inferential or mixed view, as inference steps in. Basically, what this means is that language could not have appeared without some sort of mindreading ability. Indeed, this is a fairly frequent hypothesis in the literature of language evolution (see the papers in Givon & Malle 2002 and, for a review, Reboul 2003a).

I will not be directly concerned here with the precise process of language evolution (but see the papers in Christiansen & Kirby 2003) but with the more delicate problem of whether the evolution of ToM did indeed precede the evolution of language, i.e. in the more precise chronological aspects of the problem, as well as in what sort of ToM, if any, did indeed precede language.

Successive evolutions or coevolution of language and ToM?

To sum up, the evolution of language could presumably not have taken place without a workable ToM and this suggests a scenario in which the evolution of ToM preceded and conditioned the evolution of language. As was pointed previously by Malle (2002), things are however not as simple as they seem: for instance, our nearest relatives in the great apes family, i.e. the two species of chimpanzees, do not seem to have anything like a human ToM (this is not to suggest that we might have inherited our ToM from chimpanzees, but that both they and we might have inherited it from our common ancestor of about 7 million years ago); what is more, the main test for ToM, the false belief test is not passed by normal children much before 4 years of age. In addition, a more recent test, the opaque context test (see Kamawar & Olson 2000, Robinson & Apperly 2003) is not passed before 5 years. However, language acquisition begins at aproximatively nine months, much before children can pass false belief, as shown in the following table (built from the data in Baron-Cohen 1995 and Bloom 2000):

 

Age

Language acquisition

ToM acquisition

From birth to 9 months

 

ID and EDD

9 months to 18 months

Going from 6 words to 40

SAM

24 months

311 words

Development of TOM

30 months

574 words

Development of TOM

48 months

Development of vocabulary

False belief test

60 months

Development of vocabulary

Opaque context test

 

(Where ID is the detector of intentionality, EDD is the detector of eye direction and SAM is the shared attention mechanism). The problem is that, just as a workable ToM is supposed to allow language evolution, a workable ToM is supposed to be necessary for language acquisition (Bloom 2000). The above table seems to suggest that either these hypotheses are wrong or that a more complex view, involving coevolution and coacquisition, rather than a succession of ToM and language, is needed. Indeed, there is independent experimental evidence that there is a link between passing the false belief test and linguistic abilities, whether these abilities are semantic, syntactic or both (see e.g. Yun Chin & Bernard-Opitz 2000, de Villiers & Pyers 2002, Ruffman et al. 2003). I will not try to adjudicate between syntactic and semantic hypotheses, as I think that it is probable that both syntactic and semantic factors play a role. I will however adopt the conclusion that whatever sort of ToM allows for language evolution (and acquisition), it is unlikely to be a full-blown ToM of the sort that allows passing the false belief and opaque context tests. What this means is not that ToM does not play a role in evolution/acquisition of language, but that the question of the nature of the ToM that facilitates both processes has to be carefully examined.

What kind of ToM allows both language evolution and acquisition?

In a recent and extremely fascinating book on the design of animal communication (see Hauser & Konishi 1999, and Reboul 2003b, for a review), Perrett devotes a chapter to "A cellular basis for reading minds from faces and actions", centering on primate abilities. Basically, he shows that there are three types of cells in the temporal cortex of macaques: cells that encode the visual appearance of body and face both static and mobile; cells that encode specific bodily and facial movements; cells that respond to particular faces and body movements as goal-directed actions. There are, as well, prefontal cells which code both the motor component and the visual appearance of specific movements, a sort of general concept of the action. This leads Perrett to a two-steps scenario in which temporal cells recognize the movement as intentional (in the vernacular sense) and alert prefrontal cells which identify the action. Additionally, a fourth type of cells in the temporal area seem to distinguish between self-generated and other generated dimensions of visual stimuli. According to Perrett, this sophisticated visual system may be sufficient for a complex social life without needing an additional mindreading capacity.

Finally, the temporal population of neurons seems to have analogues in the human brain. What would the system just described amount to in ToM terms? Well, it should cover at least one module associated with ToM, ID, and maybe also EDD. That is, an individual with such a neuronal system should be able to detect intentionality in others' behavior and, possibly, to detect eye direction. Whether it covers SAM as well is unclear, as, on Baron-Cohen's account (Idem), SAM yields ternary representations of type [Mummy-see(I-see-girl)], that is embedded representations, and there is no reason to think that such a neuronal system would in and of itself allow embedding. What is clear, however, is that, at the beginning of language acquisition, though ID and EDD are acquired, SAM is not entirely operational. This suggests, that, despite the title of Perrett's paper, what is in question here might be more on the behavior-reading side than on the mindreading side. The question then is: if ID and EDD are necessary for language acquisition though SAM is not (at least at the beginning of acquisition), and given that there is some evidence of ID and EDD in primates, would not ID and EDD have been sufficient (in terms of behavior/mindreading) for language evolution too? Note that this is not a question of ontogeny recapitulating philogeny, but rather a question of necessary conditions for the evolution of a new ability, the ability to acquire a language. Note as well that I'm not claiming that there is nothing more to language acquisition/evolution than a behavior-reading ability. What I am suggesting rather is that the kind of social ability necessary for the evolution of the language acquisition device may have been of the order of behavior-reading rather than mindreading abilities. Thus SAM, which can be taken to be the first step in mindreading, would not be necessary to the acquisition (or evolution) of language at least at the beginning. The next question is: is a linguistic ability of some kind (however immature) necessary for the development of SAM? In other words, could a non-linguistic animal develop SAM? Well, chimpanzees have been shown to be capable of gaze following (see, e.g., Povinelli 2000) just as babies do. The question then is whether this indicates that chimpanzees and babies have SAM or whether it merely indicates that they have ID and EDD. The only common behavior that indicates that both chimpanzees and babies have SAM is that they are able to follow the gaze of another individual. It is not clear why this should entail a ternary representation of the kind described by Baron-Cohen. In other words, though ternary representations may appear on the basis of gaze following abilities, it is not clear that such abilities are based on ternary representations.

If this is right, then it appears that language evolution/acquisition, though it unsurprisingy rests on social abilities, may need behavior-reading rather than mindreading abilities. A full mindreading ability such as ToM would not be needed. This would enable us to restate the question of the relation between acquisition/evolution of language and acquisition/evolution of ToM.

Coevolution of language and ToM

As the table above makes clear, language and ToM are acquired in tandem, at least if one relies on false belief and opaque context tests as the acid tests for possession of ToM. This reliance has been contested by Bloom & German (2000), who pointed out that the false belief test tests plenty of other things in addition to ToM, such as e.g. memory. This criticism is probably right though it does not mean that the false belief task does not test ToM as well: this is because possession of ToM relies on the ability to attribute to other people different beliefs from one's own. There also is the methodological point that until now no better test seems to have been proposed. However, the issue regarding the false belief and opaque context tests might rather be that, as they depend on language, they obviously cannot be passed until mastery of language is well advanced. Thus, such an objection would go, it is no surprise that passing those tests is done late and they do not give us any information about the mindreading abilities of children previous to their having attained a linguistic ability sufficient for the task. This objection may well be right, but it is not clear that it goes through given the evidence that piles in favor of the existence of a strong link between mindreading and language. Supposing that there is indeed such a link, what more precisely could it be?

In a series of experiments, Povinelli (2000) has tried to demonstrate that chimpanzees, despite their social and tool using abilities, do not have ToM or naive physics comparable to human ToM or naive physics (Note however that these experiments have been criticized by, e.g., Hauser 2001). The difference is due to the fact that both abilities are supported in humans by abstract concepts corresponding to invisible but supposedly causally efficient "entities" such as force, belief, etc. Chimpanzees not only do not have these concepts, according to Povinelli: no amount of learning leads them to acquire them. Povinelli's theory regarding this major difference is that it can be explained by the presence in humans and absence in chimpanzees of language. In other words, language is, under that hypothesis, what allows human to develop abstract concepts of the sort described above. Note that though abstract concepts in ToM might depend on an embedding capacity of the sort described in Hauser et al. (2002), there is no reason to think that this is the case for abstract concepts in naive physics. Thus, it may be that we do indeed need language for mindreading, though we do not need mindreading for language at its inception, even though we need behavior reading abilities. This means that language may have a cognitive function, though I will not discuss here whether this was the basis for language evolution (see Newmeyer 2003 for a nice discussion).

Conclusion: what is a ToM useful for?

If we do not need a ToM to acquire language and did not need it to evolve language, what is the use of a full-blown ToM? A first (Gricean) suggestion might be that we need it for utterance interpretation. This position has been criticized by Sperber & Wilson (2002) who propose that utterance interpretation rests on a dedicated comprehension module, which evolved especially for the comprehension of linguistic communication, exploiting some metarepresentational principles, as well at the relevance least cognitive effort principle. This would seem to rob ToM of any significant role in utterance interpretation. However, I would like to go back to a tripartite distinction introduced by Sperber (1994) between three interpretive strategies: Naive optimism, in which the hearer considers the speaker to be both competent and benevolent; Cautious optimism, in which the hearer considers the speaker to be benevolent but not necessarily competent; and Sophisticated understanding, in which the speaker is assumed to be neither competent nor benevolent. In the first strategy, no ToM is required, though some is necessary in the second strategy and a lot more in the third strategy. It is a common place to say that communication (especially linguistic communication) brings with it the possibility of deception. This may be where the coevolution scenario takes a new dimension: the evolution of a language acquisition device allowed the emergence of language and linguistic communication that respectively allowed the development of a full-blown ToM and of deception. And ToM may be the best tool for detecting and counter foiling deception, though I would not go so far as to claim any sort of Baldwin effect such as that described in Godfrey-Smith (forthcoming), in which a new situation (here linguistic communication) leads to the adoption of new behavioral skills (here ToM from language), leading to a change in social ecology (better mindreading individuals are favored), hence to changes in selection pressures, which might lead to the new skills being adapted, i.e. to an increase of genotypes predisposing individuals to acquire these skills (here full-blown ToM).

A final word: Dehaene (1997) has shown how sophisticated contemporary mathematics arises from and is still determined by numerosity, a relatively low level capacity which we share with a wide range of animal species, from birds to primates. Numerosity needs to be supplemented by language and a symbolic system of notation to be the foundation of mathematics. My suggestion is that full-blown mindreading may rest on relatively simple behavior-reading abilities supplemented by language to develop. It then may be used in sophisticated linguistic communication (such as fiction for instance) though it probably is not necessary for either language acquisition or evolution and for most common-place linguistic communication.

References

Baron-Cohen, S. (1995) Mindblindness. An essay on autism and theory of mind, Cambridge, MA, The MIT Press.

Berge, C. & Gasc, J-P. (2001) "Quand la bipédie devient humaine", in Picq, P. & Coppens, Y. (eds), Aux origines de l'humanité. Le propre de l'homme, Paris, Fayard, 80-125.

Bloom, P. (2000) How Children Learn the Meanings of Words, Cambridge, MA, The MIT Press.

Bloom, P. & German, T. (2000) "Two reasons to abandon the false belief task as a test of theory mind", Cognition 77, B25-B31.

Christiansen, M.H. & Kirby, S. (eds) (2003) Language evolution, Oxford, Oxford University Press.

De Villiers, J.G. and Pyers, J.E. (2002), "Complements to cognition: a longitudinal study of the relationship between complex syntax and false-belief understanding", Cognitive Development 17, 1037-1060.

Dehaene, S. (1997) The number sense, Oxford, Oxford University Press.

Givon, T. & Malle, B. (eds) (2002) The evolution of language out of pre-language, Amsterdam, John Benjamins Pub. Co.

Godfrey-Smith, P. (forthcoming) "Between Baldwin skepticism and Baldwin boosterism", in Weber, B. & Depew, D. (eds), Learning and evolution. The Baldwin effect reconsidered, Cambridge, MA, The MIT Press.

Hauser, M.D. (2001). "Elementary, my dear chimpanzee. Review of "Folk physics for apes" (D. Povinelli)", Science 291, 440-441.

Hauser, M., Chomsky, N. & Fitch, T. (2002) "The faculty of language. What is it, who has it, and how did it evolve?", Science 298, 1569-1579.

Hauser, M. & Konishi, M. (eds) (1999) The design of animal communication, Cambridge, MA, The MIT Press.

Humphrey, N. (1976) "The social function of intellect", in Bateson, P. & Hinde, R. (eds), Growing points in ecology, Cambridge, Cambridge University Press, 303-317.

Kamavar, D. and Olson, D.R. (2000) "Children’s Representational Theory of Language: Problems of Opaque Contexts", Cognitive Development 14, pp. 531-548.

Levinson, S. (2000) Presumptive meaning. The Theory of Generalized Conversational Implicature, Cambridge, MA, The MIT Press.

Malle, B. (2002), "The relation between language and theory of mind in development and evolution", in Givon, T. & Malle, B. (eds), The evolution of language out of pre-language, Amsterdam, John Benjamins Pub. Co., 265-284.

Newmeyer, F. (2003) "Grammar is grammar and usage is usage", Language 79/4, 682-707.

Origgi, G. & Sperber, D. (2000) "Evolution, communication and the proper function of language: a discussion of Millikan in the light of pragmatics and of the psychology of mindreading", in Carruthers, P. & Chamberlain, A. (eds), Evolution and the human mind. Language, modularity and social cognition, Cambridge, Cambridge University Press, 140-169.

Perrett, D. (1999) "A cellular basis for reading minds from faces and actions", in Hauser, M. & Konishi, M. (eds) The design of animal communication, Cambridge, MA, The MIT Press.

Picq, P. (2003) Au commencement était l'homme. De Toumaï à Cro-Magnon, Paris, Odile Jacob.

Povinelli, D. (2000) Folk Physics for Apes, Oxford, Oxford University Press.

Reboul, A. (2003a) "Review: Givon, Talmy and Bertram F. Malle, eds (2002) The Evolution of Language out of Pre-language. John Benjamins Publishing Company, paperback ISBN 1-58811-238-1, ix+392", Linguist list issue 14.1734.

Reboul, A. (2003b) "Review: Hauser, Marc D. and Mark Konishi, ed. (2003) The Design of Animal Communication, MIT Press.", Linguist list issue 14.3088.

Reboul, A. & Moeschler, J. (1998) La pragmatique aujourd'hui. Vers une nouvelle science de la communication, Paris, Le Seuil.

Robinson, E.J. and Apperly, I.A. (2001) "Children’s difficulties with partial representations in ambiguous messages and referentially opaque contexts", Cognitive Development 16, pp. 595-615.

Ruffman, T., Slade, L., Rowlandson, K., Rumsey, C. and Garnham, A. (2003) "How language relates to belief, desire and emotion understanding", Cognitive Development 18, pp. 139-158.

Sperber, D. (1994) "Understanding verbal understanding", in Khalfa, J. (ed.), What is intelligence?, Cambridge, Cambridge University Press, pp. 179-198.

Sperber, D. & Origgi, G. (in press) "Qu'est-ce que la pragmatique peut apporter à l'étude de l'évolution du langage?", in Hombert, J-M. (ed.) L'origine de l'homme, du langage et des langues.

Sperber, D. & Wilson, D. (1995) Relevance. Communication and cognition, Oxford, Basil Blackwell, 2nd edition.

Sperber, D. and Wilson, D. (2002) "Pragmatics, modularity and mind reading", Mind and Language 17, pp. 3-23.

Sterelny, K. (2000) "Primate worlds", in Heyes, C. & Huber, L. (eds), The evolution of cognition, Cambridge, MA, The MIT Press, 143-162.

Yun Chin, H. and Bernard-Opitz, V. (2000) "Teaching Conversational Skills to Children with Autism: Effect on the Development of a Theory of Mind", Journal of Autism and Developmental Disorders 30/6, pp. 569-583.

Open Verbal comprehension is metarepresentational (1 reply)
Dan Sperber, Feb 26, 2004 18:01 UT
Open At least two routes into language (1 reply)
Simon Baron-Cohen, Feb 24, 2004 23:33 UT
Open Not Only Language and Not Only for Language (3 replies)
Cristiano Castelfranchi, Feb 24, 2004 14:18 UT
Open Language facilitates constructing more powerful modules such as TOM (3 replies)
Eric Baum, Feb 22, 2004 20:52 UT
Open UNDERSTANDING ATTENTION: IS IT A PSYCHOLOGICAL MATTER? (1 reply)
Cristina Meini, Feb 22, 2004 17:51 UT
Close Bricolage in the Evolution of Language and Mind-Reading  
Michael Arbib
Feb 21, 2004 18:52 UT

1. I agree with Anne Reboul’s conclusion that “full-blown mind reading may rest on relatively simple behavior-reading abilities supplemented by language to develop … [and] probably is not necessary for either language acquisition or evolution ...” I agree, too, with Gärdenfors that we must treat ToM as a complex of structures rather than a single one, and with Diesenbruck that language acquisition must also be subdivided, but think that the emphasis on word acquisition may be mistaken and that “fully fledged” ToM (ffToM) is impossible without more complex utterances. Here are some specific comments on different aspect of Reboul’s discussion:

2. Reboul asserts that “it is hard to imagine that linguistic communication could take place if our species could not mind read”. I would agree that linguistic communication would seem to require some form of shared attention (cf. Gärdenfors) but do not see why it would require mind reading rather than behavior reading to be of great social benefit. Indeed, Reboul quotes Perrett’s [1999] analysis of "A cellular basis for reading minds from faces and actions" which may perhaps be more akin to behavior reading than mind reading, and yet be sufficient for a complex social life without needing an additional mind reading capacity. I will return to this topic in my own paper discussing the Mirror System Hypothesis later in this series.

3. I would like to see a more critical discussion of ID as “the detector of intentionality”. I think there is compelling evidence to distinguish what is called imitation in the newborn from what I would consider “true” imitation in the one-year old infant. Perhaps a similar distinction between “the early appearance of something with some of the characteristics of intentionality” and “later appearance of an understanding of the intentions of others” might be helpful in our multi-aspect approach to ToM.

4. I would like to see a thread discussing why mind reading is useful, as distinct from behavior reading. If I see that you are angry and behave in a more guarded fashion because of this, one might explain it in “mind talk” but in this case analysis of behavioral disposition in relation to facial expression seems more than adequate. Why then do mental states figure in folk psychology as more than dispositions for current behavior? It seems that the ability to follow quasi-causal chains to link a variety of behaviors by an individual seem crucial, and it may require both good episodic memory and the ability to use language to organize it to ground a folk psychology that makes these mental states feel real. This seems to support Reboul’s position.

5. Reboul notes Sperber & Wilson’s (2002) notion of a dedicated comprehension module, which evolved especially for the comprehension of linguistic communication, but this seems too all-or-none. Language is regenerative - not simply expressing extant meanings but also creating new realities as experience crystallizes around new words and expressions and stories. However, Sperber’s (1994) distinction between three interpretive strategies that she cites does seem helpful. Whereas the speaker of early language (in either sense) might simply be uttering a description or command, effective use of modern language will often (not always) have a communicative goal of affecting the speaker in a certain way. With experience, one builds increasingly subtle models of individuals to tailor one’s utterance not only to one’s general experience with them but also to one’s sense of their current state. This is true for behavior, but becomes amplified in subtlety because language can communicate so many more messages which can thus go wrong in so many ways. Basic animal behavior in terms of hierarchy seem to be the basis for this - the protoToM, if you like, once coupled to, e.g., the facial expression of the emotions or other communicator (e.g., pheromones) of current state.

6. Let me just add that much of the “evolutionary process” described in Reboul’s article and here may well be a historical process building on a biology adequate to support protolanguage, rather than resting on specific biological changes.

  1 reply to Bricolage in the Evolution of Language and Mind-Reading:
    Open Reply to Michael Arbib
Anne Reboul, Feb 23, 2004 13:16 UT
Open Do we still need a False Belief Task? (3 replies)
Gloria Origgi, Feb 18, 2004 12:30 UT
Open Joint attention as a scaffold for language (1 reply)
Peter Gärdenfors, Feb 18, 2004 10:40 UT
 
Note: yellow triangles (   ) indicate new messages that have been posted since your last visit to the site.
 
© 2008 interdisciplines.