Corps de l’article

Verbal exchange is undoubtedly the pinnacle of human communication. Language use is often conceptualized as guided by internal mental representations. Relatively few research programs have focused on the sonic and embodied aspects of linguistic discourse and exchange. Indeed, language-based communication cannot be separated from at least some vocal determinants that contribute to shaping the meaning of a message. The materiality of linguistic expression (voice, gesture, etc.) is the foundation for conversational exchange. Conversation analysts take into account the expressive multimodality of verbal and non-verbal behaviour between speakers. They are interested in the order and sequencing of chunks of speech that convey specific thoughts, as well as the temporal continuity of the interaction as a whole. More specifically, the gestures, prosody and phonology used by speakers provide major cues for understanding interaction at both local and global levels. Thus, integrating temporality and corporeality is necessary if the aim is to understand the process of interpersonal coordination. In any case, the paradigm of representational cognition seems nowadays to be less meaningful than that of embodied cognition.

Within this framework, musical communication between musicians, and between musicians and their audiences, has recently emerged as a valuable field of study. Music forces us away from studying the mere representational function of verbal language, because it is naturally suggestive of broader, polysemic, and perhaps more intimate, meanings (emotions, moods, feelings). Furthermore, musical meaning is even more clearly dependent on its temporal dimensions (sequence, order, synchrony). In his well-known essay, “Making Music Together: A Study in Social Relationships,” the sociologist Alfred Schütz[1] describes musical activity from both the musician’s and the audience’s perspectives, underscoring the existence of a pre-communicative, wordless social interaction that is sufficient for establishing intersubjectivity and mutual understanding. Schütz uses music as a theoretical backdrop for gaining understanding into interpersonal relations in general.

John Dewey also underscores music’s privileged position among the arts deriving from its temporal dimension and its collective practice.[2] It is difficult to doubt music’s communicative power and its eminently social function. Music contributes to defining differences between cultures and groups, but it also brings them together. Its omnipresence and its role in all sorts of human celebrations and rituals is a case in point. Some authors support the idea that music played an adaptive role in human evolution because of its social dimension. Thus, the ambiguity of musical meaning might have offered an adaptive advantage as it helped maintain and control the remarkable social flexibility that is characteristic of our species,[3] meanwhile fostering social cohesion and cultural belonging.[4] Arjun Appadurai regards music as the best way to create communities of feeling, transporting both musicians and their audiences beyond their individual perspectives.[5] It is possible that musical cultures mark boundaries far less clearly than linguistic cultures do. In all musical traditions, despite real acoustic, rhythmic and harmonic differences, certain basic principles are found fairly systematically, such as pulse (strong regular beats inviting movement and anticipation), quality (allowing the expression of movement and melody), and global performance structure (a narrative with a beginning and an end, and rising and declining tension involving repetitions and changes). In this paper, we contend that these general principles of music represent primary means of coordination, observable even in the earliest forms of human communication.

The study of musical communication can therefore shed light on the synchronization process that takes place between individuals. We are particularly interested in the kinds of improvisational practices that can be independently observed within different musical genres, but which are more representative of some of them, for example, jazz. Keith Sawyer has described in detail the similarities between collective improvisation and verbal conversation; in both situations, individuals can express themselves freely, provided they match or “attune” to each other.[6] In this paper, we will suggest that musical improvisation also resembles the interactions between adults and pre-verbal infants, who nevertheless produce expressive sounds and gestures. We will suggest that the meaningful coordination of sound and gesture is a generalized human skill that appears as early as the first weeks of life. In terms of human evolution, these skills appeared long before the technically elaborate and culturally sophisticated forms of synchronization that characterize jazz performance came into being. Thus, infants’ spontaneous body movements and vocalizations already represent the child’s ability to synchronize and communicate feeling states and intentions. This early skill, which lends itself to myriad new situations as a child develops and enters into language, continues to foster the shared temporal experiences that build relationships, communities and histories.

In the first section of the paper, we discuss the notion of synchrony by looking at its origins in infant communication. In the second part of the paper, we describe processes of coordination observable in improvised jazz performance. In the last section of the paper, we explore how these processes affect interpersonal relationships and their cultural contexts. Thus, we hope to demonstrate that the temporal coordination of gesture and sound represents the basis of human communication, shared emotion and culture.

Synchrony and the roots of musical communication

Since the early sixties, researchers have worked to describe the communicative processes that take place between babies and their caregivers. Many of the terms they have used to describe the intimate, non-verbal relations between adults and infants connote embodied musical experience. Synchrony is one of these terms. It was used, in a now famous paper by William S. Condon and Lou W. Sander, to describe how newborns rhythmically adjust their body movements to the speech of adults:

As early as the first day of life, the human neonate moves in precise and sustained segments of movement that are synchronous with the articulated structure of adult speech. […] This study reveals a complex interaction system in which the organization of the neonate’s motor behavior is entrained by and synchronized with the organized speech behavior of adults.[7]

Daniel Stern refers to mother-infant interaction as a duet and a dance and Condon talks of the orchestration and choreography of communicative behavior,[8] while Colwyn Trevarthen and Stephen Malloch use the term “communicative musicality” to describe communication with infants in the first few months of life.

Communicative musicality refers to the infant’s motivation to communicate with close others through coordinated timing, emotive expression and sympathetic mirroring across sense modalities. According to Trevarthen and Malloch, infants are able to partake in these meaningful collaborative performances because they are born with the ability to share a time sense with others.[9] Parents and infants parse expression in similar ways. Knowing what chunks or units of expressive behavior the other will perceive, they thus participate in a dynamic mutual shaping of patterns of expression.

Three dimensions of expression in time are considered to be the core temporal processes of “communicative musicality”: pulse, quality and narrative. These are hierarchically organized and interactively upheld. Pulse is the shortest unit of time, lasting between 500 milliseconds and 2 seconds, and is created by the regular succession of behavioral elements (vocal or gestural accents). Quality refers to the contours of expression in body and voice that provide cues to individual intentions and emotions. It is based on musical features such as pitch, intensity and timbre. Narrative is defined as an unfolding musical story built on variations of pitch and intensity that perform shared episodes with growing and declining excitement. It is interesting that a number of researchers have highlighted a correspondence between musical or communicative time-scales and the frequency ranges of various human bodily activities. For instance, a musical phrase is associated with the time frames of breathing, of arm-length gesturing and of swaying. Musical pulse is closest to the rhythms of chewing and sucking, of heartbeat, walking, sexual intercourse and head nodding.[10]

Many studies have shown that parents spontaneously adopt a musical form of speech when addressing young infants. Infant-directed speech (IDS) has been shown to match very precisely the capacities of infants at different ages. It can be easily recognized in any language of the world for being rhythmic, presenting a wide pitch range with high amplitude variations in pitch, as well as containing more repetition of sounds and prosodic contours.[11] From birth, infants respond to their parents with whole-body, multi-modal expressions that are precisely coordinated. By the second month, their vocalizations finely match the acoustic properties of adult speech.[12] Parents perceive infants to be taking part in dialogues; they feel they have “something to say.”

From the first few weeks after birth, when infants acquire the physical strength to channel their drive to communicate, they show interest in the lives of others and are indeed able to command and direct the attention of the adults around them. How do they do this without bodies to transport them, and without words and gestured signs? While the answer is not yet entirely clear, it seems that one way they achieve this is through time. Infants have an intimate and precise sense of time, and of timing, as experimental studies have shown.[13] They are able to organize their own expression in time, and to perceive the timing of sequences in other people’s behaviour.

It may seem implausible that an 8-week-old infant is capable of simultaneously monitoring and performing expressive behavior occurring within very short time frames. Traditional theories of cognitive psychology cannot account for the ability of young infants to anticipate events at multiple hierarchical levels. There is, however, strong evidence that infants do possess this expertise and cognitive psychology must adjust to this fact. There is today a substantial literature in developmental psychology pertaining to music to support the notion that young infants are capable of remarkable musical perception and cognition.[14] Furthermore, recent theories of embodiment acknowledge body-based processing of complex information.[15]

The study of vocal communication between mothers and infants reveals that mothers often act as pulse givers and that infants participate in the activity of collaborative time keeping.[16] Pulse is, most of the time, an implicitly recurring unit that is felt, but not always heard. However, one crucial feature of this rhythmic dimension of pulse is that it involves variability, that is, minor deviations from strict temporal regularity. The pulse of early communication is expressive.[17] Infants have been shown to entrain to regular beats in music from the first months of life,[18] but it seems that they also anticipate rhythmic patterns and, most importantly, that they expect these patterns to deviate from strict regularity. The phenomenon of expressive timing is well known in music; expression in musical performance depends on small temporal deviations from expected or written forms.[19]

Musicality, then, might be considered an antecedent of both music—in all its forms—and language. It is perhaps what gives rise to human sense-making. It is likely that oral discourse cannot convey meaning without some musicality. Language, like music, is primarily patterned sound in time. The timing of speech, its progression along multiple dimensions formed by melody, tone, accent and grouping (all that is subsumed under the term prosody) is what holds attention and holds communication.[20] A number of researchers interested in the organization of everyday conversation highlight the rhythmic properties that enable speaker coordination.[21] Some conversations flow better and are more enjoyable than others, even in very mundane situations. A certain seamlessness can exist between the productions of the interlocutors, which makes them sense what the other is about to say. This ability to anticipate the words and thoughts of another individual generally goes hand in hand with a high degree of intersubjective understanding and affiliative emotion.

As an aside, it is interesting to note that musicality also plays an important role in written discourse. Although syntactic structures are crucial to comprehension, the activity of reading is fundamentally temporal (and culturally situated). The reader picks up the timing of the written sentences, the way they fall, in accordance with the writer’s intentions. Reading is an intersubjective experience; to be intelligible, a written text must translate into patterned sound in the mind of the reader and the reader must tune in to the writer’s temporality, sharing not only his thoughts and impressions, but also his “flux of experience in inner time.”[22]

One problem with doing research on synchrony in infant communication is that infants cannot use words to describe their interactive experiences. Parents, however, can describe their experiences of communicating with their young ones, and often describe the excitement they feel when their babies “tell” them things and when they sense their “non-discursive” meanings.[23] We can, however, ascertain a great deal about the interactive processes underlying the experience of “tuning in” and synchrony that “communicative musicality” supports by listening to and studying the rich oral discourse used by jazz musicians to describe their own experiences of improvising together, and the embodied performances themselves. In their verbal accounts, jazz musicians rely heavily on metaphor; they use gestures and onomatopoeia because there is, at bottom, something ineffable about their experiences. Jazz musicians feel they are “saying” something when they play well, particularly when they are, as they say, “sharing good time” or “in a groove.”

We propose that music, and improvised music in particular, has remained closest in form to the ways in which humans become intimate and build culture together through time. At the same time, it must be granted that if musicality accounts for the intuitive, spontaneous part of music, it is not by any means sufficient to account for the beautifully complex musical productions from around the world that move us aesthetically and spiritually and which rely on centuries of accumulated know-how and creativity. What we are suggesting is that musicality plays a fundamental role in shaping cultural forms and styles in an interactive way, over time, a time that is at once subjective, intersubjective and historical.

Synchrony and improvisation in sound-based interaction

Musical improvisation is often thought to involve playing something that has not been planned ahead of time and that is not based on any preexisting scheme. But this definition is not entirely accurate. Most jazz musicians, for instance, base their improvisations on standard songs that are part of a well-defined repertoire. And all improvising musicians adhere to a number of rules that are, if they are not explicit, implicitly established in the course of performance. In all so-called improvised genres, there is a question of the extent to which any performance is, indeed, improvised. Improvisation, according to Sawyer, always involves a dialectical relation between structure and variation, or between the known and the new.[24] Musicians must establish what Sawyer calls an “improvisation zone” that lies between predictability and novelty. Although most would admit that there is less improvisation in the interpretation of a scored symphony than in the performance of a piece of free jazz, quantifying degrees of improvisation in music remains highly problematic.

However, the experience of being together in time or of being in synchrony is at the heart of improvised music. It refers to the sense of having a shared “feel,” both for the beat and for the unfolding musical performance’s possible futures. When jazz musicians feel they are playing well together, they are able to sense each others’ movements and expressions, they sense how the tempo might progress and, at the same time, they anticipate harmonic progressions and melodic lines.[25] Many jazz musicians describe the experience of togetherness afforded by synchrony or “groove” as a collective emotional experience. The musician Don Byron says grooving is “a euphoria that comes from playing good time with somebody.”[26] Another jazz musician, Phil Bowler, likens groove to a “mutual feeling of agreement.”[27] Synchrony in music does not come about by simply adhering to some common time signature and harmonic framework. A common time feel is collaboratively created; it requires a present moment effort based on shared purposes.[28] It may be that in jazz performance, as in preverbal communication, setting up and sensing the underlying pulse is a crucial step for feeling connected and tuned in. Pulse may be thought of as a basic framework for experiencing the sorts of alignments and interpersonal affiliations that jazz musicians and mothers often describe as synchrony.[29]

Empirical analysis of interactions between improvising musicians highlights other means by which individuals arrive at what Shutz called “a ‘We’ experience.” The musical phrase, for instance, can be considered an important unit for the creation of musical meaning. It has been described as intimately grounded in human biological and kinesic temporal cycles.[30] Musical phrasing expresses both dynamic emotion or feeling and intentional motion.[31] The phrase itself can be thought of as a narrative unit presenting both start and end points that are often marked by final lengthening or pausing with an ending.[32] It is a way of shaping the experience of time in music and giving sound a direction (intentionality). This is achieved through continuous interaction with ongoing musical pulse. Therefore, phrasing constitutes another means for the expressive coordination of musical ideas and feelings. Each musician can react to, and partake in, the phrasing proposed by another.

Gratier described various processes involving both melodic and rhythmic phrasing in two performances of a jazz duet.[33] An acoustic analysis of short audio and video excerpts taken from recordings of live performances showed both of the featured musicians (a guitar player and a drummer) making pervasive use of various forms of imitation and matching of melodic, rhythmic and harmonic elements. Other frequent techniques the musicians used to establish mutual understanding and shared purpose were punctuation and completion. In addition to audible musical cues, the musicians signaled their musical connectedness and intentions through various visual bodily indices. Facial expressions, posture shifts and gestures partake in and reveal the dynamic temporal coordination of an improvised performance.[34] Mutual gaze is a sign of attention and can facilitate the negotiation of transition points. Smiles can index recognition, pride, or embarrassment, and often the perception of humour. All bodily movements can potentially indicate to other musicians and to audiences, voluntarily or not, what a musician is feeling at a particular point in time. Performers also sometimes use vocal cues to signal a state, or a change that is about to occur. Synchrony between improvising musicians can thus be reinforced or transformed through para-musical signals. But synchrony must be seen as a means for musical expression, supporting collective creativity, and not as an end in itself. However, it clearly also has an aesthetic dimension, as it produces emotion in performers and audiences.

Another empirical study, this one conducted by Rebecca Evans, focused on micro-temporal adjustments made between two rhythm section jazz musicians in the course of playing multiple takes of the same song for the recording of a commercial album.[35] In this study, the separate audio tracks of each musician were analyzed with regard to beat positioning. The aim of the study was to ascertain whether the rhythm section musicians (bass and drums) played in synchrony at the pulse level. Evans’s study revealed two interesting and interconnected phenomena. First, it appeared that musicians knew when they were not in synchrony at the pulse level; they also knew that this knowledge did not require verbal explication. Indeed, one of the takes that was performed was unanimously discarded, whereas three other takes were evaluated and discussed (involving some debate) regarding their suitability for inclusion in the album. Analysis of the four takes showed that the lag between the drummer’s and the bassist’s beats was far greater in the discarded take than in the three others. The second phenomenon that emerged from this analysis was that in all three suitable takes, there was some lag between the beats of the two musicians. These lags alternated throughout the song such that one lagged behind the other on the first beat, and the other lagged behind the other on the second beat. There is, it seems, a conversational “give and take” between the two musicians. Synchrony is thus a much more complex process than simply “playing together at the same time.”

Synchrony involves both a shared timing, as in the experience of groove that comes from a negotiated pulse, and a sharing of audible intentions or ideas, as in matching and punctuating phrases. We suggest that sharing “good time” through sensing the beat of a lively pulse underlies interpersonal affiliation, but also engenders people’s sense of belonging in human interactions. The “mutual tuning-in relationship”[36] that a shared pulse affords produces feelings of intimacy and closeness,[37] and constitutes a form of meaning-making which can be described as non-discursive.[38] In the final section of this paper, we suggest that both infants and improvising musicians achieve a sense of belonging through the process of getting into a groove, or achieving interpersonal synchrony, which opens up spaces for co-constructed meaning and the sedimentation of dynamic cultural forms of interaction.

Synchrony, meaning and culture

Human communication cannot be regarded merely as informational exchange. We must also take into account the fact that individuals become involved with each other, each contributing their own knowledge and desires. The process of interacting, furthermore, starts to blur the boundaries between self and other, situating knowledge in a shared space. In conversation, one person frequently ends what another has begun, and intentions are not so much in the mind as in the action being carried out.[39] De Jaegher and Di Paolo theorize a crucial link between interpersonal coordination and the production of meaning. They define “participatory sense-making,” which is a natural outcome of all social encounters, as:

The coordination of intentional activity in interaction, whereby individual sense-making processes are affected and new domains of social sense-making can be generated that were not available to each individual on her own.[40]

The social foundation of self has been much discussed in sociology, philosophy and psychology.[41] Individuals seek recognition from others, but it is only through involvement with them that they gain a definable individuality. Research on infant behaviour suggests, however, that social agency and motives at birth must form a rudimentary sense of self, a self that recognizes the other.

With this in mind, one sees that to communicate is to negotiate one’s own identity and sociocultural situation, along with that of the other. It involves being acknowledged while acknowledging the other. In human interaction, there is always scope for inter-individual sharing and differentiation, for accord and discord. These antagonistic forces generate a vital tension that is indeed well represented in the social dynamics of jazz ensembles and that partakes in the creative process. Many researchers have described the tensions, common to all musical ensembles, between individual and collective identities and expressive intentions.[42] This dimension is an integral part of musical coordination between improvising musicians.

However, musicians communicate more than just their musical relations. What exactly do they communicate? What is exchanged during performances, between both musicians themselves and their audiences? What do musical signals represent? Clearly, their function is not to represent or to transmit information about the external world. Expected responses to musical performance are above all signs of aesthetic involvement—emotional experience that can translate into body movement (including brain activity) that itself coordinates with the musical signal. It is in this sense that Ole Kühl proposes the notion of embodied meaning, suggesting that musical phrases are reproduced as iconic signs within the temporal structures of individuals’ physical reactions.[43] Musical signs themselves afford synchrony as they call forth tuned-in responses from other musicians. This explains why music lends itself so well to collective practice and shared social experience. This meaningful reaction to musical motifs is, however, never specific or systematic, for it is the context itself that determines the final sense and effect of the musical sign.

Because musical meaning is so dependent on context, it is relevant to study indexical processes in musical works, particularly in performances. Indexicality has been well described in spontaneous linguistic discourse[44] and has been applied to musical interaction by Ingrid Monson and Keith Sawyer, who describe the constant commerce of significant material between jazz musicians in the course of improvised performance.[45] This process of sense-making in improvisation cannot be dissociated from that of synchrony. Sawyer reminds us that the aim in improvisation is not so much the final result as the process itself, the fluctuating degree of convergence of individual orientations within a group voice.

Most forms of musical communication are grounded in collective knowing, acquired through both music-based and language-based socialization processes. It is through the practice of playing together over significant periods of time and through gradual socialization into a particular way of listening that musicians come to embody rules of play and etiquette.[46] At the same time, it is through performance, and through the use and modification of implicitly learned rules by groups of performers, that this musical tradition evolves. Most musical improvisation is based on preexisting structures. The repetition-variation dialectic of all musical idioms is intrinsically a culture-producing process. It provides a basic architecture for intersubjective engagement, one that is evident in the first vocal and gestural encounters of young infants.

New forms emerge and stabilize in the course of interaction, some of which can be attributed to individual plans, and others of which are the result of collaborative shaping. But there is a necessary continuity between emergent and recurrent forms. Creativity is rooted in a history of successfully organized social interaction. The re-uptake of emergent forms of expression builds guiding principles that are at once biologically rooted and culturally shaped, and blurs the boundaries between individual and collective agency and authorship.

We believe that improvised collaborative performance builds a sense of belonging and creates communities of like-minded people. But there is clearly a dynamic, mutually shaping relationship between the interactions of like-minded people and the habits, ideas and values that come to be built into them. Styles of musical communication index the contexts they are associated with; these contexts (constellations of historical, social, cultural and physical situations), in turn, guide the expectations of social agents.[47] In conversational exchange, the processes by which individuals come to establish mutual understanding (monitoring, confirming, repeating, etc.) have been described as interactive “grounding,” whereas the content of mutual understanding (the values, ideas and things talked about) has been described as “common ground.”[48] Common ground is thus shared knowledge that enables people to interact; at the same time, it is being continually transformed through the grounding processes embedded in the interactions themselves.

In the context of improvised jazz performance, synchrony can be seen as a form of grounding.[49] Synchrony is thus vital in building musical common ground, or musical culture. Magnier compared performances of duets by jazz musicians who knew each other well and by those who had never before met. His analyses revealed that although both types of duets used grounding processes in their interactions—that is, although both the duets played by acquainted and unacquainted musicians called upon pre-existing rhythmic, melodic and harmonic patterns—the already acquainted musicians were able to be far more creative and humorous in their exchanges than the unacquainted musicians, who tended to stick more rigorously to common cultural codes. This study highlights an interesting relation between culture and creativity. This line of research merits further investigation, in order to determine more precisely how common ground evolves in the course of interaction, how new patterns emerge and stabilize into shared knowledge.[50]

As a participative practice, collaborative musical improvisation also plays an important role in the construction of identity. When the practice is authentic and based on a search for new meaning, the process is transformative for performers. Alessandro Duranti and Kenny Burrell explore the search for “a unique self” that jazz improvisation not only implies, but requires, by studying the discourse and insights of professional jazz musicians, teachers and students.[51] They show how jazz musicians define their identity through musical and conversational exchange based on a history of narratives of remarkable musicians’ unique styles of playing. By appropriating and transforming the elements of an inherited tradition, they strive to develop their own musical voice within a collective search for meaning, while respecting tacit aesthetic and moral rules. Thus, experienced musicians and less experienced students collaboratively set up and explore the boundaries of a jazz community in which they participate to varying degrees. Practicing performers create what Jean Lave and Etienne Wenger have called “communities of practice,” and which can be defined as domains of knowledge that communities of people are engaged with and for which they develop shared practices.[52] The students of jazz described by Duranti and Burrell are “legitimate peripheral participants”[53]: they are legitimate because they are drawn into the field of shared knowledge and know-how by more experienced musicians and because they acknowledge the implicit “rules of the game”[54] that define the jazz idiom. As their sense of belonging develops and they move away from the periphery, musicians play a more central role in collaboratively shaping and transforming the “rules of the game” and, by so doing, confirm the uniqueness of their musical identities. Because interaction produces shared knowledge, it creates a sense of belonging.

We can link the organization of sound in time to an ethos, to cultural identity and to communities of practice. Jazz musicians both invoke and construct traditions musically by relying on particular signals that are recurrent and emergent, known and new. These are the “licks,” “riffs” and “tricks”[55] borrowed from the historical repertoires they identify with and which carry value. But as each of these musical elements or signs, taken from the history of the tradition, is brought into performance, it becomes an opportunity for individual expression. These signs attach musicians to their communities of practice through the repetition and variation of a shared history and through collective anticipation of a common future, while, at the same time, contributing to the emergence of individual style. These are the signatures of belonging and identity that define individuals and ensembles while situating them within one or many traditions.

In the next paragraphs, we argue that signatures and styles, taken as stabilized forms of negotiated expression, are found in all social interactions. They are strong markers of belonging in conversational exchange. Conversational styles are laden with riff-like features, though these do not always have a clear origin.[56] Distinct vocal styles can be identified within close-knit communities or within families. Styles of talk are also associated with particularly charismatic or notorious individuals who set trends for ways of being in the world that often go against accepted collective norms.[57] The study of style in different domains of human activity—in conversation, music, subcultures, artistic production and fashion—may provide important insights into the ways in which cultural meaning gets built into social and physical worlds and interactions.

Young babies have clear expectations about unfolding acoustic events.[58] Does their listening, in social contexts, gradually become more situated? When do they start to sense the cultural affordances of the communicative expressions directed toward them?

We have described how infants, from the second month of postnatal life, become involved in sustained social exchanges, effectively coordinating multiple modalities in their expressive behaviour. Because language has long been conceptualized as a system of abstract symbols, the question of what sort of meaning infants access before they learn to speak has been largely overlooked. Trevarthen has most cogently addressed the question of infant semiosis, proposing that “protoconversation” (a turn-taking exchange with a non-verbal infant) is, at its core, a semiotic activity.[59] Infants intend meanings and they do so through interpersonal coordinated timing that is motivated, from birth, by the desire to share meaningful experience. There is a whole world of meaning before language which coexists and is in continuity with symbol-based meaning throughout life. Linguistic meaning is perhaps grounded in this realm of “fluid,” “non-discursive” meaning.[60] A number of theorists in developmental psychology have focused on the capacity for joint attention and proto-declarative pointing as intentional, attention-orienting behaviour that forms the basis of language acquisition.[61] But careful study of the temporal aspects of social interaction in the first months of life provides some indication that communicative partners intentionally orient their attention much earlier, and that they do so through a musical semiotic process. Cultural learning cannot emerge solely as a function of language or proto-linguistic behavior; it must be built into the very fabric of social engagement from the beginning of an individual’s life.[62]

There is some evidence that infants and adults develop specific styles of engagement through repeated social encounters, and that these are related both to broader cultural habits conveyed by enculturated adults and to idiosyncratic practices.[63] Tonya R. Bergeson and Sandrea Trehub, for example, show that mothers use specific signature tunes when talking to their infants.[64] These tunes not only identify them as individuals but also identify ways of expressing emotion in particular contexts. Another study has shown that infants, by the age of two months, have become familiar with and sensitive to their own mother’s way of timing her behaviour when socially engaging with them. Indeed, two-month-olds are less responsive when interacting with an unfamiliar partner whose timing is either faster or slower than that of their own mother, but respond more to an unfamiliar partner whose timing is close to that of their mother.[65] Synchrony, at least interactional expressive synchrony, thus implies a style. There are ways of being together in time. This is well described by the jazz musicians interviewed by Monson.[66]

Implicit cultural practices, such as conversational patterns, were also found to identify styles of social interaction between mothers and infants. A study of vocal interaction between two- to four-month-old infants in France, India and the United States showed, on the one hand, that in these three cultural contexts, mothers and infants rely on the temporal processes of pulse, phrasing and narrative to frame their exchanges, and, on the other hand, that their interactions present a number of cultural specificities. Indian mother-infant interactions were most different from those in France and North America. They were characterized by greater overlap between mother and infant (less of a turn-taking format), less verbalization on the part of the mother, more vocalization (perhaps surprisingly) on the part of the infant, and a much shorter between-speaker pause indicating that Indian mothers and infants, on average, wait less long to take a turn than do mothers and infants in the other two contexts.[67] The between-speaker pauses in French and North American mother-infant dyads corresponded roughly to the between-speaker pauses typically found in adult verbal conversation in these two cultures.[68]

Synchrony in social encounters must therefore be seen as the necessary condition for participatory sense-making that entails emotional rapport (positive or negative) and the collaborative process of linking past, present and future based on a shared semiosis.[69]

* * *

Synchrony is ubiquitous in biological systems and physiological processes.[70] In its strictest sense, it relies on coordination, that is, correlations between two regularly oscillating systems. In this paper, we have described processes of synchrony between human agents, between young infants with very little cognitive and communicative competence—but who are equipped from birth with the fundamental competency to achieve synchrony with competent social partners—and improvising musicians whose competencies are highly developed in specific domains involved in collective music making. It is our contention that human synchrony, from its earliest to its most sophisticated forms, is unique in that it relies on imperfect or expressive rhythmic coordination involving subtle moment-to-moment adjustments between interacting agents who continually situate themselves in time in relation to each other. Coordination between human agents can also be achieved through temporally contingent forms of imitation, expressive matching and anticipation that are highly significant. This leads to a coordination of ideas and/or feeling states that builds on and contributes to a shared culture. Synchrony in human interaction is thus seen as a primordial basis for the elaboration of an intimate, non-verbal culture that is itself a precondition for language-based cognition, communication and culture, considered the hallmarks of our species.