Corps de l’article


While most interpreters will agree that they need to see the speaker and the audience in order to interpret well, most cannot explain why, and so far there has been no large-scale study able to conclusively prove the necessity of visual input. While visual input is not limited to nonverbal communication but can rather be extended to a complete view of the venue, the audience, and visual presentations, which all provide additional information to the interpreter, nonverbal communication as an integral part of communication is certainly one of the largest and most fascinating sources of information for the interpreter. This paper will first look at nonverbal communication and how it relates to simultaneous interpreting. After providing this theoretical background, I will discuss how visual nonverbal information can be interpreted and present the findings of a small-scale experiment on the relevance of visual input in simultaneous interpreting.

Nonverbal Communication

Oral communication, the working tool of the interpreter, consists not only of what is said, but also how it is said – with a sullen face and an irritated tone of voice, or with a smile and a nod. Poyatos (1997b: 259) describes oral communication as follows:

[…] trying to be simply realistic in our approach to speech, we must recognize that what has been called orality is produced in reality in an aural and visual manner through the combination of internal (phonetic) articulations and sound modifications, and external articulations that depend on our facial and bodily anatomy.

Oral communication is, in fact, the combination of three elements that act together to facilitate the process of communication – verbal language, paralanguage (pitch, intonation, pauses, volume, etc.) and kinesics (Poyatos 1987). These elements can occur individually or in different combinations, fulfilling a number of different functions. The entire gamut of visual nonverbal communication encompasses not only kinesics (gestures, facial expressions, gaze direction), but also proxemics (interpersonal distance), cultural and social traits (clothing, socially determined behavior) and certain visible physiological reactions (such as blushing or tear-shedding), since they, too, convey a message (Poyatos 1997b). Nonverbal communication signals always convey information, but this sending is usually not deliberate or fully controlled by the sender, and the information is usually also not received or decoded consciously by the recipient (Argyle 2002, Bühler 1985, Scherer [1977]/1984). This may account for the difficulties interpreters have when asked which signals aid their task most, and also makes research in this area difficult (cf. Bühler 1985).

While paralanguage is an important aspect of nonverbal communication – in fact, oral communication cannot occur completely without it – it is also that part of nonverbal communication that is accessible to the interpreter even when he is deprived of visual input. It may also often be the only way for the interpreter to convey what has been expressed kinesically by the speaker.

Some forms of visual nonverbal communication (e.g., clothing, blushing) provide contextual information rather than a message that would have to be interpreted. When we think of visual nonverbal communication in the context of interpreting, however, we usually refer to body language, or kinesics, defined by Poyatos (1987: 88f) as

the conscious or unconscious psycho-muscularly based body movements and intervening or resulting positions, either learned or somatogenic, of visual, visual-audible, and tactile or kinesthetic perception, which, whether isolated or combined with the linguistic and paralinguistic structures and with other somatic or objectual behavioral systems, possess intended or unintended communicative value.

Kinesics can be further divided into gestures – conscious or unconscious movements of the head, face (including eyes) or limbs, which play an important role in communication (e.g., smiles, gaze movements, a hand gesture for emphasis); manners – mainly learned and socially ritualized according to specific situations (e.g., the way we greet others); and postures – more static and also codified by social norms, although they may reveal cultural background and mental attitudes (e.g., boredom, tenseness).

The three elements of the “triple structure language – paralanguage – kinesics” (Poyatos 1987: 76) can occur in different combinations. While visual nonverbal communication (such as gestures) can occur without accompanying verbal communication, verbal signals never occur completely independently of paralanguage and kinesics.

Nonverbal signals can fulfill a number of different functions in relation to the verbal message they precede, accompany or follow. They can add information, support, repeat, emphasize, de-emphasize or even contradict what is being said verbally, and, in the case of kinesics, they can even be used instead of words, either as an economy device or because the speaker is at a loss for a word (cf. Poyatos 1997b: 258f.).

Morphological and functional categories of nonverbal behavior

The nonverbal elements described above can have a number of different functions, which shall be described here in some depth, as it is precisely these categories of visual nonverbal information that may have to be verbalized or otherwise expressed in simultaneous interpreting. This applies mainly to visual elements; however, these may at times be accompanied or substituted by paralanguage. There are three main categories: emblems, illustrators, and adaptors (Ekman and Friesen [1972]/1984, Poyatos 1987: 93-97, 1997b: 267-270).

Emblems are nonambiguous gestures (sometimes paralanguage) with a clearly defined meaning within a culture. They can be replaced by a word or phrase and vice versa (e.g., “Okay,” “Stop,” “Yes,” “No,” “Hush”). While some gestures are nowadays recognized worldwide (mainly through the mass media), the same gesture can have completely different meanings in different cultures or even subcultures. Emblems are usually conscious signals intended to convey a message and can be understood unambiguously by a listener of the same culture. They can occur completely independently of verbal language, e.g., in situations where verbal communication is not possible due to distance, noise etc.

Adaptors are a large group of gestures with which we contact ourselves (self-adaptors, e.g., thoughtfully rubbing one’s chin), other people (alter-adaptors, e.g., greetings and goodbyes, patting someone on the shoulder), objects (object-adaptors; e.g., twirling one’s pen, shifting the papers on the table) or objects or substances relating to our body (body-adaptors; e.g., unconsciously twirling a ring on one’s finger). Not all gestures that involve contact are adaptors; they may also be emblems and some types of illustrators (see below).

Illustrators are elements of kinesics and paralanguage that refer directly to verbal language by emphasizing, adding information, or substituting the verbal message. Unlike emblems, they are closely linked to verbal communication and usually do not have a direct verbal equivalent; in fact, it can be very hard to express them verbally.

Illustrators are divided into several overlapping categories. Their number varies slightly depending on the author. I have used Poyatos’ classification (1987: 94-97, 1997b: 267-270) and referred to Ekman and Friesen ([1972]/1984: 113f.), on whose work Poyatos’ categories are based as well:

Language markers are always present in discourse to some extent. They are closely related to culture; they are, in fact, what Poyatos (1987b: 268) calls a “visual accent.” Language markers are the small movements of head, hands and face that accentuate and punctuate language, following the rhythm of the speech and the contents of the message. Their function can be pronominal (e.g., gestures to indicate “me,” pointing with nods or gaze to someone who is present or even to a person who is not present), prepositional (e.g., brief gestures or expressions accompanying “until,” “but,” “without”), conjunctional (e.g., gestures for “however”) or verbal (mainly temporal, sometimes modal), and they can also further accentuate the message or follow the intonation (nods, facial expressions, hand gestures).

Space and time markers illustrate size, distance, and location (“over there”), and time (past, present and future), and can also illustrate the duration of an event.

Deictics point to a person, place or point in time, and are often accompanied by paralanguage (stressing the referent, e.g., “This conference is the best one ever”).

Pictographs, echoics, kinetographs and kinephonographs imitate their referents. Pictographs draw the shape or contour of a physical referent (sometimes out of verbal deficiency, but they can also occur together with words), e.g., a spiral staircase, while kinetographs depict movement, e.g., wave motion, cranking. Echoics imitate a sound, be it through paralanguage (“vroom” for a starting motorcycle) or nonvocally (finger-rapping for a galloping horse), and kinephonographs imitate both movement and sound (e.g., a throwing motion and whistling sound to describe the throwing of a stone).

Ideographs and event tracers trace the direction of a thought or the development of an event and may be accompanied by paralanguage. Ideographs can depict a heroic deed or a beautiful work of art, while event tracers can describe the coming and going of a person or the excessive duration of a meeting.

Identifiers describe with bodily form abstract concepts (“absurd,” “impossible”), moral and physical qualities (“cold,” “cautious,” “tough”) and qualities of objects or the environment (“dirty,” “smooth”). They can occur together or alternating with verbal language and are at times more expressive than words.

Externalizers do not illustrate words but are reactions to “other people’s past, present, anticipated or imagined reality, to what has been said, is being said or will be said, silenced, done or not done by us or someone else, to past, present, anticipated or imagined events, to esthetic experiences and to spiritual experiences” (Poyatos 1987: 96f.). They are frequently unconscious reactions, and need or should not always be verbalized by the interpreter (e.g., a speaker’s nervousness despite his efforts to suppress it). (cf. Ekman and Friesen [1972]/1984, Poyatos 1987, 1997b).

The interpreter and the nonverbal

While practitioners and researchers agree that simultaneous interpreting should convey not the words but the message, words are still widely considered the only source of this message. Often only background information and general knowledge are mentioned as extralinguistic elements (Viaggio 1997: 283), while nonverbal input is mostly ignored. Considering language and communication in the context of the “basic structure language – paralanguage – kinesics,” however, it becomes evident that the interpreter consciously or unconsciously receives other forms of input than merely the verbal.

Interpreting should not only facilitate communication at the linguistic, but also at the cultural level (cf. Kalina 1998: 38). Culture-specific expressions and gestures cannot be conveyed word for word, but instead the interpreter should consider the culture-specific and individual knowledge of the sender and recipient and convey the information in this light (Reiß and Vermeer 1984: 65). Bühler (1985: 49) summarizes the task of the conference interpreter as follows:

Conference interpreting is not merely a question of repeating words or phrases in another language, a question of code switching, but rather a question of understanding and making oneself understood, of assuming responsibility for the success of the communication process as a second sender in a communication channel that is interrupted because original sender and receiver use different codes. The task of the interpreter is therefore to convey a message, the message of the original sender in real time without loss of information content.

In order to understand the context and convey the meaning of the message, the source text of course has to be understood and analyzed within a very short time span.

The tasks (or efforts) of listening and analyzing, storing information, and producing a target language speech (Gile 1991, 1997) usually occur more or less simultaneously: the interpreter conveys one part of the message while already hearing, analyzing and storing the next one. This means that interpreters always work with split attention. Since our processing capacity is not unlimited, it needs to be divided among these three efforts as necessary. If the source text is hard to understand – either because of its content or due to poor acoustics, a heavy accent, etc. – the interpreter has to focus more on receiving and analyzing information, leaving less resources for storing and conveying it, which can lead to problems or mistakes in target text production (cf. Gile 1991). Since analysis and understanding of the source text is the crucial part, interpreters should make use of any information that can make this process easier or faster (Bühler 1985: 51). As nonverbal signals are usually decoded subconsciously, they do not produce additional cognitive stress but rather support the cognitive process (Viaggio 1997: 289).

As noted above, the interpreter does not convey words, but meaning. Since not only the speaker’s words, but also his paralanguage and kinesics carry meaning, they should also be considered. As nonverbal communication can not only supplement the verbal message, but may in some cases be the sole conveyor of part of the message, a lack of visual input would deprive the interpreter of relevant parts of the source text, making it harder or even impossible to understand, or leading to misunderstandings.

Keeping in mind all the different ways in which kinesics can interact with verbal and paralinguistic signals

[…] it would seem that the translator would have to “keep an eye” on the speaker’s total speech, lest he should miss something which has been said kinesically only. This amounts to saying that an interpreter can translate visually and not only audibly, in other words, that he is not only the translator of verbal language, but of the whole triple structure.

Poyatos 1987: 91

A clear view of the speaker not only allows for a better and easier understanding of the speech itself, but also functions as a “backup” for information. If the interpreter misses a part of the acoustic input (e.g., because of technical problems, a heavy accent, or focus on the output), he may be able to receive this information through the visual channel, e.g., if he missed the name of the next speaker but sees the chairman’s hand gesture as he gives the floor to the next speaker (provided the delegates’ names are known to him) (Strolz 1992: 90f.).

Since nonverbal signals often precede the verbal message, they can be a valuable source of information for anticipating the message (Poyatos 1997b: 261, 271). If, for instance, the speaker sighs and shakes her head slightly just before beginning to speak, the interpreter can expect a negative answer. Kinesics can also indicate who will take the floor next, e.g., through a nod and hand movement by the chairman, or if someone sits up straighter and makes some notes. Bühler (1985) notes that interpreters appear to be particularly sensitive to these “speech-preparatory movements” or “turntaking cues” (cf. Scherer & Wallbott 1984). The interpreters in Bühler’s study said they could usually guess who was going to speak next. This may also warn the interpreter of a language change and give them a few moments to prepare (Bühler 1985). In a discussion with more than two participants, hand gestures, nods or gaze direction can indicate who the words are directed at – for the words themselves may at times only indicate approval or disapproval (e.g., “I must disagree with my colleague here”). This knowledge can prepare the interpreter for the argument to ensue (Viaggio 1997).

Visual nonverbal communication, however, is not the only reason why a clear view of the conference room is necessary. “Verbal visual” information (Pöchhacker 1994: 98), such as statistics or quotes in PowerPoint presentations, is often presented at a high speed, but the information may be redundant – if pressed for time, the interpreter may opt for “As you can see in this image” instead of verbally repeating what the audience can see anyway (cf. Bühler 1980: 49, Alonso Bacigalupe 1999: 135, Eder 2003: 107). It has also been demonstrated that the visibility of lip movements aids understanding even when the listener does not know how to read lips, in particular if the sound quality is limited (Massaro et al. 1993: 446). It remains to be determined whether this plays a large role in simultaneous interpreting, where the distance between speaker and interpreter can sometimes be very large (Bühler (1980: 47) assumes that this may very well be the case).

It is important for the interpreter to not only see all parties to the communication process, but also to have the same visual information they have. Information derived from context or situation that is obvious to the audience will usually not be referred to explicitly, but references to it can be quite confusing for an interpreter who does not have access to this information (cf. Strolz 1992: 90). Such information can be the fact that papers are being distributed, the behavior of other panelists or the audience, or even something outside the window, if the speaker decides to refer to it.

Seeing the audience’s body language is also often the only way for an interpreter to know whether the message is getting across and the only way to get feedback. In a normal communication situation small cues indicate that the listener is paying attention and understands. A lack of these cues is a source of anxiety. While the interpreter is not the primary sender of the message, listener feedback is nevertheless important to them – perhaps even more important, as Viaggio (1997: 288) notes (cf. Poyatos 1987, Viaggio 1997).

How to interpret nonverbal communication

In a simultaneous interpreting situation, the listener usually sees the speaker and his body language while hearing, with a short delay, the verbal language and paralanguage of the visually absent interpreter (cf. Pöchhacker 1994: 100, Poyatos 1997b: 252, Weale 1997: 295). Due to cultural differences, it may be hard for the listener to correctly decode the speaker’s kinesics; they may not be perceived at all, they may not be understood, or in the worst case, they may be misunderstood. This is particularly the case with emblems, since they can carry much meaning, but it can also apply to other types of kinesics. Since interpreting should convey the entire message, and nonverbal communication is an essential part of communication, nonverbal elements should be conveyed in the target language in one way or another. This goes particularly for emblems that are not verbalized in the source language.

Nicht-verbaler Dank indischer Kulturen wird im Deutschen verbalisiert, z.B. als dankeschön! (Da der Zielrezipient verbalen Dank erwartet und sonst Unhöflichkeit annehmen könnte). […] Unter Umständen muss Begleitgestik verbal oder paraverbal ausgedrückt werden, z.B. eine Resignationsgestik durch Intonation. […] Gestik wird interpretiert, nicht transkodiert.

Reiß & Vermeer 1984: 65

Of course most parts of nonverbal communication are not perceived or decoded consciously, but they still influence the listener’s understanding of the message (Argyle 2002: 17). Naturally, only those nonverbal elements that the interpreter perceives consciously can be interpreted. Whether or not to interpret an emotion depends on whether it is intended by the speaker (e.g., irony, benevolence) or unintended (e.g., nervousness), and it is up to the interpreter to judge whether conveying it verbally or through paralanguage would be in the speaker’s interest or whether it would constitute an invasion of the person’s privacy (cf. Poyatos 1997b: 255). Interpreting nonverbal communication is certainly important if it adds information by stressing or substituting words, in particular if the interpreter can assume the listener would not decode this kinesic statement correctly. When kinesics are used out of verbal deficiency, it is up to the interpreter to supply the word verbally in his interpretation (cf. Poyatos 1997b: 259). When a gesture in the source language exists in the target language with a different meaning, or when it would likely not be perceived as a kinesic message at all, it would make sense to provide “what we might call ‘oral footnotes,’ that is, the verbal replacement for what has been expressed nonverbally in a way the native listener could decode, but not the foreign one” (Poyatos 1997b: 262). This, of course, requires the interpreter to have a good knowledge of emblems and other kinesic expressions in both the source and the target languages. Above all he should be aware of gestures that can typically be misunderstood by the target language listener – the kinesic equivalent of “faux amis” at the verbal level (Poyatos 1997b: 267). The most obvious example of such false cognates is the reversal of nodding for “yes” and shaking your head for “no” in some cultures, and there is a number of emblems that convey a perfectly innocent message in one language but are severe insults in another.

Emblems that accompany a verbal message (e.g., “I don’t know” accompanied by a shrug) are part of the overall message and do have a certain effect, but they may be redundant, and if pressed for time, the interpreter may choose to omit them. In Viaggio’s (1997: 287) words: “More often than most interpreters realize, the speaker’s body language has made his own words – and therefore the interpreter’s – redundant.”

The gestures that most frequently require verbal interpretation are emblems, identifiers, and pictographs, if they are used instead of words. However, not all forms of nonverbal communication will or can be conveyed verbally. Since the simultaneous interpreter is usually not visible to the audience, the message he has received as a compound of verbal language, paralanguage and kinesics can only be expressed vocally, i.e., with words and paralanguage. Words alone often do not suffice to express what has been conveyed nonverbally. A simple “What?” can have a number of different meanings, depending on the kinesics and paralanguage that accompany it. Verbalizing the surprise or anger that may be expressed with that one word would be time-consuming and often near impossible. But while the interpreter cannot convey the message by kinesic means, he can nevertheless use his own paralanguage to create a similar effect.

Nonverbal Communication and Visual Input in Simultaneous Interpreting: An Experiment

Based on the hypothesis that visual input plays an important role in facilitating understanding in simultaneous interpreting, I conducted an experiment at the University of Vienna in March 2004 in order to determine whether there were any appreciable differences in interpreting with and without visual contact. While an experiment can never completely recreate a natural situation for the subjects, a number of realistic elements were introduced to make it less artificial.

As Kurz (1996) shows, interpreting from video screens (remote interpreting or pre-taped material) leads to higher fatigue than interpreting in a live situation. Additionally it can be assumed that a video image will not always give access to all the visual information that would be available to the interpreter in a live situation (Anderson 1994). Therefore I opted for a more realistic interpreting situation by choosing a live speaker. A native speaker of English was asked to hold two short speeches (each approximately 10 minutes) in front of an audience. The presence of an audience both gave the test subjects someone to interpret for, thus making the situation less artificial, and made the communication situation more natural for the speaker. The speeches were largely unprepared and spontaneous, with few written notes and no visual material (such as overhead projections). Balzani (1990) shows that errors are more common in interpreting speeches that are read aloud, since speakers who read their papers usually have a higher presentation speed, a less pronounced prosody and use less kinesics. Written and spoken communication also differ in form and complexity, which makes pre-written speeches often more complex than a freely held presentation. The speaker was given complete freedom in her choice of topics in order to ensure her own interest in the topic and in transmitting a message, rather than having to hold a speech on a topic that did not interest her and that she might have presented with less enthusiasm and possibly less pronounced body language.

The speaker was aware of the purposes of the experiment, but she was not instructed to use particularly much body language. This of course meant that it was impossible to know or influence the amount and types of nonverbal elements in advance. While it would have been interesting to introduce particularly hard passages or even “traps” or to try to achieve a balance in the use of different nonverbal elements, it must be taken into account that a large part of our nonverbal communication is sent unconsciously and often decoded unconsciously as well. Would the speech be practised in advance with all its nonverbal elements, as in Bacigalupe (1999), the usually unconscious movements and expressions would have to be performed consciously, thus making the situation artificial, or they would be left out. For the purposes of this experiment a natural speech with all its conscious and unconscious visual nonverbal elements seemed the best solution.

Since in interpreting research it is only possible to observe the product, while the processes that created it remain largely hidden, it would be folly to assume that the lack or presence of a piece of information in the interpretation has a causal relationship with the lack or presence of visual input. There are many other reasons for information loss or errors in the interpretation, such as a large time lag, acoustic or lexical problems, understanding, and the skill of the individual interpreter. Therefore my study only contrasts and comments on input and output, and highlights situations where visual information may have had an influence on the output.

Material and Subjects

Two English language speeches of approximately 10 minutes length each were held by a native speaker of English and interpreted into German. The topics of the two speeches were of a rather general nature, so as not to present any large difficulties concerning terminology or content, in order to avoid problems in interpreting due to the message, rather than its presentation and the provided input. The subjects of this study were two graduates and three advanced students of the Department of Translation and Interpreting of the University of Vienna. One of the students was male, the other participants were female. All subjects had German “A” (with one exception where German was the language of habitual use but studied as “B”) and English “B.”

The topics were announced one day ahead of the experiment; due to the nature of the topics no large amount of terminological preparation was required. The speaker mentioned any terms that might be unusual or might cause difficulty at the beginning of the experiment. The subjects were aware of the goal of the experiment, but the exact manner in which the experiment would be conducted was only revealed at the beginning of the experiment.


The subjects were divided into two groups, group A with 3 and group B with 2 members. The windows of three of the interpreting booths were covered to deprive the interpreters in those booths of visual input (referred to as “blind” booths in the following). Group A interpreted speech 1 without and speech 2 with visual input, while group B interpreted speech 1 with and speech 2 without visual input.

After the experiment, the subjects were asked to complete questionnaires in order to determine the benefit of visibility and the usefulness of different forms of visual input.

The speeches and interpretations were recorded digitally as audio files and later transcribed, and the speeches were also recorded as video tapes and analyzed to identify the kinesic elements.

Analysis and Discussion

In the descriptive analysis I sought to identify types of visual nonverbal communication that were particularly important for the understanding of the message. I assumed that these would mainly be emblems and some types of illustrators (kinetographs, deictics), since although adaptors, language markers and externalizers can provide information on the speaker, they contain no information that would have to be verbalized. The exception are situations where language markers overlap with the categories of space and time marker and deictics (e.g., if two persons or facts are contrasted and the language markers for the first are always made on the left, the ones for the second one are on the right side of the speaker). In such cases they may be helpful for differentiating between the two, and I considered them in the analysis where it appeared that there might have been a connection between language markers in the original speech and prosodic features in the interpretation.

Overall, no significant positive or negative influence of visual input could be found, with a few noteworthy exceptions.

The speaker began her first speech with “Good evening, ladies and gentlemen… ladies and gentleman” (followed by a little chuckle), using deictics to indicate first the audience as a whole and then the only one male person in the audience. This is a good example of the importance of seeing the entire room and the audience, since this utterance only makes sense if the interpreter is aware that there is only one male person present. The three interpreters in the “blind” booths ignored this correction on the speaker’s part and used the standard greeting “meine (sehr verehrten) Damen und Herren,” as did one of the interpreters with visual input. The other one, however, interpreted “Guten Abend meine sehr verehrten Damen und Herren… meine sehr verehrten Damen und mein sehr verehrter Herr,” with an additional emphasis on the singular “Herr,” thus conveying the humorous note of the original speech.

In one instance, a pronominal language marker that merely emphasized “my” (which was not emphasized with paralanguage) was converted into a paralinguistic emphasis by the subjects with visual input.

In another instance, the interpreters with visual input apparently verbalized a completely nonverbal expression. The speaker followed a statement that expressed criticism and doubt by tilting her head, raising her eyebrows, making a slight “hand-shrug” and pressing her lips together, followed by a light tongue-click. This gesture was expressed verbally by both subjects who had visual input, without apparently being conscious of doing so (when told of it later, neither had a memory of it): “Also ob ihnen das nun zugute kam, ist eine wichtige Frage.” (B1); “Es wird nicht sehr gut geheißen” (B2).

In another part of the speech, the speaker’s opinion on the subject matter was quite apparent in her gestures and particularly her facial expression, which conveyed disapproval and exasperation. This was of course also apparent from her paralanguage, but her facial expression and gestures emphasized her feelings. Whether or not the speaker’s feelings should be conveyed by the interpreter should of course be decided by the individual interpreter depending on the situation; in this case, however, the speaker made no secret of her opinion, reinforcing her words by nonverbal means. This is one of the few cases where externalizers can influence the interpretation. It is hard to determine conclusively whether this was really the case, but the interpreters with visual contact emphasized this statement with their paralanguage and appeared to convey the speaker’s “mood” to some extent.

The interpreters with visual input also appeared to have less problems following several particularly complex statements in which two sides of an argument were separated spatially with language markers, and they made a similar kind of separation of these parts with their paralanguage. In one instance, language markers clearly distinguished between “effects” and “actual events” in the original speech. While the interpreters in the blind booths seemed to have trouble understanding the statement, the group with visual contact expressed this juxtaposition quite clearly, if in a slightly summarized way. They also placed an additional emphasis on the “effects” in their interpretation.

Since the second speech was about sign language, there were some cases where visual contact was quite obviously helpful. In one case, the speaker describes the sign language gesture for “thank you” verbally (“holding your hand to your chin and moving it forward such- like this”) while at the same time signing it, in fact repeating the gesture four times (two times probably unconsciously). It should be noted that this gesture is of course a word in sign language and not an emblem, since it does not stand for a word but is a word. Within the context of this speech, however, the movement is illustrated, and can therefore be considered a kinetograph. Since it is made redundant by the words, the verbal information would have been enough for the purposes of the interpretation. However, one of the subjects in the “blind” booths stated in the questionnaire that it bothered them to know a gesture was being made and not to be able to see it. One of the interpreters with visual contact opted for using this redundancy as an economy device and only referred to the gesture, which she knew was visible to the audience, by saying “wie ich grade meine Hand bewege,” instead of describing it verbally as the others did.

In one case the visual input may have been misleading. Here part of the verbal message was hard to understand, and the kinesics contradicted this message. Interestingly, the “blind” booths interpreted this part correctly (possibly due to background knowledge), while the subjects with visual contact interpreted it incorrectly, possibly mislead by the gesture. This may be an example of the fact that in case of conflicting information, the nonverbal is considered more reliable than the verbal (cf. Bugental et al. 1984, Massaro 1993).

In many instances, the visual nonverbal information was redundant, since the information was contained in the verbal message as well. Here it was often difficult to judge the influence of visual input, as the information was conveyed by subjects from both groups. There are several cases where the group with visual contact and one subject from a blind booth conveyed information present in both the verbal and the nonverbal material, but it cannot be determined conclusively whether the visual nonverbal information was helpful. One case where redundancy may have been beneficial is a passage where the verbal information was very hard to understand due to speed and unclear pronunciation. The speaker was explaining that a certain television program would become available, “but only via satellite” – this last bit added quickly, as an afterthought. This was accompanied by pointing upwards with her right index finger and glancing briefly upwards. This information was only present in two interpretations – one from group A and one from group B. Due to this distribution it is hard to say whether the gesture facilitated understanding, as the subject in the blind booth (group B) had obviously received the information from the verbal message. However, the gesture may have supported understanding for the interpreter from group A.

The questionnaires showed that interpreting without visual contact was harder for the subjects, required more concentration and lead to anxiety, in particular because they were afraid of missing something – especially when they knew the speaker was using a gesture which they could not see. This seems to corroborate the findings of Bühler (1980) and Kurz (1996). Hand gestures and facial expressions were considered the most important sources of information, and the main functions of visual nonverbal communication were described as facilitating understanding and emphasis.

While objectively very few indications could be found to support the theory that visual nonverbal communication aids understanding, the subjective assessment by the subjects indicates – as in Bühler (1980, 1985) and Kurz (1996) – that kinesics are considered helpful and that interpreting without visual contact apparently requires more concentration and may lead to more stress due to the feeling of missing out on information. In a task that requires such a high amount of concentration in itself, it would appear important to avoid any additional stress.

The above results should in no way be extrapolated from, since the scale and scope of the experiment were too small, and the subjects had no or hardly any experience as professional conference interpreters. Differences in the level of skill of the subjects may also have influenced the results.

The findings of the study suggest that while for the most part, visual input appeared to have no appreciable positive or negative effect, certain types of visual information may support the interpreter, either by providing additional information that aids the understanding or by repeating the verbal message, thus making the processing of the message easier by providing redundant information on two input channels.

Future research into this field could include a closer examination of kinesics that replace verbal information partially or completely, add information, illustrate it, or refer to the conference venue or audience. In addition to emblems, kinetographs, pictographs and illustrators, which are all very expressive forms of nonverbal communication, it might be interesting to examine the role of language markers, which have so far been widely neglected in the context of simultaneous interpreting. While it bears repeating that they will not typically occur without accompanying paralanguage, it does seem possible that they can help with visualizing the structure of complex sentences and arguments. It would also be interesting to look at the effect of language markers on the paralanguage of interpreters in order to see whether interpreters emphasize utterances that are emphasized with language markers more than those that are only emphasized prosodically.


While there is as yet no conclusive proof to support the claim of many interpreters that they need to have a full view of the conference room, several studies indicate that a lack of visual input may increase stress and fatigue. The extent to which the interpreter takes advantage of this input depends largely on the individual’s style; some interpreters prefer to close their eyes during difficult passages. While studies show that visual nonverbal information is often redundant, it can nevertheless aid the processing of information. Visual contact can certainly be of importance when the verbal message refers to the audience or the conference room or when the nonverbal element adds information that is not present in the speech. This may also be the case with some types of language markers, which are usually accompanied by vocal stress patterns but can serve to visually structure complex sentences.

The interpreter may not need all the information he receives through the visual channel; however, this is hard to determine since a large part of this information is received unconsciously and may affect the understanding of the source text or the delivery of the interpretation without the interpreter being aware of it. It should ideally be up to the interpreter to decide which visual information to utilize and which to ignore, rather than depriving the interpreter of this communication channel entirely or partially (e.g., remote interpreting). By the same token, I would suggest that there are benefits to using visible live speakers (as opposed to audio or video tapes) in interpreter training. In addition to making the training situation more natural, providing visual contact and teaching the students to utilize this information, it could also teach them to make their own text production more lively by using the entire triple structure of oral communication – language, paralanguage and kinesics.