Sound Synthesis, Representation and Narrative Cinema in the Transition to Sound (1926-1935)

  Maurizio Corbella
  Anna Katharina Windisch

  Maurizio Corbella
    Università degli Studi di Milano

  Anna Katharina Windisch
    Universität Wien

The trouble about these beautiful, novel things is that they interfere so with one’s arrangements. Every time I see or hear a new wonder like this I have to postpone my death right off.

Mark Twain (1906)

September 29, 1906, New York City: in a building on Broadway and 39th Street, a huge electronic instrument called the Telharmonium [1] delivered a particular kind of musical performance. From that day on, and with a certain regularity until World War I, live music played on the Telharmonium keyboard was transmitted four times daily to various public and private venues via telephone cables: a number of cafés, restaurants, theatres and private homes, including Mark Twain’s, were filled with dynamo-generated music, or rather Musak, as someone later called it (Lincoln 1972; Chadabe 1997, p. 3). The reporter covering the event for the New York Times chose the adjective “magic” (anonymous 1906) to evoke two aspects of the musical event: on the one hand the effect (still novel although not unprecedented [2]) of experiencing music acousmatically (Chion 1999), “searching the room for the source of the sound” and locating it in “the bank of flowers in the centre, where, hidden in the depths is a horn” transmitting “full organ tones” (anonymous 1906); on the other hand, the astonishing fact that the music was not produced by either acoustical instruments or a phonograph at the other end of the cable:

For all that has been seen the music might be the reproduction of a wonderfully versatile and sweet-toned phonograph. But this is not reproduction. On the contrary, it is the original production of music in a simple telephone receiver. The secret of what produces the sounds is in the hidden chambers below

anonymous 1906, emphasis added

Through the suggestive images of the “musical flowers” and the “hidden chambers below” concealing the somehow eerie secrets of the Telharmonium, [3] this report provides us with a paradigmatic scenario that can subsume the cultural implications of sound synthesis throughout the following decades: a scenario in which sound synthesis would function as a “magic” vehicle for fantastic synaesthetic imagination (e.g. flowers producing music) and perturbing uneasiness (e.g. hidden chambers shielding mysterious secrets).

“Sound synthesis” is an expression that neither the New York Times reporter nor the Telharmonium’s inventor Thaddeus Cahill would have used: the terminology for electronic and computer music did not in fact become established until the 1960s. In this essay we use sound synthesis in its broad etymological meaning (syn+thesis = put together, combine) to stress the creation of novel sound objects through the transformation, assemblage and combination of heterogeneous input signals, such as electro-magnetic waves, light and sound. Sound synthesis is often intended in the stricter meaning of generating acoustic signals through procedures of calculation, such as algorithms and electronic methods (additive, subtractive, filter synthesis, etc.). Montage-related and re-recording techniques, which were crucial to musical currents such as tape music and musique concrète, would then be excluded from the strict definition of synthesis. Nonetheless we believe that in popular culture (within which cinema plays a major part) such distinctions are less significant, and these typologies can be subsumed under the broad category of synthesis: after all, both sound montage and re-recording, which held a fundamental role since the very beginnings of sound cinema, are conceptually related to the notion of sample-based synthesis.

After the Telharmonium, which can legitimately be considered the first synthesizer, [4] a range of electric synthesizers proliferated in the 1920s, such as the Theremin, the Ondes Martenot [5] and other, less famous prototypes. Meanwhile, in the film studios that were perfecting the art of synchronizing sound and image, a form of optical, non-electrical synthesis was devised: drawn sound, commonly described by sound engineers of the period as “synthetic sound,” a technique for generating and controlling sound through “graphic markings made directly onto film” and “played back using a film projector and a conventional sound system” (Davies 2001).

As is evident from this short overview, the development of sound synthesis techniques underwent a particular intensification in the period of cinema’s transition from silent to sound film (ca. 1926-1935) (Beck 2011, pp. 67-70). In this essay we explore the way sound synthesis intersected this fundamental discontinuity of cinema’s history and complemented, through its symbolic connotations, the construction of sonic realism in narrative films. We suggest that synthetic sound went to the core of the representational and narrative paradigms of the production of sound films, still in their infancy, and functioned as a largely overlooked counterpart to sound reproduction. Sound reproduction has been investigated in recent years [6] as a vehicle for the formation of notions of realism in narrative sound cinema. A similar interrogation with regard to sound synthesis will show how narrative realism was indeed balanced by a paradigm of the uncanny that enabled cinema to reflect on its own nature as a medium. Sound technologies had indeed co-existed with cinema from the very beginnings of media culture, when, rather than discrete domains, the media were “nebulae” of encounters, experimentations and negotiations between different forms of expression and performance (much like today, we might add). [7]

This essay will thus develop an idea of sound synthesis as a cross-medial vector capable of informing the cultural imagery of the early sound film era. It is structured in four parts. We start by elaborating on the intricate overlapping of sound synthesis and film genres. Then we explore how the notion of sound synthesis related to audiovisual synchronization by triggering a metaphorical duality with sound reproduction. In the third part we discuss how sound synthesis implemented a different notion of disembodiment than the one promoted by phonography, and we investigate the potential of synthetic sound for shaping ideas of a virtual reality. Finally, we illustrate the role of synthetic sound drones and loops in inducing hypnotic setups and conveying senses of mechanized environments and soundscapes.

A Haunting Question: Sound Synthesis and Film Genres

Optical sound synthesis was independently conceived in the Soviet Union and in Germany. In the USSR it was developed by Arseny M. Avraamov and Yevgeny Sholpo and employed in the first, now lost Soviet sound documentary Plan velikikh rabot (Abram Room, Five-Year Plan, a.k.a. The Plan for Great Works, 1930) (MacKay 2005, p. 5); simultaneously, Grigory V. Aleksandrov and Sergei M. Eisenstein utilized it for the storm and animated sequences of the short film Romance sentimentale (France, 1930) (Burke 2007 and Davies 2001). In the context of the German Bauhaus, Rudolf Pfenninger, Edmund Meisel, László Moholy-Nagy, Oskar Fischinger and Paul Arma pursued researches that were mainly oriented towards demonstrative abstract films, such as Moholy-Nagy’s Tönendes ABC (1932) and Pfenninger’s Tönende Handschrift (1932). On the other side of the ocean, drawn sound was making its way to Hollywood: Ub Iwerks’ The Village Barber (1930) was pivotal in establishing some of the mickey mousing conventions for synthetic sound in animated film (e.g. glissandi, bouncing sounds and vocal utterances) [8] and Rouben Mamoulian’s Dr. Jekyll and Mr. Hyde (1931) was arguably the most effective early example of connecting synthetic sound to cinematic horror. We will return to this last example.

As for the Theremin, it was reportedly first used to accompany the silent film Aelita (Yakov Protazanov, Aelita, the Queen of Mars, USSR, 1924) in the score by Valentin Y. Kruchinin (Orton and Davies 2001), while its first use in a sound film was to underline a snowstorm sequence in Odna (Grigori Kozintsev and Leonid Trauberg, Alone, USSR, 1931) scored by Dmitry Shostakovich, [9] who was also responsible for the first documented cinematic use of the Ondes Martenot, in Vstrechnyy (Fridrikh Ermler and Sergei Yutkevich, Counterplan, USSR, 1932). Despite the Theremin’s success as a concert instrument after it arrived in the United States in 1927, we have no evidence of any use of the instrument in Hollywood before 1944, [10] when Robert E. Dolan introduced it in his score to Lady in the Dark (Mitchell Leisen, 1944), soon followed by Miklós Rózsa in Spellbound (Alfred Hitchcock, 1945) and The Lost Weekend (Billy Wilder, 1945) and by Bernard Herrmann in The Day the Earth Stood Still (Robert Wise, 1951). This latter film inaugurated the golden era of 1950s Hollywood science fiction, when the Theremin became a stable studio device together with other electronic sound resources. Nonetheless, the impact of the Theremin in the United States in the 1930s should not be overlooked: the theme music of the radio series The Green Hornet, broadcast from 1936 to 1952 (Wierzbicki 2002, p. 127), featured its buzzy synthetic effect mixed with an arrangement of Rimsky-Korsakov’s The Flight of the Bumblebee.

By undermining the “illusion” of documentation conveyed by sound reproduction, sound synthesis made it possible to conjure up acoustic dimensions that had never been experienced before, except possibly in dreams or altered states of consciousness, and to transpose them into narrative situations. Synthetic sound approximated cinema on narrative grounds, and contributed to the deep structure of film genres such as science fiction and horror. These genres are often viewed as allowing more freedom for “developments in sound technology as well as innovations in sound signification” (Whittington 2007, p. 4) and as driving the grammar of narrative cinema towards and beyond the borders of realistic representation (Deleon 2010, p. 10). [11] We could borrow Philip Tagg’s (2012, pp. 523-24) semiotic terminology to maintain that synthetic sound worked as a “style indicator” for sci-fi and horror, whereas it became a “genre synecdoche” [12] when deployed in different contexts, such as other genres (thriller, fantasy, mystery, melodrama, parody, etc.) or other media artefacts (e.g. popular music records). The means by which sound synthesis produced a relatively homogeneous range of symbolical correlates in popular culture were primarily associated with the timbre, articulation and spatiality of the generated sounds. Therefore, we should emphasize the “anaphonic” functions of synthetic sound (Tagg 2012, pp. 487-514), that is to say, the sonic, tactile, kinetic, gestural and spatial analogies triggered in listeners and filmgoers of a particular social and geo-cultural environment: for instance, the Theremin’s and the Ondes Martenot’s qualities of portamento and vibrato and their timbral resemblance to the human voice were crucial for associating them with spectral, extraterrestrial and disturbing anthropomorphic presences in films.

Nevertheless, this perspective raises a series of issues. The fact that sound synthesis and sound cinema are both technological means sharing a sonorous nature suggests that the cultural sensibility towards sound preceded and to some extent directed technological research, which in turn profoundly shaped cultural production and aesthetics. This hermeneutic circle leads us to challenge the way synthetic sound is frequently associated with cinema as a mere response to clichés and conventions of a given film genre. According to Rick Altman (1999), film genres are products of complex interactions between diverse agents such as producers, audiences and critics, and result from the combination of syntactic, semantic and pragmatic factors. Yet the notion of sound synthesis is itself negotiated by analogous agents and factors. In both cases, we deal with conceptual constructs in a state of flux throughout history and culture. Is the notion of virtual reality, for instance, a result or a premise of sound synthesis? Are the “ethereal” qualities of the Theremin and their associations with extraterrestrials a consequence of its timbre and performance style or are they rather products of the metaphysical speculations that pervaded early twentieth-century Russian theosophic circles? Borrowing Jonathan Sterne’s (2003, p. 5) programmatic idea that “the history of sound is already connected to the larger projects of the human sciences,” we will try to show that sound synthesis is simultaneously a cultural product and a source for narratives and paradigms of representation that shapes ensuing developments. It is our intention to point out that the dissemination of sound synthesis techniques in the 1920s afforded different interpretations to distinct socio-cultural milieus, such as the “avant-garde” and “popular culture.” Hence, sound synthesis is at once a meta-historical notion, spanning the pre-Telharmonium to the post-digital era, and a cultural artefact that crystallized in different forms throughout historical periods and geopolitical contexts.

“This is not reproduction”: Sound Synthesis and the Cultural Discourse

As Alan Williams (1992, p. 134) has remarked, Hollywood’s transition to sound did not necessarily imply a revolution in terms of stylistic paradigms, but rather caused radical adjustments in narrative content, to be observed in the decline of genres such as the stage melodrama and the success of other genres founded on the centrality of the voice and the dialogue setting, such as horror (Spadoni 2007 and Eugeni 2002).

At the basis of Hollywood talkies there is the technological reproduction of the quintessential human act: elocution, or the matching of language and speech act (embodiment of speech). Audiovisual lip-synch became a key feature of Hollywood’s language-centrism and disclosed its ideological nature as soon as the industrial propaganda insisted on its supposed naturalness:

By creating a new myth of the origins, it displaces our attention 1) from the technological, mechanical, and thus industrial status of the cinema, and 2) from the scandalous fact that sound films begin as language––the screenwriter’s––and not as pure image

Altman 1980, p. 69

This also explains the resistance to its application by film directors from other countries, particularly Alfred Hitchcock in the United Kingdom, René Clair in France and Dziga Vertov in the Soviet Union, who, together with the other signatories of the “Sound Manifesto”, opposed

the redundancy of synchronous sound by experimenting with the use of subjective sound, off-screen sounds, the “doubling” of voices, and manufactured sound effects

Beck 2011, p. 70

Both the talkies’ and the Sound Manifesto’s paradigm emphasize the inscriptive property [13] of recording technologies and foreshadow in different ways the pendant of synchronism, that is to say, the a-synchronism or, more broadly, the acousmêtre (Chion 1999, p. 19 and passim) as a creative resource for audiovisual narration.

It is on this subtle borderline between correspondence and non-correspondence of sound and image that we can try to understand sound synthesis. In the regime of cultural prominence accorded to recording, that is, technology’s capacity of trans-scribing reality and playing back a simulacrum of it in reproduction, sound synthesis works as a potentially divergent vector, as it skips the inscriptive principle that enables correspondence between sound and source or, rather, between reality and its mark left on an analogue support. But when sound synthesis gets reified in the framework of narrative cinema, it transforms into an essential complement of recording, as we shall illustrate.

As Rainer Maria Rilke (1987, pp. 1085-93) had imaginatively sensed, following the invention of the phonograph any groove on a surface could potentially become a source of sound. Yet what happens when the act of reproducing that sound does not imply a previous act of inscription? What happens if we apply the phonograph’s stylus to the grooves on a human skull, to the cranial coronary suture? “Before [Rilke],” noted Friedrich A. Kittler (1999, p. 44), “nobody had ever suggested decoding a trace that nobody had encoded and that encoded nothing.” It was an insight that László Moholy-Nagy (quoted in Passuth 1985, p. 291) turned into a theorization to renovate musical (and overall artistic) language, in his writings about the transformation of the phonograph “from an instrument of reproduction into one of production.” This insight also informed the first attempts at drawing sound on optical film. To make sound visible and vision sonorous had been an enduring obsession, from Ernst Florens Friedrich Chladni’s Klangfiguren (tone figures) in the eighteenth century, through the speculations on synaesthesia in the first decades of the twentieth century, and the technology of sound film finally made it possible (Levin 2003, p. 39). [14]

Nonetheless, the other face of this Janus-like phenomenon was also developing: the perturbing, uncanny and fascinating effects of synthesis were making their mark right across cultural reception. Kittler (1999, pp. 44-45) traced them back to Rilke’s intuition:

What the coronal suture yields upon replay is a primal sound without a name, a music without notation, a sound even more strange than any incantation from the dead for which the skull could have been used. Deprived of its shellac, the duped needle produces sounds that “are not the result of a graphic transposition of a note” but are an absolute transfer, that is, a metaphor.

This is, we infer, the metaphoric implication embedded in synthesis. In the domain of 1920s avant-garde Rilke’s intuitions were addressed to expanding the frontiers of art languages, but in the field of popular culture the metaphoric principle gained the upper hand. By the end of the decade, once the Theremin spread with its appealing performance style, metaphorical principles became almost inescapable, foreshadowing the radical fracture between “art” and “popular” music that was to characterize twentieth-century cultural discourse. To a certain extent, cinema can be considered one of the subjects of this controversial negotiation: while abstract cinema became the expression of an avant-garde dialogue between musical and cinematic experimentation––famously demonstrated by John Cage’s and Edgar Varèse’s writings [15] (Manning 2003, p. 6; Russet 2004, p. 111)—narrative cinema would suffer (with some notable exceptions) from a set of stigmatizations by “art” composers and music theoreticians. In the period we are examining, both these tendencies co-existed. On the one hand, the trade press and newspapers of the late 1920s and early 1930s repeatedly emphasized the metaphorical qualities of synthetic sound with sensationalistic accounts: music “drawn out of the air” (Anonymous 1929) was the common way to describe evocative performances on the Theremin, an instrument that “captivated the attention of much of the Western World, because it was performed in an unusual manner: without touching it” (La Rosa 2012, p. 4). Its “ether music,” [16] “which one plays by waving the hands before it” (Anonymous 1929a), was “the nearest thing to magic one sees nowadays” (Sutton 1933, p. 15). [17] On the other hand, Moholy-Nagy himself (quoted in Passuth 1985, p. 311) showed his suspicion of the Theremin when he wrote in a footnote of his article “Problems of the Modern Film”:

The distinguished scientist Theremin, inventor of ether wave music, provided a good example of a false method of approach, when he aimed at imitating the old orchestral music with his new ether wave instrument.

This assessment resonates with Cage’s even stronger claim: “although the instrument is capable of a wide variety of sound qualities . . . Thereminists act as censors, giving the public those sounds they think the public will like” (Cage 1961, p. 3). It is important to understand these positions not merely in terms of cultural biases: in Moholy-Nagy’s and Cage’s view, the use of the Theremin as a bizarre simulator of old instruments was preventing sound synthesis from revealing authentic new horizons of sound organization. If we extend this notion of simulation to the domain of cinema, we realize that sound synthesis compelled cinema to expose its deceptive ideology of realism and to react by developing meta-textual procedures that are crucial to genres such as horror and sci-fi.

“Re-voicing” Reality: Synthesis as Simulation

“On the evening of February 15, 1931 four journalists were invited to the London offices of the Producers Distributing Company, an American film distributor,” as documented by Jean-Marc Pelletier (2009). There, the engineer Eric Allan Humphriss, who had been working on the RCA Photophone sound-on-film technology, reproduced in front of the journalists the vocal line “all of a tremble” that he had manually painted in ink “on a strip of cardboard” and then “photographed onto the sound track of a blank film.” By all accounts the journalists themselves trembled. [18] Cecil Thompson (1931, p. 1) opened his report for the Daily Express by setting the atmosphere: “Four men sat in a darkened room in London”; then, astonished by the events, he imparted that his biases about the supernatural had been superseded by the progress of technology, “It was not a spiritualistic séance. There was nothing supernatural about the phenomenon. The experience can only be described as the birth of the world’s eighth wonder––the creation of the ‘robot’ voice”; finally, he portrayed his uneasiness, terror, almost horror, on hearing the robot speak:

There was silence. The “robot” voice had spoken. It was terrifying for the moment, almost horrible. I felt a tingle down my spine. . . . I heard a voice that was not a voice, words that had never been spoken

Thompson 1931, p. 2

The issue of disembodiment is co-substantial to the invention of the phonograph, as many have noted. Tom Gunning (2001, p. 16) remarked that

the phonograph had in effect separated the human senses, divorcing ear from eye, and . . . Edison’s original intention in pursuing motion pictures was to bring them back together.

The contemporary flourishing of mechanical automatons and “audio autographs”––recordings of famous personalities for “posthumous revivals” (Christie 2001, p. 5)––mirrored the spectral and phantasmal metaphors associated with moving images (p. 9). However, when Humphriss’ method was first utilized in a feature film, most likely Born to Love (Paul L. Stein, USA, 1931) starring Constance Bennett (Pelletier 2009), the synthetic voice marked the difference from phonographic disembodiment: while audio autographs implied the recollection of a (late) existent voice, synthetic voice “had never been spoken.” Synthetic voice finally found its specular equivalent in film: a “dis-voiced” body. Bennett’s voice was materially replaced every time she pronounced the name of a character, in order to avoid his identification with a living British peer. [19] John Kobler Jr. (1932, p. 6) reported the episode in the Winnipeg Free Press:

In a recent film, Constance Bennett was heard utter coyly the name of a fictitious British peer. Her public was fascinated by the way the name rolled from her tongue with just the proper intonation. But “Connie” never spoke that name. Neither did anyone else. And I am not kidding. The voice that spoke those words belongs to no being earthly or spiritual.

[Emphasis added.]

As we can infer from this example, sound synthesis had the ability to creatively “re-voice” bodies, providing them with constructed new timbres, potentially more perfected than the “originals.” If the disembodiment of the phonograph was founded on a separation between voice and source, the re-voicing via sound synthesis decisively tended towards total artificialness, or as we would say today, virtual reality. The striving for the perfected musical instrument that had animated Cahill’s lifetime research devoted to the Telharmonium [20] was paralleled in the age of sound film by the most utopian of desires: the perfect synthetic voice. Thirty years would pass before computer synthesis took over this mission, when in 1961 IBM created the first song ever sung by a computer, Daisy Bell—an episode that would strikingly resonate in a memorable scene of 2001: A Space Odyssey (Stanley Kubrick, 1968)––but already in 1931 Kobler (1932, p. 6) praised Humphriss’ outstanding achievement:

When I spoke to Humphriss he pointed to another strip of film on his desk. He told me it was the world’s perfect voice. It wasn’t Caruso’s and it wasn’t McCormack[’s]. It was no one’s. This time it was singing. Humphriss has succeeded in making a throatless voice sing an entire octave. It is perfectly recorded––a bass now, but with a few strokes of the pen it can be a tenor or even a soprano.

Imitations of Caruso notwithstanding, voice was prevalently reworked through optical synthesis to produce uncanny animal cries, as Carl Dreher (1931, p. 759), the director of the RKO Sound Department, recalled:

Sometimes synthetic sounds must be devised. An amusing instance occurred in a current picture in which it was required to imitate the noises of prehistoric animals. No one knew what noises these animals made, but it was left to the sound effects man to put in something which would be interesting and plausible.

Examples of similar uses can be found in the high-pitched cries of Dr. Pretorius’ small human figures in Bride of Frankenstein (James Whale, USA, 1935) and in the terrifying roars of King Kong in the eponymous film (Merian C. Cooper and Ernest B. Schoedsack, USA, 1933), although here re-recording techniques were preferred to optical sound synthesis (Whittington 2007, p. 71). [21]

Primeval Fears: Synthetic Drones, Loops and the Sound of Machinery

In his well-known analysis of Das Testament des Dr. Mabuse (Fritz Lang, Germany, 1933), Michel Chion (1999, pp. 31-57) noted how the character of Mabuse is deprived of a voice, for it is never possible to clearly identify his physical presences (the aphasic patient of Dr. Baum, the ghost, Dr. Baum himself) with his vocal conformations (Baum’s “real” voice, his voice reproduced by the gramophone, the ghost’s voice): “never is [Mabuse] seen speaking except as a ghost or in superimposition, endowed with the eerie voice of an old witch” (p. 31), the “androgynous squeaky whisper of a possessed sorcerer” (p. 35).

Chion does not comment on the musical cue underlying the central sequence, wherein a misshapen ghost takes possession of Baum’s body and mind. This cue holds a central role in conveying the eerie attributes of the ghost’s voice. An irregular pulse of the timpani underpins a cluster of high-pitched long notes. The association of percussion instruments with a series of still shots of tribal masks, skulls and expressionist faces painted on the wall of Baum’s studio strongly suggests a primeval anthropomorphic fear which the spectator would associate a few instants later with the close-ups of Baum’s hypnotized gaze and the bug-eyed ghost. At the same time the high-pitched drone, arguably crafted by optical synthesis or manipulation of acoustic sounds, [22] conveys an aseptic and clinical mood for the whole scene, accentuated by the fact that the drone unfolds during the sequence through the montage of unrelated “harmonic” blocks. Ethnomusicologist Judith Becker (2004, p. 21) has observed that a sudden noise or unexpected bright light were standard devices used to induce “cataleptic trance” in clinical hypnosis sessions at the turn of the twentieth century. In this respect the synchronization between Baum’s transfixed gaze into the camera and the fortissimo in the high-pitched drone strikingly recalls Becker’s description. A third sound layer, consisting in shorter single notes resembling a clarinet timbre, surfaces in the mid frequencies when the ghost stands up and “enters” Baum’s body. Finally, a stronger stroke on the timpani marks the end of the hypnosis “session.”

Rouben Mamoulian’s Dr. Jekyll and Mr. Hyde shows striking similarities in the creation of a transfiguration scene. Before drinking his potion, Dr. Jekyll (Fredric March) takes a look at the model skeleton in the studio, thus establishing once again a parallel between human concrete corporeity and science’s promise to transcend the bodily burden and open new horizons of plural identities. [23] Mamoulian presents the entire transformation through a virtuoso subjective sequence shot, in which a vertiginous circular movement combines with blurred and dazzling visual effects and with a particularly memorable sound mix. The “sound stew,” as Mamoulian called it, claiming authorial responsibility for this particular element, is a sophisticated achievement in the history of sound synthesis. As Neil Lerner (2010, pp. 66-73) has illustrated, [24] three sound layers form the texture for the first transfiguration of Jekyll into Hyde: candlelight photographed on the soundtrack, [25] a gong’s reverberation played back in reverse and the recording of a heartbeat. The way the Paramount press-book advertised the heartbeat element is of particular interest:

To obtain the “boom boom” of the heart-pump, the microphone was held over March’s heart. The sensitiveness of the instrument boosted the sound past that which one naturally hears while holding an ear over a heart to the quality attained by listening through a stethoscope

quoted in Lerner 2010, p. 68

The production company stressed the “stethoscopic,” that is to say, the hyper-realistic function of the microphone as a means to produce bodily intimacy, and it did so to the point of misreporting the events by identifying the sound source as March’s heart, although we know from Mamoulian’s interviews that he recorded his own heart (Lerner 2010, p. 70). This last detail is of course of enormous relevance for the issue of identification implied by the artifice of synchronization. After all it appears totally natural to assign the heart to the subjective point of view of the shot, and ultimately to spectators themselves, who experience the transfiguration literally in first person.

The proximity between the heart’s “boom boom” and the use of percussion drums as primeval rhythms opens up cultural issues that go beyond the scope of this essay. [26] Still, it is important to point out the divergent (and yet related) directions opened up by percussion sounds within films of that period. Stressing the continuity between human inner rhythmic circularity (heartbeat, breathing) and the mechanized rhythms of industrial machinery, films thematized the idea of technologies as human prostheses and drew emotional and narrative consequences from it. We are of course referring to Chion’s acute discussion of the “acousmachine” in Das Testament des Dr. Mabuse, in which the ticking of the bomb threatening Kent’s and Lili’s lives and the acousmatic industrial pulse pervading the opening sequence are associated with “the same impossibility of seeing the heartbeat of the terrible acousmatic machine” (Chion 1999, pp. 37-42).

Circular, repetitive and inexorable noise loops are topical presences in horror films and are well illustrated by the frequent occurrence of potentially threatening machineries such as mills, presses and engines. [27] In the final sequence of Vampyr (Carl Theodor Dreyer, The Vampire, Germany, 1932), the low frequency pulse, probably produced by slowed-down timpani strokes mixed with industrial noises, is looped to evoke a striking correlation with the close-ups of gears and mechanisms of the mill, where the village doctor is eventually to be killed. This scene is rendered hauntingly beautiful by the alternate audiovisual montage with the dreamlike voices (again: voices!) of the runaways crying—almost singing––“Hallo!” over a celestial high-pitched chord.

The combination of machinery and voice that we have observed in Vampyr is paralleled with different consequences in White Zombie (Victor Halperin, USA, 1932), one of the most discussed independent American horror movies as well as the first sound film ever to portray zombies. Peter Dendle (2007, p. 46) has explained the connection between the Haitian people and zombiism:

Ghosts and revenants are known worldwide, but few are as consistently associated with economy and labour as the shambling corpses of Haitian vodun, brought back from the dead to toil in the fields and factories by miserly land-owners or by spiteful houngan or bokor priests.

The Haitians are constantly accompanied by drum and choral music, stressing both their cultural “otherness” and the tribal rituality of their society. Clichéd racial subtexts notwithstanding, percussive and vocal profiles play a major part in setting up a double narrative layer pervading the whole film: more directly than in the previous examples, percussion here subtly evokes, from the very opening titles, the disquieting implications of social organization, set against repetitive modules in which individuals lose their importance. The function of the voice is more ambivalent, since the funeral chant of the opening titles serves to connect the Haitian credence in zombies with the unfolding of the plot, whereas another chant, this time arranged in an African-American spiritual style, functions as a somehow hypnotic call directed by half-dead Madeleine to rescue her husband Neil, through the voice of the dead Haitians. The undead are portrayed throughout the film as silent ghostly characters, with the exception of an episode set in a mill where they work under the voodoo spell of Legendre (Bela Lugosi). The sound profile of the mill sequence is an effective counterpart to the Haitian ritual chants. In this disturbing sequence, it is the incessant labour, rather than the literally alienated workers, which produces the sound. The scratchy, undulating sound contour, resembling effects obtainable with extended techniques on a cello or an upright bass, is undoubtedly a prime example of sound manipulation which still preserves its disturbing effects. Furthermore, the looping of the sound episode, a montage technique that anaphonically evokes the hopeless repetitiveness of the industrial production line, offers a glimpse of the reasons why this film produced a considerable volume of literature dealing with the representation of social and racial issues. [28]


This research represents a preliminary insight into the phenomenon of sound synthesis in early sound cinema. Much is left to be done in recovering production files and documentation that could complement the abundant accounts spread throughout the trade press and newspapers. Further prospective threads of investigation will regard the utilizations of synthesizers such as the Theremin and the Ondes Martenot in late silent films as well as in other performance arts in order to outline dramaturgical continuities and discrepancies with sound films. As is evident from our analyses, case study research into the sound design of single films shall help unearth poetics and strategies of relating synthetic sound to narrative paradigms and issues of cinematic representation in the early sound era.

