Article body


One of the founding principles of modern linguistics is the phoneme. It entails the concept of différence (Saussure 1916) or opposition (Ogden 1932), as a fundamental feature of the structure of language. As is well known, it is defined broadly as any perceptually distinct phonic cue in a specified language that distinguishes one word from another.

The topic of the psychological reality of the phoneme became, almost from the outset, a major area of interest among linguists, semioticians, and psychologists. Some of the most interesting offshoots of phoneme theory are the ideas proposed by the early structuralists working within the so-called Prague School (see Toman 1995; Sériot 2014), in particular regarding sounds as “imitative” or “modeling” devices and thus comprising an inherent audio-aural iconicity in the generation of verbal signs. One of the explanatory frameworks that developed from this conceptualization was, of course, that of sound symbolism, also known as phonetic symbolism, phonesthesia, or phonosemantics, which posits essentially that phonic structures carry meaning in and of themselves. This has, in turn, led to various linguistic and psycholinguistic studies on the connection between phonemes, perception, and conceptualization. Does the aural nature of a phoneme suggest a referent or, vice versa, does a referent evoke a specific phonemic response? With the advent of cognitive linguistics and its emphasis on language as a transformation of bodily experience into conceptual structures, interest in what can be called “phonemic modeling”, for lack of a better term, has increased considerably. This can be defined simply as the use of phonemes to suggest meanings at various levels, from words to figural assemblages.

This paper will review relevant work within sound symbolism theory and then attempt to put forth a framework for describing phonemic modeling, based on fundamental notions developed by the Tartu School of semiotics (see Lotman 1991; Sebeok & Danesi 2000; Andrews 2003; Aleksei 2012). The particular perspective taken here is that any phonemic element, isolated from its occurrence or recurrence in words, is a modeling device that leads to the construction of words, remaining a phonemic cue that allows for a concatenation of other structures. In other words, the phoneme is the structure that leads to syntactic and figural assemblages by virtue of its suggestiveness. The underlying psychobiological hypothesis is that the phoneme is part of a process of embodiment into form and thus into cognition. The term is defined concretely by Rosch, Thompson & Varela as follows :

By using the term embodied we mean to highlight two points : first that cognition depends upon the kinds of experience that come from having a body with various sensorimotor capacities, and second, that these individual sensorimotor capacities are themselves embedded in a more encompassing biological, psychological and cultural context.

1991 : 172-173

Semiosis is based on some property of the senses, and in the case of word formation and its extensions, that property is sound perception.

Sound Symbolism : An Overview

The starting point for any examination of phonemic modeling is, of course, sound symbolism theory. This has made it possible to establish a repertory of facts from which the theory of phonemic modeling can be derived. But even well before the advent of sound symbolism theory in the twentieth century, ancient writers typically linked sound, writing characters, and meaning speculatively. For instance, in ancient Chinese writings, as Schuessler (2007) points out, words with /m/ were associated with something black, words with /n/ with something soft or flexible, and those with /k/ with some abrupt action. In Plato’s dialogue Cratylus (see Plato 2013), Socrates suggests that words are originally constructed with sounds that reflect some property of their referents. However, the many counterexamples given to Socrates by his interlocutor, Hermogenes, lead Socrates to admit that his view was, after all, highly speculative.

The concept of sound symbolism was discussed indirectly somewhat during the Middle Ages and the Renaissance. But ultimately it was dismissed as specious, as Locke (1690) argued in his Essay on Human Understanding and then by Leibniz (1765) in his New Essays on Human Understanding. Locke made the argument that if sound symbolism was a principle of language creation then we would all be speaking the same language, betraying a universalist view of human cognition. He maintained that the relation between words and their phonetic structure is arbitrary, with only a few onomatopoeic exceptions. Leibniz expressed a similar view, but attenuated Locke’s dismissal by stating that the relationship between words and their phonetic make-up is suggestive, rather than purely arbitrary. As Foucault (1994) pointed out in The Order of Things, post-Renaissance philosophers like Locke and Leibniz saw knowledge as based on difference, rather than resemblance – the model of knowledge held by pre-Renaissance thinkers. Foucault’s view is prefigured in Giambattista Vico’s New Science (see Danesi 1993), where the origin of language is described as being based on an inherent creative audio-aural process called “poetic logic”, whereby language and sound merge to produce words and ideas that are forged through creative associations. The same view was held a few centuries later by Russian psychologist Lev Vygotsky (1962) who traced the emergence of language in children to the same kind of poetic logic and thus to a sense of the interconnectedness between sound, meaning, and bodily experience. Actually, Wilhelm von Humboldt (1836) was among the first to argue that words and their referents are always interconnected in the process of creation – a theory that has, of course, had its descendants in relativity theory as formulated by Boas (1940), Sapir (1921), and Whorf (1956).

Arbitrary theory as expressed by Locke found its most coherent articulation, of course, in Saussure (1916). Saussure argued that the connection established between the physical structure of a sign and its meaning is an arbitrary one, developed over time for some specific social purpose. There was no evident reason for using, say, tree or arbre to designate “an arboreal plant”, other than to name it as such. Indeed, any well-formed word signifier could have been used in either language. Like Leibniz, Saussure did admit, however, that some signs were fashioned in imitation of some sensory or perceivable property detectable in their referents. Onomatopoeic words, he granted, were indeed put together to simulate actual physical sounds. But, like Locke before him, he maintained that the coinage of such words was the exception, not the rule. Moreover, the highly variable nature of onomatopoeia across languages proved that it was itself a largely arbitrary sign-making process. For instance, the expression used to refer to the sounds made by a rooster is cock-a-doodle-do in English, but chicchirichì in Italian; and the expression employed to refer to the barking of a dog is bow-wow in English, but ouaoua in French. Obviously, representing what a rooster or a dog sounds like when it crows or barks is largely an discretionary process, depending on culture. Nevertheless, Saussure could not dismiss the fact that such words are highly suggestive of actual crowing and barking, no matter how different they may seem phonetically. Moreover, Saussure’s claim that onomatopoeia is a sporadic and random phenomenon in word-formation does not stand up to closer scrutiny, as pointed out by various key works on sound symbolism published in the latter part of the twentieth century (Hinton, Nichols, and Ohala 1994; Magnus 1999), which are pointing to a shift in the treatment of how words are formed, away from arbitrary theory to so-called motivation theory.

The serious investigation of sound symbolism started in the 1920s, extending through the 1930s and 1940s, with a series of studies looking systematically at word formation from a phonosemantic perspective (Sapir 1929; Bentley and Varon 1933; Tsuru and Fries 1933; Newman 1933; Allport 1935; Guillaume 1937; Bolinger 1949). These studies basically adopted Jespersen’s (1922) overall theory of language origins as a phonosemantic phenomenon. In the 1950s, Morris Swadesh (1951, 1959, 1971) championed sound symbolism, drawing attention to the fact that most of the world’s languages used front vowels (/i/-type and /e/-type) to construct words in which “nearness” was implied, in contrast to back vowels (/a/-type, /o/-type, and /u/-type) to construct words in which the opposite concept of “distance” was implied. In English common examples are here-versus-there, near-versus-far, this-versus-that, and so on. The same kind of phonosemantic opposition is found across languages to distinguish between this (implying nearness) and that/you (implying distance), suggesting that it might be a universal tendency. Examples include the following :

forme: 2060281n.jpg

Since Swadesh, a host of studies have appeared examining phonosemantic processes in diverse ways (Brown, Black, and Horowitz, 1955; Maltzmann, Morrisett, and Brooks 1956; Wertheimer 1958; Marchand 1959; Miron 1961; Taylor and Taylor 1962; Fónagy 1963; Weiss 1964, 1968; Aztet & Gurard 1965; Heise 1965; Reid 1967; Haas 1970; Wescott 1973; French 1977; Kim 1977; Koriat 1977; Fischer-Jorgensen 1978; Fónagy 1980; Jakobson & Waugh 1987; Allott 1989; Hinton, Nichols & Ohala 1994). Cumulatively, they propose that, beyond a direct phonesemantic relation between the sounds that make up a word and its referent, there is a latent, suggestive link between the primary phoneme that is used to construct the word and its extension in the construction of other words. For example, continuants are found typically in words that refer to things that are perceived to have “continuity”. The /fl/ cluster is found commonly in the make-up of English words that refer to things that move or run smoothly with unbroken continuity, in the manner that is characteristic of a fluid : flow, flake, flee, float, fly. On the other hand, the cluster /bl/, which consists of an obstruent, is found in words that refer typically to actions that involve blocking, impeding, or some other form of occlusion : block, blitz, blunt, blow. In effect, stop phonemes are found in words which refer to objects or actions that are perceived to involve “stoppage”, and continuants in words that refer to objects or actions that are perceived to involve “flow”. The framework being proposed here is that these are modeling devices that recur throughout a particular system of language, leading to the hypothesis that sounds are in themselves the originating force of language creation.

Magnus (1999) has amassed a significant corpus of data to show that phonemes have such a function, even though she does not directly identify the function as such. Words that begin with the same phoneme tend to coalesce around a similar core of meanings, whereas different phonemes suggest different clusters of reference which seem not to overlap among the phonemes. The phonosemantic relation is not one-to-one – it is “symbolic”, that is, one cannot predict what phoneme a given language will use for imprinting some audio-aural property of a particular referent into the formation of its words. The pattern emerges only when comparing large numbers of words. Recalling von Humboldt, Magnus puts forth four basic phonosemantic categories :

  1. Onomatopoeia is the least significant phonosemantic process, since it involves straightforward, intentional imitation of sounds in the phonemic make-up of a word : splash, pop, bang.

  2. Clustering refers to the fact that words share a phoneme cluster around a referential domain; so, if /h/ is used for house, then a disproportionate amount of words will start with /h/ within the same referential or lexical field : hut, home, hovel, habitat.

  3. Iconism is a modeling process that becomes evident when comparing words that have similar or analogous referents. For instance, words such as stomp, tramp, and step show or iconism among themselves whereby the phonemic pattern of /m/ + /p/ or /s/ + /t/ unconsciously produces a phonosemantic linkage. Needless to say, it was Charles Peirce (1938-1956) who called this form of meaning-making as being “motivated”, that is, guided by our sensory perceptual apparatus. And, of course, it was Peirce who introduced the concept of iconicity as a resemblance-based sign-making process.

  4. Phenomimes and psychomimes are “quasi onomatopoeic” words; these are words that imitate soundless referents called phenomimes when they encode external phenomena and psychomimes when they refer to psychological states. The word duck is a phenomine because it suggests the sound made by a “duck”, already encoded onomatopoeically with quack. A psychomime would be any emotive expression such as Ugh, which is meant to resemble some inner state indirectly.

Magnus (2013) has written perhaps the most up-to-date history of sound symbolism analysis with a summary of the relevant research findings. Another work that adds considerably to the reservoir of facts on sound symbolism is the anthology of studies put together by Max Nänny and Olga Fischer (1993). Genette (1976) has called this whole area of study mimologics, an apt term for the investigation of what has here been called phonemic modeling. Suffice it to say that the field of sound symbolism theory has shown that iconicity is a major force in word formation. The study of sound symbolism thus provides a unique kind of insight into language processes that may have been operative during the original formation of speech.

The validity of this theory as a psychological process, whereby sounds suggest referents and, vice versa, referents suggest specific phonemic forms, has been put to the test in various psycholinguistic studies. A classic one is by Roger Brown (1970 : 258-273), who asked native speakers of English to listen to pairs of antonyms from a language unrelated to English and then to try to guess, given the English equivalents, which foreign word translated which English word. The subjects were asked to guess the meaning of the foreign words by attending only to their sounds. When he asked them, for example, to match the words ch’ing and chung to the English equivalents light and heavy, not necessarily in that order, Brown found that about 90% of English speakers correctly matched ch’ing to light and chung to heavy. He concluded that the degree of translation accuracy could only be explained “as indicative of a primitive phonetic symbolism deriving from the origin of speech in some kind of imitative or physiognomic linkage of sounds and meanings” (Brown 1970 : 272). More specifically, words constructed with the vowel /i/ have a perceptible “lightness” quality to them and those constructed with /u/ have a “heaviness” quality. This perceptual differentiation shows up in the kinds of meanings assigned to the words themselves.

Sound symbolism is not a very satisfactory term to cover the various facets of the phenomenon, as can be seen even by the cursory survey above. A better description for this might be “naturalness” (Peterfalvi 1970; Wellems & De Cuypere 2008). Bolinger (1963) prefers the semiotic notion of iconicity. Whatever designation we adopt, there is really no doubt today that this inherent principle of word formation can be dismissed. Arbitrariness theory is an ideal; the fact that we use our sensory apparatus in the creation of signs is now virtually a law of semiosis. The evidence in favor of phonemic modeling, as it has been called here, is overwhelming. Already in 1922, Otto Jespersen suggested that this process was not only a force in the initial formation of language, but one that operated continually to shape words according to their senses.

Empirical Evidence

Linguistic research on sound symbolism is based typically on comparisons among languages or on the analysis of phonemic systems and their functions in word formation within languages. The question arises : Is there any corroborative psychological evidence other than these analyses? The experiment discussed above by Brown is one example of how sound symbolism has been examined within the psychological sciences, giving it validity beyond the purely descriptive.

Already Piaget (1955) discovered that children uniformly relate the sounds of words to the objects to which they refer. Piaget concluded that for a child, every object seems to possess a necessary name reflecting the object’s nature. The same type of finding was documented extensively by Vygotsky (1962). Sound symbolism is sometimes explained as a form of synesthesia and thus that phonemes may well be synesthetic responses to the world of sound that is processed as meaningful information by the brain. As we recognize sound properties in one domain of reference, it is a small step to projecting these properties onto other referential domains – constituting a veritable form of conceptual blending in the Lakoffian sense of the term (Lakoff & Johnson 1980, 1999; Lakoff 1987). Sounds and shapes in objects are commonly abstracted and projected onto domains that have some perceived resemblance to them. Words beginning with, say, /v/, involve a metaphorical property that can be related to life and vitality – vigor, velocity, vacillation, veering, varying, and so on. These forms are now the basis for metaphorical expressions such as : “Life goes by too fast” (velocity); “One must live vigorously”; and so on. The connection between sound symbolism and figurative cognition has only sporadically been explored, but it is definitely a pattern in linguistic cognition that requires further study.

In order to cover the extensive possibilities of sound symbolism as a basis for linguistic cognition and, ostensibly, other creative semiosic processes, the concept of phonemic modeling is proposed as the originating force in semiosis of this kind. It can now be defined more specifically as the use of individual phonemes as the elemental guides to word formation. This leads to assemblages ranging from the level of individual words to the level of syntax and figural structures, whereby the latter are all interconnected to the latent properties of the phoneme. The term ideophone is sometimes used to describe phonemic modeling phenomena across certain languages. This is a word or expression that refers to some sensory referent. Brown’s experiment above is an example of how we associate sound to image. But phonemic modeling breaks down the ideophone even further to suggest that the meaning potential of sound is itself the guide to ideophonic phenomena.

As a Gestalt-based psycholinguist, Brown’s experiment recalls those of other Gestalt psychologists such as Asch (1950), who essentially experimented with the concept of ideophone, without labeling it as such. Asch examined metaphors of sensation (hot, cold, heavy, etc.) in several unrelated languages as descriptors of emotional states. He found that hot stood for rage in Hebrew, enthusiasm in Chinese, sexual arousal in Thai, and energy in Hausa. This suggested to him that, while the specific emotion implicated varied from language to language, the metaphorical process did not. Simply put, people seem to think of emotions in terms of physical sensations and express them as such. Moreover, the sounds of the words used indicate a sound symbolism that is essentially ideophonic. In English, too, the metaphor hot is based, in part, on the association of /h/ to a state of reacting phonically to the sensation of internal heat. It is a small expiration that alludes phonemically to perceived aspects of heat. Brown (1958 : 146) commented on Asch’s findings, stating that there is “an undoubted kinship of meanings” in different languages that “seem to involve activity and emotional arousal”, and furthermore that this “kinship” is revealed through metaphor.

Peterfalvi’s 1970 book is one of the first to review experimental studies on sound symbolism, allowing us to conclude already in the 1970s that the psychological literature points unequivocally to phonosemantics as an unconscious linguistic force, as Jespersen called it, in verbal creativity and verbal processing in general. The evidence also suggests that there is not one or several manifestations of this force, but rather that it crosses all linguistic levels, including the syntactic one, where word order or word relationships in sentences are iconic rather than based arbitrarily on some rule-making system (Haiman 1985; 1992). The complexity of the phenomenon has always been widely known. Brown (1958) gives the example of Samoan ongololo, referring to “centipede”, as an example of how the syllables in a word correspond to the number of distinct elements in the sound, object, or action. The same process is extended to shapes. In Chinese, visual contour leads to a sound symbolic modeling of the feelings that the shapes evoke. This is why many of the Chinese classifiers (words indicating semantic category) are based on shape, such as morphemes that indicate long, flat and round objects, containers, pairs and sets. Incorporating the size of object and the length of word to encode meaning is commonly found throughout the world’s languages. The conclusion seems to be that words are aural-acoustic models, guided in their creation and interpretation by phonemic cues that transcend the purely phonological level, reaching into the grammatical and semantic levels of language.

The evidence for the reality of phonemic modeling is, in a word, strong. This does not in any way imply a universalist stance in language formation. It simply means that the propensity to use phonemic cues as models, and then for these models to remain latent in words so as to generate different assemblages, is part of verbal semiosis. The data suggest, basically, that the phonemic modeling of words leaves a residue in the words that is extended to other levels of language; this means that phonemics is much more fundamental to language creation than anything else. The sound symbolism of the whole word (sometimes called morphosymbolism) resonates with further suggestiveness, much like the poetry for children by Dr. Seuss or the late novels of James Joyce (Danesi 2004). Once a phoneme has been imprinted into a word it leads to a series of latent entailments between the sense and the form of other signs connected to the original word in some way. This is an imaginative process based on sense-making. Susanne Langer compared it, appropriately, to a “fantasy” :

Suppose a person sees, for the first time in his life, a train arriving at a station. He probably carries away what we should call a “general impression” of noise and mass. Very possibly he has not noticed the wheels going round, but only the rods moving like a runner’s knees. He does not instantly distinguish smoke from steam, nor the hissing from the squeaking. Yet the next time he watches a train pull in, the process is familiar. His mind retains a fantasy which “means” the general concept, “a train arriving at a station”. Everything that happens the second time is, to him, like or unlike the first time. The fantasy… was abstracted from the very first instance, and made the later ones “familiar”.

1948 : 129

Recent work in cognitive science has been central in establishing phonemic modeling as having substance at the neurological level. In a 2014 study, Kanero, Imai, Okada & Matsuda, used functional magnetic resonance imaging to investigate how Japanese mimetic (phonosemantic) words are processed by the brain. In one experiment, the researchers compared processing for motion in mimetic words with non-sound symbolic motion in verbs and adverbs. They found that mimetic words uniquely activated the right posterior superior temporal sulcus (STS). In another experiment, they examined the generalizability of the findings by testing another domain : shape mimetics. The results showed that the right posterior STS was active when subjects processed both motion and shape mimetic words, thus suggesting that this area may be the primary neural system for processing sound symbolism. Increased activity in the right posterior STS may also reflect how sound symbolic words function as both linguistic and non-linguistic iconic signs. This association to the right hemisphere has many implications, since it is that hemisphere which has been studied abundantly as the source of meaning structures such as metaphor (Danesi 2004). If indeed a phonemic model is the ultimate source of figural assemblages, then one would expect this pattern of neuroscientific findings to establish a link between sound symbolism and figuration.

Phonemic Modeling

A major problem with the theories and research findings pertaining to sound symbolism in general is that there seems to be no connecting paradigm – no one theory of the phenomenon. More to the point of this article : How does one go from phonosemantics (onomatopoeia, for example) to metaphoric symbolism and beyond? Also, with respect to semiotic theory : How does sound symbolism relate to the basic definition of a sign as a structure that refers to something in some way?

The traditional goal of semiotic theory has been to figure out how signs are constituted and how they encode referents. One approach to realizing this goal that has borne some fairly interesting and insightful results comes from so-called Modeling Systems Theory (MST) as developed by the Tartu School of semiotics (see Sebeok and Danesi 2000). A model in this perspective is defined as anything that is made to stand for something else in a way that captures some essence of the referent – it can thus be a sign, a text, a code, and so on. Extending this theory, it can be said that any element within a sign or among sign structures can be modeled, so that the element itself may not be a meaning-bearing sign structure but a feature that is suggestive of meaning. So, in an onomatopoeic word such as quack it is easy to see that the entire sign structure (/kwak/) is itself phonosemantic. Now, one can assume that the primary phonic element in that sign is the /k/ which seems to simulate the sounds emitted by the animal in question; it is this element, imprinted in the word, that then becomes suggestive at different levels of various interconnected meanings, from the name of the animal that emits the sound, duck (/dәk/), to its metaphorical uses (quack as in “talk loudly”). The phoneme /k/ can now be defined as a modeling device, that is, as something that guides or perhaps triggers an association between sound and sense, as Jakobson & Waugh (1987) aptly described it. This implies a revision of the basic definition of the phoneme : it is a modeling device that leads to the creation of words and other structures. Its use as a feature of différence is a methodological or epistemological aspect of phonological systems; in other words, it has a function in language to keep units of meaning distinct. But its more fundamental use is as a modeling device, which implies that word signs can be thought of as composed of phonemic elements that are themselves suggestive of meaning. These suggestions, as it were, are then distributed throughout the levels of language.

By extension, signs that are based on different modalities, such as visual ones, can now be reconsidered in terms of MST. Although this is not the purpose here, it is likely that sensory modalities crisscross at the primary originating level of semiosis so that a phoneme might correspond to, say, a viseme, and that the two are isomorphic elements with a similar sign-making function.

As an example of how a primary system analysis of phonemic modeling would unfold, consider stop phonemes as they are found in words which refer to objects or actions perceived to involve “stoppage”, and continuants in words that refer to objects or actions perceived to involve “flow”. Here are other examples of this dichotomy in word construction (Crystal 1987 : 174) :

In MST there are three levels of modeling – primary, secondary, and tertiary. The above description refers to the primary level where phonemic modeling occurs – this is when the sound itself suggests meaning. Although these levels have been used to explain a broad range of semiosic phenomena, they can now be adapted to include the concept of phonemic modeling. As mentioned, this is a primary force in the creation of meaning and its expressive forms. It is the level at which phenomena such as onomatopoeia guide the formation of sign structures. The difference between sound symbolism theory and phonemic modeling theory is that in the latter each phoneme is already itself a pre-symbolic structure that bears with it suggestiveness of meaning. At a secondary level of modeling, this suggestiveness takes on a composite shape in the connection of words on the basis of the phonemic force of the originating word, and includes clustering, iconism, and the formation of phenomines and psychomimes. This secondary level is the level of extension, as is the case with all secondary modeling systems. At this level, the same phonemically-constructed forms are now projected onto broader domains of meaning and structure that are sensed to have some affinity with the primary forms. Finally, at the tertiary level, associative structures are interconnected more intricately, producing metaphorical, metonymic, and other figural structures. While the original phonemic cue may not be consciously recognizable, by deconstructing them into sound elements we can always recover it. So, a metaphorical expression such as “to duck under the bridge” can ultimately be decomposed into the /k/ phoneme as suggestive of the animal, its sounds, and the images that emanate from the associations with /k/—abruptness, lowering (of the head), and so on.

The overall interconnection of the modeling systems can be broken down as follows :

  1. Primary modeling system – based on single phonemes (onomatopoeic, simulative);

  2. Secondary modeling system – extending the phonemic models (clustering, iconism, phenomimes and psychomimes);

  3. Tertiary modeling system – combining structures through association (metaphor and other figural assemblages).

A highly simplified and schematic example of how this framework might be seen to operate in conversations is as follows (based on Sebeok & Danesi 2000). Take the phoneme /s/; among various other domains of reference, it is suggestive of sounds that refer, for example, to the movement of snakes. Thus, it is a primary phonemic model for constructing or using words referring to various aspects of snakes – hiss, slither, slippery, and so on. At a secondary level, the word snake itself is a phonemic consequence of this modeling tendency. Here too, words that begin with /s/ are sometimes constructed with this referent in mind. Finally, the use of the word snake in figurative ways to describe human personality (i.e. “He’s a snake in the grass) in this case completes the modeling system. At this level, other word structures can easily be linked to the original concept via the /s/ phoneme which is part of their phonological make-up : scammer, serpentine, slithery, slimy, and so on.

Of course, other linkages and combinations can coalesce at the tertiary level. But, ultimately, they can be traced back to the phonemic modeling possibilities of /s/, producing an “associative network” of interrelated meanings. The following brief stretch of recorded conversation between two students on the University of Toronto campus (Danesi 1999) shows how this associative network unfolds even in a simple conversation, one which is guided by the /s/ device and its concatenations to other modeling systems :

Student 1 :

You know, that prof is a real snake.

Student 2 :

Yeah, I know, he’s a real slippery guy.

Student 1 :

He somehow always knows how to slide around a tough situation.

Student 2 :

Yeah, tell me about it! Keep away from his courses; he bites!

Like an organism, which is made up of atoms that combine into molecules and then into organs, human semiosis involves composition, with each element in the composition linked to the others in some specific way. Michel Foucault (1972) characterized such intertwining of forms and sense as an endless “interrelated fabric” in which the boundaries of meanings are never clear-cut. Of course, this whole explanatory framework can be critiqued as reductive to a simplistic “phonic” view of verbal semiosis, but given the overwhelming evidence that has accrued on sound symbolism, it cannot be totally dismissed as an implausible hypothesis. This means that like a melody in music, a particular note (or key) has resonance and suggests all other notes in the melody, which in turn are interconnected to each other. A phoneme is very much like a key in music – it starts the process of invention and then guides the entire composition of linguistic forms. While all this has been implicit in various approaches to sound symbolism, it has never been articulated as such, to the best of my knowledge.

While there is no current empirical evidence available to back up this model of verbal semiosis, it is, as mentioned, supported indirectly by previous empirical work on sound symbolism from which it has been extracted. Gaining such evidence is part of an agenda for future research. One could potentially write a “phonemic grammar” of a language, showing how all its levels, from the purely phonological to the syntactic and beyond, are derivatives of phonemic cues.

Concluding Remarks

The purpose here has been to argue that phonemic modeling is a primary force in language formation and cognition, suggesting that we use our senses to construct originating forms. One of the great conundrums in semiotics and linguistics is explaining how we can locate the ultimate source of semiosis in the body. The descriptive apparatus of how semiosis occurs has been developed in great detail since at least the nineteenth century, but a viable explanatory framework of why it occurs in the first place still seems to be clouded in vagueness. Phonemic modeling is an attempt to provide some clarity as a suggestive sensory-based framework for understanding the formation of signs and sign assemblages in language.

The primary task of any science is to explain how and why phenomena are the way they are by means of suitable models or theories and, as new facts emerge or are collected about the relevant phenomena, to subsequently adjust, modify, or even discard the models and theories on the basis of the new data. In this piecemeal and cumulative fashion, the ideal of science is to explain the “final causes” of the components of reality. Sound symbolism was, before recent times, largely relegated to marginal status within general linguistics. As a result, a theoretical exploration of the “final cause” of this phenomenon fell by the wayside. However, since the advent of the embodied cognition movement of the late 1970s, the tide has changed, allowing theories such as the present one to be evaluated in the light of current scientific paradigms.

The notion of phonemic modeling arguably opens up a coordinating methodology for examining relations among subsystems (phonology, morphology, and so on) within language. Maybe it will open up a common ground for future research that integrates linguistic and semiotic theories and methods with empirical psychological ones. An indirect source of evidence for the reality of phonemic models in language and cognition, for example, is in the world of advertising where slogans, taglines, brand names and the like show a consistent isomorphism between phonemics and semantics (Danesi 2008). In a recent study, Aliyeh & Zeinolabedin (2014) chose English and Persian onomatopoeic structures randomly from different Internet sites and print sources as a basis for comparison. They concluded that some onomatopoeic activities in Persian and English were different; but these could be traced to the different species of animals, the different phonological or morphological systems of each language, and other such differences. Overall, however, the differences were minor. They also found a pattern of phonosemantic similarity in the ads used by both languages, finding that they were very similar in how they conveyed moods, emotions and actions through a common system of phonemic modeling. This area has received very little attention and may be a critical one in assessing how phonemic modeling works across domains of pragmatic use, from conversational structure to advertising.

There are many questions that phonemic modeling raises. I would like to suggest that only through empirical research can their validity be either corroborated or refuted. The study of semiosis as a study in modeling systems provides a specific agenda for conducting research that is truly interdisciplinary and apt to produce interesting results. The attractive aspect of MST is that it allows us to study semiosis in all its manifestations as a bodily-based phenomenon that produces interconnected modes of meaning-making.