Review of Making Sense of Recordings: How Cognitive Processing of Recorded Sound Works by Mads Walther-Hansen, Oxford University Press 2020.

Stephen S. Hudson

View PDF
Return to Volume 35

Suggested Citation

Mads Walther-Hansen’s admirably short monograph on cognitive metaphors for sound quality and timbre may not be filed under music theory, but lies just outside of a disciplinary boundary that is rapidly expanding towards it. This condensed theoretical text doesn’t look like music theory traditionally has; it contains no score examples, it analyzes a corpus of technical and journalistic writing about music instead of patterns of pitch and duration, and it cites very few card-carrying members of SMT. It has been published by Oxford, but not under their Studies in Music Theory series—and yet, it is an important contribution to the field that resonates with recent trends, especially in embodied cognition and topic theory. In this review, I will examine how Walther-Hansen’s book is compatible with established music-theory epistemologies, and propose ways in which aspects of his model could be adapted to make an even clearer fit for music theory’s norms of systematicity and rigor. Finally, I will discuss the advantages and consequences of this field including more research that departs from its traditional methodologies, which have usually focused on segmentation and classification of patterns of notes in a score. Specifically, music theory must include more research like Walther-Hansen’s work on cognitive metaphors if it is to describe listeners’ musical intuitions and experiences with a substantial degree of completeness or veridicality.

1.      Cognitive Metaphors and Music Theory

The goal of music theory has famously been described as a “formal description of the musical intuitions of a listener who is experienced in a musical idiom” (Lerdahl and Jackendoff 1983, 1; italics are original). Traditionally, most music theory has formalized musical knowledge as patterns of notes found in a score; then in analysis, these familiar patterns are demarcated in a score, and the resulting segmented score is often described as a map of composers’ or listeners’ musical expectations, understanding, or experience. Walther-Hansen describes a very different kind of musical knowledge, and provides a very different kind of formalization, in terms of “cognitive metaphors” (also known as “conceptual metaphors”), analogical mappings which we use to understand and experience one idea or domain of experience in terms of another. (Some famous examples include TIME IS MONEY, according to which one can save time, spend time, bank time, etc.; ARGUMENT IS WAR, by which arguments are experienced as conflicts, and one can take sides, concede territory, and win or lose; and HAPPNIESS IS UP, which means that sadness is down, motivating thoughts and expressions such as “she’s feeling down today” or “he’s over the moon.”) Walther-Hansen’s theory explains how we use cognitive metaphors to perceive sound quality, which he describes as “the timbral characteristics of the sound as it emerges in experience, rather than the characteristics of the sound source or the physical properties of the sound wave” (2020, 8).

“Cognitive metaphors” is a term made famous by Lakoff and Johnson (1980), and Walther-Hansen uses this term to describe how we process music by understanding sound in terms of other domains of knowledge or experience. Following Lakoff and Johnson, Walther-Hansen argues that metaphors are not merely poetic comparisons in artful language, but form the basic structure of virtually all human cognition of sound, ranging from our basic understanding of what sound is to timbral qualities like “heavy” or “wet.” (Walther-Hansen generally uses italics for sound qualities, even when they are considered as metaphors; metaphor names are in all-caps only when “functioning in the background of our cognitive system.” See Walther-Hansen 2020, 127.)  Some of these metaphors directly invoke physical qualities, such as when “heavy” sounds feel powerful and are low in pitch, qualities associated with large objects in human experience; but Walther-Hansen also argues that cognitive metaphors can be more abstract mappings with less immediate connection to physical experiences (2020, 3). For example, “wetness” does not describe a quality of sounds made by wet objects, but by convention refers to the amount of reverberation, and this mapping is one which is usually learned from encounters with the discourse of sound producers, rather than one which is intuitively understood.

Cognitive metaphors may even be used to understand unfamiliar, unconventional, or even unreal descriptions (see Walther-Hansen 2020, 51). For example, a recent obituary of ZZ Top’s late bass player, Joe Michael “Dusty” Hill, quoted Hill’s crude, self-effacing parody of rock guitarists’ self-obsessed equipment-talk as a testament to Hill’s shy but off-color character: “Someone once asked me to describe my tone, and I said it was like farting in a trash can. What I meant is it’s raw, but you’ve got to have the tone in there” (Risen 2021). Of course, Hill’s guitar tone doesn’t literally sound like farting in a trash can, and is quite easily recognizable as a bass guitar; but this description directs our attention towards particular salient aspects of bass guitar timbre and allows us to experience them in a new way. Even if we have never been in the situation Dusty Hill described ourselves, we can imagine what he might have meant by drawing on our previously separate experiences of flatulence and of large metal cans. Part of why Hill’s joke is so effectively raunchy is that our understanding of his timbral description is already physical, even though it references an imaginary experience. At the same time, it serves as an immediately understood and highly memorable metonym for Hill’s bass guitar tone, a metaphor that (for me, at least) gives vivid character and renewed physical impact to a sound that had not previously caught my attention.

2.      Chapter Outlines

Walther-Hansen’s book proceeds from the assertion that any conscious perception we have of timbre is filtered through cognitive metaphors for physical qualities. To that end, the book is split into two halves, Part I “Foundations and Theory,” followed by a concrete survey of conceptual metaphors in Part II “Encyclopedia.” Before Part I, the Introduction explains the concept of “cognitive metaphors,” and justifies the author’s choice to focus exclusively on these metaphors rather than sound spectra (this may be one reason why the author engages so little with existing analytical studies of timbre, which often are grounded in spectrographic analysis, such as Cogan 1984, Fales 2002, Berger and Fales 2005, and Lavengood 2020). Chapters 1–3 in Part I subsequently develop this framework, contextualize it in history, and interface with other fields. Chapter 4, “An Encyclopedia of Selected Sound Terminology,” comprises all of Part II and defines the most common cognitive metaphors from the author’s corpus of writing about sound recordings.

Chapter 1 explores the evolution of sound recording media and traces the development of one particular cognitive metaphor for sound, showing how the early assumption that a sound recording captured a more-or-less veridical record of reality (the THERE IS ONE REALITY metaphor, 32) gradually evolved into an understanding that sound recording could capture that reality from many different perspectives, and even fabricate sonic unrealities (the MULTIPLE REALITIES metaphor, 41). Many of the examples in this chapter are well-rehearsed objects and scenes from sound studies scholarship on the history of recording and listening (such as Sterne 2003). Walther-Hansen’s contribution here is to formalize the evolution of particular cognitive metaphors during this history. In doing so, he introduces a key move from cognitive linguistics that underlies much of this book, the argument that the structure of discourse represents the structure of cognition; the evolution in words for describing how sound represents reality indicates an evolution in ways of thinking.

Chapter 2 explores “ontological metaphors,” or metaphors we use to understand the nature of sound itself. For example, we often talk about something happening “in” the sound, or something “sticking out too much” from the sound; these descriptions draw on the SOUND CONTAINER metaphor (2020, 60–62). This chapter also explores how the framing of sound in terms of cognitive metaphors relates to previous scholars’ philosophical positions on whether we hear sounds as representations of real-world objects, or as purely sonic events (for example, the latter position is represented by Pierre Schaeffer’s “acousmatic listening”; see Schaeffer 1966, Chapter IV).

Chapter 3 extends the methods of the previous chapters to nonverbal dimensions, exploring cross-domain mappings between timbre and color, physical shape, and smell. This chapter is more speculative than the other chapters; for example, several of the author’s arguments culminate in a prediction that the future of audio interfaces will represent sounds as tactile, physical shapes through the SOUND CONTAINER metaphor, replacing the now dominant SIGNAL FLOW metaphor in sound engineering. Lakoff and Johnson’s concept of “image schemas,” referenced throughout the book, is discussed at more length here to explore our sensorimotor experiences of sounds, resonating with music theorists’ similar applications of this idea though the author does not cite this work (for example, Cox 2016 and Zbikowski 2017). The chapter ends with a compelling argument that cognitive processing of sound quality works best if metaphors, discourse, action, etc. have the greatest possible fit or resonance between different sensory domains.

Throughout this book, but especially in Chapter 3, Walther-Hansen often considers the actions and experiences of producers and sound engineers, while rarely discussing fan culture in much depth. Cognitive metaphors for timbre have an enormous impact on fans’ listening experiences and relationship with their favorite styles, so this omission is a critical missed opportunity to demonstrate the power and relevance of Walther-Hansen’s framework. Examples include experiences of rough timbre as “threat” and “violence” in death metal culture (Wallmark 2018), or the use of tape and vinyl sounds as markers of pastness or nostalgia in hip-hop (Harrison 2006; Fouché 2011). Another missed opportunity is Walther-Hansen’s omission of listeners’ perceiving actions from his discussion of how actions can be used to understand sound, not just shape it (2020, 73). In my own research, I argue that metal listeners’ headbanging creates and amplifies experiences of heaviness, by adding corporeal impact to whatever is already heard in the sound (Hudson forthcoming). But emphatic listener motion doesn’t have to be heavy, it can engage other conceptual metaphors. In the music video for the 2007 rap hit “Pop, Lock & Drop It” by the late St.-Louis-based artist Huey, the female dancers “drop it,” squatting down and bouncing up simultaneously with the bass hits on beat 2 while the chorus vocals repeat the song title. Whether we join the dance or just watch the video, this butt-drop surely contributes something significant to our multi-modal understanding of the bass’s weight, motional quality, and meaning—and it definitely adds something different than the heaviness of headbanging.

The methodology of music theory (as articulated in the quotation above from Lerdahl and Jackendoff 1983) is most clearly approached in Chapter 4, a brief “Encyclopedia” which explores 15 opposing pairs of sound qualities such as “Dark/Bright” and “Clean/Dirty.” Each entry has the following sections: “Metaphor” describes the domains which the metaphor maps sound onto (emotion, physical and spatial qualities, etc.), “Physical Signal” briefly describes the sonic attributes associated with this metaphor, and “Discourse” sketches the metaphor’s usage in technical and critical writing. Finally, each entry includes a table of binary characteristics which are entailed by this metaphor. For example, the table for “Clean/Dirty” lists under the header “Clean sound,” “Is non-distorted; Sounds sterile (unexciting); Is noise-free; Is unoffensive,” while in the opposing column “Dirty sound” it lists the opposite qualities, “Is distorted; Does not sound sterile (exciting); Is noisy; Is morally unclean/offensive” (2020, 95). In sum, the cultural meanings and physical characters evoked by these descriptors are certainly “musical intuitions of a listener who is experienced in a musical idiom,” and Walther-Hansen’s detailed encyclopedia entries are certainly “formal descriptions” of these intuitions.

3.      Adaptation for Musical Analysis

Musical analysis usually works through segmenting and labelling a musical score or an auditory experience, as Dora Hanninen (2001, 2012) has highlighted. Walther-Hansen’s short encyclopedia provides an understandably coarse-grained taxonomy, which is arguably not yet a rigorous segmentation method for musical analysis (to be fair, this is outside the stated scope of his book). Below I suggest additional degrees of systematicity that help this encyclopedia meet some recommendations Hanninen has for theories of segmentation and analysis.

 Hanninen suggests that in the most general sense, “Music analysis might be described as the conceptualization and representation of musical relationships” (2001, 345), which can include relationships between individual notes, but also relationships between different kinds of musical objects or concepts. Hanninen argues that music analysis’s descriptions of these relationships can be more powerful when there are clearer criteria for demarcating these objects or concepts and distinguishing between them, because this additional rigor can “open up the possibility for precise and reasoned intersubjective discourse about how…analytic interpretations differ, and about ambiguity, richness, and multiplicity of hearings” (2001, 346).

Adding more systematicity to Walther-Hansen’s method enables it to speak to the richness and plurality for which Hanninen advocates, by elucidating both the relationships between different cognitive metaphors for timbre, and the criteria for their distinction. One way to make Walther-Hansen’s account more systematic is to import additional tools from cognitive linguistics to describe the relationships between concepts and linguistic objects, such as Ronald Langacker’s (2002) account of schematicity and interrelated senses of lexical items. “Lexical items” is a broad category which includes parts of words, single whole words, and conventional chains of words that together form a language’s basic vocabulary. Langacker argues that our understanding of any one of these items is more complex than a single definition, but is better depicted as a network of related metaphors, derivative terms, and more fine-grained distinctions.

The precise configuration of such a network is less important than recognizing the inadequacy of any reductionist description of lexical meaning. A speaker’s knowledge of the conventional value of a lexical item cannot in general be reduced to a single structure, such as the prototype or the highest-level schema. For one thing, not every lexical category has a single, clearly determined prototype, nor can we invariably assume a high-level schema fully compatible with the specifications of every node in the network. (Langacker 2002, 2–3)

Similarly, timbres and the cognitive metaphors we use to understand them are often best conceived as a “considerable array of interrelated senses” (Langacker 2002, 2) in which each sound quality consists of a network between several overlapping metaphors, related near-synonyms, and diverse resonances and associations with other dimensions of experience. Walther-Hansen’s encyclopedia definitions are already admirably multi-layered and multi-modal, but an application of his ideas to musical analysis must recognize the breadth of overlap between related cognitive metaphors as well as the depth of more fine-grained distinctions.

For example, the cognitive metaphor for “Heavy” overlaps considerably with “Dark,” “Hard,” and “Rough.” While these are not identical metaphors, most instances of “Heavy” arguably also draw on one or more of the other three metaphors. Additionally, in Walther-Hansen’s definitions, these four cognitive metaphors share many overlapping entailments, as I’ve mapped out in Figure 1. For example, “Heavy,” “Hard,” and “Rough” sounds all entail apparent force or effort; “Heavy” and “Dark” sounds are both low in pitch; etc.

Hudson, Figure 1
Figure 1. Four cognitive metaphors with their overlapping entailments. Top row: cognitive metaphors for sound quality; Bottom row: entailments / characteristics from other domains of experience. Based on Walther-Hansen’s encyclopedia definitions (Chapter 4). Dotted lines represent two additional entailments I added: rough sounds are often literally loud or imply loudness, and heaviness is often associated with badness or evil.

Additionally, a single metaphor like HEAVY operates in the background for a large network of related sound qualities with distinct connotations and associations, which often are not entirely represented within a single definition or term. Figure 2 takes a few of the large number of senses for HEAVY used within the metal genre, grouped into two categories by speed. The Heavy & Fast category is also closely related to another background metaphor, HARD. The broad metaphor of HEAVY could be described as a kind of schema which passes on many entailments (like size, weight, impact, etc.) to each of the more specific senses (such as brutal, thunderous, adrenalized, etc.). But many of these individual senses resonate with other metaphors as well, and those other metaphors could be viewed as schematic for these individual terms. For example, “funereal” could be described as a finer sense of both HEAVY and DARK. This network represents a diverse and multidimensional space of interrelated senses, which cannot be reduced to a single definition for HEAVY; for example, “funereal” and “adrenalized” are practically opposite in meaning, but both are senses of HEAVY which apply this metaphor in divergent ways to create their distinct qualities of physical impact.

Hudson, Figure 2
Figure 2. Network of senses of the cognitive metaphor HEAVY. Square boxes contain cognitive metaphors. Shaded circles provide two distinct senses of “heavy” categorized by the characteristic of speed. Individual descriptive terms are in normal text. Dotted lines show that a term draws on a specific metaphor. Double-dashed line indicates that HEAVY and HARD are closely related metaphors; both metaphors are activated by the sense “Heavy & Fast.”

4.      Consequences for the Field

But there’s something else that theorists can get from this book, besides a new method of segmentation analysis. Music theory has traditionally focused on the syntax of notes and patterns of notes, but there are many other kinds of intuitions listeners make beyond distinguishing between different musical pattern-objects and construing syntactic structure. The greatest strengths of Walther-Hansen’s approach to timbre may lie not in distinguishing between different timbres, but in mapping out the qualities and experiences invoked by those timbres, and explaining timbre’s instantaneous and compelling pull over us. In other words, Walther-Hansen’s work points towards a new direction for music theory and analysis, but one which is still within the scope of formalizing listeners’ intuitions: instead of segmenting and labelling different regions of a score or temporal experience, or elucidating principles of syntax, music theory and analysis can investigate the rich web of metaphors and concepts that listeners might bring to understanding individual musical qualities such as timbre, harmony, melodic motion, or topic.

If Lakoff and Johnson’s arguments about cognitive metaphor are correct, and virtually all cognition involves metaphor, then music theory needs more cognitive metaphor research like Walther-Hansen’s book if it is to map the “intuitions of an experienced listener” and describe our musical experiences. And music theory is already moving in this direction; musical meaning and embodied cognition have become hot topics in the last decade, to the point that many of Walther-Hansen’s non-music references are already well-cited in some areas of music theory. In fact, though it is not framed in this way, the recently ascendant subfield of topic theory is already a kind of cognitive metaphor theory, as it maps how music can metaphorically represent and evoke affects, images, people, and even other music—although unlike Walther-Hansen’s theory of timbre, topic theory still shares music theory’s traditional locus of note patterns in scores. While the field of music theory may once have prioritized “abstract principles” of musical structure, over the last decade or two more and more attention has been devoted to concrete and situated explorations of cognition and experience. Fully realizing this scope will mean including more research that works in new modes other than segmenting scores into discrete segments or describing the syntax of those note patterns. It’s a substantial shift, but one the field seems poised to take.

Stephen S. Hudson
University of Richmond
455 Westhampton Way
University of Richmond, VA 23173


Berger, Harris M., and Cornelia Fales. 2005. “‘Heaviness’ in the Perception of Heavy Metal Guitar Timbres: The Match of Perceptual and Acoustic Features over Time.” In Wired for Sound: Engineering and Technologies in Sonic Cultures, edited by Paul D. Greene and Thomas Porcello, 181–97. Wesleyan University Press.

Cogan, Robert. 1984. New Images of Musical Sound. Cambridge: Harvard University Press.

Cox, Arnie. 2016. Music and Embodied Cognition: Listening, Moving, Feeling, and Thinking. Indiana University Press.

Fales, Cornelia. 2002. “The Paradox of Timbre.” Ethnomusicology 46 (1): 56–95.

Fouché, Rayvon. 2011. “Analog Turns Digital: Hip-Hop, Technology, and the Maintenance of Racial Authenticity.” In The Oxford Handbook of Sound Studies, edited by Trevor Pinch and Karin Bijsterveld, 505–525. Oxford University Press.

Hanninen, Dora A. 2001. “Orientations, Criteria, Segments: A General Theory of Segmentation for Music Analysis.” Journal of Music Theory 45 (2): 345–433.

———. 2012. A Theory of Music Analysis: On Segmentation and Associative Organization. University of Rochester Press.

Harrison, Anthony Kwame. 2006. “‘Cheaper than a CD, Plus We Really Mean It’: Bay Area Underground Hip Hop Tapes as Subcultural Artefacts.” Popular Music 25 (2): 283–301.

Hudson, Stephen S. 2022 (forthcoming). “Bang Your Head: Construing Beat Through Familiar Drum Patterns in Metal Music.” Music Theory Spectrum 44 (1).

Lakoff, George, and Mark Johnson. 1980. Metaphors We Live By. University of Chicago Press.

Langacker, Ronald W. 2002. Concept, Image, and Symbol: The Cognitive Basis of Grammar. 2nd ed. Mouton de Gruyter.

Lavengood, Megan L. 2020. “The Cultural Significance of Timbre Analysis: A Case Study in 1980s Pop Music, Texture, and Narrative.” Music Theory Online 26 (3).

Lerdahl, Fred, and Ray Jackendoff. 1983. A Generative Theory of Tonal Music. Cambridge: The MIT Press.

Risen, Clay. 2021. “Dusty Hill, Long-Bearded Bassist for ZZ Top, Dies at 72.” The New York Times, July 28, 2021.

Schaeffer, Pierre. 1966. Traité des objets musicaux: essai interdisciplines. Editions du Seuil.

Sterne, Jonathan. 2003. The Audible Past: Cultural Origins of Sound Reproduction. Duke University Press.

Wallmark, Zachary. 2018. “The Sound of Evil: Timbre, Body, and Sacred Violence in Death Metal.” In The Relentless Pursuit of Tone: Timbre in Popular Music, edited by Robert Fink, Melinda Latour, and Zachary Wallmark, 65–87. Oxford University Press.

Zbikowski, Lawrence M. 2017. Foundations of Musical Grammar. Oxford University Press.