1 Physics of Sound
Alexander Pillai
Learning Objectives
- Define sound and how it travels from a source to our ears.
- Describe music as a complex wave.
- Discuss how we hear music.
- Recognize how our brains dissect incoming auditory information in order to understand music.
Introduction
Think about what a simple rock song consists of: drums, a guitar, a bass guitar, and vocals. How does a single speaker transmit all of this information to our ear? We know sound travels as a wave, but now imagine one sound wave containing the melodic, harmonic, and rhythmic information of each instrument. Now think about what happens when this mysterious wave reaches our ear. Regardless of how all of that information was packaged into a single wave, we now have to figure out a way of extracting it in order to hear the song. Our ears and brain must take this single wave and extract the drums, guitar, bass guitar, and vocals and process this information in a way that allows it to make sense—allowing us to hear the song.
In his TED talk, Michael Tilson Thomas described classical music as “a dialogue between the two powerful sides of our nature: instinct and intelligence” (Thomas, 2012). However, this definition can be applied to all kinds of music. Humans are born with a variety of reflexes including some related to sounds—such as turning our heads towards the source of a sound (“Turning the Head Toward a Sound”, 2009). We are born with instinctual responses to sounds. As we develop, our musical abilities refine dramatically, and by childhood, we can effectively listen to and understand music. Thus, we acquire and develop our intellectual responses to sounds. It is this interplay between instinct and intellect that helps us breakdown and process music.
Physics of Sound
Before understanding how our brains take apart and comprehend music, we must understand how we hear the music. And before this, we must understand what sound is and how sound travels (albeit on a basic level). To do this, we must look to physics. Sound waves are classified as traveling longitudinal waves (Knight, 2019). In order to understand what that means, it is easier to break it down into its two components: longitudinal waves and traveling waves.
To learn about longitudinal waves, we must look at the basics of the wave itself. Mechanical waves (which includes sound waves) rely on the movement of the medium through which the wave travels (Knight, 2019). In the case of sound waves, the most common medium is air. Thus, a sound wave is not something that travels through air molecules, but instead, it is actually the specific movement of the air molecules themselves. Sound waves are classified as longitudinal waves because of the specific way that the air molecules are disturbed (Knight, 2019). The air molecules are moving parallel to the motion of the overall wave. To imagine this, think of a spring. When you push on one end of the spring, the compression travels through the spring to the other end. The spring itself did not move, and there was no special object that was forced through the spring. Instead, we can see that this longitudinal wave resulted from the parallel motion of the medium (the spring). Compare this to a rope in which a wave can travel through if you snap the end of the rope up and down. In this case, the movement of the rope is perpendicular (up and down) to the movement of the wave (sideways).
Sound waves arise from special movements in air molecules. But how do these movements translate into a wave that can travel from a source (such as a speaker) to our ears? To understand this, we can delve into the second classifier of sound waves: traveling waves.
We can easily do this by imagining a speaker. The speaker cone moves back and forth in accordance with the incoming signal—but for right now we do not need to worry about that signal. As the cone moves forward, it compresses the air molecules directly in front of it. This increases the local air pressure in that region. However, the molecules naturally want to return to a more stable, lower pressure. To do so, these compressed air molecules push against nearby molecules which then compress. This cycle of compression and pushing against the next region of adjacent molecules continues and what is generated is a region of compressed air that travels in a particular direction. As the speaker cone moves in and out, there will be repeated regions of compression, and in between these compressions, there are regions of lower pressure called rarefactions (Knight, 2019).
In this GIF, sound waves are traveling to the right. At point A, the speaker is moving back and forth, creating these waves. At point B, the shaded regions are areas of compression and the white regions are rarefactions. Vibrations (i.e., the alternating pattern of compressions and rarefactions) travel through the air. Once these vibrations reach our ears, our eardrums will vibrate and we will process the vibrations as sound.
Basic physical features of these compressions relate to the sound information that these waves carry. Two basic elements are frequency and amplitude. The frequency is defined as the number of compressions per second, and this directly relates to the pitch of the sound. As the frequency increases so does the pitch. The frequency is measured in Hertz (Hz), and a common example of the relationship between frequency and pitch is the note A above middle C which has a frequency of 440 Hz. A similar characteristic of sound waves is their period which refers to the time it takes for one cycle of the wave to complete and is the inverse of the frequency. The second basic feature of sound waves is their amplitude. This refers to the intensity or amount of pressure change that is occurring between the rarefactions and compressions. This translates to the loudness of the sound, and (similarly to frequency) as the amplitude increases so does the loudness (Hansen, 2001). These two features are critical in order to understand how musical information is carried in waves. From amplitude and frequency, the loudness of a pitch in music can be determined—and soon we will see how these waves can combine to transmit all of the information in a song simultaneously.
The simplest sound wave is that of a single note produced by a computer. This sound wave is essentially perfect and creates exactly that specified note. In reality, this is not the case. Think again about the note A that represents 440 Hz. A computer can produce a simple tone that corresponds exactly to 440 Hz. However, this note can be produced through a variety of means including a piano, a guitar, and the human voice. And from our experience, we know that each of these cases sounds different even though they are all producing the same note. Why is this so? The answer lies in what are called complex sound waves.
Complex sound waves can be thought of as the combination of multiple simple sound waves added together. While the exact mechanisms for how this is accomplished can become quite complicated, we can investigate this phenomenon on a more basic level in order to gain a baseline understanding of how music is transmitted via one complex sound wave. Complex sound waves make up most of the sounds we hear every day—including those of music. These waves arise from the superposition of simple waves. In essence, the combinations of each simple waves’ compressions and rarefactions generate a single complex wave with less regular intervals between compressions and rarefactions as well as less regular changes in amplitude.
With this basic understanding of complex sound waves, we can now more confidently revisit the differences between an A note (440 Hz) on a piano and a guitar. It is obvious that there is some level of difference between these sounds and the sound of a pure 440 Hz tone from a computer. This difference actually refers to the waves generated from pianos and guitars: these instruments (like most other natural sound sources) generate complex sound waves. Both the piano and guitar produce complex sound waves corresponding to the note A at 440 Hz. However, in addition to the clear differences between these instruments and a computer tone, we still have the ability to distinguish between a piano and a guitar. This is because while the basic elements of the guitar and piano complex waves are similar (specifically, they share a basic period), each instrument’s sound waves will differ from each other. The specific qualities unique to each instrument’s complex sound waves generate that instrument’s timbre or the quality of its sound. Thus, we use the basic elements of a wave to determine pitch while also using the associated complex elements to identify the source or the timbre.
Now that we have a basic understanding of sound waves and how they conduct information from a source to our ears, we will now uncover how we actually hear these sounds and make sense of the information that they transmit.
For an additional explanation of sound waves visit https://pudding.cool/2018/02/waveforms/
Hearing Music
There are two systems that are relied upon to process incoming auditory stimuli: the auditory system and the brain. In this section, we will focus more on the former and understand the various parts of the auditory system, how they work, and how sound information is transmitted from the outside world to the brain.
The human auditory system is primarily centered around the ear which itself is divided into three distinct parts: the outer, middle, and inner ear (Swenson, 2006). The outer ear—which is comprised of the parts that are visible and the ear canal—direct sound waves from the external environment to the middle and inner ear. The handoff between the outer and middle ear occurs at the tympanic membrane or eardrum (Swenson, 2006). The function of the tympanic membrane is to transmit incoming soundwaves safely to the rest of the middle and inner ear. To do so, the tympanic membrane vibrates according to the incoming sound waves and its vibrations are transmitted through three small bones (the ossicles) that comprise the inner ear. Together, the tympanic membrane and ossicles conduct the sound waves from outside of the body to the inner ear where they are converted to signals for the brain (Swenson, 2006).
The inner ear houses the primary sensory organ of the auditory system: the cochlea. The cochlea is a fluid-filled organ and contains numerous sensory hair cells that are linked to nerves that eventually travel to the brain (Swenson, 2006). As sound travels to the ear, the middle ear safely transduces the sound waves to the cochlea where the sound waves will then travel through the fluid of the organ. The cochlea responds to different frequencies due to how sounds vibrate different regions of the cochlea. Different frequencies are mapped to different regions of the cochlea. When a region vibrates (due to a specific external frequency) hair cells in the region transmit signals to the brain (Swenson, 2006). Thus, when a complex wave comprised of two different frequencies is detected by the cochlea, hair cells corresponding to each frequency will be activated and send signals to the brain—allowing us to detect and interpret complex sound waves.
Understanding Music
The auditory system allows us to dissect complex waves from our environment and send these interpreted signals to our brain; however, this is only the first step in how we process these sounds. The process of understanding music can best be described as:
Multiple successive processing stages represent the collections of perceptual features corresponding to a particular instrument or melody; disambiguate simultaneous instruments and melodies; link these representations with stored musical memories and knowledge; import information from other cognitive domains; and ultimately, [program] an appropriate [behavioral] response (Warren, 2008).
Thus, to properly uncover the processes for how we understand and interpret the sounds we hear, we first must understand our memory systems, how auditory memory works, and how this is combined with the auditory system in order to understand music.
Human memory can be divided into three distinct yet interconnected systems: sensory memory, short-term memory, and long-term memory (Camina, 2017). The basic understanding of human memory details that initial sensory information (i.e., visual, auditory, and kinesthetic information) about our environment and experiences is first stored in the sensory memory system. This represents a basic encoding of the information and makes it possible for this information to be manipulated and further stored if desired. Additionally, information stored in sensory memory only persists for less than one second before fading as this system only represents the basic encoding of the multitude of sensory information that we are exposed on a regular basis. The next level of human memory is short-term memory. This system involves the storage and manipulation of information and acts as a central mediator between sensory memory, long-term memory, and certain decision-making processes and deals with slightly longer-term storage of information than sensory memory (20-30 seconds). In addition to short-term storage abilities, the short-term memory system is also involved in processing information used in executing responses or functions. To do this, short-term memory can receive information from both sensory and long-term memory and relies on a system known as working memory to handle the manipulation of this pulled information. The last level of memory is long-term memory which is responsible for storing crucial information for a much longer period of time (and even possibly indefinitely). Long-term memory can be further divided into two retrieval mechanisms: explicit and implicit memory. Explicit memory is involved with the conscious recall of information and is even further divided into episodic memory (dealing with personal memories) and semantic memory (dealing with factual information). Conversely, implicit memory revolves around the unconscious recall of information dealing with the use of the body or object. Examples of implicit memories may include habits and repeated skills (such as driving a car or using a pencil).
With this basic understanding of human memory, it is now possible to delve deeper into the processes involved in the processing of music. The fundamental goal of auditory perception is to “recognize the plausible physical causes of incoming sensory information” (Agus, 2010). To accomplish this, our brains learn to recognize certain features of sounds and form associations between these elements and a specific source. This is partially the role of auditory working memory: to match specific features of incoming complex sound waves with information stored in memory. With this system in place, we can readily match these sound features with sources that we recall from memory in order to make sense of the incoming sounds.
However, what “specific features” are used in auditory working memory in order to match incoming sounds with memories and information previously stored in the brain? In a 2009 review, Daniel Levitin identified eight basic elements of these auditory stimuli that are used in our perception of sound: pitch, rhythm, timbre, tempo, meter, contour, loudness, and spatial location (Levitin, 2009). As explained previously, all of this information can be transmitted through a single complex sound wave to our ears. These elements can be combined and applied in different ways to create a variety of sounds and music—and these variations in application help to differentiate the music of different cultures (Levitin, 2009).
When faced with a complex sound wave, the brain first relies on the more fundamental of the aforementioned elements such as pitch, duration (which involves a combination of the elements), and loudness (Warren, 2008). With this information and more nuanced features of the complex waves, the brain uses the auditory working memory system to assist in processing and identifying significant aspects of these waves. Through complex interactions of various brain regions, we can quickly identify the sources of sounds (such as instruments), link incoming sounds with past memories associated with similar or the same sounds (such as past emotions and experiences with similar music), and execute responses to the sounds. In fact, while the brain processes the sounds for its musical elements, a separate system simultaneously processes the music for its emotional elements (Warren, 2008). The combination of both of these processing systems helps to coordinate and direct our emotional and behavioral responses to sounds and music while also helping to train future behavior (such as whether we will continue to listen to this music or avoid it).
How We Hear a Song
Throughout this chapter, we have uncovered the characteristic of sound, our auditory system, and our memory and applied these elements towards one central question: how do we hear music? With this chapter, you should now have an understanding of the basics of sound waves and how they transmit information from a source to our ears. We dissected the human auditory system in order to understand how it responds to incoming sound waves and translates these waves into signals for the brain. Lastly, we dove into human memory in order to understand its role in processing incoming auditory information. We learned how the different levels of memory work and how they work together in order to make sense of sounds we hear.
With all of this, we can now try and tackle our central question: how do we hear a song? First, a speaker creates a complex sound wave that combines all of the instruments and other musical elements of the song into one unit. The ear receives and transmits the information of incoming complex sound waves to the brain including the sounds’ pitches, loudness, and timbre. Next, the brain sends these signals through a hierarchical network of pathways that processes the basic information. Part of this process is the application of our memories and stored information to process the sounds’ timbre and recognize the sources of these sounds in the song (i.e., instruments). Simultaneously, we process the emotions of the song and relate the song to past emotional experiences we have stored in memories. Thus, the musical information, memory-related information, and emotional information all dictate our understanding and reaction to music.
References
Agus, T. R., Thorpe, S. J., & Pressnitzer, D. (2010). Rapid formation of robust auditory memories: insights from noise. Neuron, 66(4), 610-618. https://doi.org/10.1016/j.neuron.2010.04.014
Camina, E., & Güell, F. (2017). The neuroanatomical, neurophysiological and psychological basis of memory: current models and their origins. Frontiers in Pharmacology, 8, 438. https://doi.org/10.3389/fphar.2017.00438
Hansen, C. H. (2001). Fundamentals of acoustics. Occupational Exposure to Noise: Evaluation, Prevention and Control. World Health Organization. 23-52. Retrieved from https://www.who.int/occupational_health/publications/noise1.pdf
Kaiser, J. (2015). Dynamics of auditory working memory. Frontiers in Psychology. https://doi.org/10.3389/fpsyg.2015.00613
Knight, R. D., Jones, B., & Field, S. (2019). College Physics: A Strategic Approach (4th ed.). New York, NY: Pearson.
Kumar S., Joseph, S., Gander, P. E., Barascud, N., Halpern, A. R., & Griffiths, T. D. (2016). A brain system for auditory working memory. Journal of Neuroscience, 36(16) 4492-4505. https://doi.org/10.1523/JNEUROSCI.4341-14.2016
Levitin, D. J., & Tirovolas, A. K. (2009). Current advances in the cognitive neuroscience of music. Annals of the New York Academy of Sciences 1156: 211–231. https://doi.org/10.1111/j.1749-6632.2009.04417.x
Norman-Haignere, S., Kanwisher, N. G., & McDermott, J. H. (2015). Distinct cortical pathways for music and speech revealed by hypothesis-free voxel decomposition. Neuron, 88(6), 1281–1296. https://doi.org/10.1016/j.neuron.2015.11.035
Swenson, R. (2006). Chapter 7D – Auditory System. Review of Clinical and Functional Neuroscience. Retrieved from https://www.dartmouth.edu/~rswenson/NeuroSci/chapter_7D.html
Thomas, M. T. (2012). Music and emotion through time. Retrieved from https://www.ted.com/talks/michael_tilson_thomas_music_and_emotion_through_time
Fairview Health Services. (2009). Turning the Head Toward a Sound [PDF File]. Retrieved from https://www.fairview.org/fv/groups/internet/documents/web_content/turninghe_2010092621080629.pdf
Warren, J. (2008). How does the brain process music? Clinical Medicine (London, England), 8(1), 32-36. https://doi.org/10.7861/clinmedicine.8-1-32