.. _chap-music-basics: ============== Music basics ============== Areas: acoustics, physics, curiosity [status: just-starting] Motivation, prerequisites, plan =============================== As I write in 2020 we listen to recorded music almost exclusively through a computer. It is interesting, instructive, and useful to understand how the computer represents music, how it is stored, compressed, manipulated, and how interesting things get done with it. To work through this material you should be comfortable with making plots, discussed in :numref:`chap-data-files-and-first-plots`. You should also install the packages ``ffmpeg`` and ``sox``. We will start by discussing what sound is. Then we will discuss how it can be represented mathematically. Finally we will look at the various formats that have been devised for computers to store sound files, and how to convert between them. What is sound? ============== Sound is a wave-like sequence of compression and decompression of the air (or other medium). The compression/decompression "front" pushes the next layer of air forward and backwards along the direction of motion. This is called a *longitudinal* wave. This can be contrasted with different types of waves, like water waves or electromagnetic waves (light, radio, x-rays, ...), where the "up and down" of the wave is perpendicular to the direction in which it moves. Those are called *transverse* waves. .. _fig-waves-longitudinal: .. figure:: Onde_compression_impulsion_1d_30_petit.* :width: 50% Longitudinal waves: the expansion/contraction happens along the direction of motion. (Image from wikipedia.) .. _fig-waves-transverse: .. figure:: Onde_cisaillement_impulsion_1d_30_petit.* :width: 50% Transverse waves: the expansion/contraction happens perpendicular to the direction of motion. (Image from wikipedia.) Some gentle introductions to sound can be found at: https://www.mediacollege.com/audio/01/sound-waves.html https://www.youtube.com/watch?v=qV4lR9EWGlY In class we discuss the quantities of interest in talking about sound. Some of these are * amplitude/sound pressure/intensity * frequency/pitch * wavelength * period * speed * pure tone versus superposition of frequencies How is sound generated? ======================= Ask the class to discuss various ways in which they have seen sound generated. Some could be: drum head, guitar soundboard, loudspeaker diaphragm, tweeters, wooferes, ... Measuring and recording ======================= The *human year*, before it has been over-exposed to repetitive sounds, can hear from 20 Hz to 20000 Hz (20 kHz). Microphones usually try to pick up a very clean (non-distorted) signal in the same frequency. How do microphones work to translate vibration of air into an electrical voltage that changes in time? The lover's phone, then the carbon microphone, then it gets modern. .. _fig-lovers-telephone: .. figure:: lovers_telephone.png :width: 30% Robert Hooke's "Lover's Telephone". (Image from wikipedia.) .. _fig-carbon-microphone: .. figure:: Carbon_microphone.svg :width: 50% A diagram of how the carbon microphon microphone works. When the air compresses it, it conducts more, so you have a higher voltage signal coming out. (Image from wikipedia.) Can someone research the technical specs of microphones? What is a "frequency response curve"? What would it be for a high quality studio microphone, versus various types of smartphones? .. fig-frequency-response-curve: .. figure:: Oktava319vsshuresm58.png :width: 60% A "frequency response" curve for two different microphones: the Oktava 319 and the Shure SM58. (Image from wikipedia.) What is music ============= An art form whose medium is sound. Music uses modulations of pitch and amplitude to achieve aesthetical effects. Discuss some concepts like stereo. Interesting definitions of "music" proposed by students: "A sound that is pleasant, has many different ..., and doesn't have to be liked by everyone." and "Many frequencies that move together in a pattern that makes it pleasant to hear." Understanding what we plot in an *amplitude* plot ================================================= Make the following simple plot: :: $ gnuplot gnuplot> plot sin(x) That shows a basic :math:`sin()` wave, but it does not connect to the physical quantities involved. To see how frequency might enter the picture try this out: :: gnuplot> A = 2.5 # amplitude of 2.5 gnuplot> freq_hz = 440 # 440 hertz - a middle A frequency gnuplot> set xlabel 'time' gnuplot> set ylabel 'amplitude' gnuplot> plot A * sin(2 * pi * freq_hz * x) This frequency is rather high, so the plot not really showing enough information. To see a bit more you can make the gnuplot sampling higher: :: gnuplot> set samples 10000 gnuplot> plot A * sin(2 * pi * freq_hz * x) Clearly we have to zoom in. To show just a few full periods of the wave let us restrict the domain: :: gnuplot> plot [-0.01:0.01] A * sin(2 * pi * freq_hz * x) Now we are ready to talk about how to read those axes. Look for the period, understand how the amplitude, frequency, and period appear on it. Discuss why the :math:`2 \pi` is in there. How does the GNU/Linux microphone work? ======================================= We will use the programs ``rec`` and ``play``, both of which are part of the `sox` package in most distributions.. ``rec`` will record a sound, and ``play`` will play it back. As we saw in :numref:`chap-advanced-plotting`, you can invoke them like this: :: rec myvoice.dat then speak in to it, or play some music in to it, and hit control-C after just a couple of seconds. You can play it back with :: play myvoice.dat If you list your directory you will find that the file `myvoice.dat` has been created, and it has three columns: time, left channel, right channel. We will plot this file like this: :: $ gnuplot gnuplot> plot 'myvoice.dat' using 1:2 with lines # you can also try 1:3 Generating your own musical tone ================================ A single tone ------------- So how would you generate a tone yourself? .. literalinclude:: play_freq.py :language: python :caption: play_freq.py - play a single note. The one we have put in here is a "middle A (*La*)" which has a frequency of 440 Hz. Put this into a file with: :: chmod +x play_freq.py ./play_freq > note.dat play note.dat The frequencies for "do, re, mi, fa, sol, la, si, do" (C,D,E,F,G,A,B,C) are (in Hertz): 261.63, 293.66, 329.63, 349.23, 392.00, 440.00, 493.88, 523.25. Note that you could change your ``main()`` function to play a full scale of notes, and it might look like this: .. _listing-two-notes-py: .. code-block:: python :caption: Play a few notes by invoking ``play_freq()`` multiple times. def main(): play_freq(2, 440.00, 70000) # play 100000 samples at 48kHz play_freq(2, 523.25, 70000) # play 100000 samples at 48kHz # freq_sequence = [261.63, 293.66, 329.63, 349.23, 392.00, 440.00, 493.88, 523.25] # for freq in freq_sequence: # play_freq(2, freq, 10000) .. _sec-from-notes-to-frequencies: From notes to frequencies ------------------------- Let us take the Italian (Do, Re, Mi, Fa, Sol, La, Si) or German/English (A, B, C, D, E, F, G) notation for musical notes and figure out how to convert those into frequencies. This will allow us to write more versatile programs that take a music specification and play it out. The general mathematical formula is: .. math:: freq = A4_{freq} * 2^{n_{steps}/12.0} where :math:`A4_{freq}` is the frequency of the "A above middle C" note, 440 Hz. This is discussed in more detail at ``https://en.wikipedia.org/wiki/Musical_note#Note_frequency_(hertz)`` If we want to convert English .. code-block:: python :caption: Convert a note specification (which consists of octave, note, and shartp_or_flat) and generate the frequency of that note. def note2freq(octave, note, sharp_or_flat): """Takes a note specification and returns the frequency of that note. If note is 'rest' then we return a frequency of zero.""" ## refer to https://en.wikipedia.org/wiki/Musical_note#Note_frequency_(hertz) if note == 'rest': freq = 0 else: A4_freq = 440 # A above middle C n_steps = note2steps(octave, note, sharp_or_flat) freq = A4_freq * math.pow(2, n_steps/12.0) return freq This function relies on another function ``note2steps()`` which is too long to put here, so we will make a link to a full music generating program :download:`generate_music.py` which you can study and modify. You can ``generate_music.py`` and save it to a file and play it to your speaker with: :: chmod +x generate_music.py ./generate_music.py > popcorn.dat play popcorn.dat File formats ============ The `.dat` files we have seen are in the simplest possible format. They are not very expressive and they would become *huge* if we had a long signal. Even those 2-second files were much too big. We will explore .dat, .au, .aiff, .mp3, .ogg, .webm, .wav, .flac, discussing how each one comes up. https://en.wikipedia.org/wiki/Timeline_of_audio_formats https://en.wikipedia.org/wiki/Data_compression#Audio :numref:`sec-spectrograms-for-standard-acoustic-files` Converting our ascii music ``.dat`` files to other formats ========================================================== Some of the file formats are very well defined: they can be decoded and played by a program that knows the specification for that format. Sometimes there is even an international expert panel which proposes and maintains the specification for that format. There have been oddities associated with this process: due to an oversight by the mp3 standard group, they allowed the mp3 format to involve a *patented* algorithm, which for a long time made the format unusable by free software. (The patent has expired now.) The ascii ``.dat`` files we have been using here are not one of those well-specified formats. As far as we can tell, they are only used by the programs in ``sox`` (sound exchange) software swite: ``rec``, ``play``, and ``sox``. On the other hand these ascii files are *extremely* useful for us to understand them, plot them, and write programs that read and write them. Our ``play_freq.py`` and ``generate_music.py`` programs generate this format with no effort at all. To convert our output file ``popcorn.dat`` (generated in :numref:`sec-from-notes-to-frequencies`) into the more standard ``.flac`` or ``.mp3`` formats. The ``sox`` utility will get us out of the non-standard ``.dat`` format by turning it into a ``.aif`` file. From there we can then use the ``ffmpeg`` program to convert it into dozens of other formats. For example: :: ./generate_music.py > popcorn.dat sox popcorn.dat popcorn.aif ffmpeg -i popcorn.aif popcorn.flac ffmpeg -i popcorn.aif popcorn.mp3 ls -lsh popcorn.* Here is the output I get from listing those music files in their various formats: :: 5.9M -rw-rw-r-- 1 markgalassi markgalassi 5.9M Jan 14 13:26 popcorn.aif 45M -rw-rw-r-- 1 markgalassi markgalassi 45M Jan 14 13:26 popcorn.dat 1.1M -rw-rw-r-- 1 markgalassi markgalassi 1.1M Jan 14 13:26 popcorn.flac 252K -rw-rw-r-- 1 markgalassi markgalassi 251K Jan 14 13:27 popcorn.mp3 This gives a really interesting look at the effect of using these various file formats. The original ``popcorn.dat`` file is 45 megabytes in size (this should strike you as way too big). Once you convert to the 1988 vintage *audio interchange file format* (aif) file ``popcorn.aif`` it is down to about 6 megabytes. The modern *free lossless audio codec* (flac) format is 1.1 megabytes, and if you are willing to lose a small amount of musical quality with the "lossy" *mp3* format you can get it down to a quarter of a megabyte. You could now play the flac or mp3 file using a music or video program. A quick way from the command line is to run: :: vlc popcorn.flac Effects filters =============== https://linuxgazette.net/issue73/chung.html