For an invention so often celebrated as revolutionary, or feared to be, the MP3 has some surprisingly distant origins. “New media,” in this case, was built on decades-old ideas and technologies. In MP3: The Meaning of a Format, McGill professor Jonathan Sterne follows the genealogy of the MP3 from primitive 19th-century telephony to the “social non-place” of BitTorrent. Fluidly thorough, Sterne compresses vast and circuitous relationships into his study of communication, keeping a B- science student like this author engaged even as Heidegger references give way to diagrammed auditory patterns. His book is, in part, a narrative of the MP3 before it was the MP3, of all the stray data that make up its song.
ca. 3500–3350 BCE
Humanity invents the wheel, tangibly demonstrating the principle of compression for the first time. In MP3, Sterne notes: “It is no accident that so many media technologies are built around spinning mechanisms. A roll of film or tape takes up less room than if it is stretched out from end to end. We could say the same for compact discs and DVDs, the platters of a hard drive, the curves of a record, the spools of paper in teletypes and telegraphs, and the spinning hands of analog clocks.”
1915-1924
Driven to “maximiz[e] profits in a context of regulated price” and drawing on the new discipline of psychoacoustics, a monopolistic AT&T gradually quadruples its phone lines’ capacity by studying the limits of human hearing and filtering together surplus frequencies. Sterne coins the term “perceptual technics” to describe this process of systematic optimization, and the resulting perceptual capital can be used both to wring out additional value and in enabling a particular innovation to function. AT&T’s Bell Labs presaged developers of the later MP3 format, which, to use the author’s words, codes audio from a CD source “in terms of a dynamically changing relationship between the signal content and the ear’s masking behavior.” Translation: MP3s only reproduce the sound they think we need to hear.
1929
Ernest Glen Wever and Charles W. Bray, two psychologists at Princeton University, remove the cerebrum of a cat and transform its head into a living telephone. Though its specific findings were later disproved, as Sterne notes, their experiment still anticipated later auditory research and “the conceptual collapse of life and machine” that is cybernetics. In a virtuosic sequence, Sterne traces a network of implications from Wever and Bray’s action: the animal-rights response, then and now, to scientific vivisection; the disturbing political undertones of a broadcasting device inside one’s brain; and bizarrely feline historical analogues, whether the cruel “cat piano” that an Italian prince received in 1650 or Napster’s “Kittyhead” icon, a brand meant to display rebellion which unconsciously evokes that earlier anonymous cat, “sewn into a system that it cannot comprehend.”
1948
Claude Shannon (pictured above) publishes his major article “The Mathematical Theory of Communication” in the Bell System Technical Journal, thereby founding the field of information theory. In 1937, as a 21-year-old master’s student, Shannon had applied Boolean algebra to electrical switches, arguing that such binary values could form the nervous system of a “logic machine,” and established a basis for digital computing. Later, after being hired by Bell Labs, he applied these ideas to telephony and began formulating information theory. As Sterne explains, an integral part of that work was Shannon’s focus on noise in a given system, or “entropy,” and the amount of data that can be unnoticeably discarded at any time—the principle underlying the method of “lossy” compression used to encode MP3s. His goal was “maximum efficiency,” what Sterne describes as “satisfactory symmetry between the moment of encoding and the moment of reception.”
1970s
Building on the familiar psychoacoustic concept of “masking” (disguising a particular sound by syncing it with a louder one on a similar frequency), various researchers working independently develop a usable process of perceptual coding, or continuous masking, perhaps the most direct antecedent of the MP3. Sterne spends a chunk of this section exploring why it took so long for perceptual coding to be made viable, and quotes from a wryly revealing interview with Bell Labs alumnus JJ Johnston: “You understand, there were no CDs at this point [1984-85]. The way I got data was I took an LP, played it on my turntable and then into my cassette deck, carried my cassette deck into work and got a 12 bit A-to-D to spool it onto disk. It took about two hours to get about 10 seconds’ worth of music. So, I didn’t have a big repertoire to work with.”
The final chapters of MP3 catch up with the titular format’s lifetime, as Sterne considers the industrial politics of its 1988 standardization (via the Moving Picture Experts Group, or MPEG, whose very name suggests the afterthought that MP3 originally was), potential cultural biases implicit in the codecs’ preliminary listening tests, music-as-ubiquitous-thing and what might come next. He makes a strong case against radical utopian narratives of copyright infringement and corporate doomsaying alike—“pirates,” for all their notoriety/cred, often end up raiding the record-label section of a floating conglomerate while gilding the one that manufactures MP3 players—but going into greater detail would seem, well, piratical.