Digital representations of sound (audio, music, voice,...)


As distinct from analog representations. Similar to representation of images, i.e.

Images: real world -> technology -> representation -> eyes -> brain
Sound: real world -> technology -> representation -> ears -> brain

Many different applications with diferent requirements, e.g.
  • Music - classical, rock, jazz....
  • Online radio (talk, music) - e.g. WFIU, BBC World Service,
  • Cellphones, phones, and other duplex voice
  • Two-way and emergency communications
  • Detecting Earthquakes
  • Voice recognition

Many different environments, e.g.
  • Compact Disc
  • Downloadable file
  • Streaming on the web
  • Real time transmission over wireless or wired networks



Sound is essentially a waves disturbance of the air. A pure note will be a sine wave with a frequency and amplitude: the higher the frequency, the higher the note that is perceived; the higher the amplitude the louder the sound. You can look at the figures on this page to see a pure sine wave and how it corresponds to air compressions.

Different instruments (including voices, etc) produce different waveforms which define the sound of the instrument. When multiple instruments are used, complex waveforms arise. See Online Piano. This page has a few examples of waveforms for different instruments.

You can also try out the spectrum analyzer at http://www.relisoft.com/freeware/freq.html

Note that these waves are continuous, or analog. To use them on a computer, we we have to digitize them. To explore this, we will go through the Wikipedia page. them. A few things to note:

  1. Analog-Digital conversion can be done in hardware or software
  2. Sampling occurs on the x (time) axis; the higher the sampling rate, the better the recreation of the wave forms (e.g. CD has 44.1KHz); the sampling rate limits the bandwidth (i.e. the range of frequencies that can be represented).
  3. The highest frequency that can be represented is half the sampling rate (the Nyquist frequency). You can play around with this demonstration to see how the sampling rate affects the reproduction of the original frequency (you can vary both the original frequency and the sampling frequency and study the effects).
  4. The amount of bits used to discretize the y (amplitude) axis, or quantization, e.g. "16 bit audio" in CD's using 16 bits (65536 values), affects the "dynamic range" of the amplitude

Remember...

Sampling occurs on the x-axis, and the sampling rate determines the bandwidth and the maximum frequency sound that can be represented (Nyquist frequency)

Quantization occurs on the y-axis, and determines the amount of different values of amplitude (or volume) that can be represented (known as the dynamic range)

Human audible range: The human audible range of frequencies is roughly 20 Hz to 20 KHz. One of the reasons why "CD quality" is good, is because sampling at 44 KHz has a Nyquist frequency of 22 KHz, which is adequate to capture the sounds the human ear can capture. Well, as we get older, we lose our ability to hear higher frequencies first. Check your own audible range here! (just for fun, not a scientific test).

All of this is a necessary precursor to the particular format we use to store (or stream!) sound. This problem is similar to images: i.e. a quality vs space tradeoff, with options for lossy or lossless compressions. Note there is a difference between a codec (the system used for encoding the samples) and the file format (the implementaiton of a codec in a file). The standard way of representing the digitized audio (which lays behind the codec) is called Pulse Code Modulation.



For details of audio formats, see the Wikipedia page.

Examples:
Compact Discs have the following parameters for audio stored on them:
Sample rate: 44.1 kHz
Channels: 2 (stereo)
Bits per sample, per channel: 16
Levels per sample: 65,536
Total data rate (Mb/s): 1.4112

Compact Discs sample the audio 44,100 times per second. The total information needed for 1 second of audio for 2 channel is therefore 44,100 x 2 x 16 = 1,411,200 bits.

Digital representation of video


Sources

  • Video cameras (digital and analog)
  • Webcams
  • Computer-generated
  • Analog / digital TV


See on YouTube

See the ins and outs of video compression

A Video Codec specifies how a video is compressed and decompressed

A Video Format specifies how a video is stored or transmitted (one format might support several codecs)


See on YouTube

Common computer / internet video codecs


Video formats


Adobe Flash

*