This is the second post in our Life of a Call series by Dialpad co-founder and VP of Telephony and Mobile, John Rector. Previously, John worked at Google as a Senior Software Engineer, building the telephony backend for Google Voice.
Today we cover the key concepts of how traditional telephones work.
As stated in our last post, you're sharing a lot of the telephone infrastructure with everyone else on the network. So when you pick up the phone, how does the network know that you want to use it? To put it in another way, how do you set up and tear down a call?
SS7, or Signaling System 7 is the answer to this. SS7 exchanges the control information associated with the call (caller number, callee number, billing information, etc.)
A key feature of SS7 is the use of CCS or Common Channel Signaling. With this method, a dedicated signaling channel is set up first and maintained alongside an active voice channel that carries the conversation.
SS7 is also related to the development of touch-tone phones, also called dual tone multi frequency or DTMF. These are of course, the numbers you press on your phone. Each number you press emits a combination of a high and low-frequency sine wave.
For the user, the touch tone interface allows for faster dialing than the rotary phone. From the design and engineering perspective, touch tone phones are more resistant to component failure/degradation than rotary phones, which have to sent out pulse signals at a well-defined rate to the switching infrastructure.
Digital vs Analog
A final key feature developed in the 1970's is the analog to digital conversion of voice. Waves in the ocean, temperature gradients, and the volume of your voice are all continuous, analog signals. Early telephones transmitted your voice as an analog signal over electrical wires, resistors, and capacitors. Of course, there were no computer chips at the time.
But the problem with analog signals is interference or degradation as it is transmitted across the line. Digital signals degrade too, but as long as noise is within certain thresholds, the signal can be regenerated or created again. Noise in analog signals is additive and harder to correct.
How Digitization Works
To digitize your continuous audio analog signal, it is sampled at a certain rate to create a list of values. This list is also quantized, or constrained to a defined set of values. For example, 8.24764 is rounded to 8.25.
How fast should the sampling rate be in order recreate the voice signal from a list of values? That question is answered by the Nyquist-Shannon sampling theorem, a very important topic in signal processing. Put concisely, your sampling rate must be TWICE the maximum frequency you want to hear.
But what is the maximum frequency that we want to hear? Let's make a quick aside first. Humans can hear frequencies from 20 Hz to 20,000 Hz, but the range of the human voice is only from 300 Hz to 3400 Hz. In classic telephony, we extend the nominal human voice range out to 4 kHz since audio filters used in processing have a smooth and not a steep drop-off. The extra room also protects against interference from signals transmitting on adjacent frequency bands.
So putting the two concepts together (maximum frequency of 4 kHz, have to sample at twice that rate), your raw audio signal is sampled 8000 times a second to get a list of 8000 discrete values. This list of discrete values is what's actually transmitted over the PSTN network. Once the signal reaches the other end, a computer chip on the other person's phone rebuilds the signal because it knows what the original sampling rate was.
The term HD Audio or also called Wideband Voice refers to expanding the traditional 300-3400 Hz range of PSTN calls to 50-7000 Hz. This greater range allows for more harmonics of the human voice to come through and thus allow for clearer and improved sound quality.
This ends Part 2 of our series. In Part 3, we'll discuss the internet and the separate of data + voice.