- Waveform algorithms (coders) Waveform algorithms have the following functions and characteristics:
- Use predictive differential methods to reduce bandwidth
- Highly impact voice quality because of reduced bandwidth
- Do not take advantage of speech characteristics
- Examples include: G.711 and G.726
- Source algorithms (coders) Source algorithms have the following functions and characteristics:
- Vocoders take advantage of speech characteristics.
- Bandwidth reduction occurs by sending linear-filter settings.
- Codebooks store specific predictive waveshapes of human speech. They match the speech, encode the phrases, decode the waveshapes at the receiver by looking up the coded phrase, and match it to the stored waveshape in the receiver codebook.
- Examples include: G.728 and G.729
The following three common voice compression techniques are standardized by the ITU-T:
- PCM Amplitude of voice signal is sampled and quantized at 8000 times per second. Each sample is then represented by one octet (8 bits) and transmitted. For sampling, you must use either a-law or µ-law to reduce the signal-to-noise ratio.
- ADPCM The difference between the current sample and its predicted value (based on past samples). ADPCM is represented by 2, 3, 4, or 5 bits. This method reduces the bandwidth requirement at the expense of signal quality.
- CELP Excitation value and a set of linear-predictive filters (settings) are transmitted. The filter setting transmissions are less frequent than excitation values and are sent on an as-needed basis.
Table 2-4 describes the CODECs and compression standards.
A common type of waveform encoding is pulse code modulation (PCM). Standard PCM is known as ITU standard G.711, which requires 64,000 bits per second of bandwidth to transport the voice payload (that is, not including any overhead), as shown in Figure 2-30.
Figure 2-30 shows that PCM requires 1 polarity bit, 3 segment bits, and 4 step bits, which equals 8 bits per sample. The Nyquist Theorem requires 8000 samples per second; therefore, you can figure the required bandwidth as follows:
8 bits * 8000 samples per second = 64,000 bits per second
Adaptive differential pulse code modulation (ADPCM) coders, like other waveform coders, encode analog voice signals into digital signals to adaptively predict future encodings by looking at the immediate past. The adaptive feature of ADPCM reduces the number of bits per second that the PCM method requires to encode voice signals. ADPCM does this by taking 8000 samples per second of the analog voice and turning them into linear PCM samples. ADPCM then calculates the predicted value of the next sample, based on the immediate past sample, and encodes the difference. The ADPCM process generates 4-bit words, thereby generating 16 specific bit patterns.
The ADPCM algorithm from the Consultative Committee for International Telegraph and Telephone (CCITT) transmits all 16 possible bit patterns. The ADPCM algorithm from the American National Standards Institute (ANSI) uses 15 of the 16 possible bit patterns. The ANSI ADPCM algorithm does not generate a 0000 pattern.
The ITU standards for compression are as follows:
- G.711 rate: 64 kbps = (2 * 4 kHz) * 8 bits/sample
- G.726 rate: 32 kbps = (2 * 4 kHz) * 4 bits/sample
- G.726 rate: 24 kbps = (2 * 4 kHz) * 3 bits/sample
- G.726 rate: 16 kbps = (2 * 4 kHz) * 2 bits/sample
Code excited linear prediction (CELP) compression transforms analog voice as follows:
1. The input to the coder is converted from an 8-bit to a 16-bit linear PCM sample.
2. A codebook uses feedback to continuously learn and predict the voice waveform.
3. The coder is excited (that is, begins its lookup process) by a white noise generator.
4. The mathematical result is sent to the far-end decoder for synthesis and generation of the voice waveform.
Two forms of CELP include Low-Delay CELP (LDCELP) and Conjugate Structure Algebraic CELP (CS-ACELP). LDCELP is similar to CS-ACELP, except for the following:
- LDCELP uses a smaller codebook and operates at 16 kbps to minimize delay, or look-ahead, from 2 to 5 ms, while CS-ACELP minimizes bandwidth requirements (8 kbps) at the expense of increasing delay (10 ms).
- The 10-bit code word is produced from every five speech samples from the 8 kHz input with no look-ahead.
- Four of these 10-bit code words are called a subframe. They take approximately 2.5 ms to encode. CS-ACELP uses eight 10-bit code words.
Two of these subframes are combined into a 5-ms block for transmission. CS-ACELP is a variation of CELP that performs these functions:
- Codes 80-byte frames, which take approximately 10 ms to buffer and process.
- Adds a look-ahead of 5 ms. A look-ahead is a coding mechanism that continuously analyzes, learns, and predicts the next waveshape.
- Adds noise reduction and pitch-synthesis filtering to processing requirements.
Cisco VoIP environments typically leverage the benefits of G.729 when transmitting voice traffic over the IP WAN. These benefits include the ability to minimize bandwidth demands, while maintaining an acceptable level of voice quality. Several variants of G.729 exist.





