A Comparative Analysis of Digital Audio Encoders: LPC, CELP, and MELP, Evaluating Quality and Complexity of Transmitted Content
Subject Areas : Majlesi Journal of Telecommunication DevicesSaeed Talati 1 , Pouria Etezadifar 2 , Mohammad Reza Hassani Ahangar 3 , Mahdi Molazade 4
1 - Ph.D. Candidate of Electrical Engineering, Imam Hossein University
2 - Assistant Professor, Faculty of Electrical Engineering Department, Imam Hossein University, Tehran, Iran.
3 - Faculty of Electrical Engineering Department, Imam Hossein University, Tehran,Iran
4 - Assistant Professor, Faculty of Electrical Engineering Department, Imam Hossein University, Tehran,Iran
Keywords: Quality, Complexity, LPC, CELP, MELP.,
Abstract :
This article compares the quality and complexity of LPC, CELP, and MELP standard audio encoders. These standards are based on linear predictive and are used in sound (speech) processing. These standards are powerful high-quality speech coding methods that provide highly accurate estimates of audio parameters and are widely used in the commercial (mobile) and military (NATO) communications industries. To compare LPC, CELP, and MELP audio encoders in two male and female voice modes and four voice models: quiet, Audio recorded without sound by the microphone, MCE, office, and two noise models 1% and 05% were used. The simulation results show the complexity of MELP is higher than LPC and CELP in terms of both processor and memory requirements. The MELP analyzer requires 72% of its total processing time. This additional memory is, due to the vector quantization tables MELP uses for the linear spectral frequencies (LSFs) and the Fourier magnitude. Also, According to the quality comparison test using the MOS index, MELP has the highest score, followed by CELP and LPC
[1] Bishnu Atal "The History of Linear Prediction". ICASSP '78. IEEE Signal Processing Magazine, vol.23, no2, march 2006. 154-161.
[2] Bishnu Atal and Manfred Schroeder. "Predictive coding of speech signals and subjective error criteria". ICASSP '78. IEEE International Conference on Acoustics, Speech, and Signal Processing. 3: 573–576, 1978.
[3] A. V. McCree and T. P. Barnwell, "A mixed excitation LPC vocoder model for low bit rate speech coding". IEEE Transactions on Speech and Audio Processing, vol. 3, no. 4, pp. 242-250, July 1995.
[4] J. J. D. van Schalkwyk, D. J. Joubert and J. G. van der Linde, "Linear predictive speech coding at 2400 b/s," in Transactions of the South African Institute of Electrical Engineers, vol. 84, no. 3, pp. 146-152, June 1993.
[5] M. Schroeder and B. Atal, "Code-excited linear prediction (CELP): High-quality speech at very low bit rates," ICASSP '85. IEEE International Conference on Acoustics, Speech, and Signal Processing, 1985, pp. 937-940, doi: 10.1109/ICASSP.1985.1168147.
[6] Weiran Lin, Soo Ngee Koh and Xiao Lin, "Mixed excitation linear prediction coding of wideband speech at 8 kbps," 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100), 2000, pp. II1137-II1140 vol.2, doi: 10.1109/ICASSP.2000.859165.
[7] J.D. Tardelli, E.W. Kreamer, “Vocoder Intelligibility and Quality Test Methods”, IEEE International Conference on Acoustics, Speech, and Signal Processing, Atlanta, Georgia, USA, 1996.
[8] M. A. Kohler, "A comparison of the new 2400 bps MELP Federal Standard with other standard coders," 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing, 1997, pp. 1587-1590 vol.2, doi: 10.1109/ICASSP.1997.596256.
[9] Saeed Talati, Pouriya Etezadifar. (2020). “Providing an Optimal Way to Increase the Security of Data Transfer Using Watermarking in Digital Audio Signals”, MJTD, vol. 10, no. 1.
[10] Hashemi, Seyed & Barati, Shahrokh & Talati, S. & Noori, H. (2016). “A genetic algorithm approach to optimal placement of switching and protective equipment on a distribution network”. Journal of Engineering and Applied Sciences. 11. 1395-1400.
[11] Hashemi, Seyed & Abyari, M. & Barati, Shahrokh & Tahmasebi, Sanaz & Talati, S. (2016). “A proposed method to controller parameter soft tuning as accommodation FTC after unknown input observer FDI”. Journal of Engineering and Applied Sciences. 11. 2818-2829.
[12] S. Talati, A. Rahmati, and H. Heidari. (2019) “Investigating the Effect of Voltage Controlled Oscillator Delay on the Stability of Phase Lock Loops”, MJTD, vol. 8, no. 2, pp. 57-61.
[13] Talati, S., & Alavi, S. M. (2020). “Radar Systems Deception using Cross-eye Technique”. Majlesi Journal of Mechatronic Systems, 9(3), 19-21.
[14] Saeed Talati, mohamadreza Hasani Ahangar (2020) “Analysis, Simulation and Optimization of LVQ Neural Network Algorithm and Comparison with SOM”, MJTD, vol. 10, no. 1.
[15] Talati, S., & Hassani Ahangar. M. R. (2020) “Combining Principal Component Analysis Methods and Self-Organized and Vector Learning Neural Networks for Radar Data”, Majlesi Journal of Telecommunication Devices, 9(2), 65-69.
[16] Hassani Ahangar, M. R., Talati, S., Rahmati, A., & Heidari, H. (2020). “The Use of Electronic Warfare and Information Signaling in Network-based Warfare”. Majlesi Journal of Telecommunication Devices, 9(2), 93-97.
[17] Aslinezhad, M., Mahmoudi, O., & Talati, S. (2020). “Blind Detection of Channel Parameters Using Combination of the Gaussian Elimination and Interleaving”. Majlesi Journal of Mechatronic Systems, 9(4), 59-67.
[18] Talati, S., & Amjadi, A. (2020). “Design and Simulation of a Novel Photonic Crystal Fiber with a Low Dispersion Coefficient in the Terahertz Band”. Majlesi Journal of Mechatronic Systems, 9(2), 23-28.
[19] Talati, Saeed, Hassani Ahangar, Mohammad Reza. (2021). “Radar Data Processing Using a Combination of Principal Component Analysis Methods and Self-Organized and Digitizing Learning Vector Neural Networks”, Electronic and Cyber Defense, 9 (2), pp. 1-7.
[20] Talati, S., Alavi, S. M., & Akbarzade, H. (2021). “Investigating the Ambiguity of Ghosts in Radar and Examining the Diagnosis and Ways to Deal with it”. Majlesi Journal of Mechatronic Systems, 10(2).
[21] Etezadifar, P., & Talati, S. (2021). “Analysis and Investigation of Disturbance in Radar Systems Using New Techniques of Electronic Attack”. Majlesi Journal of Telecommunication Devices, 10(2), 55-59.
[22] Saeed. Talati, Behzad. Ebadi, Houman. Akbarzade “Determining of the fault location in distribution systems in presence of distributed generation resources using the original post phasors”. QUID 2017, pp. 1806-1812, Special Issue No.1- ISSN: 1692-343X, Medellín-Colombia. April 2017.
[23] Talati, Saeed, Akbari Thani, Milad, Hassani Ahangar, Mohammad Reza. (2020). “Detection of Radar Targets Using GMDH Deep Neural Network”, Radar Journal, 8 (1), pp. 65-74.
[24] Talati, S., Abdollahi, R., Soltaninia, V., & Ayat, M. (2021). “A New Emitter Localization Technique Using Airborne Direction Finder Sensor”. Majlesi Journal of Mechatronic Systems, 10(4), 5-16.
[25] O. Sharifi-Tehrani, "Design, Simulation and Fabrication of Microstrip Hairpin and Interdigital BPF for 2.25 GHz Unlicensed Band," Majlesi Journal of Telecommunication Devices, vol. 6, no. 4, 2017.
[26] O. Sharifi-Tehrani and S. Talati. (2017) “PPU Adaptive LMS Algorithm, a Hardware-Efficient Approach; a Review on”, Majlesi Journal of Mechatronic Systems, vol. 6, no. 1.
[27] O. Sharifi-Tehrani, "Hardware Design of Image Channel Denoiser for FPGA Embedded Systems," Przegląd Elektrotechniczny, vol. 88, no. 3b, pp. 165-167, 2012.
[28] O. Sharifi-Tehrani, A. Sadeghi, and S. M. J. Razavi, "Design and Simulation of IFF/ATC Antenna for Unmanned Aerial Vehicle," Majlesi Journal of Mechatronic Systems, vol. 6, no. 1, pp. 1-4, 2017.
[29] O. S. Tehrani, M. Ashourian, and P. Moallem, "An FPGA-based implementation of fixed-point standard-LMS algorithm with low resource utilization and fast convergence," International Review on Computers and Software, vol. 5, no. 4, pp. 436-444, 2010.
[30] O. Sharifi-Tehrani, "Novel hardware-efficient design of LMS-based adaptive FIR filter utilizing Finite State Machine and Block-RAM," Przeglad Elektrotechniczny, vol. 87, no. 7, pp. 240-244, 2011.
[31] O. Sharifi-Tehrani, M. F. Sabahi, and M. R. Danee, "Low-Complexity Framework for GNSS Jamming and Spoofing Detection on Moving Platforms," IET Radar, Sonar & Navigation, vol. 14, no. 12, pp. 2027-2038, 2020.
[32] M. Ashourian and O. Sharifi-Tehrani, "Application of semi-circle law and Wigner spiked-model in GPS jamming confronting," Signal, Image and Video Processing, pp. 1-8, 2022.
[33] O. Sharifi-Tehrani, M. F. Sabahi, and M. Danaee, "Null broadened–deepened array antenna beamforming for GNSS jamming mitigation in moving platforms," ICT Express, vol. 8, no. 2, pp. 161-165, 2022.
[34] O. Sharifi-Tehrani, H. Lashgarian, M. Soleymanzade, and M. H. Ghasemian, "Futurology of Electronic Warfare Systems for IR. IRAN's Fast Crafts," Majlesi Journal of Telecommunication Devices, vol. 8, no. 2, 2019.
[35] O. Sharifi-Tehrani, A. Sadeghi, and S. M. J. Razavi, "Design and Simulation of IFF/ATC Antenna for Unmanned Aerial Vehicle," Majlesi Journal of Mechatronic Systems, vol. 6, no. 1, pp. 1-4, 2017.
[36] O. Sharifi-Tehrani, M. F. Sabahi, and M. R. Danee, "GNSS Jmming Detection of UAV Ground Control Station Using Random Matrix Theory," ICT Express, vol. In Press, 2020.
[37] O. Sharifi-Tehrani, "Novel hardware-efficient design of LMS-based adaptive FIR filter utilizing Finite State Machine and Block-RAM," Przeglad Elektrotechniczny, vol. 87, no. 7, pp. 240-244, 2011.
[38] H. Pourghassem, O. Sharifi-Tehrani, and M. Nejati, "A novel weapon detection algorithm in X-ray dual-energy images based on connected component analysis and shape features," Australian Journal of Basic and Applied Sciences, vol. 5, pp. 300-307, 2011.
[39] O. S. Tehrani, M. Ashourian, and P. Moallem, "Fpga implementation of a channel noise canceller for image transmission," in Machine Vision and Image Processing (MVIP), 2010 6th Iranian, 2010, pp. 1-6: IEEE.
[40] Ghazali, S. M., Baleghi, Y. “Pedestrian Detection in Infrared Outdoor Images Based on Atmospheric Situation Estimation” Journal of AI and Data Mining, 2019; 7(1): 1-16. doi: 10.22044/jadm.2018.5742.1696
[41] Talati, S., Ghazali, S. M., Hassani Ahangar, M., & Alavi, S. M. (2021). “Analysis and Evaluation of Increasing the Throughput of Processors by Eliminating the Lobe’s Disorder” Majlesi Journal of Telecommunication Devices, 10(3), 119-123. https://doi.org/10.52547/mjtd.10.3.119
[42] Seyed Morteza Ghazali, Jalil Mazloum, Yasser Baleghi. “Modified binary salp swarm algorithm in EEG signal classification for epilepsy seizure detection” Biomedical Signal Processing and Control. Volume 78, September 2022, 103858.
[43] EtezadiFar. P., Talati. S., Hassani Ahangar. M.R., Molazade. M., “Investigation of Steganography Methods in Audio Standard Coders: LPC, CELP, MELP” Majlesi Journal of Telecommunication Devices, 12(1), in press, 2023.
Majlesi Journal of Telecommunication Devices Vol. 13, No. 1, March 2024
A Comparative Analysis of Digital Audio Encoders: LPC, CELP, and MELP, Evaluating Quality and Complexity of Transmitted Content
Saeed Talati1, Pouriya Etezadifar2, Mohammad Reza Hassani Ahangar3, Mahdi Molazade4
1- PhD Candidate, Faculty of Electrical Engineering Department, Imam Hossein University, Tehran, Iran.
Email: Saeed.Talati@ihu.ac.ir
2- Assistant Professor, Faculty of Electrical Engineering Department, Imam Hossein University, Tehran, Iran.
Email: petezadifar@ihu.ac.ir (Corresponding Author)
3- Professor, Faculty of Electrical Engineering Department, Imam Hossein University, Tehran, Iran.
Email: MRHassani@ihu.ac.ir
4- Assistant Professor, Faculty of Electrical Engineering Department, Imam Hossein University, Tehran, Iran.
Email: mmollazade@ihu.ac.ir
ABSTRACT: This article compares the quality and complexity of LPC, CELP, and MELP standard audio encoders. These standards are based on linear predictive and are used in sound (speech) processing. These standards are powerful high-quality speech coding methods that provide highly accurate estimates of audio parameters and are widely used in the commercial (mobile) and military (NATO) communications industries. To compare LPC, CELP, and MELP audio encoders in two male and female voice modes and four voice models: quiet, Audio recorded without sound by the microphone, MCE, office, and two noise models 1% and 05% were used. The simulation results show the complexity of MELP is higher than LPC and CELP in terms of both processor and memory requirements. The MELP analyzer requires 72% of its total processing time. This additional memory is, due to the vector quantization tables MELP uses for the linear spectral frequencies (LSFs) and the Fourier magnitude. Also, According to the quality comparison test using the MOS index, MELP has the highest score, followed by CELP and LPC. .
KEYWORDS: Quality, Complexity, LPC, CELP, MELP. |
1. Introduction
In 1966, Linear Predictive Coding (LPC) was presented and in 1978 this method was completed [1]. LPC is one of the most common audio coding methods that converts analog audio to digital at 2400 bps.
LPC is one of the powerful methods of high-quality audio encoder analysis that provides very accurate estimates of audio parameters. The way LPC works is that speech-like audio signals are produced by a noise, and sounds with frequencies have successively added to them alternately. This method is the closest approximation to the real sound.
The Code Excited Linear Predictive (CELP) was presented in 1985 [2]. CELP is a linear speech encoder programming algorithm that converts analog audio to digital audio at 4800 bits per second. This method is high quality and is used in MPEG-4 audio speech encoder.
2. LPC
The working method of LPC is that sound signals similar to speech are produced by noise and sounds with alternating frequencies are successively added to it. This method is the closest approximation to the real sound. LPC analyzes the speech signal by estimating the forms, removing their effects from the speech signal, and estimating the intensity and frequency of the residual noise. The process of eliminating these forms is called inverse filtering, and the remaining signal after subtracting the filtered modeled signal is called the remaining signal. Since the speech signals are different, this process is done in short pieces of the speech signal, which are called frames. In general, LPC compresses the speech signal at 30 to 50 frames per second.
Because LPC is often used to transmit spectrum information, it must be tolerant of transmission errors. Transferring the filter coefficients directly is undesirable because they are very sensitive to error.
In other words, a very small error can change the entire spectrum
2.1. Bit Allocation
Bit allocation of LPC frame should be according to the following table [4].
Table 1. Bit Allocation of LPC Encoder.
Unvoiced | Voiced | Bit | Unvoiced | Voiced | Bit | Unvoiced | Voiced | Bit |
R-6* | RC(8)-1 | 37 | RC(3)-3 | RC(3)-3 | 19 | RC(l)-0 | RC (l)-0 | 1 |
RC(1)-6* | RC(5)-1 | 38 | RC(4)-2 | RC(4)-2 | 20 | RC(2)-0 | RC(2)-0 | 2 |
RC(2)-6* | RC(6)-1 | 39 | R-3 | R-3 | 21 | RC(3)-0 | RC(3)-0 | 3 |
RC(3)-7* | RC(7)-2 | 40 | RC(l)-4 | RC(l)-4 | 22 | P-0 | P1-0 | 4 |
RC(4)-6* | RC(9)-0 | 41 | RC(2)-3 | RC(2)-3 | 23 | R-0 | R2-0 | 5 |
P-5 | P-5 | 42 | RC(3)-4 | RC(3)-4 | 24 | RC(l)-1 | RC(l)-1 | 6 |
RC(1)-7* | RC(5)-2 | 43 | RC(4)-3 | RC(4)-3 | 25 | RC(2)-1 | RC(2)-1 | 7 |
RC(2)-7* | RC(6)-2 | 44 | R-4 | R-4 | 26 | RC(3)-1 | RC(3)-1 | 8 |
Unused | RC(10)-1 | 45 | P-3 | P-3 | 27 | P-1 | P-1 | 9 |
R-7* | RC(8)-2 | 46 | RC(2)-4 | RC(2)-4 | 28 | R-1 | R-1 | 10 |
P-6 | P-6 | 47 | RC(3)-5*3 | RC(7)-0 | 29 | RC(l)-2 | RC(l)-2 | 11 |
RC(4)-7* | RC(9)-1 | 48 | R-5* | RC(8)-0 | 30 | RC(4)-0 | RC(4)-0 | 12 |
RC(1)-8* | RC(5)-3 | 49 | P-4 | P-4 | 31 | RC(3)-2 | RC(3)-2 | 13 |
RC(2)-8* | RC(6)-3 | 50 | RC(4)-4 | RC(4)-4 | 32 | R-2 | R-2 | 14 |
RC(3)-8* | RC(7)-3 | 51 | RC(1)-5* | RC(5)-0 | 33 | P-2 | P-2 | 15 |
RC(4)-8* | RC(9)-2 | 52 | RC(2)-5* | RC(6)-0 | 34 | RC(4)-1 | RC(4)-1 | 16 |
R-3* | RC(8)-3 | 53 | RC(3)-6* | RC(7)-1 | 35 | RC(1)-3 | RC(1)-3 | 17 |
Synch. | Synch. | 54 | RC(4)-5* | RC(10)-0 | 36 | RC(2)-2 | RC(2)-2 | 18 |
[1] P = Pitch
[2] R = RMS Amplitude
[3] * = Error Control Bit
Bit 0 = least significant bit of data
3. CELP
CELP is essentially Analysis with Synthesis (AbS) meaning that coding (analysis) is performed with perceptual optimization of the decoded signal (synthesis) in a closed loop, the high complexity of CELP was initially an impractical proposition. However, many ways to speed up the coding process have been found and CELP has become a practical reality [5].
3.1. Bit Allocation
Bit allocation of CELP frame should be according to the following table [4].
Table 2. Bit Allocation of CELP Encoder.
Bit | Description | Bit | Description | Bit | Description | Bit | Description | Bit | Description | Bit | Description |
1 | PG(4)-41 | 25 | PG(3)-1 | 49 | LSP 1-2 | 73 | PD(l)-4 | 97 | PG(l)-2 | 121 | LSP 7-2 |
2 | PD(3)-42 | 26 | PD(4)-5 | 50 | PG(3)-2 | 74 | CG(3)-2 | 98 | CG(3)-4 | 122 | CI(4)-2 |
3 | LSP 1-13 | 27 | CG(l)-3 | 51 | HP-1 | 75 | LSP 7-1 | 99 | LSP 10-2 | 123 | PD(l)-1 |
4 | CG(2)-44 | 28 | CI(3)-5 | 52 | PD(3)-1 | 76 | CI(2)-7 | 100 | CI(4)-5 | 124 | PG(2)-4 |
5 | CI(3)-35 | 29 | LSP 7-O | 53 | CG(4)-3 | 77 | CI(3)-O | 101 | CI(2)-O | 125 | CG(3)-3 |
6 | CI(l)-8 | 30 | CI(2)-1 | 54 | LSP 8-1 | 78 | PD(2)-5 | 102 | PD(l)-2 | 126 | LSP 3-1 |
7 | PD(4)-O | 31 | PD(3)-7 | 55 | PG(3)-O | 79 | LSP 4-1 | 103 | LSP 5-1 | 127 | CI(l)-7 |
8 | LSP 8-O | 32 | CI(l)-O | 56 | CI(2)-8 | 80 | CG(l)-O | 104 | SP-O6 | 128 | PD(3)-2 |
9 | PG(2)-3 | 33 | PG(4)-O | 57 | PD(4)-1 | 81 | PG(4)-3 | 105 | PG(4)-2 | 129 | CI(2)-6 |
10 | CG(3)-O | 34 | LSP 4-3 | 58 | CI(4)-O | 82 | LSP 9-1 | 106 | CG(2)-3 | 130 | LSP 9-2 |
11 | PD(l)-5 | 35 | CG(3)-1 | 59 | LSP 3-2 | 83 | PD(3)-6 | 107 | LSP 2-1 | 131 | PG(4)-1 |
12 | LSP 3-3 | 36 | CI(l)-5 | 60 | PG(2)-O | 84 | CI(l)-4 | 108 | PD(4)-4 | 132 | CG(l)-1 |
13 | CI(2)-3 | 37 | PD(2)-O | 61 | PD(l)-6 | 85 | CG(2)-1 | 109 | CI(l)-2 | 133 | PD(2)-4 |
14 | CI(4)-4 | 38 | CI(4)-1 | 62 | CG(2)-O | 86 | LSP 6-2 | 110 | PG(2)-1 | 134 | HP-3 |
15 | PD(2)-1 | 39 | LSP 9-O | 63 | CI(3)-6 | 87 | CI(4)-3 | 111 | CI(3)-7 | 135 | LSP 6-O |
16 | LSP 10-0 | 40 | CI(3)-8 | 64 | LSP 10-1 | 88 | PG(2)-2 | 112 | LSP 4-O | 136 | PG(3)-3 |
17 | PG(l)-3 | 41 | PG(l)-4 | 65 | PG(l)-1 | 89 | PD(4)-3 | 113 | CI(2)-5 | 137 | CI(4)-6 |
18 | CG(4)-O | 42 | CG(2)-2 | 66 | CI(4)-7 | 90 | LSP 1-0 | 114 | PD(l)-7 | 138 | PD(l)-O |
19 | LSP 5-2 | 43 | PD(l)-3 | 67 | PD(3)-3 | 91 | CG(4)-2 | 115 | PG(l)-O | 139 | LSP 2-3 |
20 | PD(3)-O | 44 | LSP 6-1 | 68 | CG(l)-2 | 92 | LSP 8-2 | 116 | CG(4)-4 | 140 | CG(4)-1 |
21 | HP-O7 | 45 | CI(3)-4 | 69 | LSP 5-3 | 93 | CI(2)-4 | 117 | LSP 5-O | 141 | CI(3)-2 |
22 | CI(l)-1 | 46 | CI(2)-2 | 70 | CI(l)-6 | 94 | HP-2 | 118 | PD(4)-2 | 142 | LSP 4-2 |
23 | CI(4)-8 | 47 | CG(l)-4 | 71 | LSP 2-O | 95 | PD(2)-2 | 119 | CI(l)-3 | 143 | PD(3)-5 |
24 | LSP 2-2 | 48 | PD(2)-3 | 72 | PG(3)-4 | 96 | LSP 3-O | 120 | CI(3)-1 | 144 | SY8 |
[1] PG(n)-i = Adaptive Code Gain
[2] PD(n)-i = Adaptive Code Index
[3] LSP j-i = Line Spectral Parameter (LSP),
where j = LSP number
[4] CG(n)-i = Fixed, Stochastically-derived Code Gain
[5] CI(n)-i = Fixed, Stochastically-derived Code Index
[6] SP = Expansion Bit
[7] HP-i = Parity
[8] SY = Synchronization Bit
Note: i = bit number, with O being the least significant bit
n = subframe number
4. mELP
MELP is based on the traditional LPC model and uses additional features such as mixed excitation, non-periodic pulses, adaptive spectrum enhancement, pulse dispersion filter, and Fourier magnitude modeling to improve performance. Adding these features allows the encoder to better match the features of the input speech. [6].
4.1. Bit Allocation
Bit allocation of MELP frame should be according to the following table [6].
Table 3. Bit Allocation of MELP Encoder.
Unvoiced | Voiced | Bit | Unvoiced | Voiced | Bit | Unvoiced | Voiced | Bit |
G(1)-1 | G(1)-1 | 37 | LSF(1)-7 | LSF(1)-7 | 19 | G(2)-1 | G1(2)-1 | 1 |
FEC(1)-3 | BP-3 | 38 | LSF(4)-6 | LSF(4)-6 | 20 | FEC2(1)-1 | BP3-1 | 2 |
FEC(1)-2 | BP-2 | 39 | P-4 | P-4 | 21 | P-1 | P4-1 | 3 |
LSF(2)-2 | LSF(2)-2 | 40 | LSF(1)-6 | LSF(1)-6 | 22 | LSF(2)-1 | LSF5(2)-1 | 4 |
LSF(3)-4 | LSF(3)-4 | 41 | LSF(1)-5 | LSF(1)-5 | 23 | LSF(3)-1 | LSF(3)-1 | 5 |
LSF(2)-3 | LSF(2)-3 | 42 | LSF(2)-6 | LSF(2)-6 | 24 | G(2)-4 | G(2)-4 | 6 |
LSF(3)-3 | LSF(3)-3 | 43 | FEC(1)-4 | BP-4 | 25 | G(2)-5 | G(2)-5 | 7 |
LSF(3)-2 | LSF(3)-2 | 44 | LSF(1)-4 | LSF(1)-4 | 26 | LSF(3)-6 | LSF(3)-6 | 8 |
LSF(4)-4 | LSF(4)-4 | 45 | LSF(1)-3 | LSF(1)-3 | 27 | G(2)-2 | G(2)-2 | 9 |
LSF(4)-3 | LSF(4)-3 | 46 | LSF(2)-5 | LSF(2)-5 | 28 | G(2)-3 | G(2)-3 | 10 |
FEC(4)-3 | AF6 | 47 | LSF(4)-5 | LSF(4)-5 | 29 | P-5 | P-5 | 11 |
LSF(4)-2 | LSF(4)-2 | 48 | FEC (4)-1 | FM7-1 | 30 | LSF(3)-5 | LSF(3)-5 | 12 |
FEC(3)-3 | FM-5 | 49 | LSF(1)-2 | LSF(1)-2 | 31 | P-6 | P-6 | 13 |
FEC(3)-2 | FM-4 | 50 | LSF(2)-4 | LSF(2)-4 | 32 | P-2 | P-2 | 14 |
FEC(3)-1 | FM-3 | 51 | FEC(2)-3 | FM-8 | 33 | P-3 | P-3 | 15 |
FEC(4)-2 | FM-2 | 52 | FEC(2)-2 | FM-7 | 34 | LSF(4)-1 | LSF(4)-1 | 16 |
G(1)-3 | G(1)-3 | 53 | FEC(2)-1 | FM-6 | 35 | P-7 | P-7 | 17 |
SYNC | SYNC | 54 | G(1)-2 | G(1)-2 | 36 | LSF(1)-1 | LSF(1)-1 | 18 |
[1] Gain
[2] Forward Error Correction Parity Bits
[3] Band pass Voicing
[4] Pitch voicing
[5] Line Spectral Frequencies
[6] Aperiodic Flag
[7] Fourier Magnitude
Note: Bit 1 = least significant bit of data set
5. Comparison
Audio encoder standards LPC, CELP, and MELP were thoroughly reviewed. It is necessary to compare their performance in terms of quality, Intelligibility, Communicability, Recognizability, and complexity for two different types of speech (male and female) in order to conclude which one performs better.
5.1. Quality
For quality testing, we use MOS1 for benign noise conditions [7]. Quality testing is often used to supplement or replace comprehensibility testing. It provides a picture of the listeners' personal opinions about the signal sent by the communication systems or processed by the algorithms under test.
MOS test has been done in four audio noise conditions and two-channel conditions. The two error environments tested are: a 1% random bit error channel and a 0.5% random block error channel. Block error contains 50% error in a 35 ms block. Q-H250 is Audio recorded without sound by the microphone
"MCE" is a mobile command environment. The office is recorded in a modem office. Quiet is a soundless environment [8].
The MOS test results for LPC, CELP, and MELP audio encoders are shown in Figs. 1 to 4.
Fig. 1. MOS Test Result for LPC.
Fig. 2. MOS Test Result for CELP.
Fig. 3. MOS Test Result for MELP.
[1] Mean Opinion Score
Fig. 4. MOS Comparison.
Relative coder ranking is easily seen in Figure 4. In all environments, MELP shows the highest MOS score, followed by CELP, and LPC.
MELP and LPC coders scored higher overall for male speakers than female speakers. Only in the 0.5% block error condition did the female MELP score exceed the male score, but this variance was within the standard error. The CELP coder, in contrast, scored higher overall for female speakers than for male speakers. This was especially bad in the office environment. CELP on QH250 also showed significantly higher scores for female speakers [8]. Table 4 shows the simulation results obtained from each of the standards in different modes.
Table 4. MOS simulation results.
. 5% | 1% | MCE | Office | Q-H250 | Quiet | Bit |
3.50 | 2.07 | 2.57 | 2.95 | 3.16 | 3.30 | MELP |
3.08 | 1.96 | 2.38 | 2.87 | 3.01 | 3.16 | CELP |
2.31 | .98 | 1.09 | 2.08 | 1.98 | 2.20 | LPC |
6. complexity
Complexity was measured using MIPS1, read only memory (ROM) and random access memory (RAM) measurements.
Table 5. Complexity Comparison [8].
Coder | MIPS | RAM | ROM |
MELP | 20.43 | 98.2K | 128K |
CELP | 17.0 | 14.8K | 128K |
LPC | 8.7 | 12.93K | 128K |
As Table 5 shows, MELP complexity exceeds, LPC, and CELP in both processor and memory requirements. The MELP analyzer requires 72% of its total processing. These additional memory requirements are due to vector quantization tables which MELP uses for both line spectral frequencies (LSFs) and Fourier magnitudes [8].
7. Support
Adapted from Saeed Talati's doctoral thesis at comprehensive Imam Hossein University entitled "Recognition of digital audio steganography in LPC10, CELP, and MELP audio encoder standards".
8. Conclusion
In this article, Standard audio Encoders LPC, CELP, and MELP are compared in the two areas of quality and complexity. These audio coding techniques are powerful audio coding standards that are widely used in the mobile, commercial and military industries (official NATO standard). The quality comparison test using different sounds is given in figure 4. The obtained results show that MELP has the highest score, followed by CELP and LPC. Quality comparison using the MOS index shows that MELP has the highest score, followed by CELP and LPC. The complexity comparison test using different voices is shown in Table 4. The obtained results show that the complexity of MELP is higher than LPC and CELP in terms of both processor and memory requirements. The MELP analyzer requires 72% of its total processing time. This additional memory is, of course, due to the vector quantization tables that MELP uses for the linear spectral frequencies (LSFs) and the Fourier magnitude.
REFERENCES
[1] Bishnu Atal "The History of Linear Prediction". ICASSP '78. IEEE Signal Processing Magazine, vol.23, no2, march 2006. 154-161.
[2] Bishnu Atal and Manfred Schroeder. "Predictive coding of speech signals and subjective error criteria". ICASSP '78. IEEE International Conference on Acoustics, Speech, and Signal Processing. 3: 573–576, 1978.
[3] A. V. McCree and T. P. Barnwell, "A mixed excitation LPC vocoder model for low bit rate speech coding". IEEE Transactions on Speech and Audio Processing, vol. 3, no. 4, pp. 242-250, July 1995.
[4] J. J. D. van Schalkwyk, D. J. Joubert and J. G. van der Linde, "Linear predictive speech coding at 2400 b/s," in Transactions of the South African Institute of Electrical Engineers, vol. 84, no. 3, pp. 146-152, June 1993.
[5] M. Schroeder and B. Atal, "Code-excited linear prediction (CELP): High-quality speech at very low bit rates," ICASSP '85. IEEE International Conference on Acoustics, Speech, and Signal Processing, 1985, pp. 937-940, doi: 10.1109/ICASSP.1985.1168147.
[6] Weiran Lin, Soo Ngee Koh and Xiao Lin, "Mixed excitation linear prediction coding of wideband speech at 8 kbps," 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100), 2000, pp. II1137-II1140 vol.2, doi: 10.1109/ICASSP.2000.859165.
[7] J.D. Tardelli, E.W. Kreamer, “Vocoder Intelligibility and Quality Test Methods”, IEEE International Conference on Acoustics, Speech, and Signal Processing, Atlanta, Georgia, USA, 1996.
[8] M. A. Kohler, "A comparison of the new 2400 bps MELP Federal Standard with other standard coders," 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing, 1997, pp. 1587-1590 vol.2, doi: 10.1109/ICASSP.1997.596256.
[9] Saeed Talati, Pouriya Etezadifar. (2020). “Providing an Optimal Way to Increase the Security of Data Transfer Using Watermarking in Digital Audio Signals”, MJTD, vol. 10, no. 1.
[10] Hashemi, Seyed & Barati, Shahrokh & Talati, S. & Noori, H. (2016). “A genetic algorithm approach to optimal placement of switching and protective equipment on a distribution network”. Journal of Engineering and Applied Sciences. 11. 1395-1400.
[11] Hashemi, Seyed & Abyari, M. & Barati, Shahrokh & Tahmasebi, Sanaz & Talati, S. (2016). “A proposed method to controller parameter soft tuning as accommodation FTC after unknown input observer FDI”. Journal of Engineering and Applied Sciences. 11. 2818-2829.
[12] S. Talati, A. Rahmati, and H. Heidari. (2019) “Investigating the Effect of Voltage Controlled Oscillator Delay on the Stability of Phase Lock Loops”, MJTD, vol. 8, no. 2, pp. 57-61.
[13] Talati, S., & Alavi, S. M. (2020). “Radar Systems Deception using Cross-eye Technique”. Majlesi Journal of Mechatronic Systems, 9(3), 19-21.
[14] Saeed Talati, mohamadreza Hasani Ahangar (2020) “Analysis, Simulation and Optimization of LVQ Neural Network Algorithm and Comparison with SOM”, MJTD, vol. 10, no. 1.
[15] Talati, S., & Hassani Ahangar. M. R. (2020) “Combining Principal Component Analysis Methods and Self-Organized and Vector Learning Neural Networks for Radar Data”, Majlesi Journal of Telecommunication Devices, 9(2), 65-69.
[16] Hassani Ahangar, M. R., Talati, S., Rahmati, A., & Heidari, H. (2020). “The Use of Electronic Warfare and Information Signaling in Network-based Warfare”. Majlesi Journal of Telecommunication Devices, 9(2), 93-97.
[17] Aslinezhad, M., Mahmoudi, O., & Talati, S. (2020). “Blind Detection of Channel Parameters Using Combination of the Gaussian Elimination and Interleaving”. Majlesi Journal of Mechatronic Systems, 9(4), 59-67.
[18] Talati, S., & Amjadi, A. (2020). “Design and Simulation of a Novel Photonic Crystal Fiber with a Low Dispersion Coefficient in the Terahertz Band”. Majlesi Journal of Mechatronic Systems, 9(2), 23-28.
[19] Talati, Saeed, Hassani Ahangar, Mohammad Reza. (2021). “Radar Data Processing Using a Combination of Principal Component Analysis Methods and Self-Organized and Digitizing Learning Vector Neural Networks”, Electronic and Cyber Defense, 9 (2), pp. 1-7.
[20] Talati, S., Alavi, S. M., & Akbarzade, H. (2021). “Investigating the Ambiguity of Ghosts in Radar and Examining the Diagnosis and Ways to Deal with it”. Majlesi Journal of Mechatronic Systems, 10(2).
[21] Etezadifar, P., & Talati, S. (2021). “Analysis and Investigation of Disturbance in Radar Systems Using New Techniques of Electronic Attack”. Majlesi Journal of Telecommunication Devices, 10(2), 55-59.
[22] Saeed. Talati, Behzad. Ebadi, Houman. Akbarzade “Determining of the fault location in distribution systems in presence of distributed generation resources using the original post phasors”. QUID 2017, pp. 1806-1812, Special Issue No.1- ISSN: 1692-343X, Medellín-Colombia. April 2017.
[23] Talati, Saeed, Akbari Thani, Milad, Hassani Ahangar, Mohammad Reza. (2020). “Detection of Radar Targets Using GMDH Deep Neural Network”, Radar Journal, 8 (1), pp. 65-74.
[24] Talati, S., Abdollahi, R., Soltaninia, V., & Ayat, M. (2021). “A New Emitter Localization Technique Using Airborne Direction Finder Sensor”. Majlesi Journal of Mechatronic Systems, 10(4), 5-16.
[25] O. Sharifi-Tehrani, "Design, Simulation and Fabrication of Microstrip Hairpin and Interdigital BPF for 2.25 GHz Unlicensed Band," Majlesi Journal of Telecommunication Devices, vol. 6, no. 4, 2017.
[26] O. Sharifi-Tehrani and S. Talati. (2017) “PPU Adaptive LMS Algorithm, a Hardware-Efficient Approach; a Review on”, Majlesi Journal of Mechatronic Systems, vol. 6, no. 1.
[27] O. Sharifi-Tehrani, "Hardware Design of Image Channel Denoiser for FPGA Embedded Systems," Przegląd Elektrotechniczny, vol. 88, no. 3b, pp. 165-167, 2012.
[28] O. Sharifi-Tehrani, A. Sadeghi, and S. M. J. Razavi, "Design and Simulation of IFF/ATC Antenna for Unmanned Aerial Vehicle," Majlesi Journal of Mechatronic Systems, vol. 6, no. 1, pp. 1-4, 2017.
[29] O. S. Tehrani, M. Ashourian, and P. Moallem, "An FPGA-based implementation of fixed-point standard-LMS algorithm with low resource utilization and fast convergence," International Review on Computers and Software, vol. 5, no. 4, pp. 436-444, 2010.
[30] O. Sharifi-Tehrani, "Novel hardware-efficient design of LMS-based adaptive FIR filter utilizing Finite State Machine and Block-RAM," Przeglad Elektrotechniczny, vol. 87, no. 7, pp. 240-244, 2011.
[31] O. Sharifi-Tehrani, M. F. Sabahi, and M. R. Danee, "Low-Complexity Framework for GNSS Jamming and Spoofing Detection on Moving Platforms," IET Radar, Sonar & Navigation, vol. 14, no. 12, pp. 2027-2038, 2020.
[32] M. Ashourian and O. Sharifi-Tehrani, "Application of semi-circle law and Wigner spiked-model in GPS jamming confronting," Signal, Image and Video Processing, pp. 1-8, 2022.
[33] O. Sharifi-Tehrani, M. F. Sabahi, and M. Danaee, "Null broadened–deepened array antenna beamforming for GNSS jamming mitigation in moving platforms," ICT Express, vol. 8, no. 2, pp. 161-165, 2022.
[34] O. Sharifi-Tehrani, H. Lashgarian, M. Soleymanzade, and M. H. Ghasemian, "Futurology of Electronic Warfare Systems for IR. IRAN's Fast Crafts," Majlesi Journal of Telecommunication Devices, vol. 8, no. 2, 2019.
[35] O. Sharifi-Tehrani, A. Sadeghi, and S. M. J. Razavi, "Design and Simulation of IFF/ATC Antenna for Unmanned Aerial Vehicle," Majlesi Journal of Mechatronic Systems, vol. 6, no. 1, pp. 1-4, 2017.
[36] O. Sharifi-Tehrani, M. F. Sabahi, and M. R. Danee, "GNSS Jmming Detection of UAV Ground Control Station Using Random Matrix Theory," ICT Express, vol. In Press, 2020.
[37] O. Sharifi-Tehrani, "Novel hardware-efficient design of LMS-based adaptive FIR filter utilizing Finite State Machine and Block-RAM," Przeglad Elektrotechniczny, vol. 87, no. 7, pp. 240-244, 2011.
[38] H. Pourghassem, O. Sharifi-Tehrani, and M. Nejati, "A novel weapon detection algorithm in X-ray dual-energy images based on connected component analysis and shape features," Australian Journal of Basic and Applied Sciences, vol. 5, pp. 300-307, 2011.
[39] O. S. Tehrani, M. Ashourian, and P. Moallem, "Fpga implementation of a channel noise canceller for image transmission," in Machine Vision and Image Processing (MVIP), 2010 6th Iranian, 2010, pp. 1-6: IEEE.
[40] Ghazali, S. M., Baleghi, Y. “Pedestrian Detection in Infrared Outdoor Images Based on Atmospheric Situation Estimation” Journal of AI and Data Mining, 2019; 7(1): 1-16. doi: 10.22044/jadm.2018.5742.1696
[41] Talati, S., Ghazali, S. M., Hassani Ahangar, M., & Alavi, S. M. (2021). “Analysis and Evaluation of Increasing the Throughput of Processors by Eliminating the Lobe’s Disorder” Majlesi Journal of Telecommunication Devices, 10(3), 119-123. https://doi.org/10.52547/mjtd.10.3.119
[42] Seyed Morteza Ghazali, Jalil Mazloum, Yasser Baleghi. “Modified binary salp swarm algorithm in EEG signal classification for epilepsy seizure detection” Biomedical Signal Processing and Control. Volume 78, September 2022, 103858.
[43] EtezadiFar. P., Talati. S., Hassani Ahangar. M.R., Molazade. M., “Investigation of Steganography Methods in Audio Standard Coders: LPC, CELP, MELP” Majlesi Journal of Telecommunication Devices, 12(1), in press, 2023.
[1] million instructions per second
Paper type: Research paper
DOI: 10.30486/MJTD.1402.1106218
Received: 7 November 2023; revised: 15 December 2023; accepted: 28 January 2024; published: 1 March 2024
How to cite this paper: S. Talati, P. Etezadifar, M. R. Hassani Ahangar, M. Molazade, “A Comparative Analysis of Digital Audio Encoders: LPC, CELP, and MELP, Evaluating Quality and Complexity of Transmitted Content”, Majlesi Journal of Telecommunication Devices, Vol. 13, No. 1, pp. 27-35, 2024.
27