Comparative Evaluation of Speaker Recognition Systems Based on the LPC, CC and MFCC Algorithms

Authors

DOI:

https://doi.org/10.36561/ING.17.6

Keywords:

Speaker recognition systems, Crowd noise, MFCC algorithm, CC algorithm, LPC algorithm

Abstract

This document proposes the evaluation of speaker recognition systems based on the LPC (Linear Predicting Coding), CC (Cepstral Coefficients) and MFCC (Mel Frequency Cepstral Coefficients) algorithms, used in the extraction of voice parameters. The evaluation, following an experimental quantitative methodology, consists of determining the change in performance when the input signal is exposed to different noise conditions (crowd and Gaussian noise), namely, at different levels of SNR, comparing the verification results for 2 speakers. Although all the systems decrease their performance in noisy environments, each one possesses intrinsically a certain level of robustness. This evaluation will serve as a reference in the construction of speaker recognition systems, which include voice enhancement systems to reduce noise.

Downloads

Download data is not yet available.

References

T. Kinnunen y H. Li, "An Overview of Text-Independent Speaker Recognition: from Features to Supervectors," Speech Communication, vol. 52, no. 1, pp. 12-40, 2010.

H. Beigi, Fundamentals of speaker recognition. New York, USA: Springer, 2011.

M. Ray, M. Chandra y B. Patil, "Speech Coding Techniques for VoIP Applications: A Technical Review," World Applied Sciences Journal, vol. 33 no. 5, pp. 736-743, 2015.

C. Ittichaichareon, S. Suksri y T. Yingthawornsuk, "Speech Recognition using MFCC," en International Conference on Computer Graphics, Simulation and Modeling (ICGSM'2012), Pattaya (Thailand), 2012, pp. 135-138.

X. Jing, J. Ma, J. Zhao y H. Yang, "Speaker recognition based on principal component analysis of LPCC and MFCC.," en International Conference on Trends in Automation, Communications and Computing Technology, 2015, December, pp. 1-5.

A. Charisma, M. Reza Hidayat y Y. Bakti Zainal, "Speaker Recognition Using Mel-Frequency Cepstrum Coefficients and Sum Squar," Engineering Faculty of Universitas Jenderal Achmad Yani, Cimahi, Indonesia, 2017.

A. N. Chadha, J. H. Nirmal y P. Kachare, A comparative performance of various speech analysis-synthesis techniques, International Journal of Signal Processing Systems, vol. 2, no. 1, Jun., pp. 17-22, 2014.

P. Manrique Ramírez y M. A. Meléndez Velázquez, «Diseño de un sistema de codificación de predicción lineal (LPC),» Ciudad de México, 1999.

L. Rabiner y B.-H. Juang, Fundamentals of Speech Recognition. Englewood Cliffs: Petrince Hall International, 1993.

C. Collomb, "Tutorial on linear prediction and Levinson Durbin," Empty Loop, febrero 3, 2009. [En línea]. Disponible en: http://www.emptyloop.com/technotes/A%20tutorial%20on%20linear%20prediction%20and%20Levinson-Durbin.pdf. [Último acceso: 12 02 2019].

J. R. Deller, J. H. L. Hansen y J. G. Proakis, Discrete-Time Processing of Speech Signals. New York: Macmillan, 1993.

J. L. Cheang Loong, K. S. Subari, M. Kamil Abdullah, N. N. Ahmad y R. Besar, «Comparison of MFCC and Cepstral Coefficients as a Feature Set for PCG Biometric Systems,» 2010.

S. S. Stevens, Volkmann y E. B. John & Newman, "A Scale for the Measurement of the Psychological Magnitude Pitch," Acoustical Society of America, vol. 8, nº 3, p. 6, 1937.

V. G. Vílchez García, "Estimación y clasificación de daños en materiales utilizando modelos AR," Sep. 2010. [En línea]. Disponible en: http://ceres.ugr.es/~alumnos/esclas/. [Último acceso: 12 05 18].

W. Gevaert, G. Tsenov y V. Mladenov, "Neural Networks used for Speech Recognition," Journal of Automatic Control, University of Belgrade, vol. 20 pp. 1-7, 2010.

VoIP Supply, «Cisco hd voice,» VoIP Supply, [En línea]. Disponible en: https://www.voipsupply.com/cisco-hd-voice. [Último acceso: 22 03 2019].

K. K. Paliwal, J. G. Lyons y K. K. Wojcicki, "Preference for 20-40 ms window duration in speech analysis," en 2010 4th International Conference on signal Processing and Communication Systems (ICSPCS).

C. S. Aguilar Orozco y M. W. Marín Benítez, Sistema certificador de locutor por voz. México: Instituto Politécnico Nacional, 2003.

Published

2019-11-29

How to Cite

[1]
Y. González, H. Juárez, O. Rocha, R. Hernández, and A. Bermúdez, “Comparative Evaluation of Speaker Recognition Systems Based on the LPC, CC and MFCC Algorithms”, Memoria investig. ing. (Facultad Ing., Univ. Montev.), no. 17, pp. 121–136, Nov. 2019.

Issue

Section

Articles