Web-Based Automatic Language Identification System

Authors

  • Mauricio M. Olvera, Angel Sánchez, and Larry H. Escobar Author

Keywords:

GMM, language identification, MFCC, SDC.

Abstract

Language Identification (LID) is the automated 
process of identifying what language is being spoken from a sample of speech by an unknown speaker. In this work we present a web-based LID system using Shifted Delta Cepstral (SDC) features derived from Mel-Frequency Cepstral Coefficients to gather relevant acoustic information from speech signals, and Gaussian Mixture Models (GMM) as a classifier. Speech corpora comprising four languages (English, Spanish, French and German) were made up of recordings from audio 
media found on the Internet. A web implementation was done using up-to-date web technologies with GNU Octave running on the server side to perform numerical computations. Results showed a system accuracy ranging from 72.5% to up to 80% depending on the duration of speech test segments.

Downloads

Download data is not yet available.

Downloads

Published

01.09.2016

How to Cite

Web-Based Automatic Language Identification System. (2016). International Journal of Information and Electronics Engineering, 6(5), 304-307. https://www.ijiee.org/index.php/ijiee/article/view/360