— Current studies on spoken document retrieval (SDR) systems concentrate on building strong systems using an approach that reduces the impact of automatic speech recognition (ASR) on retrieval performance. Herein we tend to propose the SDR system, the main goal of that is to reduce the effect of ASR transcription errors on retrieval performance. An automatic speech recognition system is employed to convert the Malay spoken broadcast news to text. The performance of unsupervised learning is evaluated on the Malay broadcast news using apriori algorithm.
— Spoken document retrieval, unsupervised learning, apriori algorithm, broadcast news segmentation.
Zainab Ali Khalaf was with Basra University, Basra, IRAQ. She is now with the Department of Computer sciences, Universiti Sains Malaysia (USM), Penang, Malaysia (e-mail: email@example.com).
Tan Tien Ping is with the Department of Computer sciences, Universiti Sains Malaysia (USM), Penang, Malaysia (e-mail: firstname.lastname@example.org).
Cite: Zainab Ali Khalaf and Tan Tien Ping, " MAHIR System: Unsupervised Segmentation for Malay Spoken Broadcast News Stories," International Journal of Information and Electronics Engineering vol. 5, no. 3, pp. 211-215, 2015.