Abstract—Detection of common motifs among proteins with low sequence identities provides important clues to the function of the proteins or to classify unknown proteins into proper families. Hence motif identification in protein sequences is essential for annotation of proteins from the sequence database among proteins with less than 30% homology. In the present work we have detected conserved regions in protein sequences using digital signal processing methods such as discrete Fourier transform (DFT) and wavelet transform with ten bit numerical representation of amino acids based on physico-chemical properties. The resulting ten bit numerical representation of each residue of the protein sequence has significant correlation with its biological activity. The conserved motifs are identified in peak regions from the DFT spectrum and wavelet spectrum. It is found that the new ten bit numerical representation using wavelet transform shows improved result than DFT. We have used wavelet transform to decompose protein sequences represented numerically by different indices such as positive charge, negative charge, polarity, charge, medium volume, small volume, aliphatic, aromatic chain and alicyclic character of the amino acids. The decomposed signals are then plotted to identify similar regions across all the proteins. Results indicate that wavelet transform using ten bit binary representation of physico-chemical properties is a promising approach for conserved motif detection. The proposed techniques are not only fast but also give the better interpretation of conserved motifs in protein sequences.
Index Terms—Conserved motif, discrete Fourier transform, physic-chemical properties, wavelet transform.
J. K. Meher is with the Department of Computer Science and Engg, Vikash College of Engg for Women, Bargarh, Odisha, India (e-mail: email@example.com).
M. K. Raval is with the Department of Chemistry, Gangadhar Meher College, Sambalpur.
P. K. Meher is with the Department of Embedded Systems, Institute for Infocomm Research, Singapore.
G. N. Dash is with the the Department of Physics, Sambalpur University, India.
Cite: J. K. Meher, M. K. Raval, P. K. Meher, and G. N. Dash, "Wavelet Transform for Detection of Conserved Motifs in Protein Sequences with Ten Bit Physico-Chemical Properties, "International Journal of Information and Electronics Engineering vol. 2, no. 2, pp. 200-204, 2012.