Abstract
Position-specific scoring matrix (PSSM) has been widely used for identifying protein functional sites. However, it is 20-dimentional and contains many redundant features. The Kidera factors were reported to contain information relating almost all physical properties of amino acids, but it requires appropriate weighting coefficients to express their properties. We developed a novel method, named as KSPSSMpred, which integrated PSSM and the Kidera Factors into a 10-dimensional matrix (KSPSSM) for ligandbinding site prediction. Flavin adenine dinucleotide (FAD) was chosen as a representative ligand for this study. When compared with five other featurebased methods on a benchmark dataset, KSPSSMpred performed the best. This study demonstrates that, KSPSSM is an effective feature extraction method which can enrich PSSM with information relating 188 physical properties of residues, and reduce 50% feature dimensions without losing information included in the PSSM.
Original language | English |
---|---|
Pages (from-to) | 70-84 |
Number of pages | 15 |
Journal | International Journal of Data Mining and Bioinformatics |
Volume | 12 |
Issue number | 1 |
DOIs | |
Publication status | Published - 2015 |
Keywords
- Kidera factors
- Ligand-binding site
- Position specific scoring matrix
ASJC Scopus subject areas
- Information Systems
- Biochemistry, Genetics and Molecular Biology(all)
- Library and Information Sciences