Skip to main content
Fig. 1 | Journal of Biological Research-Thessaloniki

Fig. 1

From: PredPhos: an ensemble framework for structure-based prediction of phosphorylation sites

Fig. 1

The framework of PredPhos. Phosphorylation sites in the training set were mapped to the protein entries of Protein Data Bank (PDB) by using Blast. We encode each residue using 51 site features, 51 Euclidean neighborhood features and 51 Voronoi neighborhood features. The first step of feature selection is done by a random forest algorithm. Features are ranked in descending order by Z-Scores and the top 80 features are selected. The second step is performed using a wrapper-based feature selection. Features are evaluated by tenfold cross-validation with the SVM algorithm, redundant features are removed by sequential backwards elimination. Finally, an ensemble of n classifiers is built using different subsets, the final result is determined by majority votes among the outputs of the n classifiers

Back to article page