Please use this identifier to cite or link to this item:
|Scopus||Web of Science®||Altmetric|
|Title:||Sequence-based prediction of enzyme thermostability through bioinformatics algorithms|
|Citation:||Current Bioinformatics, 2010; 5(3):195-203|
|Publisher:||Bentham Science Publishers Ltd|
|Mansour Ebrahimi and Esmaeil Ebrahimie|
|Abstract:||Predicting the thermostability of a biomolecule, given its sequence, is one of the big challenges of protein engineering and developing tools to screen thermostable mutants is of great interest. Here we used various screening, clustering, decision tree and generalized rule induction models to search for patterns of thermostability. Arg was solely found as N-terminal amino acid in proteins at temperatures higher than 70ºC. Fifty-four protein features were important in feature selection, and the number of peer groups (anomaly index 2.12) declined from 7 to 2 with selected features; no changes were found in K-Means and TwoStep clusters with/without feature selection filtering. Tree depths of decision tree models varied from 14 (in C5.0 with 10-fold cross-validation and with feature selection) to 4 (in CHAID) branches and C5.0 was the best and the Quest model was the worst. No significant difference in the performance of various decision tree models was found with/without feature selection, but the number of peer groups in clustering models was reduced significantly (p<0.05). The frequency of Gln was the most important feature in decision tree rules and for all association rules in antecedent to support the rules. The importance of Gln in protein thermostability is discussed in this paper.|
|Keywords:||Bioinformatics; modeling; protein; thermostability|
|Rights:||© 2010 Bentham Science Publishers Ltd.|
|Appears in Collections:||Animal and Veterinary Sciences publications|
Files in This Item:
There are no files associated with this item.
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.