Multiple Sequence Alignment: Comparison and Evaluation
Loading...
Date
2012
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
The present investigation was undertaken to compare and evaluate eleven tools of Multiple
Sequence Alignment, namely Bali-phy, ClustalW, DCA, Dialign, Kalign, MAFFT, Multalin,
MUSCLE, PRALINE, PROBCONS and T-COFFEE and to present a choice to the users of MSA
regarding the optimal program. Nineteen sequences, each of cytochrome C and hexokinase were
aligned by these tools using their default parameters. On comparing the various features of MSA tools,
it was found that linux was the most favoured operating system and Fasta was the most favoured input
as well as output format. PRALINE was the only web-based and Bali-phy was the only stand-alone
tool, while the rest of the tools were both stand-alone as well as web-based. Various MSA programs
were based on different alignment algorithms and scoring matrices. Alignments produced were used to
generate phylogenetic trees. Comparison of these phylogenetic trees with the reference phylogenetic
tree i.e. NCBI Taxonomy Common Tree, showed that while the tree obtained from the T-COFFEE
alignment tool was nearest to the NCBI Taxonomy Common Tree, phylogenetic trees produced by
none of the MSA tools were consistent. Multalin tool was found to give the highest diversity of
position scores on Scorecons server for both cytochrome C and hexokinase sequences; while TCOFFEE,
MUSCLE, Kalign and MAFFT programs gave low scores. The benchmark database,
BAliBASE was used in the current study for cytochrome C sequences. It was found that MSA
programs Praline, MAFFT, MUSCLE and PROBCONS gave high SP scores for both divergent
sequences and sub-families, while T-COFFEE gave low score. Inconsistent phylogenetic trees and
discrepancy between Scorecons and BAliBASE scores from various MSA tools may be attributed to
the use of highly divergent sequences from bacteria to human in the current study.
Description
Keywords
Enzymes, Cytochromes, Coffee, Fungi, Proteins, Plant habit, Fruits, Bioinformatics, Bacteria, Sets