Improving the efficiency of dot-matrix similarity searches through use of an oligomer table.
AUTOR(ES)
Fristensky, B
RESUMO
Dot-matrix sequence similarity searches can be greatly speeded up through use of a table listing all locations of short oligomers in one of the sequences to find potential similarities with a second sequence. The algorithm described finds similarities between two sequences of lengths M and N, comparing L residues at a time, with an efficiency of L X M X N/(SK) where S is the alphabet size, and k is the length of the oligomer. For nucleic acids, in which S = 4, use of a tetranucleotide table results in an efficiency of L X M X N/256. The simplicity of the approach allows for a straightforward calculation of the level of similarities expected to be found for given search parameters. Furthermore, the storage required is minimal, allowing for even large sequences to be compared on small microcomputers. Theoretical considerations regarding the use of this search are discussed.
ACESSO AO ARTIGO
http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=339447Documentos Relacionados
- Livelihood benefits of small improvements in the life table.
- Homology Induction: the use of machine learning to improve sequence similarity searches
- A method for measuring the non-random bias of a codon usage table.
- Improving quality of drug use through hospital directorates.
- Protein sequence similarity searches using patterns as seeds.