Prediction of splice junctions in mRNA sequences.

AUTOR(ES)
RESUMO

A general method based on the statistical technique of discriminant analysis is developed to distinguish boundaries of coding and non-coding regions in nucleic acid sequences. In particular, the method is applied to the prediction of splicing sites in messenger RNA precursors. Information used for discrimination includes consensus sequence patterns around splice junctions, free energy of snRNA and mRNA base pairing, and statistical differences between coding and non-coding regions such as periodic appearance of specific bases in coding regions reflecting the non-random usage of degenerate codons. Given the reading frame of an exon (but not the exon/intron boundaries), the method will predict the following exon, namely, the intron to be excised out. When applied to human sequences in the GenBank database, the method correctly identified 80% of true splice junctions.

Documentos Relacionados