In the last few years, the completion of the human genome sequencing showed up a wide range of new challenging issues involving raw data analysis. In particular, the discovery of information implicitly encoded in biological sequences is assuming a prominent role in identifying genetic diseases and in deciphering biological mechanisms. This information is usually represented by patterns frequently occurring in the sequences, also called motifs. Because of biological observations, the class of structured motifs have received much attention. This paper gives a contribution in this setting by providing an efficient algorithm for the identification of novel classes of structured motifs, where several kinds of "exceptions" (whose biological relevance recently emerged in the literature) may be tolerated in pattern repetitions.
Efficient discovery of loosely structured motifs in biological data
FASSETTI, Fabio;GRECO, Gianluigi;TERRACINA, Giorgio
2006-01-01
Abstract
In the last few years, the completion of the human genome sequencing showed up a wide range of new challenging issues involving raw data analysis. In particular, the discovery of information implicitly encoded in biological sequences is assuming a prominent role in identifying genetic diseases and in deciphering biological mechanisms. This information is usually represented by patterns frequently occurring in the sequences, also called motifs. Because of biological observations, the class of structured motifs have received much attention. This paper gives a contribution in this setting by providing an efficient algorithm for the identification of novel classes of structured motifs, where several kinds of "exceptions" (whose biological relevance recently emerged in the literature) may be tolerated in pattern repetitions.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.