A software tool looks for patterns in the sequences of long RNA molecules that aren’t transcribed into proteins and predicts their function1. The work could help clarify the role that these so-called long noncoding RNAs play in autism.
Long noncoding RNAs, which measure more than 200 nucleotides, are implicated in autism. Some of them regulate genes or signaling pathways associated with the condition. Scientists know the function of only a minority of them, however.
The new tool examines long noncoding RNA sequences one nucleotide at a time. It searches for repeated motifs, or patterns of three to eight RNA nucleotides. Then it calculates how often those motifs show up relative to the length of the RNA. This step helps correct for the variety of lengths of long noncoding RNAs and uncovers sequence similarities other tools may miss.
The software groups the RNAs based on the frequency of the motifs; shared motifs reveal ‘families’ of RNAs with similar functions.
The researchers tested their software on 161 long noncoding RNAs with sequence similarities that other algorithms had already identified. Their tool classified these RNAs as well as or better than two other algorithms.
The software then identified shared motifs among long noncoding RNAs with no known sequence similarities and assigned functions to the resulting groups. It also indicated where in the cell a long noncoding RNA is active, offering still more clues to the RNA’s function. Researchers described the tool in the October issue of Nature Genetics.