3''-UTR SIRF: a database for identifying clusters of whort interspersed repeats in 3'' untranslated regions.
Short (~5 nucleotides) interspersed repeats regulate several aspects of post-transcriptional gene expression. Previously we developed an algorithm (REPFIND) that assigns P-values to all repeated motifs in a given nucleic acid sequence and reliably identifies clusters of short CAC-containing motifs required for mRNA localization in Xenopus oocytes. In order to facilitate the identification of genes possessing clusters of repeats that regulate post-transcriptional aspects of gene expression in mammalian genes, we used REPFIND to create a database of all repeated motifs in the 3'' untranslated regions (UTR) of genes from the Mammalian Gene Collection (MGC). The MGC database includes seven vertebrate species: human, cow, rat, mouse and three non-mammalian vertebrate species. A web-based application was developed to search this database of repeated motifs to generate species-specific lists of genes containing specific classes of repeats in their 3''-UTRs. This computational tool is called 3''-UTR SIRF (Short Interspersed Repeat Finder), and it reveals that hundreds of human genes contain an abundance of short CAC-rich and CAG-rich repeats in their 3''-UTRs that are similar to those found in mRNAs localized to the neurites of neurons. We tested four candidate mRNAs for localization in rat hippocampal neurons by in situ hybridization. Our results show that two candidate CAC-rich (Syntaxin 1B and Tubulin beta4) and two candidate CAG-rich (Sec61alpha and Syntaxin 1A) mRNAs are localized to distal neurites, whereas two control mRNAs lacking repeated motifs in their 3''-UTR remain primarily in the cell body. Computational data generated with 3''-UTR SIRF indicate that hundreds of mammalian genes have an abundance of short CA-containing motifs that may direct mRNA localization in neurons. In situ hybridization shows that four candidate mRNAs are localized to distal neurites of cultured hippocampal neurons. These data suggest that short CA-containing motifs may be part of a widely utilized genetic code that regulates mRNA localization in vertebrate cells. The use of 3''-UTR SIRF to search for new classes of motifs that regulate other aspects of gene expression should yield important information in future studies addressing cis-regulatory information located in 3''-UTRs.
PubMed ID: 17663765
PMC ID: PMC1973087
Article link: BMC Bioinformatics.
Genes referenced: nanos1 sec61a1 slc25a20 stx1a stx1b tubb4a
Article Images: [+] show captions
|Figure 3. Localization of the Rat and Human Tubβ4 3'-UTRs in Xenopus oocytes. The 3'-UTR of rat or human Tubβ4 (Acc. # 82522352 and BC013683, respectively), and human Stx1B2 (Acc. # BC062298) were synthesized and labelled in vitro with Alexa-Fluor-546-UTP. These fluorescently labelled RNAs were then microinjected into stage II Xenopus oocytes. All three RNAs localize to the vegetal pole, which is oriented downwards in all panels. A fragment of the Xenopus β-globin gene (XβG) was used as a negative control for localization, whereas the mitochondrial cloud RNA localization element from the Xenopus Xcat-2 mRNA (MCLE) was used as a positive control. Note that the extent of Stx1B2 localization is higher than that of either Tubβ4 RNA. Arrows depict the localized RNA towards the vegetal pole and GV indicates the germinal vesicle (nucleus) in these cells which are ~300 μm diameter.|