August 1, 2015;
Xenbase: Core features, data acquisition, and data processing.
Xenbase, the Xenopus model organism database (www.xenbase.org), is a cloud-based, web-accessible resource that integrates the diverse genomic and biological data from Xenopus research. Xenopus frogs are one of the major vertebrate animal models used for biomedical research, and Xenbase is the central repository for the enormous amount of data generated using this model tetrapod. The goal of Xenbase is to accelerate discovery by enabling investigators to make novel connections between molecular pathways in Xenopus and human disease. Our relational database and user-friendly interface make these data easy to query and allows investigators to quickly interrogate and link different data types in ways that would otherwise be difficult, time consuming, or impossible. Xenbase also enhances the value of these data through high-quality gene expression curation and data integration, by providing bioinformatics tools optimized for Xenopus experiments, and by linking Xenopus data to other model organisms and to human data. Xenbase draws in data via pipelines that download data, parse the content, and save them into appropriate files and database tables. Furthermore, Xenbase makes these data accessible to the broader biomedical community by continually providing annotated data updates to organizations such as NCBI, UniProtKB, and Ensembl. Here, we describe our bioinformatics, genome-browsing tools, data acquisition and sharing, our community submitted and literature curation pipelines, text-mining support, gene page features, and the curation of gene nomenclature and gene models.
[+] show captions
FIG. 1. Xenbase content. Schematic representation of the major data types stored within Xenbase, and their relationships to each other.
FIG. 2. An example of a Xenbase Gene Page. In this example, data for the pax6 gene is displayed. Under the top navigation bar, a set of
“tabs” are visible, each of which will load specialized content when selected. Within the body of the Summary tab (the default), information
is organized into rows containing the same type of data, and columns that contain data from the same species or subgenome. Most tabs
indicate whether they contain content with either an icon or number, to avoid nonproductive selection of empty tabs.
FIG. 3. The Xenbase browser can load three different builds of both X. tropicalis (4.1, 7.1, and 8.0) and X. laevis (6.0, 7.2 [both J strain],
and WT1.0). Multiple gene models, their annotations, and a variety of epigenomic and ancillary data are available for most genome builds.
Many tracks launch pop-up windows with further information. In this view, a user has clicked on the morpholino feature labeled “cyp26a1
MO1.” This reagent was designed against X. laevis cyp26a; so, a number of mismatches are present in the alignment to the active view,
which is X. tropicalis build 7.1. The stacked ChIP-seq tracks for p300 are displayed using Topoview (available at: http://flybase.org/static_
FIG. 4. Curation pipeline. The curation pipelines incorporate an integrated set of automatic, semiautomatic, and manual processes, where as much data as possible from each publication is curated. A final step is when the article is entered into the “Xenbase Article Curation
Tracker” (XACT) an automatic set of Ruby and Perl scripts that generate a Google fusion table containing all the information on articles
imported each week. Curators access this table to monitor the status, content, and priority of all articles in the database from the time of its
first entry to its complete curation.
FIG. 5. Literature page. In addition to features such as title, authors, and abstract, Xenbase publications also display lists of genes cited,
antibodies, and morpholinos referenced (with hyperlinks to these separate resource pages). All anatomical features (e.g., neural tube) identified by text-mining for XAO terms are hyperlinked, as are gene symbols (e.g., Rfx2). Author names are also linked to pages in our community module and through this, to the author’s other publications.