Research
Research Resources
home > research > bioinformatics > tools
Tools
A program for analysis of protein functional divergence and prediction of molecular mechanisms.
Rapid detection, classification and accurate alignment of up to a million or more related protein sequences.
Ergatis is a web-based utility used to create, run, and monitor reusable computational analysis pipelines, utilizing the Workflow engine. It contains pre-built components for common bioinformatics analysis tasks. Ergatis is under active development at IGS & J Craig Venter Institute (JCVI).
IDEA - Interactive Display for Evolutionary Analyses
IDEA (Interactive Display for Evolutionary Analyses) provides a graphical interface for PAML (Phylogenetic Analysis by Maximum Likelihood, Ziheng Yang, 1997), a suite of programs for conducting molecular evolution analyses on nucleotide and amino-acid data. IDEA allows you to run either of the PAML programs codeml or baseml on a single dataset or on multiple datasets simultaneously. Among other things, codeml and baseml allow you to obtain maximum likelihood estimates of numbers of substitutions per branch and per site and to compare multiple models of molecular evolution given the data and a phylogenetic tree for the sequences. IDEA runs on Linux, Solaris and Mac OS X operating systems; it is designed to execute processes in parallel on a multiprocessor machine and can run on a computer grid with either SGE or Condor. IDEA is available free of charge from SourceForge.
Manatee is a web-based tool used to perform manual functional annotation. It has been specifically designed to optimize the ability of curators to evaluate all available sequence-based and experimental data to assign the best possible annotation to a given gene product. Manatee allows users to view, modify, and store annotation through interactions with an underlying relational database where all of the information is stored. Manatee supports the storage of multiple types of functional annotation including protein names, gene symbols, EC numbers, Gene Ontology terms, and associated supporting evidence. In addition, Manatee provides summary views of statistics and information from the genome as a whole.
PhyloTrac is a software package for exploration and analysis of phylogenetic diversity from PhyloChip data. PhyloTrac is capable of displaying data from multiple PhyloChip experiments in a variety of styles, including heatmap, time series/parallel coordinates, probe intensity display, phylogenetic tree, and textual spreadsheets. All views are fully synchronized and dynamic so that selection and filtering in one view is instantaneously reflected in the other views.
Sybil is a web-based tool for visualizing and mining comparative genomic data. Powered by a Chado relational database, Sybil provides a rich set of interfaces for browsing and analyzing data. Sybil has been implemented for a variety of organisms both prokaryotes and eukaryotes. Most recently a Sybil site has been released to the public for comparison of complete Streptococcus pneumoniae genomes. Strepneumo promises to be an important tool in accelerating vaccine discovery in developing nations.
Sybil allows users to search for genes or gene clusters of interest and visualize their genomic context. The various displays provide multiple types of genomic comparisons for in-depth data mining, data interrogation from multiple angles, and generation of publication-ready figures. Sybil also gives users the ability to identify core and accessory genes from all or a subset of the available genomes. All of the views in Sybil are interactive and allow the user to browse the data seamlessly, for instance moving from a whole genome comparison to a local comparison to an individual gene report to an orthologous cluster report.
A Chado relational database forms the backend of Sybil and is populated with genome annotations and comparisons of two or more closely related genomes. The pre-computed comparisons include an all-vs-all BLASTP and a clustering of orthologous and paralogous genes. Paralogous gene clusters are defined based on a Jaccard similarity coefficient using BLASTP matches. Orthologous gene clusters are defined by a single linkage clustering of bi-directional best BLASTP hits between genes or paralogous clusters of genes from all pairs of genomes. Besides genome annotation and gene clusters, Sybil has been used to visualize repeats, small and large-scale polymorphisms and microarray data. Future development will focus on expanding the supported data types as well as improvements to usability and image customization.
Sybil is implemented in Perl and built in a tiered architecture that includes an API for retrieving data from Chado. The software architecture also includes utilities for rendering publication quality images in SVG and PDF formats. Sybil is open source and freely available with documentation and demo databases available for download.
Workflow is a Java based, XML driven Workflow Engine suite, which can be used to build, execute and monitor complex process pipelines. This tools serves as the execution engine for the Ergatis tool. Workflow is under active development at IGS & J Craig Venter Institute (JCVI).