Analysis - Transcriptome Analysis
Prokaryotic Transcriptome Analysis Pipeline Overview
The prokaryotic transcriptome analysis pipeline at IGS is a comprehensive resource that provides several of the most common transcriptomic analysis tasks. The pipeline consists of the following steps, tools and products:
- Alignment – The first step in the analysis is alignment of RNA-Seq reads to a reference genome using Bowtie. The resulting alignments are converted into BAM files using SAMtools. Outputs from this step include a table showing the total number of mapped reads, the percentage of mapped reads and the percentage of reads that fall in genic or intergenic regions for each sample. These statistics provide an overall view of how well reads from each sample aligned to the reference and can be used to establish possible problems with the data or sample.
- RPKM analysis – Using the BAM files for each sample, the RPKM (reads per kilobase per million mapped reads) values for each gene in the samples is calculated based on the gene annotation contained in the reference genome. Tab-delimited text files showing RPKM values for each gene in the reference genome will be provided along with the number of genes covered by RNA-Seq data in the reference genome. PDF plots are also generated to show the RPKM results in a graphical format.
- Differential expression analysis – In order to identify genes that have increased or decreased expression in one set of samples vs. another, differential expression analysis is performed using DEseq and EdgeR on the read counts generated using HTSeq. The results for the sample comparisons are filtered and the output stored in tab-delimited text files as well as Excel spreadsheets. The filters are based on the False Discovery Rate (FDR) or p-value and read counts. We have several predefined cutoffs that are in the standard output, but user defined cutoffs can also be used. Along with the data files, clustering plots based on normalized read counts for conditions and samples are generated.
- Visualization – The Integrative Genome Browser (IGV) can be used for visualization of alignment, coverage and annotation. An IGV instance can be created for the user upon request which will allow the user to access their data without having to download the data files.
Training workshops are available on this topic, please click here for more information