Auxiliary

Auxiliary programs

CDD2MGS

Converts into MAPGAPS input format NCBI Conserved Domain hiMSAs, which may be obtained by going to the NCBI CDD link below.

cdd2mgs_v1.3.tar

NCBI CDD hiMSAs

Click the link on the right to download NCBI Conserved Domain hiMSAs as a compressed tarball.

ftp://ftp.ncbi.nih.gov/pub/mmdb/cdd/hiMSA

 

AddPhylum

Kingdom and phylum taxonomic labeling of NCBI nr protein sequences for GISMO, BPPS, and DARC.  C++ source code and a simple example are included in the tarball.  

 

addphylum_v1.4.tar

NCBI taxonomy dump ftp site

The taxdump.tar.gz and accession2taxid/prot.accession2taxid.gz files are required by the AddPhylum program.

ftp://ftp.ncbi.nih.gov/pub/taxonomy

NCBI non-redundant (nr) protein sequence ftp site

Download the fasta formatted nr.gz and pdbaa.gz files at this site for use as input to MAPGAPS and other programs.

ftp://ftp.ncbi.nih.gov/blast/db/FASTA/

ConvertMSA 

Converts alignments from cma-format to mFASTA, from mFASTA to cma, and from cma to rich text format (rtf).  The rtf files are suitable for publication.

convert_msa_v1_0_1.tar

PurgeMSA

Merges concatenated cma files into one file and removes sequence fragments and redundant sequences. Click the link to the right to obtain the executable (beta version).

purge_msa.tar

GetPDB

Retrieves pdb coordinate files based on fasta defline identifiers within an input file  and then creates pdb files with modeled hydrogens (e.g., 1abc_H.pdb) as required by DARC, SPARC, and our other programs.  GetPDB requires Perl and the reduce program by Michael Word.

get_pdb.tar