Auxiliary programs
CDD2MGS
Converts into MAPGAPS input format NCBI Conserved Domain hiMSAs, which may be obtained by going to the NCBI CDD link below.
NCBI CDD hiMSAs
Click the link on the right to download NCBI Conserved Domain hiMSAs as a compressed tarball.
ftp://ftp.ncbi.nih.gov/pub/mmdb/cdd/hiMSA
AddPhylum
Kingdom and phylum taxonomic labeling of NCBI nr protein sequences for GISMO, BPPS, and DARC. C++ source code and a simple example are included in the tarball.
NCBI taxonomy dump ftp site
The taxdump.tar.gz and accession2taxid/prot.accession2taxid.gz files are required by the AddPhylum program.
ftp://ftp.ncbi.nih.gov/pub/taxonomy
NCBI non-redundant (nr) protein sequence ftp site
Download the fasta formatted nr.gz and pdbaa.gz files at this site for use as input to MAPGAPS and other programs.
ftp://ftp.ncbi.nih.gov/blast/db/FASTA/
ConvertMSA
Converts alignments from cma-format to mFASTA, from mFASTA to cma, and from cma to rich text format (rtf). The rtf files are suitable for publication.
PurgeMSA
Merges concatenated cma files into one file and removes sequence fragments and redundant sequences. Click the link to the right to obtain the executable (beta version).
GetPDB
Retrieves pdb coordinate files based on fasta defline identifiers within an input file and then creates pdb files with modeled hydrogens (e.g., 1abc_H.pdb) as required by DARC, SPARC, and our other programs. GetPDB requires Perl and the reduce program by Michael Word.