CPTRA: Cross Platform Transcriptome Analysis

Download the package here

Brief description:

The CPTRA package is for analyzing transcriptome sequencing data from different sequencing platforms. It combines advantages of 454, Illumina GAII, or other platforms and can perform sequence tag alignment and annotation, expression quantification tasks. A paper describing the package is being reviewed by BMC Bioinformatics.

The package is written in Python. It can run on any computers with Python interpreter installed (version >= 2.2), and has no additional module requirement. The package can be run as a standalone program under commandline, or used as a programming library to be incorporated in other programs.

When running under commandline, a ``-h'' option will display all parameter requirements:

  $ python CPTRA.py -h
  Usage: CPTRA.py arguments... (-h to print help message)

	-h, --help            show this help message and exit
	-t TAGFILE, --tag-file=TAGFILE
	                      Sequence tag data file (if analyzing Solexa data, must
						  be in FASTQ format)
	-l LIBFILE, --lib-file=LIBFILE
	                      Reference cDNA sequence file (fasta format)
	-b BASES, --bases-used=BASES
	                      Number of bases of sequence tags to be used for
	-m MISMATCH, --mismatch-allowed=MISMATCH
	                      Number of mismatch bases allowed for tag-reference
	                      alignment (default 1)
	-g GAP, --gap-allowed=GAP
	                      Number of gaps allowed for tag mapping (default 0)
	-o OUTFILE, --out-file=OUTFILE
	                      General name of output file
	-G, --gene-ontology   Perform Gene Ontology functional analysis. Must also
	                      provide arguments to -A, -f options.
	-A GOANNOFILE, --go-annotation=GOANNOFILE
	                      GO annotation file for cDNA library entries. This file
	                      should only have two columns: 1. sequence name, 2. GO
	                      account name (one in each row).
	                      The directory where following files reside: term.txt,
	                      term2term.txt. They can be obtained from
	                      archive.geneontology.org/latest-full/ and should be of
	                      same version.

Example usage

The CPTRA package is currently implemented to analyze Illumina GAII and iGentifier sequencing data. To make a change between the two platforms, you need to modify the program a little bit. Open the program file in a text editor, go to the bottom of the program, and uncomment the class instantiation line corresponding to the sequencing platform you're working with, and don't forget to comment off the others.

$ python CPTRA.py -t tag_file.fastq -l cdna.fasta -m 1 -o job_out

$ python CPTRA.py -t tag_file.fasta -l cdna.fasta -m 2 -o job_out -G -A cdna.GOanno -f /path/to/termFiles/