Brief description:
The CPTRA package is for analyzing transcriptome sequencing data from different sequencing platforms. It combines advantages of 454, Illumina GAII, or other platforms and can perform sequence tag alignment and annotation, expression quantification tasks. A paper describing the package is being reviewed by BMC Bioinformatics.
The package is written in Python. It can run on any computers with Python interpreter installed (version >= 2.2), and has no additional module requirement. The package can be run as a standalone program under commandline, or used as a programming library to be incorporated in other programs.
When running under commandline, a ``-h'' option will display all parameter requirements:
$ python CPTRA.py -h Usage: CPTRA.py arguments... (-h to print help message) Options: -h, --help show this help message and exit -t TAGFILE, --tag-file=TAGFILE Sequence tag data file (if analyzing Solexa data, must be in FASTQ format) -l LIBFILE, --lib-file=LIBFILE Reference cDNA sequence file (fasta format) -b BASES, --bases-used=BASES Number of bases of sequence tags to be used for analysis -m MISMATCH, --mismatch-allowed=MISMATCH Number of mismatch bases allowed for tag-reference alignment (default 1) -g GAP, --gap-allowed=GAP Number of gaps allowed for tag mapping (default 0) -o OUTFILE, --out-file=OUTFILE General name of output file -G, --gene-ontology Perform Gene Ontology functional analysis. Must also provide arguments to -A, -f options. -A GOANNOFILE, --go-annotation=GOANNOFILE GO annotation file for cDNA library entries. This file should only have two columns: 1. sequence name, 2. GO account name (one in each row). -f TERMFILEDIR, --term-files=TERMFILEDIR The directory where following files reside: term.txt, term2term.txt. They can be obtained from archive.geneontology.org/latest-full/ and should be of same version. |
Example usage
The CPTRA package is currently implemented to analyze Illumina GAII and iGentifier sequencing data. To make a change between the two platforms, you need to modify the program a little bit. Open the program file in a text editor, go to the bottom of the program, and uncomment the class instantiation line corresponding to the sequencing platform you're working with, and don't forget to comment off the others.
| $ python CPTRA.py -t tag_file.fastq -l cdna.fasta -m 1 -o job_out |
| $ python CPTRA.py -t tag_file.fasta -l cdna.fasta -m 2 -o job_out -G -A cdna.GOanno -f /path/to/termFiles/ |