iteres

iteres has several modules for transposable element related analysis. stat was used for getting subfamily level alignment statistics from Chip-based sequence data, like Chip-seq, MeDIP-seq etc. cpgstat was used for same purpose, but used for MRE-seq data.

Example run

stat

We provided a sample BAM file at here, please download this BAM file and run following command (download chromosome size file, repeat size file and rmsk.txt file from Download):

iteres stat hg19_lite.size subfam.size rmsk.txt sample.bam

iteres will ouput some stuff on the screen:

* Parsing the rmsk file
* Total 5298130 repeats found.
* Parsing the SAM/BAM file
* Processed read ends: 12548044
* Writing stats and Wig file
* Generating bigWig files
* Preparing report file
* Done, time used 44 seconds.

Note: since the chromosome size file didn't contain those supercontigs, so you would see some warning messages.

This would generate following files:

sample.iteres.report	a simple report file contains the mapping statistics
sample.iteres.unique.bigWig	read density on repeat consensus by uniquely mapped reads
sample.iteres.bigWig	read density on repeat consensus by all mapped reads
sample.iteres.class.stat	statistics of repeats at class level, like RPM/RPKM by uniquely or all mapped reads
sample.iteres.family.stat	statistics of repeats at family level, like RPM/RPKM by uniquely or all mapped reads
sample.iteres.subfamily.stat	statistics of repeats at subfamiliy level, like RPM/RPKM by uniquely or all mapped reads

Run iteres without any parameter will give a list of available modules:

Module parameters

stat

$ iteres stat

Obtain alignment statistics for each repeat subfamily, family and class.

Usage:   iteres stat [options]    

Options: -S       input is SAM [off]
         -Q       unique reads mapping Quality threshold [10]
         -c       coverage threshold for overlapping [0.0001]
         -N       normalized by number of (0: reads in repeats, 1: non-redundant reads, 2: mapped reads, 3: total reads) [0])
         -U       unique reads normalized by number of (0: unique mapped reads in repeats, 1: unique mapped reads, 2: total reads) [0])
         -R       remove redundant reads [off]
         -T       treat 1 paired-end read as 2 single-end reads [off]
         -D       discard if only one end mapped in a paired end reads [off]
         -w       keep the wiggle file [off]
         -B       output bed file of mapped reads [off]
         -V       output bed file of unique mapped reads [off]
         -C       Add 'chr' string as prefix of reference sequence [off]
         -E       extend reads to represent fragment [150], specify 0 if want no extension
         -I       Insert length threshold [500]
         -o       output prefix [basename of input without extension]
         -h       help message
         -?       help message

filter

$ iteres filter

Obtain alignment statistics of individual loci of each repeat subfamily, family or class.

Usage:   iteres filter [options]    

Options: -S       input is SAM [off]
         -Q       mapping Quality threshold [10]
         -g       coverage threshold for overlapping [0.0001]
         -N       normalized by number of (0: non-redundant unique mapped reads, 1: unique reads, 2: mapped reads, 3: total reads) [0])
         -n       use repName (subfamily) as filter [null]
         -f       use repFamily as filter [null]
         -c       use repClass as filter [null]
         -t       only output repeats have more than [1] reads mapped
         -r       output the list of reads [off]
         -R       remove redundant reads [off]
         -T       treat 1 paired-end read as 2 single-end reads [off]
         -D       discard if only one end mapped in a paired end reads [off]
         -C       Add 'chr' string as prefix of reference sequence [off]
         -E       extend reads to represent fragment [150], specify 0 if want no extension
         -I       Insert length threshold [500]
         -o       output prefix [basename of input without extension]
         -h       help message
         -?       help message

nearby (Removed since version 0.3.1)

$ iteres nearby

Obtain nearby genes from locations listed in a bed file by querying UCSC database.

Usage:   iteres nearby [options] 

Options: -d       database to query [hg19]
         -n       output how many genes each direction [1]
         -o       output prefix [basename of input without extension]
         -h       help message
         -?       help message

Note: the bed file should contain at least 3 fields which were [chr] [start] [end]
      also you need to have an internet connection

cpgstat

$ iteres cpgstat

obtain CpG statistics for each repeat subfamily, family and class.

Usage:   iteres cpgstat [options]    

Options: -w       keep the wiggle file [off]
         -o       output prefix [basename of input without extension]
         -h       help message
         -?       help message

cpgfilter

$ iteres cpgfilter

obtain CpG statistics for each repeat locus.

Usage:   iteres cpgfilter [options]    

Options: -n       use repName (subfamily) as filter [null]
         -f       use repFamily as filter [null]
         -c       use repClass as filter [null]
         -t       only output repeats have more than [0] CpG score
         -o       output prefix [basename of input without extension]
         -h       help message
         -?       help message

Access Alignment Statistics of Transposable Elements

Example run

stat

Module parameters

stat

filter

nearby (Removed since version 0.3.1)

cpgstat

cpgfilter