Intermediate Methylation Detection Algorithm (iMet)
We developed a maximum scoring segment algorithm to identify regions of overlapping MeDIP-seq and MRE-seq signals. Given normalized MeDIP-seq and MRE-seq read densities across all CpGs, the algorithm traced through each CpG sequentially, comparing read counts from both assays. An arbitrary score proportional to the read density was increased when the signals overlap and decreased when they do not, and an additional penalty proportional to the distance between CpGs was assigned. When the score returned to zero at some distance following the initialization of an IM region, the end point of the region was defined by the position with the highest score following the start site.

The tool used for identifying IM regions could be downloaded from here.
This tool was developed by Xin Zhou and is maintained by GiNell Elliott.
Instructions for Running iMet tools
The download file above contains 3 tools for processing MeDIP-Seq and MRE-Seq data, a directory of example data, and a script that will run the pipeline on the example data using a single command. See details on each item below.
List of download contents:
README.txt iMet.c medipBedGraph2cpg.c mreBed2cpg.c run_iMet_Example.sh Example_Data: chr20.chromSize.txt chr20.CpG_sites.bed chr20.Breast_Luminal_Epithelial_Cells.Donor1.MeDIP-Seq.bedGraph chr20.Breast_Luminal_Epithelial_Cells.Donor1.MRE-Seq.bed chr20.Breast_Luminal_Epithelial_Cells.Donor2.MeDIP-Seq.bedGraph chr20.Breast_Luminal_Epithelial_Cells.Donor2.MRE-Seq.bed chr20.Fetal_Brain.Donor7.MeDIP-Seq.bedGraph chr20.Fetal_Brain.Donor7.MRE-Seq.bed
Generate Results Using Example Data
- In the iMet download package, run the file run_iMet_example.sh from the command line (you may first need to make the file executable).
$ chmod +x run_iMet_Example.sh #makes file executable $ ./run_iMet_Example.sh
The script will run the iMet pipeline on MeDIP-Seq and MRE-Seq data from chromosome 20 for 3 different samples. See below for details on each tool. -
For each sample, the pipeline will produce 4 output files in the Example_Data directory with the following suffixes: .MeDIP-Seq.cpg, .MRE-Seq.cpg, .raw.IM, .filtered.IM
- .MeDIP-Seq.cpg : a four-column bedgraph file with MeDIP-Seq read counts at CpGs only
chr20 60425 60427 4 chr20 60431 60433 4 chr20 60550 60552 4 chr20 60577 60579 3
- .MRE-Seq.cpg : a four-column bedgraph file with MRE-Seq read counts at CpGs only
chr20 60425 60427 1 chr20 60431 60433 1 chr20 64322 64324 1 chr20 64376 64378 16 chr20 64380 64382 16
- .raw.IM : the complete output of IM regions before post-processing filters (columns: chromosome, start, stop, IM region score, region length, position of each CpG)
chr20 62318464 62318656 31.979994 193 62318656,62318643,62318632,62318629,62318621,62318611,62318607,62318605,62318594,62318587,62318556,62318535,62318516,62318481,62318471,62318468,62318464, chr20 61745800 61745991 29.289997 192 61745991,61745962,61745942,61745885,61745883,61745856,61745839,61745836,61745822,61745818,61745800, chr20 61266856 61267004 27.120003 149 61267004,61266976,61266943,61266938,61266936,61266924,61266912,61266896,61266871,61266856, chr20 4836317 4836439 23.380001 123 4836439,4836409,4836366,4836358,4836353,4836346,4836337,4836334,4836324,4836317,
- .filtered.IM : a four column file with IM regions filtered by a score cutoff of 8 (columns: chromosome, start, stop, IM region score). Score cutoff was determined by comparison to randomized data.
chr20 62318464 62318656 31.979994 chr20 61745800 61745991 29.289997 chr20 61266856 61267004 27.120003 chr20 4836317 4836439 23.380001
- .MeDIP-Seq.cpg : a four-column bedgraph file with MeDIP-Seq read counts at CpGs only
-
Example data output as viewed on the UCSC Genome Browser at the 3 imprint control regions found on chromosome 20:
Requirements for Running iMet Tools
- System Requirements
iMet requires a machine with at least 32G memory
- Preliminary Files (any sample normalization should be done prior to running iMet tools)
- MeDIP bedGraph files
#Chromosome Start Stop ReadCount chr20 60366 60367 1 chr20 60367 60388 2 chr20 60388 60404 3 chr20 60404 60558 4
- MRE bed files (requires 6 columns with strand information in the 6th column)
chr20 60423 60467 SOLEXA12_4:6:59:1299:1253 0 - chr20 64314 64359 SOLEXA12_4:6:78:387:56 0 + chr20 64338 64413 SOLEXA6_80:2:73:1557:1486 0 + chr20 64381 64454 SOLEXA11_1:3:62:1196:347 0 +
- Human chromosome size file with two tab-separated columns
#Chromsome Size(bp) chr20 63025520
- Human CpG coordinate file--a bed-style formatted file with the start and stop coordinates of each human CpG
chr20 60178 60180 chr20 60425 60427 chr20 60431 60433 chr20 60550 60552
- MeDIP bedGraph files
Running iMet Tools
- Run medipBedGraph2cpg on MeDIP-seq bedGraph file
- Purpose:
Converts whole-genome MeDIP-seq read counts to CpG-only read counts (this reduces the size of the file so it can be used as input in the iMet program) - Usage
./medipBedGraph2cpg <chromosome size file> <CpG coordinates file> <input MeDIP bed file> <output file of data at CpG sites>
- Notes:
Remember to compile before running: cc medipBedGraph2cpg.c -o medipBedGraph2cpg - Example
$ ./medipBedGraph2cpg chr20.chromSize.txt chr20.CpG_sites.bed chr20.Breast_Luminal_Epithelial_Cells.Donor1.MeDIP-Seq.bedGraph chr20.Breast_Luminal_Epithelial_Cells.Donor1.MeDIP-Seq.cpg
- Purpose:
- Run mreBed2cpg on MRE bed file
- Purpose:
Converts whole-genome MRE-seq read counts to CpG-only read counts for iMet input - Usage
./mreBed2cpg <chromosome size file> <CpG coordinates file> <input filtered MRE bed file> <output file of data at CpG sites>
- Notes
To compile: cc mreBed2cpg.c -o mreBed2cpg - Example
$ ./mreBed2cpg chr20.chromSize.txt chr20.CpG_sites.bed chr20.Breast_Luminal_Epithelial_Cells.Donor1.MRE-Seq.bed chr20.Breast_Luminal_Epithelial_Cells.Donor1.MRE-Seq.cpg
- Purpose:
- Run iMet on output files from steps 1 and 2
- Purpose
Compares MeDIP-Seq and MRE-Seq data at individual CpGs to define regions of intmermediate methylation - Usage
./iMet <chromosome size file> <CpG coordinates file> <MeDIP data generated by medipBedGraph2cpg> <MRE data generated by mreBed2cpg> <output file of putative intermediately methylated regions>
- Notes
To compile (must use -lm flag): cc iMet.c -o iMet -lm
parameters can be adjusted by directly editing the code-- for instance, the minimum region length, which is 50bp by default - Example
$ ./iMet chr20.chromSize.txt chr20.CpG_sites.bed chr20.Breast_Luminal_Epithelial_Cells.Donor1.MeDIP-Seq.cpg chr20.Breast_Luminal_Epithelial_Cells.Donor1.MRE-Seq.cpg chr20.Breast_Luminal_Epithelial_Cells.Donor1.raw.IM
- Purpose