Methylation refers to the process by which methyl groups are added to the molecule of DNA, resulting in modification of its functions 1. Methylation plays an important role in the control of gene expression - the two main ways in which it does this is by either physically preventing the transcription of that gene, or by initiating in the recruitment of chromatin remodelling proteins 2. DNA methylation has been associated with multiple biological processes, such as aging 3, cellular development and differentiation 4, and can also be affected by exercise and diet 5. It has also been associated with many different diseases including cancer, multiple sclerosis 6 and cardiovascular disease 7.
Image modified from the cover of Cell, April 25 2013, Volume 153
Using methylation data
DNA methylation plays a crucial role in the regulation of gene expression and therefore, understanding how, why, when, and where DNA is methylated is a fundamental research question. Around the world, researchers are starting to ask this type of questions, for example:
One of our users, Prof. Graham Ball, heads up a bioinformatics research group at Nottingham Trent University that is interested in the use of methylation data as a part of an integrated '-omics' approach. They have developed algorithms for the mining of high-dimensional datasets. Using feature cross comparison they can look for concordance between different datasets for the same disease state, and using inference techniques they can identify interactions between features.
"The use of methylation data provides one piece of the puzzle at a systems biology level."
DNA methylation is also an important area of study in cancer biology as it has been shown that the epigenome of cancer cells is often dramatically different from the healthy tissues that they originate from. This is highlighted by a study published earlier this year in Nature Communications that showed that DNA methylation patterns can be used to discriminate between high risk ovarian cancers and non-serious ones. A finding which may help in developing cancer-preventative strategies!
Furthermore, some researchers are specifically developing tools to support investigation of DNA methylation. One Repositive user, Jose L. Oliver, leads the Computational Epigenomics and Bioinformatics Group at Granada University (Spain), which develops and applies software and database tools to explore the epigenome, mainly whole genome methylomes and microRNA. One of these tools is NGSmethDB, which I will describe in more detail below.
However, while epigenomics has huge potential to redefine the way we understand and treat disease, technological limitations to sequencing DNA modifications slows their exploitation as biomarkers, prognostic and diagnostic tools. Therefore there is now a growing drive to develop new technologies to study DNA methylation.
Cambridge Epigenetix (CEGX) is developing technologies which now enable researchers to decipher DNA methylation and hydroxymethylation and support the growth of epigenetic research in the academic and clinical setting. Their technologies not only allow researchers to quantify 5mC and 5hmC at a single-base resolution, which was previously impossible with traditional bisulfite conversion methods, but also generate NGS epigenomic data of unparalleled quality. By developing integrated workflows that bypass the roadblocks faced by researchers today, CEGX technologies standardise epigenetics analysis to enable groundbreaking discoveries.
"Thanks to these innovations, 5hmC, similar to 5mC, is increasingly applied as a prognostic biomarker in disease advancement and treatment."
Methylation data resources
There is an ever-increasing amount of methylation research going on and a lot of that data is being released into the public forum. Alongside the methylation datasets that can be found in the large international NCBI and EBI repositories, which you can browse here, there are also many data sources that have chosen to specifically focus on methylation data. These include:
Cancer Methylome System (Browse), which was written by the Computational Biology and Bioinformatics Division at UTHSCSA, Texas, contains whole genome-wide methylation data from endometrial and breast cancers, and some normal samples. This data was acquired using the methyl-CpG binding domain proteins followed by sequencing (MBDCap-seq) protocol.
The database of human DNA Methylation and Cancer - MethyCancer 8 (Browse) hosts data on DNA methylation, cancer-related genes and mutations, and cancer information from public resources, and the CpG Island (CGI) clones derived from large-scale sequencing. It was developed by the Behavioral Genetics Center in the Chinese Academy of Sciences.
The NGSmethDB (Browse) is a database dedicated to the storage, browsing and data mining of whole-genome, single-cytosine resolution methylomes for the best-assembled eukaryotic genomes. It stores whole genome methylomes generated from short-read datasets from bisulfite sequencing (WGBS) projects, and includes two precompiled datasets: 1) methylation segments and 2) differentially methylated cytosines. By applying a unique bioinformatics protocol to all the datasets it enables comparisons to be drawn between all the samples for all their variables. Furthermore, program-driven access to NGSmethDB from the user desktop is supported, enabling methylation comparative analyses against NGSmethDB data without needing to upload private data to a public server.
The Department of Psychiatry of Columbia University and the New York State Psychiatric Institute have assembled a collection of genome-wide DNA methylation profiles for human brains. MethylomeDB (Browse) contains non-psychiatric, schizophrenic, and depression methylation profiles from regions including the dorsal lateral prefrontal cortex, ventral prefrontal cortex, and auditory cortex. The DNA methylation profiles were generated by Methylation Mapping Analysis by Paired-end Sequencing (Methyl-MAPS) method and analysed with the Methyl-Analyzer software package, and cover over 80% CpG dinucleotides to a single-CpG resolution.
View the Methylation Data collection
These are only a few of the multiple resources now becoming available for researchers to gain access to DNA methylation data.
Stay tuned for more posts on what resources are out there for other genomic assay types, technologies, rare diseases and common diseases!
For more details about the resources discussed above and how to access their data, sign-up to Repositive.
Related Blog Posts
A Guided Tour on Downloading and Accessing Repositive Methylome Data