Skip to main content
European Commission logo print header

Machine learning approaches to epigenomic research

Article Category

Article available in the following languages:

New methods in epigenomics

Researchers have created new methods to analyse epigenomic data – information about how DNA packaging controls gene expression.

Health icon Health

Epigenomics is the study of DNA organisation on a genomic level. This is a new field, with researchers just coming to terms with how to mine useful information from a sudden overabundance of genomic data. The EU-funded EPIGENE INFORMATICS (Machine learning approaches to epigenomic research) initiative aimed to create bioinformatics tools that can parse useful information from the vast and complex data sets being generated by modern biological methods. In particular, researchers wanted to create a framework to allow statistical testing on these data sets. EPIGENE INFORMATICS created two new methods to compare genetic sequencing profiles from techniques called ChIP-Seq (chromatin immuno-precipitation followed by sequencing) and BS-Seq (bisulphite sequencing). These techniques are used to study the proteins and epigenomic changes associated with a specific sequence of DNA. The new methods allow researchers to compare these profiles and identify significant differences based on statistical analysis. The team tested the new methods by studying H3K4me3, a common epigenomic mark, and Cfp1, the protein responsible for introducing H3K4me3 to the genome. Project research showed that Cfp1 has more than 1 600 potential target regions, and linked the H3K4me3 mark to gene expression changes. This means that the new method, called MMDiff, can be used to identify biologically relevant epigenomic changes. The second method, M3D, can detect changes in methylation (a common epigenomic change) patterns across a genome. This method compared favourably to previous tools that do the same thing. Bringing robust statistical analysis to bear on genomic data will help researchers begin to make sense of it all. This will lead to an improved understanding of human biology, with far-reaching benefits in medicine and human health.


Epigenomics, gene expression, genomic data, bioinformatics, genetic sequencing

Discover other articles in the same domain of application