Deep Transfer: Generalizing Across Domains

Final Report Summary - DEEP TRANSFER (Deep Transfer: Generalizing Across Domains)

This project lies in the field of machine learning, where the goal is to devise algorithms that learn models from data. For example, consider the task of labeling an abnormality on a mammogram as benign or malignant. Here, the data consist of a radiologist's description of an abnormality and its known label. After learning, the model can then be used to make predictions about future abnormalities (that is, where the abnormality's label is unknown). The quality of the model can be assessed by measuring its predictive accuracy. Typically, the quality of a learned model strongly depends on the amount of data available for training: processing more data results in more accurate models. Unfortunately, for any single target task it may be difficult to acquire sufficient data as acquiring data can be time consuming (e.g. annotating documents), monetarily expensive (e.g. genetic testing), physically invasive (e.g. collecting a tissue sample) or unavailable in sufficient quantities (e.g. data about rare diseases). Furthermore, even if there is data for a related problem, most current learning algorithms are only able to improve with experience with respect to one problem and not across multiple different tasks.

Transfer learning addresses this problem by allowing an algorithm to consider data from a source task in addition to data from the target problem when learning a model. To date, most transfer occurs between closely related domains. That is, they are described by the same predicates, variables, objects, etc. This project tackles what we call Deep Transfer: the ability to transfer knowledge across entirely different domains (i.e. they are described by different predicates, objects, properties, etc.). Computationally, the missing link is the ability of a learning algorithm to discover structural regularities that apply to many different domains, irrespective of their superficial descriptions. Conceptually, deep transfer offers a fundamentally different and novel paradigm for acquiring experience: exploiting data from other, possibly very different, tasks.

This project has significantly advanced the state of the art in deep transfer through the development of a novel framework for this setting. First, our approach uses data from a source domain to identify important structural regularities, such as transitivity or homophily. Then, when learning a model in a new domain, the learner is biased towards reusing regularities that have proven to be helpful in modeling the source domain. On an algorithmic level, our approach confers two important advantages compared to existing techniques for deep transfer. First, it is based on a generative model of the world which enables the transfer to occur in a more principled, well-founded manner whereas previous approaches were more ad hoc. Second, it is able to transfer a wider array of patterns as well as information about each pattern's usefulness across domains. Empirically, we have evaluated our approach on three real-world data sets: a protein-protein interaction data set about Yeast, a Web domain about computer science departments, and a collection of Twitter data. We found that our approach results in significant improvements in terms of both accuracy and run time compared to both other transfer learning approaches as well as learning from scratch. Furthermore, our algorithm identified and transferred important regularities such as symmetry and homophily between different domains.

Several challenges encountered during this research led to contributions in three other areas of machine learning: probabilistic model learning, statistical relational learning, and empirical evaluation of learned models. First, we have developed several state-of-the-art structure learning algorithms for Markov networks, a type of probabilistic model. Our approaches offer significant improvements in accuracy and training time compared to previous methods. Second, we developed a framework for learning Markov logic networks from data, which are a widely used statistical relational learning formalism that we used in this project. Third, we have developed the first approach to learning models in relational domains that contain both continuous and discrete variables. All previous approaches were restricted to only modeling discrete variables. Finally, we made two important advances for evaluating learned models. One, we uncovered an unknown fact about precision-recall curves, a widely used performance metric in bioinformatics, machine learning, information retrieval, and many other areas, that has important methodological implications. Two, we showed how to compute a wide variety of evaluation metrics from partially labeled data, which will enable the evaluation of machine learning methods in many additional settings.

The research funded by this project has had impact on several different levels. Its primary impact is the technical advances that we have made, namely the various ways that we have advanced the state of art as described in the previous two paragraphs. A second impact is that we have made code for several of our systems as well as some of the data that we used in our experiments publicly available. This promotes reproducibility of our results and allows others to benefit from our progress. A third impact is that machine learning is becoming more and more ubiquitous in our world and our findings can be used to help solve important problems in other domains. For example, our collaborators have applied one of the algorithms we have developed to the task of analyzing disease interactions. Hence, it seems possible that this work will be useful for researchers in other domains and may even result in novel scientific discoveries.

Finally, on a personal level, this funding had a great impact on me as it helped me transition from a post-doctoral position at the University of Washington to a tenure-track faculty position at KU Leuven. The support from this award was crucial in building up my research group and publication record, and contributed to me being awarded tenure at KU Leuven in June of 2015.

The project website: http://dtai.cs.kuleuven.be/research/projects/deeptransfer

All publications can be found online: http://people.cs.kuleuven.be/jesse.davis/

Furthermore, the following websites contain publicly available software packages related to the project:

https://dtai.cs.kuleuven.be/software/todtler/

https://dtai.cs.kuleuven.be/software/gssl/

https://dtai.cs.kuleuven.be/software/llm/

Final Report Summary - DEEP TRANSFER (Deep Transfer: Generalizing Across Domains)

Share this page

Download