Periodic Reporting for period 2 - IMAGINE (IMAGINE – Informing Multi-modal lAnguage Generation wIth world kNowledgE)
Berichtszeitraum: 2021-03-14 bis 2022-03-13
An important reason these models do not generalise well is the fact that the data they are trained on is biased, and they cannot efficiently "understand" and utilise human-curated knowledge present in structured knowledge graphs. In the IMAGINE project my main goal is to incorporate world knowledge to better learn state-of-the-art V&L models so that they better generalise to unseen or rare cases, and also so that we can mitigate issues related to bias and unfairness. I investigate ways to make V&L models transparently connect to knowledge graphs, and whether that leads to less bias and better generalisation. I also currently devise better datasets to train vision & language models, and better benchmarks to evaluate these models, as orthogonal strategies to gauge their capabilities.
Throughout 2020 I have supervised three Master students at NYU where we worked on a project involving representation learning for multi-modal knowledge graphs (Huang et al., 2022). Throughout 2020 I have also activated my research networks and worked on different projects: I proposed to use knowledge graphs to improve multilingual language models (with colleagues in Rome and Finland), which led to a paper published at NAACL 2021 (Calixto et al., 2021); I investigated the linguistic capabilities of pretrained vision and language models (with colleagues in Malta and Germany), which led to a paper published at the MMSR 2021 workshop (Parcalabescu et al., 2021); I co-authored a review article on multilingual/multimodal natural language generation published at the Journal of Artificial Intelligence Research (JAIR) in 2022 with colleagues in the COST Action Multi3Generation (Erdem et al., 2022); I published the VisualSem vision and language (V&L) dataset at the Multilingual Representation Learning workshop (Alberts et al., 2021), which includes text and images where concepts are part of a knowledge graph (e.g. Wikipedia) - this dataset is devised to support V&L model training and evaluation; I conducted a survey of the current landscape of NLP applications and resources for mental health and mental disorders (with colleagues in the USA and Brazil) which was published as a book chapter (Calixto et al., 2022); and, finally, I co-authored a paper accepted at the Annual Meeting of the ACL 2022 where we propose a benchmark to assess, in a systematic manner, what are the fine-grained linguistic capabilities and knowledge of pretrained V&L models (Parcalabescu et al., 2022).
I have published multiple papers/articles relevant to the IMAGINE project: one at ACL 2019, one at the MT Journal, two at AACL 2020, one at NAACL 2021, one at MMSR 2021, one at MRL 2021, one at EAMT 2022, and one at ACL 2022. I also have one pre-print we have not decided where to publish yet. I have presented my work at meet-ups and invited talks, and have co-organized the Representation Learning for NLP 2021 workshop, which is perceived as a very high-impact workshop in my area. In addition to presenting my work at conferences, I recently gave (or will give) the following invited talks: Dublin Machine Learning meet-up (September 2020), KU Leuven NLP Symposium (December 2020), Cardiff University (March 2021), Probabll lab at the University of Amsterdam (March 2021), RGCL Machine Learning and Deep Learning Seminar Series (June 2021), and the Helsinki NLP Research Seminar in Language Technology (June 2021), KUIS AI at Koc University (December 2021), and the Informatics Institute NLP Lab at the Federal University of Goias (April 2022).
VisualSem is available for research purposes in: https://github.com/iacercalixto/visualsem. Together with colleagues, I have finalised the code to train a model to learn unsupervised multi-modal KB representations (Huang et al., 2022). This model uses VisualSem and is available in: https://github.com/iacercalixto/visualsem-kg.