Skip to main content

Custom-Made Ontology Based Data Access

Periodic Reporting for period 4 - CODA (Custom-Made Ontology Based Data Access)

Reporting period: 2020-02-01 to 2020-07-31

"We are living in an era where an unprecedented amount of data is
available to companies, scientists, and individuals. Making use of
this data is increasingly becoming crucial for economic success and
for scientific progress, and even to take informed decisions in
everyday life. However, a large amount of data from many different
sources can be very difficult to handle and to interpret and, in
particular, such data will often be highly incomplete and very
heterogeneous in representation. How can we provide support for
dealing with this situation? Imagine that a computer program which
processes data has at its disposal a large amount of encyclopedic
domain knowledge, of the kind provided by Wikipedia for human
users. This is the purpose of an ontology and indeed ontologies have
proved to be very useful for supporting the processing of incomplete
and heterogeneous data. But the convenience comes at a price:
processing the data and the ontology at the same time is
computationally expensive. The CODA project sets out to provide an
ultimately fine-grained analysis of the computational costs of
ontology-mediated querying (also known as ontology-based data
access OBDA), identifying and studying ""islands of tractability"", that is,
classes of ontologies which can be processed efficiently and have other
desirable computational properties. The overall aim of this approach is
to enable ontology engineers to design and customize ontologies that
strike an ideal balance between the knowledge provided and efficient
processing, in this way developing ontology-mediated querying to its
full potential."
We have studied various islands of tractability in ontology-mediated
querying both from a theoretical and from a practical angle. Many of
these islands pertain to computational complexity and to rewritability
into practically relevant query languages. We have made significant
progress for a wide range of ontology languages, including
inexpressive and expressive description logics, existential rules, and
expressive first-order fragments such as the guarded fragment. Our
results draw a fairly complete picture of the complexity of ontology-
mediated querying, on a very fine-grained level. The recognition of
and interest in our results is witnessed by prestigious best paper
awards at conferences such as PODS and IJCAI and invited talks about
the topics of CODA at leading conferences such as STACS and ICDT.

In data complexity, the islands of tractability that we have
considered include (but are not limited to) the complexity classes
AC0, L, NL, PTime, and coNP, rewritability into first-order logic FO,
into linear Datalog, into monadic and unrestricted Datalog, as well as
the rewritability of ontology-mediated queries (OMQs) based on
conjunctive queries (CQs) into OMQs based on instance queries
(IQs). This opens up various routes towards practically efficient
implementation of ontology-mediated querying based on existing
systems.

We have also pioneered studying the complexity of OMQs from the
viewpoint of parameterized complexity. For Horn DLs and existential
rules, we have established a tight connection to treewidth and
analyzed the interplay of ontology and treewidth. For certain
important cases, this approach also allowed us to characterize the
important island of PTime combined complexity.

We have made strong progress in many directions, obtaining amongst
many other results natural characterizations of the complexity and
rewritability of OMQs, complexity results for deciding whether an OMQ
belongs to one of the above islands, and practically efficient
algorithms. For FO-rewritability, we have pushed the state of the art
from inexpressive Horn description logics (EL ontologies with IQs) via
expressive ones (such as Horn-ALCIF with CQs) to important classes of
existential rules (such as frontier-guarded rules). In important
cases, we have developed practically efficient algorithms that we then
implemented in a reasoning system called Grind which is avaliable for
public use and has already experienced uptake in applications. We
have also pushed the state of the art regarding FO-rewritability for
expressive description logics from IQs to CQs, which corresponds to
the non-trivial transition from CSPs to MMSNP and required a
substantial development of the theory of `meta-questions for MMSNP'
such as containment, FO-rewritability, and equivalence to a CSP.
These are also fundamental topics from the perspective of constraint
satisfaction.

We have obtained a complete characterization of complexity and
rewritability for an important OMQ language (EL with IQs), a kind of
`ultimate result' for that language that was beyond our
expectations. We have also exhibited a very rich picture of islands of
tractability within the class of OMQs where the ontology is formulated
in the guarded fragment. Moreover, our work has essentially provided a
complete solution of CQ-to-IQ-rewritability for expressive description
logics, solving a problem that was open for a decade. Apart from these
`main strands of research', we have also obtained results on querying
the unary negation fragment of FO extended with regular path
expressions, ontology-mediated querying with regular path queries, the
inseparability of description logic ontologies in terms of queries,
tractable weighted model counting in extensions of the two-variable
fragment of FO as a foundation for ontology-mediated querying of
probabilistic data, the one-dimensional fragment of FO, which gives
rise to description logics with relations of higher arity, and new
kinds of distributed automata that provide a novel computational model
for CSPs and OMQs.
Before CODA, the knowledge about islands of tractability in
ontology-mediated querying has been rather incomplete and
scattered. In the project, we were able to attain a rather holistic
understanding of many such islands. Our results have considerably
pushed the state of the art in several directions, as laid out in more
detail above and in the scientific reporting. In summary, the CODA
project has progressed very much beyond what was known before the
project. In the relevant research communities, the results obtained in
the project are widely known and received very positively.