Periodic Reporting for period 4 - CompDB (The Computational Database for Real World Awareness)
Período documentado: 2021-12-01 hasta 2022-05-31
We made significant progress on a new system architecture, addressing many challenges in efficient compilation and language integration. One of the goals is this project is seamless integration of high-level data processing, specified in a programming language, with traditional database query support. This has many technical challenges, including, somewhat surprisingly, compile time: When specifying a complex algorithm and then later executing it on a very efficient parallel execution engine, the compile time can be higher than the actual execution time. This turned out to be problematic for interactive use cases, but we developed a new compilation framework that adaptively compiles the different parts of the execution plan depending upon usage: The code is compiled initially using a very cheap compiler that is optimized for compile time and uses a new linear time register allocator, and more expensive compilation modes are then used to improve the initial code only when the observed execution times and the cost model predict expensive compilation to be beneficial. This allows for every efficient execution of “cheap” queries (i.e. queries that might be structurally complex, but that touch comparatively little data), while complex analytical still benefit from the full power of an optimizing compiler backend. Extensive work on algebraic optimization leads to an improved query optimization component, which is essential for handle large and complex analytical queries, whereas previous approaches were unable to find solutions for large queries. The optimization framework we developed can handle all classes of queries, including queries with cross products and hyper-edges, which is important to handle arbitrary analytical queries.
And integrated user defined operators into the query execution workflow, which can be used as building block for executing high-level execution logic.
Our work on complex analytical processing using user defined logic offers a much richer and powerful interface for expressing application logic, and has been accepted at PVLDB.
The overall system has been used for experiments by several other groups, has demonstrated excellent performance in many different application scenarios.