Periodic Reporting for period 1 - MALOT (Managing Mobility Data Quality for Location of Things)
Período documentado: 2020-04-15 hasta 2022-04-14
The LoT has been widely spread around Europe and the world, and it forms an open and interconnected ecology that greatly expands the application potential of fundamental social services like transportation planning, logistics, and security control. However, these downstream services can truly exert their social value only when resolved is the problem of how to manage the data quality efficiently and effectively under a decentralized, dynamic, and heterogeneous architecture.
Given the aforementioned problem, this project aims for a collection of modular techniques that can be adaptive to the decentralized, dynamic, and heterogeneous LoT environment for evaluating and improving mobility data quality. The project objectives are listed as follows.
- We aim to design reliable methodology for continuously modeling mobility data quality for dynamic and heterogeneous computing nodes in a decentralized architecture.
- We aim to devise efficient and effective decentralized algorithms for enhancing the quality of heterogeneous mobility data generated by LoT nodes.
- We aim to propose an efficient and cost-effective mechanism to continuously coordinate multiple data quality management processes on heterogeneous LoT nodes.
Within the proposed framework, we studied data quality modeling techniques that can be adaptative to the decentralized, dynamic, and heterogeneous computing environment. Specifically, taking the data statistics (quantile computation) task as a case study, we analyzed and modeled the relationship between data errors and processing latencies among a set of decentralized and heterogeneous computing nodes. In another task of positioning data cleaning, we studied and proposed a generalized model for capturing and mitigating the uncertainty of mobility data (positioning data) in LoT.
On top of the proposed data quality modeling techniques, we studied specific data quality enhancement algorithms. On the one hand, we studied missing data imputation for wireless positioning data, considering modeling the spatiotemporal dependencies within a mobility data sequence and between multiple mobility data sequences. On the other hand, we studied decentralized, continuous proximity-based outlier detection in an edge computing fashion.
To measure and improve the efficiency of the planning of data quality tasks in LoT, we proposed an edge-resident task coordination mechanism for optimizing resource usage. Such a mechanism has proven effective in the application of spatiotemporal quantile monitoring. As a facilitator work, we also implemented a testbed for deploying and evaluating data quality management algorithms in the decentralized, dynamic, and heterogeneous LoT environment.
To disseminate the project results in the cyber world, we have created a project website (http://msca-malot.github.io/) a GitHub organization (https://github.com/msca-malot) and a Twitter social media account (https://twitter.com/MalotMsca). These portals will continue to be used to expand the impact of the project's current and subsequent outcomes. Partially project outcomes have also been presented in academic and networking events, including top-tier conferences VLDB 2022 and SIGMOD 2022, invited talks at HUST, SUSTech, BUPT, etc., and seminars at AAU and RUC.
1) established a general picture of managing mobility data quality in the decentralized, dynamic, and heterogeneous LoT environment,
2) proposed effective mobility data quality modeling techniques on the dimensions of data uncertainty, data errors, and processing latency,
3) gained insights on designing efficient and robust algorithms for mobility data quality enhancement tasks including missing data imputation and outlier detection, and
4) proposed a novel, edge computing based mechanism for coordinating decentralized and heterogeneous computing nodes on mobility data quality management.
The current deliverables of the project, which focused on representative computing tasks including data statistics, outlier detection, and data repairing, enable a new paradigm of using edge node coordination for mobility data quality management in a decentralized and heterogeneous architecture.
From a long-term perspective, this new paradigm has the potential to be applied to a wide variety of data-centric tasks in the LoT (and generally, the Internet of Things), towards reliable, responsive, and resilient data management in the presence of decentralized, dynamic, and heterogeneous computing nodes. This should facilitate the development of many IoT-enabled applications like traffic planning, logistics, and air quality monitoring, which contribute immediately to human well-being and continuous social progress.