Skip to main content
European Commission logo print header

Managing Mobility Data Quality for Location of Things

Periodic Reporting for period 1 - MALOT (Managing Mobility Data Quality for Location of Things)

Período documentado: 2020-04-15 hasta 2022-04-14

As an instance of the Internet of Things (IoT), Location of Things (LoT) represents a paradigm that connects and integrates various location sensing “things” for smart location-based services at scale. In guaranteeing user experiences, location-based services pose strict requirements on the quality of mobility data generated from LoT. However, LoT consists of massive amount of decentralized, dynamic, and heterogeneous computing nodes, making existing data quality management approaches inapplicable. In particular, many approaches assume stand-alone data systems or simplex data systems with homogeneous distributed nodes; many approaches consider neither heterogeneous data sources nor data of different qualities in computations; and many other approaches neglect the optimization of the task assignment among dynamic and heterogeneous computing nodes.
The LoT has been widely spread around Europe and the world, and it forms an open and interconnected ecology that greatly expands the application potential of fundamental social services like transportation planning, logistics, and security control. However, these downstream services can truly exert their social value only when resolved is the problem of how to manage the data quality efficiently and effectively under a decentralized, dynamic, and heterogeneous architecture.
Given the aforementioned problem, this project aims for a collection of modular techniques that can be adaptive to the decentralized, dynamic, and heterogeneous LoT environment for evaluating and improving mobility data quality. The project objectives are listed as follows.
- We aim to design reliable methodology for continuously modeling mobility data quality for dynamic and heterogeneous computing nodes in a decentralized architecture.
- We aim to devise efficient and effective decentralized algorithms for enhancing the quality of heterogeneous mobility data generated by LoT nodes.
- We aim to propose an efficient and cost-effective mechanism to continuously coordinate multiple data quality management processes on heterogeneous LoT nodes.
The fundamental problem of the project is how to define and assess the quality of LoT mobility data, in the presence of the unique characteristics of LoT. To this end, we conducted an in-depth analysis of the data quality dimensions of the mobility data and the factors that affect the data quality dimensions in the context of Location of Things (LoT). Moreover, we performed an extensive study of the state-of-the-art techniques for mobility data quality management and low-quality mobility data exploitation, and based on that, we identified new opportunities for quality-aware data management and exploitation in the challenging LoT setting. Finally, we summarized the means to mitigate mobility data quality issues and proposed an integral framework for mobility data quality management in the LoT.

Within the proposed framework, we studied data quality modeling techniques that can be adaptative to the decentralized, dynamic, and heterogeneous computing environment. Specifically, taking the data statistics (quantile computation) task as a case study, we analyzed and modeled the relationship between data errors and processing latencies among a set of decentralized and heterogeneous computing nodes. In another task of positioning data cleaning, we studied and proposed a generalized model for capturing and mitigating the uncertainty of mobility data (positioning data) in LoT.

On top of the proposed data quality modeling techniques, we studied specific data quality enhancement algorithms. On the one hand, we studied missing data imputation for wireless positioning data, considering modeling the spatiotemporal dependencies within a mobility data sequence and between multiple mobility data sequences. On the other hand, we studied decentralized, continuous proximity-based outlier detection in an edge computing fashion.

To measure and improve the efficiency of the planning of data quality tasks in LoT, we proposed an edge-resident task coordination mechanism for optimizing resource usage. Such a mechanism has proven effective in the application of spatiotemporal quantile monitoring. As a facilitator work, we also implemented a testbed for deploying and evaluating data quality management algorithms in the decentralized, dynamic, and heterogeneous LoT environment.

To disseminate the project results in the cyber world, we have created a project website (http://msca-malot.github.io/) a GitHub organization (https://github.com/msca-malot) and a Twitter social media account (https://twitter.com/MalotMsca). These portals will continue to be used to expand the impact of the project's current and subsequent outcomes. Partially project outcomes have also been presented in academic and networking events, including top-tier conferences VLDB 2022 and SIGMOD 2022, invited talks at HUST, SUSTech, BUPT, etc., and seminars at AAU and RUC.
In the project, we have
1) established a general picture of managing mobility data quality in the decentralized, dynamic, and heterogeneous LoT environment,
2) proposed effective mobility data quality modeling techniques on the dimensions of data uncertainty, data errors, and processing latency,
3) gained insights on designing efficient and robust algorithms for mobility data quality enhancement tasks including missing data imputation and outlier detection, and
4) proposed a novel, edge computing based mechanism for coordinating decentralized and heterogeneous computing nodes on mobility data quality management.

The current deliverables of the project, which focused on representative computing tasks including data statistics, outlier detection, and data repairing, enable a new paradigm of using edge node coordination for mobility data quality management in a decentralized and heterogeneous architecture.

From a long-term perspective, this new paradigm has the potential to be applied to a wide variety of data-centric tasks in the LoT (and generally, the Internet of Things), towards reliable, responsive, and resilient data management in the presence of decentralized, dynamic, and heterogeneous computing nodes. This should facilitate the development of many IoT-enabled applications like traffic planning, logistics, and air quality monitoring, which contribute immediately to human well-being and continuous social progress.

Documentos relacionados