Periodic Reporting for period 1 - KDD-CHASER (Knowledge Discovery in Data as Collaboration of Human and Software Actors)
Reporting period: 2018-02-01 to 2020-01-31
Analysing personal data to discover useful knowledge about the individuals concerned is usually viewed as something done by corporations with access to large quantities of customer data, such as Google or Facebook. Thus, when exploitation of personal data is viewed from the perspective of the individual, it is typically seen as something that needs to be regulated in order to ensure that corporations do not abuse the power that comes with the possession of data about people. The perspective that this data could also be something that the people themselves exploit for their own personal benefit tends to get overlooked; the significance of the collaboration concept studied in KDD-CHASER is that it would enable individuals to achieve this by working together with other individuals instead of handing control of their data over to a company.
The overall objectives of KDD-CHASER were to build a model of the process of collaborative data analysis, to develop a software platform to support this type of collaboration, and to demonstrate the viability of the process model and the software platform by running a trial. The successful execution of the trial shows that the collaboration process is feasible and that the software platform can be used to support it in a real-world environment, although the usability, stability and performance of the software still require substantial improvement. Furthermore, the results of a survey conducted among the participants of the trial suggest that there is interest in this type of collaboration and that many people could gain useful information about themselves through collaborative analysis of their personal data.
After the first version of the ontology had been designed and implemented, development of a software platform for collaborative analysis of personal data was started. The platform was designed to support finding and inviting collaborators, creating and sharing datasets, visualising analysis results and communicating with collaborators via text-based chat.Internally the platform was designed to use the ontology to represent and store all information about the users of the platform and their collaborations with one another. To test the software platform and the collaboration process in practice, a trial was carried out. 12 volunteers were recruited, asked to use wearable devices to record their sleep data for a period of approximately 2 months and given instructions for using the platform to share their data with a researcher (playing the role of data analysis expert) and to view the analysis results. At the end of the trial, the volunteers were invited to complete a survey to provide feedback on the software platform and the collaboration process. The results of the trial indicate that the process and software are fundamentally viable but suffer from issues that need to be addressed by further research.
The results of the project were disseminated by publishing peer-reviewed papers in international scientific conferences. Further publications are being prepared and are expected to be submitted for review in the first half of the year 2020. The software platform did not reach a sufficiently high technology readiness level during the project to be exploitable, but research funding has been applied for to continue the development of the ontology and the software following the successful proof of concept.
The immediate impact of the results of KDD-CHASER is to lay the groundwork for more applied research in the area of collaborative analysis of personal data. The long-term impact of the work, assuming that the proof of concept is eventually transformed into a fully fledged product, is potentially highly significant. Empowering people to control their data and to refine it into knowledge would enable everyone to enhance their quality of life through the application of technologies that most people currently have limited access to, such as artificial intelligence. Given that activity and sleep data are highly relevant to health, better exploitation of such data for the benefit of the data owners would have a considerable positive social impact as well. Collaborative data analysis also has the potential to become a new area of profitable economic activity, with business opportunities for freelance data analysts and providers of collaboration platforms. Finally, collaboration with individual data owners may become a new way for researchers to obtain access to data, potentially boosting scientific research in any discipline where there is a use for personal data collected by the individuals themselves.