Periodic Reporting for period 1 - INVeST (INdividual Vascular SignaTure: A new machine learning tool to aid personalised management of risk for cardiovascular disease)
Berichtszeitraum: 2016-09-02 bis 2018-09-01
Objectives: The overall main research objective of the INVeST project was to create a sound method that generates individual vascular signatures based on the measurements of the retinal vascular function. These signatures are used for early prediction of an abnormal vascular response in case of risk or pathologies and personalised intervention to modify it.
-The learning tasks included the list of teaching objectives listed in the proposal: Time series modelling, Statistical noise modelling, Machine learning, Data collection using DVA, Personal data protection and Financial project management.
-The fellow supervised a final year computer sciences student in Machine Learning
- Networking tasks consisted of trips to Vienna, Basel and Honolulu and discussing project related issues.
- During the clinical trials the fellow ran a recruitment campaign, scheduled patients, operated the Dynamic Vessel Analyser, Vendy’s Endothelix and helped in collecting various biomarkers
The bulk of the work consisted in designing and implementing a software solution for the long-time storage, retrieval and analysis of the recorded measurement data. Collectively this software package was named as Invest software suite (ISS).
ISS is a three layer software package consisting of:
1. Invest database of retinal vascular function data
2. Middleware
3. Import / export tools
4. Research software
Invest database
The Invest database is at the core of our data storage solution, i.e. it is the central repository for all the primary data related to the project. It contains all data from the various retinal vessel analysis data sources in a single location as well as related medical background data (e.g. lifestyle information, cardiovascular, blood tests, endothelium), where collected.
The technical solution for the central Invest database involves an Ubuntu 16.04 LTS server, backed by RAID-Z storage. The database server is accessible only to two users. Physically the server is locked in the project coordinators' office.
Challenges:
There are two technical issues to overcome related to databases:
1. merging multiple Sybase databases in a single relational database and
2. extracting the measurement data from the retinal vessel analysis recordings
Merging multiple Sybase databases was challenging because of the following factors:
1. Outdated documentation of the used database structure.
2. The so-called moving target problem. Update to the vendor's software brought breaking changes in their database structure.
3. Inconsistencies in data entry. Subtle changes in spelling or typos can lead to the same patient being recorded in multiple Sybase databases.
4. Lack of Sybase licenses.
Creating a complete multi-center database was out of scope of the project. The binary data format used for recording the measurements had no documentation and this made the extraction of the actual measurement data for research pourposes extremely difficult.
Middleware
The middleware is an abstraction layer above the Invest database and provides access for the import/export tools. It consists of two projects, implemented in .NET:
1. InvestDb. This maps the database tables to .NET classes.
2. RVADecoder. This is a class library which extracts the data from RVA recordings.
The import/export tools
A set of four utility tools have been created to facilitate (1) importing data from the DVA equipment, (2) exporting data from the Invest database to csv files as well as (3) importing csv files into Invest database, and (4) editing or adding medical background for patients.
These tools have been glued together with various BASH and Windows batch scripts to allow the seamless and safe movement of data (using SSH tunnels) between the different computers over the intranet: development computer / client computer -- database server -- DVA machine.
Research software
The research software is the largest part of the project and is responsible for the data analysis part of the research. It includes a technical solution consisting of a so-called thin layer for accessing the database, a caching mechanism which allows reusing past database queries, and most importantly data analysis and data visualisation tools. The research and exploration tools have been written in Matlab.
Analysis and visualisation of the RVA data
Modelling the retinal vessel response has proved to be challenging because of the significant amount of noise of various sources present in the signal. We experimented with a large number of computational and mathematical methods, such as neural networks (including autoencoders), genetic programming, classical time series models (e.g. ar, arma, Box-Jenkins, output-error). Due to the high dimensional nature of the flicker data, for visualisation, the challenge was in the choice of the dimensionality reduction algorithm which would allow the mapping of all measurements into 2D or 3D.
An essential step towards our final solution of modelling the personal flicker response of patients consisted in the explicit modelling of the flicker signal. By including the flicker signal as a square wave function and modelling the vascular flicker response as a dynamic system it was possible to create a 2D mapping that correlates very well with the distance from the golden average. Therefore we have produced a sound method for accurately modelling individual flicker response.