The first 18 months have been focused on models and tool development.
Work Package 1 : Model develpment an improvement. In particular, the following tasks have been performed:
- Set up modeling infrastructure on AWS (Understanding of the developed models and their requirements, Implementing the pipeline used to generate the data, Release and test, Set up architecture for processing collected field data sync with Timbtrack)
- Create high fidelity, up-to-date forest basemap & masks (SRI) that are displayed here :
https://ee-swiftt.projects.earthengine.app/view/foresttype(opens in new window)Publication : Salii, Y., Kuzin, V., Hohol, A., Kussul, N., & Yailymova, H. (2023, July). Machine learning models and technology for classification of forest on satellite data. In IEEE EUROCON,
https://doi.org/10.1109/EUROCON56442.2023.10199006(opens in new window).
- Several data science models have been trained:
For insect outbreak : Identification and collection of benchmark ground-truth maps of insect outbreak (Republic Czech, September 2020 by DEFID2, Southeast of France, October 2018 by SERTIT) ; Exploration on the achievements of spectral data engineering, spectral-vegetation indexes, multi-sensor (Sentinel-1 and Sentinel-2) data, time-series data ; Training and evaluation of the accuracy performance of pixel-wise classification algorithms (Random Forest, XGBoost, SVM, MLP) and U-Net-based architectures (with attention, self-distillation, data fusion) ; Performing a preliminary analysis of the temporal transferability of the best trained models
For windthrow : Literature review (GLAD Alerts, GLAD-S2 Alerts, RADD Alerts) ; Experiments with SAR on Australian windthrows ; Exploration of SAR processing methods ; Set up of the first model with temporal SAR and anomaly detection.
For wildfire : 3 different machine learning and deep learning models were tested for fire risk prediction applied to forest assets. The Ignition prediction is tested with data gathered from Spain, the weather forescast data were tested from Italy.
Work Package 2 : Data Standardisation, Integration and Verification. In particular, the following tasks have been performed:
In addition, the infrastructure work has been prepared on the web and mobile platform to create a shared data lake. At specific intervals, a synchronization process is initiated to transfer data from the Timbtrack Database to the Shared Database. This ensures that the shared database contains both the real-time and historical data required by the AI models. The Wildsense API accesses the historical data stored in the Shared Data Database to perform model training and analysis.
- SWIFTT utilizes Google Cloud for secure online hosting of its database, ensuring confidentiality and integrity.
- Employing Sequelize ORM simplifies database operations and enables efficient creation of tables and relationships.
- MySQL 8.0 is chosen for storing historical data crucial for AI model training. Its efficient indexing and querying, along with ACID compliance, ensure data integrity.
- The integration of MySQL, Sequelize, and Google Cloud provides a robust infrastructure for AI model learning, enhancing risk detection and decision-making capabilities.
In addition, a Forest damage dataset has been created for Ukraine on forest fires and bark beetle (excluding sanitary cutting and result of war).
Publications: H. Yailymova, B. Yailymov, Y. Salii, V. Kuzin, A. Odruzhenko, S. Sydorenko, A. Shelestov, N. Kussul. A Multimodal Dataset for Forest Damage Detection and Machine Learning 2024 IEEE International Geoscience and Remote Sensing Symposium (IGARSS 2024) (accepted).
WP3: Testing and Feedback
- the field partners have been trained in data collection standards
- A first data protocole was created, then a second version has been updated
- First data collection happened in Riga and Ukraine forest