Skip to main content

Big Data for 4D Global Urban Mapping – 10^16 Bytes from Social Media to EO Satellites

Periodic Reporting for period 3 - So2Sat (Big Data for 4D Global Urban Mapping – 10^16 Bytes from Social Media to EO Satellites)

Reporting period: 2020-05-01 to 2021-10-31

By 2050, around three quarters of the world’s population will live in cities. Despite of increasing efforts, global urban mapping still drags behind the geometric, thematic and temporal resolutions of geo-information needed to address these challenges. Nowadays diverse sets of incomplete data exist. For example, Earth observation (EO) satellites reliably provide geodetically accurate large scale geo-information of the cities on a routine basis from space. But the data availability is limited by resolutions and acquisition geometries of the sensors. Complementary, massive amounts of imagery, text messages and GIS data from open sources and social media provide a temporally quasi-seamless, spatially multi-perspective information basis, but with unknown and diverse qualities. So2Sat aims at a joint exploitation of big data from social media and satellite observations for global urban mapping, and aims at breakthroughs in 3D/4D urban modelling, infrastructure occupancy classification, and very high resolution population density mapping on a global scale for revolutionizing urban geographic research. In detail, the following methodological and application objectives will be addressed: improving urban-related information retrieval from EO satellite data, mining urban imagery and text messages from social media data, fusion of heterogeneous data sources, big data processing, as well as pilot application research regarding informal settlements classification and global population density estimation. The outcome of So2Sat will be the first and unique global and consistent spatial data set on urban morphology (3D/4D) of settlements, and a multidisciplinary application derivative assessing population density.
The work performed from the beginning to the end of the third financial reporting period can be structured along the four major objectives:
MO1 Improving information retrieval: There are three sub-objectives in MO1: 1) robust Earth observation image denoising, 2) solve ill-conditioned and underdetermined problem, and 3) image fusion. Regarding the first one, we have developed the so called nonlocal-means filter to significantly improve the SNR of TanDEM-X SAR images, which in turn improves the 3D reconstruction accuracy when using the filtered images. This smart filtering algorithm was integrated into our radar tomography algorithm that is used for global 3D building reconstruction (Shi et al., 2019). For the second sub-objective, we have also achieved it. We developed a compressive sensing-based radar tomographic algorithm (Shi et al., 2019) that is tailored to solve ill-conditioned and heavily underdetermined problem. The last sub-objective is on going.

MO2 Mining social media data: The three sub-objectives in MO2 are 1) find feature representation of social media images, 2) find efficient method to update 3D building models, and 3) mine information from text messages. Regarding the first and the third points, we developed stable processors to crawl Flickr images and tweets from the internet. Currently, we have collected more than 25 million social media images and 1.5 billion tweets. Social media images from Flickr cover a broad variety of motifs, but only a small fraction of these images contains clear and useful information for individual buildings. Hence, for 2) we have developed an algorithms to identify images that is relevant for our building type classification task. To extract information from text messages we use Twitter data and implement natural language processing methods including word embeddings and deep learning models. By integrating the outcoming features in the spatial domain we create semantic information, e.g. building functions. The study of social media images and tweets have been done, and published in (Häberle et al. 2019, Kruspe et al. 2021), and are also being summarize in two phd dissertations.

MO3 Optimal information fusion of heterogeneous data: in this MO, we tackle two types of data fusion challenges. One is the fusion of different types of EO images, and the other is the fusion of EO and social media data. Regarding the former one, we have specifically focused on the fusion of synthetic aperture radar (SAR) data and multi-spectral imagery provided by the Sentinel-1 and Sentinel-2 missions, respectively. This lead to an improvement of urban land cover classification accuracy over the utilization of a single sensor source. The research has been done and the result was published in (Zhu et al. 2021). Regarding the latter, we finished a study of the fusion of street view and aerial view optical images for settlement type classification, and provided conclusion of the best fusion strategy for such highly heterogeneous data (Hoffmann et al., 2019). A study on the fusion of social media image and tweets was also done, and is being summarized in a phd dissertation. The study on the fusion of social media data and remote sensing data is ongoing.

MO4 Big data processing: In this MO, we have been constantly developing and improving our big data processing software on the supercomputer SuperMUC-NG of LRZ. This framework is the basis for our global 3D reconstruction and classification processing. We also achieved significant progress in developing fast optimizer for 3D reconstruction using SAR tomography, which improved the computational speed by a factor of 20 (Shi et al., 2018); regarding classification of global EO data, we have developed efficient inferencing code on CPUs which is currently running on the supercomputer.
Triggered by the need for methods from the field of artificial intelligence (AI) in the project, the project group has become a leading group with regard to the application of deep learning in Earth observation and has started to define the corresponding state-of-the-art in the remote sensing community. By the end of the project, the group will have consolidated this position, and several innovative algorithms fine-tuned to both semantic (2D) and topographic (3D/4D) urban analysis will have been developed and made available to the public in the form of open access publications.
Mapping Global Urban Morphology from Space by Deep Learning
The coverage of our Global Building Footprint (G), comparing to Google (R) and OSM (B)
First Impression of the Global 3D Urban Models from Earth Observation Data Science
So2Sat LCZ42: A Benchmark Dataset for Global Local Climate Zones Classification
First Impression of the So2Sat Global Urban Models (3D + Semantics)