Periodic Reporting for period 4 - QuAnGIS (Question-based Analysis of Geographic Information with Semantic Queries)
Berichtszeitraum: 2023-07-01 bis 2024-08-31
This project addressed these issues by advancing Geographic Question Answering (GeoQA), a cutting-edge approach in Artificial Intelligence (AI) that handles spatial information through natural language questions. While current GeoQA methods excel at retrieving facts (e.g. “Which municipalities neighbor Amsterdam?”), they fall short when addressing geo-analytical questions whose answers are not explicitly available, such as “What is the average park density accessible for pedestrians in Amsterdam?” These analytical questions require indirect answers derived from complex workflows involving data transformation, spatial reasoning, and tool integration—processes central to GIS.
The project's primary objective was to develop methods for enabling automatic retrieval and synthesis of tools, workflows and data sources for answering geo-analytical questions. To achieve this, we:
- Developed a theory of interrogative spatial concepts to formalize geo-analytical questions.
- Created computational models that make such questions machine-interpretable.
- Designed semantic frameworks to describe geospatial tools and data in terms of the questions they answer.
By pursuing these goals, we provided analysts with tools to formulate questions conceptually, enabling the automated discovery of relevant data and workflows over the Web. This paradigm shift improves accessibility to GIS for non-experts, fosters interdisciplinary applications, and enhances reproducibility and efficiency in spatial analysis. The project’s outcomes have significant implications for society. They lower the barriers to leveraging spatial data in fields like public health, urban planning, and disaster management, empowering diverse stakeholders to address critical societal challenges. Furthermore, this research advances Geographic Information Science (GIScience) by concept and theory development and by integrating AI and GIS, setting the foundation for innovative applications in spatial reasoning and AI-based knowledge generation. Ultimately, our work represents a milestone in democratizing GIS capabilities, understanding geographic information, and expanding the reach of spatial analysis across domains.
In Year 2, we developed a geo-analytic question grammar (WP2), capturing question patterns, formalizing them based on interrogative concepts, and enabling concept transformations. The grammar was tested on the corpus and online sources, forming the basis for a query interface.
We also developed a conceptual model of information components for geo-analytic QA (WP3/5). The Core Concept Data Types (CCD) ontology enables automatic workflow composition using tools and geodata, successfully tested in GIS workflow studies. Annotations support our data/tool repository (WP4).
The Core Concept Transformation (CCT) algebra was created to interpret geo-analytical questions as concept transformations for querying workflows, and we explored amounts in Geography. CCT is being implemented as a Python library.
In Year 2, User Study 1 (WP1.3) evaluated GIS workflow design with 40 participants. Despite delays due to the pandemic, this study provided valuable insights for automation and informed the gold standard.
In Year 3, User Study 2 (WP4.2) assessed the usability and interpretability of the GeoQA grammar and Blockly interface, leading to refinements based on participant feedback. User Study 3 (WP4.3) tested human analysts’ ability to recognize core concepts in geographic tasks, validating the CCD ontology and guiding its improvement.
In Year 4, User Study 4 (WP5.2) was intended to evaluate the full technology stack, but due to incomplete prototype development, we instead conducted a cognitive map interpretation survey, essential for the project’s progress.
A retrieval and query study (Steenbergen et al., 2023) also evaluated the technology stack, testing the ability of the GeoQA grammar, CCD ontology, and CCT algebra to retrieve workflows and data based on geo-analytical questions. Results confirmed the system’s effectiveness in automating workflow composition and data retrieval.
The detailed outcomes of these studies validate the GeoQA framework’s feasibility and practical utility in advancing geographic question-answering and spatial analytics.
- GeoAnQu Corpus: A groundbreaking dataset of geo-analytical questions, providing a gold standard for evaluation and empirical insights into geo-analytical questions.
- Integration of Core Concept Model in GeoQA: The core concept model (Kuhn, 2012) has been applied in geoQA for semantic descriptions of geodata, automated GIS workflow composition, and grammar-based question interpretation.
- Core Concept Data Types (CCD) Ontology and Geographic Quantities: The CCD ontology (Scheider et al. 2020) describes geodata and tools, automating workflow composition and linking geo-analytical questions to executable workflows. Retrieval studies validated its effectiveness, and formal theories of geographic amounts (Top 2022) were developed to extend it.
- Geo-Analytical Grammar and CCT Algebra: A grammar was developed (Xu et al. 2023) to translate geo-analytical questions into concept transformations via the Core Concept Transformation (CCT) algebra (Steenbergen et al. 2023), laying the groundwork for automated question answering with workflows.
- QuAnGIS Prototype: A prototype integrates these scientific developments into a GeoQA system, allowing users to ask geo-analytical questions via a Blockly interface to retrieve GIS workflow suggestions.
These achievements provide a solid foundation for advancing geo-analytical QA and to automate knowledge generation in Geographic Information Science (GIScience).