Final Report Summary - CULTURALGEOSEMANTICS (Geosemantics for Cultural Heritage Documentation – Domain specific ontological modelling and implementation of a Cultural Geosemantic Information System based on ISO specifications)
Main results
Main results of the project concern ontology engineering and comprise the creation of CRMgeo, a linking ontology between CIDOC CRM and GeoSPARQL, contributions to CRMarchaeo, a CIDOC CRM extension to represent common archaeological concepts for excavations, and CRMscience, an extension to model scientific observation. The CRMgeo extension has been presented to the archaeological community and to ICOM-CIDOC Working groups on various occasions (28th meeting of the CIDOC CRM SIG, Seventh World Archaeological Congress, CAA 2013), including a peer reviewed paper at the CAA 2013. CRMgeo is documented in detail in an ICS-FORTH Technical Report (March 2013) and its implementation in RDF is published on the ICS-FORTH webpage. CRMgeo proves to be a significant result as it is already being applied in different EU funded projects like ARIADNE (www.ariadne-infrastructure.eu) IMARINE (www.i-marine.eu) InGeoCloudS (www.ingeoclouds.eu) and LifeWatch (www.lifewatch.eu). In 2014, CRMarchaeo and CRMscience were approved by ICOM-CIDOC to become recommendations.
Further results concern the integration of geoinformation and semantic technologies. The abovementioned extensions of the CIDOC CRM were applied to a use case scenario dealing with the research on the history of mining activities in the Eastern Alps, which is the objective of the Research Centre HiMAT (http://www.uibk.ac.at/himat/index.html.en). A workflow was created to transform data from the Research Centre and other relevant data sources into RDF using the CIDOC CRM extensions. In a next step a Cultural Geosemantic Information System prototype was designed based on the state-of-the-art Open Source software tools of Arches and Parliament. This system allows an ontology and semantic technology based data integration of cultural heritage data and geoinformation data. The goal of the use case application was to assist integration of archaeological prospection of historic mining sites, excavation data and satellite data. Research results have been published in the International Journal of Heritage in the Digital Era.
In continuation, a new project was proposed to the Austrian Science Fund and granted in October 2014, aiming at employing and improving the developed methodologies and technologies for co-referencing place names. This funding will guarantee the continuation of the research on the integration of Humanities data and Geoinformation after the end of the project in January 2015 in collaboration with the University of Southern California and the Getty Foundation and demonstrates the value of the project results.
Work carried out to achieve project objectives
The first objective of ontology engineering was to build an extension to the CIDOC CRM in order to integrate OGC GeoSPARQL classes. Through intensive analysis applying ontological criteria of substance, identity, existence and unity the scientist in charge and the fellow realised that it was necessary to build an extension to the CIDOC CRM that links to GeoSPARQL as both standards do not really match any concepts which are common to both models and do not allow for expressing objectively where something is, in a way which is robust against any change of spatial scale and time.
The second objective, a use-case study, was broken into three steps. The first one consisted of acquiring and understanding relevant data through various meetings with members of the research centre HIMAT which enabled the fellow to acquire the necessary knowledge of the underlying semantics for performing the mapping of their data. In the second step, investigating archaeological concepts, the fellow significantly contributed to the analysis of international archaeological concepts and methodologies in the ARIADNE Workshop on Excavation Methodologies, which lead to the generation of CRMarchaeo. In parallel he contributed to CRMscience, a CRM extension for scientific observation that provides more general concepts for CRMarchaeo. For the third step, mapping example datasets to CRM and extensions, the KARMA tool (http://www.isi.edu/integration/karma/) was used because it provided a visual user interface for mapping datasets to ontologies and producing RDF Triples which allowed the final ingestion of data into the Metadata Repository of ICS-FORTH.
The third objective, to design a Cultural Geosemantic Information System prototype, consisted of customizing the new Arches System of the Getty Conservation Institute and World Monuments Fund (http://archesproject.org/) as GIS component of the prototype. Arches supports CIDOC CRM based data structures and has a state of the art user interface. For the objective of applying GeoSPARQL queries to a semantic repository the Parliament triple store (http://parliament.semwebcentral.org/) was used. The example data was ingested and the integration of the repository infrastructure of FORTH-ISL with Parliament was realized through the FORTH-ISL query manager .
For the fourth objective, dissemination, was realized by the publication of a paper in the International Journal of Heritage in the Digital Era and various other presentations, project and conference contributions. A second paper for submission in a Geoinformation journal focusing on the implementation work is under preparation. In the course of the project the fellow was co-supervising a master thesis of a student at ICS-FORTH and a PhD thesis of another student at the University of Innsbruck. Another application for a multipartner project including the Information Sciences Institute of the University of Southern California, the Getty Conservation Institute and the University of Innsbruck was written and the project was approved and will be funded by the Austrian Science Fund.
Conclusions
The CRMgeo extension closes the gap between cultural heritage ontological representation and detailed geometric information as defined in OGC Standards and thus enables the integration and joint reasoning with causal topological information coming from the CIDOC CRM ontology and topological information derived from geometries. The ontological approach of CRMgeo to differentiates between a phenomenal world consisting of observable phenomena that occupy spacetime volumes and a world described by information using geometries to approximate real world phenomenal places enhances the conceptual world defined by the OGC which consists of only features and geometries. The new concepts of Phenomenal Spacetime Volumes, Phenomenal Places and Declarative Places allow for differentiated reasoning on the semantic properties of the real world phenomenon with their spatiotemporal behaviour and the information intended to represent these phenomena. CRMscience and CRMarchaeo provide the ontological classes to document the provenance of scientific knowledge, differentiating explicitly between observations, measurements and inferences. Formalising the differentiation of observation and inference in CRMscience is crucial for the objective interpretation and reinterpretation of archaeological data.
The application of these ontologies to a use case scenario from the area of mining history research posed first the challenge of mapping the information relevant for a specific research question and then converting the source data in RDF. Semantic technologies helping with these mappings and conversions are available but still limited (e.g. for URI generation). The representation of information in graphs instead of relational databases poses additional challenges to the design of user interfaces. The developed Cultural Geosemantic Information System prototype proposes a possible solution that makes use of ontologies through applying emerging semantic technologies in combination with professional Geoinformation systems for user interaction.
Potential impact
One significant impact is that essential concepts of CRMgeo have been adopted by the CRM-SIG for inclusion into the CIDOC CRM standard and will be integrated in the next regular update of ISO21127. It is planned that CRMgeo will be proposed as an extension to CIDOC in 2015 recommended for modelling spatiotemporal data. The use of CRMgeo by the German Archaeological Institute (DAI) to model temporal Gazetteers should have another significant impact in the archaeological community, as well as the use of the model by Franco Niccolucci of the University of Florence who describes CRMgeo as a "paradigm change" in the way to represent space and time in archaeology. The work on CRMarchaeo and CRMscience representing common scientific and archaeological concepts for excavation sites will have an impact on the potential to integrate archaeological dataset from different archaeological schools, methodologies and times. The integration of Cultural Heritage and Geoinformation developing and using ontologies with the application of semantic technologies will have impact on the knowledge representation in these disciplines allowing the integration in global data networks like Linked Open Data with the potential to connect huge resources that have never been related before. The integration of cultural heritage information in location based tourism portals could also be a field of application with socio-economic impact. Another scenario is the application of the technologies within museum exhibitions to visualize the geographic background and relations of displayed objects based on an event oriented ontology. The latter scenario will be further developed in the project funded by the Austrian Science Fund and applied to permanent exhibitions of the Smithonian American Art Museum and a touring exhibition on the mining history of the Eastern Alps.
Main results of the project concern ontology engineering and comprise the creation of CRMgeo, a linking ontology between CIDOC CRM and GeoSPARQL, contributions to CRMarchaeo, a CIDOC CRM extension to represent common archaeological concepts for excavations, and CRMscience, an extension to model scientific observation. The CRMgeo extension has been presented to the archaeological community and to ICOM-CIDOC Working groups on various occasions (28th meeting of the CIDOC CRM SIG, Seventh World Archaeological Congress, CAA 2013), including a peer reviewed paper at the CAA 2013. CRMgeo is documented in detail in an ICS-FORTH Technical Report (March 2013) and its implementation in RDF is published on the ICS-FORTH webpage. CRMgeo proves to be a significant result as it is already being applied in different EU funded projects like ARIADNE (www.ariadne-infrastructure.eu) IMARINE (www.i-marine.eu) InGeoCloudS (www.ingeoclouds.eu) and LifeWatch (www.lifewatch.eu). In 2014, CRMarchaeo and CRMscience were approved by ICOM-CIDOC to become recommendations.
Further results concern the integration of geoinformation and semantic technologies. The abovementioned extensions of the CIDOC CRM were applied to a use case scenario dealing with the research on the history of mining activities in the Eastern Alps, which is the objective of the Research Centre HiMAT (http://www.uibk.ac.at/himat/index.html.en). A workflow was created to transform data from the Research Centre and other relevant data sources into RDF using the CIDOC CRM extensions. In a next step a Cultural Geosemantic Information System prototype was designed based on the state-of-the-art Open Source software tools of Arches and Parliament. This system allows an ontology and semantic technology based data integration of cultural heritage data and geoinformation data. The goal of the use case application was to assist integration of archaeological prospection of historic mining sites, excavation data and satellite data. Research results have been published in the International Journal of Heritage in the Digital Era.
In continuation, a new project was proposed to the Austrian Science Fund and granted in October 2014, aiming at employing and improving the developed methodologies and technologies for co-referencing place names. This funding will guarantee the continuation of the research on the integration of Humanities data and Geoinformation after the end of the project in January 2015 in collaboration with the University of Southern California and the Getty Foundation and demonstrates the value of the project results.
Work carried out to achieve project objectives
The first objective of ontology engineering was to build an extension to the CIDOC CRM in order to integrate OGC GeoSPARQL classes. Through intensive analysis applying ontological criteria of substance, identity, existence and unity the scientist in charge and the fellow realised that it was necessary to build an extension to the CIDOC CRM that links to GeoSPARQL as both standards do not really match any concepts which are common to both models and do not allow for expressing objectively where something is, in a way which is robust against any change of spatial scale and time.
The second objective, a use-case study, was broken into three steps. The first one consisted of acquiring and understanding relevant data through various meetings with members of the research centre HIMAT which enabled the fellow to acquire the necessary knowledge of the underlying semantics for performing the mapping of their data. In the second step, investigating archaeological concepts, the fellow significantly contributed to the analysis of international archaeological concepts and methodologies in the ARIADNE Workshop on Excavation Methodologies, which lead to the generation of CRMarchaeo. In parallel he contributed to CRMscience, a CRM extension for scientific observation that provides more general concepts for CRMarchaeo. For the third step, mapping example datasets to CRM and extensions, the KARMA tool (http://www.isi.edu/integration/karma/) was used because it provided a visual user interface for mapping datasets to ontologies and producing RDF Triples which allowed the final ingestion of data into the Metadata Repository of ICS-FORTH.
The third objective, to design a Cultural Geosemantic Information System prototype, consisted of customizing the new Arches System of the Getty Conservation Institute and World Monuments Fund (http://archesproject.org/) as GIS component of the prototype. Arches supports CIDOC CRM based data structures and has a state of the art user interface. For the objective of applying GeoSPARQL queries to a semantic repository the Parliament triple store (http://parliament.semwebcentral.org/) was used. The example data was ingested and the integration of the repository infrastructure of FORTH-ISL with Parliament was realized through the FORTH-ISL query manager .
For the fourth objective, dissemination, was realized by the publication of a paper in the International Journal of Heritage in the Digital Era and various other presentations, project and conference contributions. A second paper for submission in a Geoinformation journal focusing on the implementation work is under preparation. In the course of the project the fellow was co-supervising a master thesis of a student at ICS-FORTH and a PhD thesis of another student at the University of Innsbruck. Another application for a multipartner project including the Information Sciences Institute of the University of Southern California, the Getty Conservation Institute and the University of Innsbruck was written and the project was approved and will be funded by the Austrian Science Fund.
Conclusions
The CRMgeo extension closes the gap between cultural heritage ontological representation and detailed geometric information as defined in OGC Standards and thus enables the integration and joint reasoning with causal topological information coming from the CIDOC CRM ontology and topological information derived from geometries. The ontological approach of CRMgeo to differentiates between a phenomenal world consisting of observable phenomena that occupy spacetime volumes and a world described by information using geometries to approximate real world phenomenal places enhances the conceptual world defined by the OGC which consists of only features and geometries. The new concepts of Phenomenal Spacetime Volumes, Phenomenal Places and Declarative Places allow for differentiated reasoning on the semantic properties of the real world phenomenon with their spatiotemporal behaviour and the information intended to represent these phenomena. CRMscience and CRMarchaeo provide the ontological classes to document the provenance of scientific knowledge, differentiating explicitly between observations, measurements and inferences. Formalising the differentiation of observation and inference in CRMscience is crucial for the objective interpretation and reinterpretation of archaeological data.
The application of these ontologies to a use case scenario from the area of mining history research posed first the challenge of mapping the information relevant for a specific research question and then converting the source data in RDF. Semantic technologies helping with these mappings and conversions are available but still limited (e.g. for URI generation). The representation of information in graphs instead of relational databases poses additional challenges to the design of user interfaces. The developed Cultural Geosemantic Information System prototype proposes a possible solution that makes use of ontologies through applying emerging semantic technologies in combination with professional Geoinformation systems for user interaction.
Potential impact
One significant impact is that essential concepts of CRMgeo have been adopted by the CRM-SIG for inclusion into the CIDOC CRM standard and will be integrated in the next regular update of ISO21127. It is planned that CRMgeo will be proposed as an extension to CIDOC in 2015 recommended for modelling spatiotemporal data. The use of CRMgeo by the German Archaeological Institute (DAI) to model temporal Gazetteers should have another significant impact in the archaeological community, as well as the use of the model by Franco Niccolucci of the University of Florence who describes CRMgeo as a "paradigm change" in the way to represent space and time in archaeology. The work on CRMarchaeo and CRMscience representing common scientific and archaeological concepts for excavation sites will have an impact on the potential to integrate archaeological dataset from different archaeological schools, methodologies and times. The integration of Cultural Heritage and Geoinformation developing and using ontologies with the application of semantic technologies will have impact on the knowledge representation in these disciplines allowing the integration in global data networks like Linked Open Data with the potential to connect huge resources that have never been related before. The integration of cultural heritage information in location based tourism portals could also be a field of application with socio-economic impact. Another scenario is the application of the technologies within museum exhibitions to visualize the geographic background and relations of displayed objects based on an event oriented ontology. The latter scenario will be further developed in the project funded by the Austrian Science Fund and applied to permanent exhibitions of the Smithonian American Art Museum and a touring exhibition on the mining history of the Eastern Alps.