Periodic Reporting for period 2 - NEWORLDatA (Negotiating World Research Data: A science diplomacy study)
Reporting period: 2023-07-01 to 2024-12-31
Neworld@a aims to better understand these global imbalances as they are now, and also to chart their historical legacy. To better understand the nature of this issue, we are undertaking a study combining scientometrics and historical approaches. The scientometrics study aims to provide quantitative and visual evidence of the current unevenness of the scientific data infrastructure through a combination of world and multi-layered network maps (see attached image) displaying regions of the world that are more or less densely connected to scientific datasets available, and can more easily use them in data-intensive research. This study is then combined with a parallel analysis of international treaties facilitating data exchange, in order to figure out if legal provisions currently existing for data-sharing, internationally, play a part in shaping these asymmetries, especially by accelerating exchanges between specific groups and regions in the Northern Hemisphere while slowing it down in others. Both scientometrics and legal data studies then feed into the historical analysis of the origins and evolution of international data organizations to better understand if their historical trajectory caters for exacerbating or removing these asymmetries in global data distribution. The final component in the project aims to utilize the scientometrics and historical evidence amassed to then elaborate further on what science policy provisions could seek to counter the widening gap in data distribution that exists at the global level.
The combined historical, legal treaties and scientometrics analyses should thus return a comprehensive picture displaying the characteristics of a problem of vital importance to global society and governance, namely the inequality in the distribution of scientific data resources across the planet. This is a problem with decisive repercussions with regards to international scientific collaborations, and especially what we now call 'science diplomacy', i.e. the use of these collaborations to build constructive relations between nations and address global societal challenges. It is our working hypothesis that an uneven distribution of research data resources can advance science in some world regions while hindering scientific development in others. In turn lack of capacity building of individual research communities can also result in subordinating them to others with overwhelming data capacity. The repercussions within the scientific society and society at large are compelling in that unless these asymmetries are better understood and fully addressed, it is impossible to truly administer the global scientific society in truly transformative ways so as to address global challenges too. It is worth recalling the findings of the 2023 mid-term report released on the implementation of the UN Sustainable Development Goals and emphasizing that not only we are far off track from achieving the 2030 SDG targets, but that in order to do so science and technology capacity building in ALL countries is needed (https://sdgs.un.org/sites/default/files/2023-09/GSDR%202023%20Key%20Messages_1.pdf(opens in new window)). As far as data are concerned, this means a comprehensive re-thinking of how data resources have been distributed over the last century and if a different distribution criteria should be developed.
The scientometrics study has enabled us to focus from a quantitative viewpoint on the current features of these global data imbalances, especially in terms of current geographical distribution of datasets, access provisions, and elements of unevenness in their use in the context of the global scientific enterprise. Using an internet-based repository of information on research datasets, Re3Data, we have thus consistently mapped these asymmetries, also using social network analysis' methods. The world mapping exercise has returned vital results showing especially how countries in Africa, Latin America and some Asian regions are at the margin of networks displaying access to research data in that they access repositories less frequently and host them more sparingly too. Our ongoing exploration suggests that data asymmetries play a part in shaping the gap in scientific development between what we now call Global North and Global South countries. We have further extended these working hypotheses to the study of treaties enabling exchange of scientific information and data sharing. While this study could not benefit in the same way from a unique repository (utilizing therefore a range of sources), it has mirrored the scientometrics study in showing that Global North countries enjoy a level of research data sharing, and of bi-lateral exchanges with selected Global South countries, far higher than that of Global South countries.
The scientometrics and legal treaties analyses have provided the evidence needed to focus our attention on the historical determinants of these asymmetries in global data distribution. We have thus explored the archives documenting the evolution of international data organizations with the ambition to find out if their histories display transitions that can be seen as setting global data distribution on a path of unevenness. We have focussed in particular on two data organizations operating in the context of the chief international scientific coordination body operating in the 20th century, i.e. the International Council of Scientific Unions (now International Scientific Council). These two organizations are the World Data System (WDS) and the Committee on Data for Science and Technology (CODATA). While ongoing, the study has already revealed current data asymmetries to result from their development as collaborative enterprises with features paving the way to the building of asymmetries. Since 1957 the WDS datasets were hosted only in a selected number of scientifically developed countries, whereas since 1966 CODATA welcomed only a limited number of national representatives mostly from the same countries. In turn we are now seeking to demonstrate that the dynamics of international collaboration in data-sharing can be seen as a key factor in shaping these asymmetries with a few countries with pre-existing scientific infrastructures dominating international collaborative efforts in data sharing.
In order to do so, we will extend the historical examination to other datasets made available in the context of the United Nations because critical to world affairs such as those in the marine sciences (the data systems IODE of the International Oceanographic Commission) and nuclear energy (the INIS system of the International Atomic Energy Agency). We wish to verify if similar asymmetries exist in the sharing and access of data on human health, human population and animal species. In particular we expect to contribute new research on different datasets such as those produced by the World Health Organization and the International Union for the Conservation of Nature. Contingently we are also examining how the Cold War impacted on these exchange dynamics. Our research appears to show that the Soviet Union was fundamentally complicit in the administration of asymmetrical data networks, being fully integrated in both WDS and CODATA while elaborating and operating a separate data network, Gosstandardt, for socialist countries in a regime of cooperative antagonism. Much remains to be ascertained about the role of the People's Republic of China too which was first isolated and the re-integrated in the 1980s when it actually started playing a pivotal role in the WDS, also becoming a powerful presence in international organizations devoted to research data.
By 2025 both sub-projects on scientometrics and legal treaties will reach an end, while the historical study of data organization which further evolved for another year. Together with the ongoing historical analysis, we expect the historical study to provide a comprehensive picture of the evolution of various interlocked systems of data exchange and to even more accurately explain how this evolution has produced asymmetries. In turn, our final task before ending the project will be to verify what policy provisions can align to the scientometrics and historical evidence unearthed. Here there is an expectation that the study will reveal the need for a 'levelling up' agenda in the global scientific enterprise that goes beyond the simple issue of funds redistribution (for the UK case see for instance: https://assets.publishing.service.gov.uk/media/601c0675d3bf7f70bc2e1f1d/CST_levelling_up_letter.pdf(opens in new window)). We envisage also a re-thinking of the role of scientific infrastructures (including data). We are also likely to suggest revising current data provisions placing overwhelming emphasis on access in the recognition that while access to data is indeed a vital issue, it is not the only decisive one. In particular, unless data production and management is more evenly distributed geographically, datasets will always be responsive to the data-intensive research priorities of those groups that take the lead on this production no matter how accessible data are. It is our preliminary understanding that greater re-distribution would in turn set different priorities for the definition of new datasets more aligned to the priorities of countries undergoing processes of scientific development.