Project ID: 313082
Country: France

Periodic Report Summary 2 - RISIS (Research infrastructure for research and innovation policy studies)

Project Context and Objectives:
The objective of RISIS is to develop a distributed infrastructure serving researchers in the field of science and innovation studies.
The underlying reason is both policy and research based. Policy because it is at least 20 years that policymakers ask questions that existing indicators and the underlying databases cannot answer (this is well explained by Godin about the limits of the production function and the input-output approach). For researchers one interpretation lies in the deepened understanding of knowledge production and innovation processes, of the circulation of knowledge and in a greater distance with the assumption of the country as the relevant unit of analysis.
The changing environment and the rise of the internet as a new communication infrastructure has further opened new possibilities for developing databases and analytical tools that improve our understanding and for developing indicators that match the theoretical state of the art. More specifically new indicators can keep the identity of actors and their fined grained location (in metropolitan areas), two of the strongest limitations of the statistics-based approach to indicators of science, technology and innovation. Lepori, Barré et al., 2008 have proposed to name these new indicators, positioning indicators.

This has generated a wealth of developments and experimental databases, quite a few being supported by the European Commission, and mostly as one-off events. A first objective is thus to stabilise and maintain over time these datasets and make them available to all interested researchers in Europe, but also abroad. We also take the duration of the project as a ‘testing’ period, which will act as a revealer of the lasting interest about these datasets and help us determine those that should be lasting, giving us also time to think about the conditions for their lasting maintenance.

However the available data cover the lasting theoretical and policy problems only partially. We have identified several key lasting issues that require specific efforts, and have thus proposed the development of new datasets on these issues. This is the second objective.

Many problems are at the interface of existing datasets and are thus linked to the ability of interfacing them and of generating a problem-based integration. This is the third objective, which requires that we care for technical and substantive harmonisation.

Whatever the number of datasets (14) currently in the project, this will only partially address the needs of researchers for robust ‘positioning’ data. Our fourth objective is to develop platforms that will help researchers build their datasets from a variety of data, such as open data available on the web, administrative data, and specific project oriented data collections, and provide them with instruments to organize and analyse these often large textual corpuses.

These four objectives combined require that we develop an overall architecture for the infrastructure so that researchers can access, build, integrate and treat data at a distance. This is a fifth objective that, though not explicitly included in the project, is becoming central to the dynamics of the infrastructure.

Project Results:
The first two years were dedicated to the ‘opening’ of RISIS existing datasets and platforms to European researchers. There was intensive work to do because most datasets were still experimental. The choice has been to conduct collectively this work so as to insure not only quality and reliability of the individual datasets but start organising the technical cognitive and legal harmonisation to foster unique access and problem-based integration. All datasets are now open for visits through a unique access ( We have had 30 visits since opening with a flow only starting de facto beginning of 2016. The selection process is fluid, well below our initial target (answers now take less than 2 weeks). We also offer training courses, free of charge, to European researchers and doctoral candidates (15 courses so far, 10 planned in 2017).

The deployment of the CorText platform was faster than anticipated thanks to a dual strategy enabling early access via the Beta platform, before the opening of the enlarged RISIS version. Its use if far above our best expectations with over 200 users in only one month (December 2016). And SMS has just opened in a beta version (March 2017). We now aim to organise their integration while developing the common functions we need for the future fully-fledged RISIS-level infrastructure.

An important harmonisation effort is being done around our two key dimensions of positioning indicators, organisations and geographic information. This will result, at European level, with a set of actor-based registers. Tools for geocoding and geoclustering have been tailored to our needs and are now in the process of being applied to all RISIS datasets.

All research WP have started and we are really optimistic for the new dataset on ‘fast growing mid-size’ firms, a major issue for European industry, and for the new repository of innovation and research policies (where an agreement is being reached with OECD). Significant developments have taken place both on PhD careers, a really challenging new European-level platform, and on PROs, where we have chosen to progress step by step.

For communication with the community, we support the main channel of encounter between researchers, the ENID conference (Leiden in September 2014, Lugano in September 2015, Valencia for 2016 and Paris for 2017). Results being there, we have now defined a full communication strategy and allocated responsibilities. One of the first impact lies in the strong evolution of our website.

Potential Impact:
We now have opened 10 datasets (and have 2 under opening), 1 platform (CORTEXT, the second one SMS opening in March 2017). We now have a rich website and a central access and process for using the datasets. Visits were slow to start, but the rhythm of visits is increasing and is supported by our extensive training effort. Similarly the use of our semantic analysis platform, CORTEXT, is far beyond our most positive expectations. And we are now devising joint activities with other EC research infrastructures to help develop their own ‘positioning’.
Thanks to our extensive harmonisation efforts, we are finalising the enrichment of datasets on the organisational and geographical levels in order to open a complete new set of treatment possibilities. The first demonstrators showing the power of organisation and placed based integration of RISIS datasets have been presented at the last RISIS annual week (January 2017).
The repository of policy evaluations has opened and is articulated with the OECD-World Bank Innovation Policy Platform (IPP). And very soon the new dataset on the critical issue of mid sized firms in Europe will complement our datasets on the internationalisation of large firms and on the developments of venture fund-backed start-up firms.
These results highlight the potential of such an infrastructure which does not only serve our community, but has elements of it that serve more widely other scientific communities (as is for instance mirrored by the very large demand for using our start-up firm dataset).
This has driven us to enter sooner than anticipated in a second stage, that is progressively generalising distant access (rather than visits) and fostering the ability for problem-based combination of multiple datasets (through a variety of options). This requires that we develop an overall computer architecture of the project (what we call the RISIS Core Facility). This is on-going with the development and testing of key functional aspects (in particular dealing with user authentification and dashboards, the technical organisation of conditional access and of the central datastore.
Doing these, we think that we provide a really new resource for our community, participate in the changing understanding and balance between qualitative and qualitative approaches in social science research, but also offer a new model for organising and sharing data (combining the maintenance of key datasets, the development of shared field specific resources, and the tailoring of platforms to field issues) that can apply to many fields in social sciences.

