Skip to main content
European Commission logo
español español
CORDIS - Resultados de investigaciones de la UE
CORDIS
Contenido archivado el 2024-05-28

An Observatorium for Science in Society based in Social Models

Final Report Summary - SISOB (An Observatorium for Science in Society based in Social Models)

Executive Summary:
The SiSOB (An Observatorium for Science in Society based on Social Models) project was a three-year EU project that was conducted from January 2011 to December 2013 (36 months). The project brought together seven universities and research centresfrom Europe and Latin America with the goal of providing policy makers and their advisors with new tools for theex ante and ex post analysis of science policy decisions. To date, the majority of studies have focused on the networks of academic, scientific, and industrial actors that produce scientific knowledge and artefacts (science production systems). The SiSOB project extended these analyses to the broader science-society system, which not only influences the decisions of science production systemsand consumes their products, but provides the socio-political context in which the science production system operates.

The main outcomes of the project can summarized as follows:
• A “conceptual model” of the social impact of science. This model defines a shared vocabulary for use within the project and makes explicit the central assumptions that underlie the project. The conceptual model describes interactions in the knowledge production system interms of entities (human and institutional actors, artefacts, contextual factors, knowledge, systems, and actors) and the relationships among these entities.
• A novel set of indicators to characterize the social networks underlying science production systems.
• Three case studies:researcher mobility, knowledge sharing, and peer review. The case studies contributed to the design of the SiSOB tools while providing novel insights that sometimes go beyond and contradict current knowledge.
• Novel software tools thatcan measure and predict the social impact of research. The project was based on the creation of complex models of actors related by artefacts, results, and relationships. Thes tools measure impact taking into account that "impact" essentially depends on interactions and the channels of social communication in which these interactions occur. To measure impact, the SiSOB project used traditional scientometric approaches in combination with structural methods based on the analysis of social and knowledge networks.

The SiSOB workplan was successfullycompleted on time and within its budget. The research and deliverables have been widely disseminated through classic media (journal and conference papers, a dissemination workshop, invited talks), social media (blogs, electronic open-source journals, social networks), and scientific publications. The tools developed by the project will soon be released as open-source software and are already in use by groups other than the SiSOB Consortium.

Website
For more information on the SiSOBproject please visit the project website at: http://sisob.lcc.uma.es/

Project Context and Objectives:
Governments, funding agencies, and scientific journals seek to support research that will have a strong positive impact on society. They are interested in research leading to products, services, and processes that change and improve people’s lives or their understanding of the world. In other words, they want to supportthe modern equivalents of smallpox vaccine, penicillin, the internal combustion engine, or Darwin’s theory of evolution. To achieve this goal, or come close to achieving it, they need effective tools to measure and predict the impact of science on society.

Traditionally, the quality of research has been evaluated using peer review (pre-publication) and bibliometric indicators (post-publication). These approaches implicitly accept that the aim of research is to satisfy the values and needs of the research community rather than the values and needs of society as a whole. However, without effective tools for predicting and measuring the true social impact of research this remains a very limited view. Against this background, the strategic goal of the SiSOBproject was to develop novel tools that can not only measure and predict the overall social impact of research, but alsoa specific aspect of this impact:that is, the social appropriation of scientific knowledge generated by research.This refers to the way in which economic and social actors take research results and use them to produce new products, new services, and new ideas. During the SiSOB project, we observed how this process essentiallydepends on interactions between different actors involved in the process,and on the channels of social communication on which these interactions depend.

To achieve the abovementioned goals, the SiSOB project pursued the following strategic objectives:
SO1: To create a general framework modelling the actors, relationships, communities, and emergent research social networks involved in the production of scientific research, to disseminatethe research results, and to translate these results into products, services, and socially important ideas.
SO2: To design and implement tools and indicators that can automatically collect, analyze, and visually represent data that describe the actors and their interactions.
SO3: To create data-driven models of specific actors, communities, and networks relevant to the three case studies mentioned above.
SO4: To use the tools and indicators developed by the project to collect and analyze data relevant to the three case studies.
SO5: To use the results from these studies to validate the methods and tools developed by the project.
SO6: To implement and release an open-source platform for the acquisition and analysis of social network data relevant to measuring the social impact of science, including data from communities and networks not included in the SiSOB case studies

The SiSOB consortium consisted of a multidisciplinary team of researchers with backgroundsin the following areas: (i) the data-driven modellingof social interactions; (ii) the design of indicators for scientific productivity and social impact;and (iii) the design of software tools for use by social scientists and policy-makers.

The work plan was organized into ten workpackages (WP) (seeTable 1). WP1 and WP10 were dedicated to management activities (coordination, dissemination). The remaining work packages focused on research and development. WP2 (conceptual modelling) created the general conceptual framework used in the development of technical tools and the case studies. Work packages 3-6developed the SiSOB software tools. Work packages 7-9consisted of case studies on specific issues related to the social impact of science. The case studies informed the design of the software tools whichwere then used by the case-study researchers to facilitate their work. Figure 1 shows the interactions between the work packages.


Project Results:
The goal of the SiSOBproject was to provide analytical instruments that can eventually be appliedas practical support tools for decision-makers and their advisors. Within the setting of the SiSOB project, we defined a decision-maker as an individual whose decisions influence and shape the production of scientific knowledge and the impact of science on society. Examples include laboratory heads, university presidents, editors of journals, reviewers, government and funding agency officials, among many others.

The social impact of science depends on a broad range of factors. Some of these factors are associated with the way scientific knowledge is produced, others to the way it is distributed to actors outside the knowledge production system, and yet others to the way it is received, exploited, and consumed. All of these factors involve social interactions between different actors (individuals, groups, institutions) working in different contexts or settings. These interactions constitute a complex social system whose dynamics unfold on a global scale over long periods of time.

The project created two sets of tools.The first setis oriented towards decision and policy-makers, whereas the second set is designed to be used by specialist scientometrists, or more generally, by researchers interested in the structure and dynamics of the science production system. The characteristics of these tools are strongly influenced by the considerations outlined in the design of the conceptual model of the project, which was developed in work package 2.

The tools allow users:
• To characterize and visualize relationships and information flows within a given scientific community (a laboratory, a university, a region, a country, a project, or a discipline), between individual researchers within this community, and between the community and specific external actors (e.g. decision-makers, opinion-makers, investors, etc) in the rest of society.
• To analyze bibliometric, social, and economic indicators relevant for the assessment of the proximal impact of a given decision and to compare relationships and information flows between different communities (e.g. different laboratories, universities, disciplines, etc).
• To analyze changes in these relationships and flows.
• To obtain a multidimensional model of how specific scientific communities operate within society.The goal is not to evaluate different outcomes, but to provide decision- and policy- makers with information that can help them make their own evaluations.
The methodology followed by the SiSOB project combined theoretical studies with case studies usingtheiterative design processillustratedin Figure 2. In this process, the case studies informed the design of the tools, which were then used by the case-study researchers to facilitate their work.

The main scientific and technical outcomes of the project were as follows:

• A“conceptual model” of the social impact of science. This model defines a shared vocabulary for use within the project and makes explicit the central assumptions that underlie the project. The conceptual model describes interactions in the knowledge production system interms of entities (human and institutional actors, artefacts, context factors, knowledge, systems, and actors) and the relationships among these entities. The three case studies operationalize the model in relation to researcher mobility, knowledge sharing, and peer review.
• A set of indicators useful in characterizing the social and information networks underlying the knowledge production system. The system of indicators developed upon state-of-the-art models of scientific communication (science mapping, science dynamics) provide rich quantitative and formal insights into structural features of mobility patterns, knowledge sharing at the interfaces of science and society, and peer review systems, enabling users to set up and run studies in the three pre-selected domain of the science system (Case Studies).
• The three case studies that explore specific aspects of the knowledge production system:
o Researcher mobility
o Knowledge sharing
o Peer review
• The SiSOB software. This is open-source software supporting quantitative studies of the social impact of science. Special emphasis is placed on the issues addressed by the case studies.



1. SISOB CONCEPTUAL MODEL
The SiSOB case studies focused on three aspects of the knowledge production system: mobility, knowledge sharing, and the role of the peer review system. A further goal was to develop concepts and tools that could also be applied outside the specific areas of research covered by the case studies.

The SiSOB conceptual model conceptualized the chain of events leading from discovery to impact as follows:
− Actors andknowledge production networks produceartefacts in a context that affects what they produce and the efficiency with which they produce it;
− Distribution actors and distribution networks distribute these artefacts to knowledge users;
− Knowledge users use the knowledge and directly or indirectly produceoutcomes.

The entities of the SiSOB conceptual model are shown in Figure 3.
The general goal of SiSOB was thus to develop measurements of production networks and/or production contexts and/or distribution networks and/or distribution contexts and to relate these measurements to outcomes.

This “conceptual model” made explicit the central assumptions underlying the SiSOB project:
1. Scientific practices are embedded in a social and economic context that exerts a strong influence on the scientific endeavour and plays a key role in determining the way scientific knowledge is ultimately used.
2. The embedding of science in society depends on multiple overlapping relationships and information flows among a multitude of actors. These relationships can be usefully analyzed and visualized using Network Analysis tools.
3. The way in which scientific knowledge and artefacts are produced and used significantlyvaries between disciplines.
4. The mechanisms determining the impact of scientific knowledge and artefacts are non-linear and not only involve the quality and characteristics of the knowledge and the artefacts, but also the context in which they are received and used.
5. The outcomes of decisions by science policy-makers can be usefully divided into proximal outcomes, which have a direct causal link to a particular policy decision, and distal outcomes, which to a large extent depend on future developments. In principle, proximal outcomes can be predicted, whereasdistal outcomes cannot.
6. Some scientific communities are far more productive and make a far greater contribution to society than others.

Scientific discoveries and inventions affect many different aspects of society. Different decision-makers will make different evaluations of the same outcomes. It is thus methodologically impossible to summarize the impact of scientific knowledge and inventions in any single indicator.The social impact of science depends on a broad range of factors, some associated with the way scientific knowledge is produced, others to the way it is distributed to actors outside the knowledge production system, and yet others to the way it is received, applied, exploited,and consumed. All these factors involve social relationships and information flowsbetween actors (individuals, groups, institutions) working in different contexts and settings. These interactions and the context in which they take place constitute a complex social system whose dynamics unfold on a global scale over long periods of time. The SiSOB conceptual model describes these interactions in terms of entities and relationships (Figure 3).The reader should treat the entities defined at the bottom level of the hierarchy simply as examples.Although we have attempted to identify the main classes of actor, artefact, outcome and so on needed for the purposes of our studies, we do not claim to have produced an exhaustive list.

In all these studies, the human actors are researchers who externalize their knowledge in artefacts (papers, patents, prototypes, etc) and who belong to institutions (also considered as actors). The mobility study investigated how researcher mobility (between institutions, between academia and industry, between lower and higher prestige jobs) affects the personal productivity of researchers and ultimately affects the productivity of the institutions to which they belong. The knowledge-sharing study examinedinformation flowsbetween researchers and their institutions as indicated by the content of the artefacts they produce. Finally, the peer-review study focused on the way co-authorship networks, citation networks, and author-reviewer networks (the networks linking researchers who have reviewed each others’ papers) influence the way the peer review system functions and thus the productivity and reputation of researchers and their institutions.


2. SiSOB SOFTWARE
The SiSOB software consists ofa web-based analysis workbench and a web-based crawling and data extraction tool that can be used as a stand-alone or in combination with the workbench. The workbench integrates the tools needed for analyses, as carried out in the case studies. It can set up and execute complete workflows starting from data import and preparation, passing through several types of filtering and analysis steps, and lead to results presented in the form of raw data (e.g. data tables or graph files) or as visualizations.
The tools have been developed as open-source code and are freely available to any research team, group, or institution interested in using them. All partners contributed input to the design of the tools as well asthe external experts who participated in the SiSOB Evaluation Workshop.

* SiSOB WORKBENCH
The SiSOB Analysis Workbench was the main analysis tool developed during the SiSOB project. The workbench combines a web-based user interface for configuring analysis processes with server-side processing of the analyses. In terms of available analysis functionalities,although the main focus has been on the implementation of network analysis techniques and on statistical analysis capabilities, additional techniques can be easily integrated. Figure 4 shows the interface of the SiSOB workbench when used to define a case study as a workflow. Figure 5shows an example of an output generated by the workbench.

* DATA EXTRACTOR
The data extractor is the stand-alone component of the SiSOB architecture in charge of retrieving information from data sources belonging to different categories:
(a) Structured data:data of known structure, e.g. data stored in XML files or in a relational database.These websites usually provide services (SOAP, Rest, JSON, etc) for information retrieval and thus the information format is already known.
(b) Semi-structured data: well-structured data whose structure is not known to the data extractor. Data extraction is based on the recognition of the tags or fields within the data source.
(c) Non-structured data: unformatted data that requires Natural Language Processing (NLP) strategies to obtain information. Typical examples are scientists' personal web pages.
The extractor has been constructed as aweb-based environment with a simple and intuitive interface. The extractor can be connected to other systems with a Restful API. In the SiSOB workbench, the extractor is used to generate data for subsequent processing by data analysts.

* SiSOB VISUALIZATIONS
The SiSOB project provides a set of visual metaphors to visualize the results of the different case studies. These metaphors are implemented over D3.js (js/html5) and NVD3.js (js/html5) as an API.The visualization components are organized in two groups:Statistical visualizations (Figure 7) and graph visualizations (Figure 8). These visualizations have been implemented as an individual model and can be used as part of the workbench or separately in other software applications.



3. SiSOB CASE STUDIES
In a project on the scale of the SiSOB, it is not possible to study all the communities and social networks that contribute to the social appropriation of research knowledge. Thus, the SiSOB project focused on three “case studies”, each of which illustrates one of the ways society appropriates (or fails to appropriate) research knowledge.

3.1. Case Study 1. Researcher Mobility

The original goal of this case study was to support the development of the SiSOB project’s methodology by empirically investigating the relationship between researcher mobility, research activities, and social impact. The case study focused on the effects of researcher mobility on the performance of research activities measured according to several dimensions of social impact. The aim was to implement the methodologies developed in the project's conceptual model within the software development process to assess how researcher mobility affects research performance and impact.

A further aim was to produce statistical indicators on the relationship between mobility and research productivity and career development. The goal was to find objective patterns based on mobility and to identify "best practices" for the evolution of scientific communities.
We developed a methodology to collect data for life-course analysis in collaboration with the technical work-package partners. The methodology implemented was based on a Web Crawling Process plus data extraction based on natural language processing. A general web-search approach was employed using inputs on basic information on the researchers and their current institutional affiliation; career data were gathered with the Crawler and then processed by a set of computational tools based on natural language processing. The first process concentrated on collecting basic information such as personal sites, current valid emails, and CVs, whereas the second focused on detecting, organising, and analysing several categories of the CVs that wereautomatically collected.

An econometric framework was developed for the analysis of researcher mobility. Figure 9 shows the cycle between researcher mobility and productivity as well as the feedback relationship. This feedback loop suggests that reverse causality issues need to be taken into account.A series of new indicators was developed to betested and used in researcher mobility studies. The main variables of interest were research or research performance. The general performance of researchers was conceptualized in terms of publication productivity. We hypothesised that mobility affects the number and quality of publications produced by researchers as wellas by institutions and regions. Thus, we developed a new set of indicators:

a) Mobility: On the vertical dimension we consider mobility along ranked systems or hierarchies of positions and locations. Vertical directional mobility path is described as (strict) monotonic paths, i.e. a continuously ascending or descending career; and non-monotonic pathsthat exhibit a mixture of upward and downward steps, i.e.paths characteristic of more diverse trajectories.
b) Ranking: Regarding the Researcher Mobility case study it would be of interest to rank institutions according to capital availability (resources and peers). Quality-weighted publications per institution and field could provide such a measure. A system of time-variant rankings was constructed providing a means for registering significant career steps (e.g. through derived threshold values or scales for each ranking).
c) Thematic Mobility: Partially based on our own experience, we applied the most recent toolbox of science mapping, which can be used to thoroughly analyse and visualizea research profile and its evolution. A measure originally proposed to account for the degree of multidisciplinarity was re-interpreted as indicating the degree of thematic mobility, through accumulation and dynamic shift.

First results of a mixed effect of mobility on performance
Using a variety of indicators, data sources, and models, we measured the effect of mobility on performance in four studies undertaken during the project. In general, the results suggest that mobility is positively associated with scientific performance. However, this positive effect only occurs after a period of adjustment during which there may be a negative relationship. Furthermore, the positive effect of a mobility event may not endure. The different studies show that the effects depend not only on the type of mobility considered, but also on the measure for scientific performance.In addition, the analysis may return different results depending on whether mobile researchers are compared to immobile researchers or to their own pre-mobility performance. Figure 9 provides evidence for this.

• Geographic mobility
Our analysis showed that internationally mobile researchers publish higher-quality papersand that foreign researchers outperform native researchers. Among returnees, a positive effect can only be shown for researchers who went abroad for their post-doctoral research.However return mobility never has a negative effect. We also considered the international postgraduate mobility of Japanese researchers and found it had a positive effect on promotion. Nevertheless, this effect decreased over time when pre-mobility characteristicswere controlled in a matched sample of researchers. As reported in previous papers, the post-doctoral effect may be short-lived.

The analysis of the effect of geographic mobility on network building showed that foreign researchers and returnees are more likely to work with international co-authors and have a more diverse network. However, researchers going abroad for their PhD studies do not show higher levels of connectivity than researchers doing their PhD in their home country. Only academics moving abroad at a later stage of their career develop network benefits. In addition, in a matched sample of mobile and immobile early career academics in Japan, we found that mobile academics are promoted up to one year earlier than immobile academics. This positive mobility effect was only observed for later career mobility but not for post-doctoral mobility. Thus, international mobility is positive for several different performance indicators and mobility measures.

• Sector mobility
Sector mobility among academic scientists was considered as mobility from industry to academia in the setting of the United Kingdom. In this case, the results show that although mobility is followed by a decline in productivity in the first few years after the move, these researchers are more productivein the long run. Thus, intersector mobility enhances personal performance and may also contribute to the science system as a whole by providing new impulses.

• Career mobility
Since mobility cannot be separated from potential gains in prestige associated with the move, performance is largely driven by the prestige of the department. All the case studies used a measure of prestige in the analysis of mobility. One approach is to qualify mobility in terms of moves between countries with a different h-index. Thus, a move from a country with low scientific performance to a country with high scientific performance is considered an upward move. The analysis found a positive effect of the h-index of the country of origin on the quality of the focal paper of a foreign scientist. It also found a positive effect for upwardly mobile researchers (from a low- to a high-performing country). However, we also found a positive effect of downward mobility. Thus, mobile researchers always outperform their immobile peers.

We also found a positive effect of upward and downward mobility on the probability of experiencing promotion in a sample of Japanese biologists. However, the effect of downward mobility was slightly stronger, indicating that researchers move strategically to gain promotion. On the other hand, downwardly mobile researchers produce fewer publications after moving.

• Thematic mobility
We also presented a measure for thematic mobility based on overlay map techniques. The tool provided important information on the change of research focus overthe individual's career. We also tested whether thematic mobility is important for scientific advancement in terms of the number of publications and found a strong positive effect. This suggests that academics with a higher degree of multidisciplinarity and an interest in new research areas are also more productive. The thematic mobility measure proved to be useful in the econometric analysis and as a predictor of scientific production.


The Mobility case study provides original evidence on mobility and challenges the commonly accepted policy view that mobility is beneficial and should be encouraged. Our results suggest a complex interaction between mobility and productivity, which only in certain circumstances may result in a positive impact of the former on the latter. Mobility is far from always being beneficial for individual researchers. In fact, mobility is associated with a short-term decrease in performance due to adjustment costs, and mobility to a lower-ranked department seems to result in decreased mid-term performance. Further research on the specifics of mobility is needed to assess the impact of mobility and informpolicies on alternative forms of mobility, especially in Europe.These specifics include mobility associated with career progress, mobility to and from business, mobility to a foreign country, and the career period in which the mobility occurs. If our results are confirmed by future work, this would underline the need to rethink policies related to researcher mobility.



3.2. Case Study 2. Knowledge Sharing


The goal of this case study was to study knowledge flows between actors and the dynamics of actor networks in science production and science-society systems, with a particular focus on their social dimension and impact. We measured and described how knowledge is generated and how it spreads within and between scientific communities. On an observational level, we tracked networks and time series of knowledge artefacts (externalization of knowledge in the form of publications or patents) between individuals and between institutions. In addition to traditional bibliometric sources, we also used communications in the press and social media.
We mainly used Social Network Analysis (SNA) to analyze the abovementioned relationships. Different programs and tools are available for this type of analysis and in some cases we faced a tool discontinuity.
The Knowledge Sharing case study included several sub-studies covering the domains of nanotechnology and biotechnology in Germany, Spain, and the Ibero-American region.

Example 1. Nanotechnology: Co-Author Networks
This case study used bibliographic data from three excellence clusters (CENIDE, KIT, and NIM), obtained from SCOPUS, a large abstract and citation database of peer-reviewed literature. Interviews with CENIDE members were conducted during the study period to better interpret the network relationships and tendencies found in the data. Based on the bibliographic data, co-author networks were constructed for each excellence cluster from 1990 until 2013. Using the measures degree centrality and betweenness centrality as indicators to identify the different roles of researchers, we found that over time the network analysis revealed several specific insights: 1) The centrality gradients could identify promising young researchers (“shooting stars”); and (2) Somewhat in contrast to intuitive assumptions, we found that high betweenness was not necessarily associated with strong strategic or political influence in the network, but could also be the result of providing “enabling technologies” (in this case, the production of nano-particles) to a variety of experiments.
Regarding Network Text Analysis, although the data were again collected from the excellence clusters CENIDE, KIT and NIM, these data were obtained from press releases, press items, and publication abstracts from 2008 to 2011. To obtain so-called topic maps, we conducted an analysis based on the approach by Leydesdorff & Hellsten (2009) that clusters network nodes by the different topics included in the network. Based on the extension of this approach, we compared the topics emerging in the public discourse (in so-called “P-Maps”) to those emerging in the scholarly discourse (so-called “S-Maps”).
The results show that the Leydesdorff approach based on word co-occurrences can detect a change in semantics when the topic moves from the scholarly to the public context. In the scholarly context, CENIDE is related to specific knowledge and basic research. In the press releases, CENIDE and nanotechnology are strongly related to events, offers, and the promotion of these offers. The press appears to be interested in the people behind research and the areas of application. Thus, informing the public behind research is important in terms of social impact.

Example 2. Analysis of High-Impact Research in the Biotechnology Field and Its Effects on Microblogging Social Media
The objective was to identify the impact of a research field on the Twitter social medium. As input, we used scientific article databases containing the most relevant scientific papers (i.e. Journal Citation Reports (JCI) from the Institute of Scientific Information (ISI) for the category “Biotechnology and Applied Microbiology” in 2011). The analysis of these data was based on the keywords that appear in these papers (co-occurrence of keywords). By analysing the networks of keywords, we obtained groups of highly related keywords (that we call themes) to identify what is being researched at a specific time. Within each theme, we used several techniques of network analysis to obtain the list of relevant sets of keywords in order to measure both their influence and the opinion generated by social media users. The latter measurement was carried out by analysing the comments generated about the sets of keywords induced/introduced in Twitter, which is one of the most relevant social media networks.
In order to obtain themes, we applied a community detection algorithm to the keyword network. A total of 21 communities (or themes) were obtained. We then selected the relevant keywords from each theme (most related keywords). In addition, the combination of relevant keywords leads to more specific results (we limited the size of these keyword sets to 2 to avoid a combinatorial explosion).
Twitter is a microblogging service that allows users to send text messages (called tweets) of at most 140 characters. The messages also include metadata (author, entities, hashtags, etc). Any element in the list of the relevant set of terms can be used as a search term. Our searches were limited to the year 2012. Once the sets of tweets are preprocessed, they are passed to an automated machine-learning tool to be classified as ‘positive’, ‘negative’ or ‘neutral’ depending on their semantic background.
In our case study, Twitter users generally had a neutral opinion regarding the selected keywords in biotechnology. This is because many of the comments in the social network related to these searches either refer to the announcement of research results or are merely emotional comments. For example, in the case of Human genome or Gene & Expression, 12200 and 6900 tweets, respectively, were not classified as neutral. Moreover, there was a high rate of positive tweets regarding human genome, whereas Gene & Expression received more negative tweets than positive tweets.
From the scientometric point of view, a comparative analysis of inputs (scientific production in biotechnology in 2011) and outputs (impact in microblogging social media) yielded interesting results. During 2011, Genes was the most productive theme with more than 12000 papers published, whereas Cells was the best-known theme in the social media microblogs. Cells and DNA were the most influential themes regarding trending tweets. Cells, in particular, contained a keyword that produced many tweets in the social sphere: the keyword was Stem cells. An issue of interest is the theme UC & CD, as its average number of tweets per article was the highest in the analysis. The total number of trending tweets indicates that knowledge is well disseminated throughout the social media.
In summary, we developed a framework for linking scientific research and innovation fields with social media resources. Our approach detected themes by using well-known techniques for extracting communities from networks of keywords and characterized them by using social network analysis. We also constructed a co-citation network of articles that can be deployed over the keyword network. Themes and subsets of relevant keywords can be easily extracted to measure their impact in microblogging social media. The tool can also qualitatively measure this impact by using machine learning and text-mining tools that can classify text according to polar opinions (positive, negative, or neutral). This methodology could provide professionals with an overview to detect how users from massive social media networks have reacted (both qualitatively and quantitatively) to relevant research work in the field of biotechnology, thus allowing them to take appropriate actions.

Example 3. Monitoring Scientific Production in Regions: The Case of Biotechnology in Andalusia (Spain)
This case study presents a simple methodology to monitor lines of research in a region based on its scientific production over a period of time. These research lines were modelled as trajectories in order to be visualized as trajectory diagrams on the SISOB workbench. Firstly, highly related research topics were detected and themes produced. Next, the themes were linked based on common shared topics and trajectories were produced. Finally, the trajectories were visualised in an easy-to-use yet powerful diagram. For example, this methodology could help decision-makers to improve their decisions about how much investment these research lines should receive and where it should be invested.
This study took into account highly relevant papers published by at least one Andalusian institution in journals indexed in the category “Biotechnology and Microbiology” of the JCR/ISI from 2004 to 2010. From these data sets (one per year), a network based on the co-occurrence of keywords in papers was created for each year. Next, an algorithm to detect communities was applied for each network. These communities contained sets of highly correlated keywords forming themes. Finally, a process was conducted for linking similar consecutive communities leading to the production of trajectories. The resulting plot of trajectories for our case study is depicted in Figure 7. Time is represented on the x-axis and themes detected per year are represented as rectangles. The length of each rectangle represents the number of related papers (i.e. scientific production) with each in its corresponding year. The visual metaphor shows how themes develop over time.

This case study shows how analysts can extract deep insights hidden in data in scientific publications. For example, it could also help decision-makers in the task of in the task of managing resources according to specific trends in scientific productivity in a region. In relation to the uses covered in this section for the region of Andalusia, they could decide to increase investment in topics related to “Growth”, “Models”, “Escherichia Coli”, “Degradation”, and “Bacteria”.

Example 4. Trajectories and Role Patterns
This study analysed the characteristic behaviour of scientists in relation to knowledge evolution (role pattern) and the lifecycle of researcher topics (geographic analysis). The data used was stored in trajectories whose entries were built from CVN-Objects (GateDataExtractor) and a publication database. Role patterns were identified in three ways based on the patterns described by Braun et al. (2001): A Newcomer has not published before the given year, but continues to publish after it; a Terminator has published before, but does not publish in the year following the given year; a Continuant is a scientist who has published before and after the given year. We introduced the following role patterns as additional patterns of interest: As an extension of the Newcomer Pattern, The Trendsetter is a Newcomer to a new topic; the Diversifier is a scientist who works on a large number of topics over a short period of time; and the Roamer is a scientist who often changes institutes. Roamers have high geographic mobility.

To show the origin, geographical distribution, and movement of research topics on a GoogleMap, we used Institute-Institute networks based on single common topics and visualized them on a geographic map.

Example 5. Biotechnology
This case study focussed on the excellence cluster REBIRTH of the University of Hannover. The aim was again to compare the topics within the press releases and the publication abstracts of the excellence cluster. We used Network Text Analysis and built Knowledge x Knowledge networks. We used the Betweenness centrality measure as an indicator for the most central topics/words. The results showed that the topics mentioned within the different spaces (scientific vs public) were quite different. For example, whereas the word “mice” (related to animal testing) was strongly represented within the scientific space in the publication abstracts, the word never appeared within the social space, that is, within the press releases. On the other hand, topics related to children's hearts could only be found within the social space.

Example 6. Main Path Analysis
Main Path Analysis (Hummon & Doreian 1989) is a network analysis technique for the scientometric study of scientific citations over a period of time. Its major application is the identification of key publications in the evolution of a scientific field taking into account the inherent temporal structure of the developmental process. Figure 13 shows an example of a Main Path Analysis outcome.

The larger main component displayed as a line starting at the lower left and ending at the upper right corresponds to the trend already identified for the CSCL (Computer Supported Collaborative Learning) community. It refers to the support and analysis of interaction and communication within computer-supported learning scenarios. The second smaller component starting at the middle left and ending at the lower right shows the second major area in the field of AIED (Artificial Intelligence in EDucation), which contains papers on intelligent tutoring systems that can be seen as a more traditional research domain within this area.

In general, the study shows that Main Path Analysis can definitely show trends at a level that is sufficiently abstract to be able to contrast and compare different communities.


3.3. Case Study 3. Peer Review

Research evaluation plays an essential role in determining the allocation of resources among researchers, institutions, and countries. It follows that efficient and fair evaluation is a precondition for effective resource allocation. However, the only way in which fairness can be demonstrated is through objective measurement. Thus, we developed a methodology for testing the fairness of evaluation processes, and pilot tested the methodology by comparing the enhanced peer review process adopted by Frontiers (a new open access publisher) and the classic process adopted in a series of computer science conferences that were supported by the WebConf conference management system. A prototype implementation of the methodology in an online workbench allows our methods to be applied to other peer review systems.
The main data sources of this case study came from the FrontierIn publication system and WebConf (a conference system from the University of Malaga). The other sources were co-authorship data for other publications by authors and reviewers cited in the two primary databases (data mined from the ISI Web of Knowledge, Google Scholar, SCOPUS, and DBLP).

- Frontiers.The Frontiers database includes details of all scientific papers submitted to Frontiers between June 25, 2007 and March 19, 2012. We used 4,550 papers in the final analysis. The majority of the papers in the database come from the life sciences and the majority of authors and reviewers come from Western Europe and Northern America. However, the database contains a substantial number of authors and reviewers from other parts of the world.

- WebConf. The WebConf database includes details of contributions submitted to seven computer science conferences held between 2002 and 2011. Three of the conferences were mainly attended by authors and reviewers from Spain or from Spain and Portugal. The other conferences in the dataset were international conferences attended by authors and researchers from a variety of countries.We recognize that this lack of homogeneity may have introduced bias into the sample.
There was considerable variation between the review processes used by Frontiers and that used by the computer science conferences.The former uses an “enhanced” interactive process, whereas the latter uses classic review procedures. Our methodology made it possible to compare the two systems.

All data were pre-processed prior to subsequent analysis. The pre-processing step involved the de-duplication and disambiguation of author and reviewer names, gender assignment, the disambiguation of institution names, automated assignment of countries and geographical regions to researchers and reviewers, automated assignment of rankings to universities and institutions, and anonymization of the data.

A first study investigated potential bias in the peer review process related to the gender of authors and reviewers and to the characteristics of their respective institutions (geographical region, main language, prestige). We found that in both systems the main factors affecting review scores were author gender, author region, author language, and the prestige of the author’s institution.Reviewers gave similar scores to papers, regardless of the gender of the authors and the characteristics of the institutions to which they belonged. However, there were significant differences between the scores given by reviewers from different regions. Statistical analysis of the interaction between author and reviewer gender, language, and university ranking showed no significant effects. However, the Frontiers and the WebConf peer review systems both showed significant biases for and against contributions from certain parts of the world. Our findings were supported by large data samples and were independent of modelling details. The successful detection of regional bias demonstrates the potential of our methodology.

A second study investigated potential bias due to “social network effects” (effects related to the authors’ prestige (centrality)) and their positions in co-authoring networks. We found a number of differences between the datasets. Author centrality had no influence on review scores in the Frontiers system, whereas there was a small but significant correlation between the two in theWebConf system. This is not necessarily a sign of bias: it is plausible that authors with high centrality will produce better papers than authors with low centrality. We foundnorelationship between author-reviewer distance and review results in either dataset. This suggests an absence of favouritism. In both systems, reviewers belonging to the same sub-community as an author gave higher review scores than reviewers belonging to different sub-communities. This suggests that in a fair review system the reviewers assigned to a paper should come from the authors’ own communities and from outside these communities.

Potential Impact:
The main tangible output of the SiSOB project is a set of open-source tools and a general methodology for data-driven studies of mobility, knowledge sharing, and peer review based on models of actors related by artefacts, results, and relationships.

The results of the case studies and the tools and methods developed by the project are of potential interest to the following agents:

• Policy makers: Individuals with the power to influence or determine policies and practices at an international, national, regional, or local level. Policy makers can promote or restrict actions that have an influence on research itselfor on society in general.
• Scientists and researchers: Individuals responsible for producing specific research outputs, such as papers, conference presentations, and patents, and for reviewing work by other scientists and researchers. In particular, computer scientists can obtain and, if needed, adapt the SiSOB software to new sources of data, functionalities of the workbench, or new analysis methods.
• Journalists: Media workers who report the work of scientists and researchers to “consumers” who do not/cannot access the primary sources.

IMPLICATIONS AND FUTURE WORK

The goal of the SiSOB project was to provide policy makers and their advisors with new tools to help them in theex ante and ex post analysis of science policy decisions. The majority of studies to date have focused on the impact of networks of academic, scientific, and industrial actors that produce scientific knowledge and artefacts (science production systems). The SiSOB project extended these analyses to the broader science-society system, which not only influences the decisions of science production systems and consumes their products, but provides the socio-political context in which the science production system operates.

• The Researcher Mobility case study provides originalevidence on mobility and challenges the commonly accepted policy view that mobility is beneficial and should be encouraged. Our results point to a complex interaction between mobility and productivity, which only in certain circumstances may result in mobilityhaving a positive impact on productivity. Mobility is far from always being beneficial for individual researchers.In fact, mobility is associated with a short-term decrease in performance due to adjustment costs, and mobility to a lower-ranked department seems to result in decreased mid-term performance. Further research on the specificities of mobility is needed to assess the impact of mobility and to inform policieson alternative forms of mobility, especially in Europe.These specifics include mobility associated with career progress, mobility to and from business, mobility to a foreign country, and the career period in which the mobility occurs.If our results are confirmed by future work, this would call for a rethinking of policies related to researcher mobility.

• The studies developed in the Knowledge Sharing case study show the great potential of the methods explored in the studies and the workflows implemented in the workbench. One of the strengthsof our approach to setting up cases as workflows in the workbench is the degree of freedom provided by such a workbench-based approach. Workflows, data sets (e.g. if an open-data policy exists), and results can all be shared between scientists and policy makers.

• The Peer Review case study developed a methodology for testing the fairness of evaluation processes and conducted a pilot test that compared the enhanced peer review process adopted by Frontiers (a new open access publisher) and the classic process adopted in a series of computer science conferences. An important goal of our future work will be to extend our methods to cover other forms of bias not investigated in this study.To date, we have tested our methodology using just two peer review systems applied to scientific papers. Future work will extend our methodology to cover other potential sources of bias and replicate our study to other peer review systems. Frontiers hasrecently created a new team tasked with building on the work performed in the SiSOB project and applying the methods and results to the business aims of Frontiers.
• Although the SiSOB project is over, the analysis capabilities of the workbenchwill continue to be developed and enhanced. The intention of the partners is to publish the workbench as open-source software and open up the system to researchers and developers who were not involved in the project. A number of outside researchers have already tested the SiSOB data extractor and have expressed their intention to continue using the tool in the future. Some of the external experts involved in the evaluation workshop have expressed their interest in using the SiSOB system, in adding to it, and in introducing the tool to their respective communities.

CONCLUSIONS
The SiSOB project successfully completed its work on time and within its budget. All the partners intensively participated in the project, which was conducted in a highly collaborative and productive atmosphere.

The most important feature of the project waslikelyits iterative design process. The case studies provided requirements to the software developers who provided tools and workflows of immediate value to the researchers engaged in the case studies.The SiSOB workbench willsoon be released as open-source software and is one of the project's most important outputs. It can be used by researchers outside the project to replicate the SiSOB case studies using different data sources. The SiSOB Data Extractor is already in use by groups from outside the SiSOB Consortium. It is noteworthy that the experts involved in the finalSiSOB evaluation workshop showed strong interest in using our tools and methods in their own work.

Other important outputs were provided by the case studies and several of the conclusions go against conventional wisdom. In particular, the results of the researcher mobility case study suggest that the effects of mobility may be less beneficial tothe careers of researchers than is often believed. Similarly, the peer review case study suggeststhat some classic forms of bias in peer review (gender bias, country bias, language bias) may be less important than other studies have suggested. These findings are evidence of the potential of SiSOB tools and methodology.

The project has disseminated its results through many different channels, including social media, scientific publications, a very successful evaluation workshop, and an "informal" review meeting attended by independent external reviewers and the Project Officer. The early take-up of the project's software is encouraging evidence of the effectiveness of these efforts.

Of course, many issues remain open. The experience of the partners during the project confirms that it is extremely difficult to measure and predict the social impact of science.Nevertheless it is feasible to measure and characterize the social dimensions of modern knowledge-production systems and to use the results to take better decisions. The SiSOB project shows how this can be achieved,

although this potential should not be taken for granted. Some of the greatest difficulties experienced by the SiSOBpartners concerned access to data, much of which is monopolized by commercial publishers whose license conditions do not allowautomated data mining, even by researchers who are licensed to manually access the data.Without data, evidence-based policy-making remains a mirage. The SiSOB partners strongly support policy and regulatory initiatives designed to support "open data". We believe that with easier access to data, an ever more important role in policy-making will be played by methods and tools such as those developed in the SiSOB project.

REFERENCES

Braun, T., Glanzel, W., Schubert, A., 2001.Publication and cooperation patterns of the authors of neuroscience journals.Scientometrics 51, 499-510.

Fernandez-Zubieta, A., Geuna, A., and Lawson, C., 2013.Researcher’s mobility and its impact on scientific productivity. LEI & BRICK Working Paper 6/2013, Turin.

Hummon, N.P. Doreian, P., 1989. Connectivity in a citation network: The development of DNA theory. Social Networks, 11, 39-63.

Leydesdorff, L.,Hellsten, I., 2006. Measuring the meaning of words in contexts: An automated analysis of controversies about ‘Monarch butterflies’, ‘Frankenfoods’, and ‘Stem cells’. In: Scientometrics, 67, 231-258.

List of Websites:
Website http://sisob.lcc.uma.es

The SiSOB project made heavy use of a social media strategy to disseminate the project. All the partners and people interested in the SiSOB project actively and continuouslyparticipated in this effort, which was coordinated by University of Malaga. As shown in Figure 13, the strategy was based on weekly posts on theproject blog. The posts were then distributed over several social networks, social websites, and social media websites. The next step was to collect the feedback from all these social media and finally to publish the main results of the project and the opinions obtained from the social media in a journal which we edited.

Contact details

- UMA (Coordinator)
University of Malaga: Beatriz Barros (bbarros(at)lcc.uma.es)
- CICE
Consejería de Economía, Innovación, Ciencia y Empleo: Francisco Triguero (sguit.cice(at)juntadeandalucia.es)
- UDE
University of Duisburg-Essen: Ulrich Hoppe (hoppe (at) collide.info)
- MTA KSZI
Institute for Research Organization, Hungarian Academy of Sciences: Sandor Soos (soossand(at)caesar.elte.hu)
- FrontiersIn
Frontiers Research Foundation: Richard Walker (richard.walker(at)epfl.ch)
- FR
Fundazione Roselli: Aldo Geuna (Aldo.Geuna(at)unito.it)
- RICYT
Red de Indicadores de Ciencia y Tecnología Rodolfo Barrere (rbarrere(at)ricyt.org)

final1-siob-final-report-gf.pdf

Documentos relacionados