Skip to main content

Medical translation in the history of modern genomics

Periodic Reporting for period 4 - TRANSGENE (Medical translation in the history of modern genomics)

Reporting period: 2021-04-01 to 2022-03-31

This project explored the history of genomic science across three different species: the baker’s yeast S. cerevisiae, the pig S. scrofa and H. sapiens. Each species featured a concerted project to determine the sequence of its DNA molecule – its genome – at consecutive, yet overlapping time periods: 1989 to 1996 in yeast; 1990 to 2003 in human, and 2006 to 2012 in pig. By tackling the antecedents, development and consequences of these projects, as well as their interactions and surrounding environments, we uncovered a variety of practices and modes of organisation, all of them constitutive of the scientific field of genomics.

Uncovering the diversity of genomics is important because the only widely known episode of its history is the Human Genome Project. Existing and amply publicised accounts of this project have mobilised a success narrative of rapid compilation and unrestricted release of the DNA sequence data characteristic of the human genome. Yet according to these same accounts, there has been an ongoing translational gap between this large amount of publicly available data, and the medical and scientific goals to be fostered by the human genome sequence.

Existing evidence suggested that the narrative of the Human Genome Project sidelined several aspects of human genomics, as well as genomics research conducted in other species. Our project reconstructed these overlooked lineages in the history of genomics through a mixed methods approach that used quantitative data as an input for historical research. Our quantitative data comprised details of almost 13.5 million of yeast, human and pig DNA sequences submitted to public databases and over 29,000 articles describing those sequences in the scientific literature. By interpreting this dataset along other qualitative evidence, we drew the following conclusions:

1) Publicly available datasets enable historians to portray genomics beyond well-known episodes and amply publicised accounts, such as those gravitating around the Human Genome Project.

2) The comprehensiveness of those datasets – including all the results of genomics research and not only those arising from well-funded and high-profile initiatives – represent an opportunity to retrieve actors and institutions largely forgotten today but important in making the history of this field.

3) The ways these forgotten participants practiced and organised genomics shows that the notion of a gap between DNA sequence data and practical research goals is only part of the history of this field: in a substantial number of the evidence we interpreted, the production of DNA sequences was inseparably entangled with the use of the resulting data for medical, agricultural or industrial research.

4) This entanglement between sequence production and use suggests that the affordances and limitations of a given genome depend on the communities that produced the underlying DNA data: the necessities and motivations of the actors and institutions that conform these communities, which differ within and across species.

5) Our method of exploiting the historical potential of large datasets can be exported to other scientific fields or areas of human activity.
Since the start of the project in October 2016, ten articles have been published in top journals in the fields of History and Philosophy of Science (HPS) and Science and Technology Studies (STS) and three more are currently undergoing peer-review. Five of the published articles are part of a special issue of Historical Studies in the Natural Sciences in which we describe our mixed-methods approach. We have also completed a monograph entitled A History of Genomics Across Species, Communities and Projects and scheduled to be published in 2022 by Palgrave Macmillan. All these publications are available either gold or green open access:

Other types of outcomes that the project has produced are a large dataset encompassing the DNA sequence submissions and publications we used as quantitative evidence for our research. The dataset is available without restrictions at the University of Edinburgh repository and a peer-reviewed note describing its contents, structure and underlying data collection process has been published in the F1000 Research open access platform ( The results of the project have also been presented at over 40 events, including top international academic conferences (History of Science Society and Society for Social Studies of Science), invited participation in workshops convened by reference institutions in our field (such as the London School of Economics, which led the ERC-funded grant Narrative Science) and scientific or industrial gatherings (such as the UK Pig Breeders’ Roundtable and a meeting at the US National Human Genome Research Institute).

The project has also attracted the attention of institutions working on ethical aspects of life sciences research, as reflected by an invitation to contribute to a horizon-scanning workshop on food sustainability organised by the Nuffield Council on Bioethics. We have also participated in an Advanced Training Workshop of the Scottish Graduate School of Social Science addressing the challenge of designing mixed methods approaches and disseminated our conclusions among lay audiences at the Edinburgh Fringe Festival.

Toward the end of the project, we organised an international workshop on Trajectories of Big Data Platforms. The meeting addressed possible synergies between sociological, historical and philosophical studies of large data repositories in different domains including genomics, medicine, plant science, insurance, technological companies and the platform economy. It was conceived as a testbed to explore the applicability of our theoretical and methodological approach to other big data endeavours. Its proceedings are available open access at
The two most important deliverables of the project have been the research monograph and special issue in Historical Studies in the Natural Sciences. The special journal issue proposes a novel methodology to conduct historical research with large datasets via the quantitative, visual and qualitative analysis of co-authorship networks. Through five interconnected articles, it also reflects on the experience and challenges of conducting history of science research in a large, interdisciplinary team.

The monograph uses the knowledge derived from our project to augment the historiographical boundaries of genomics. Entitled A History of Genomics Across Species, Communities and Projects, it proposes the new term ‘genomicists’ to highlight the diversity of communities involved in genomic science and their agency as historical actors. It constitutes the first comprehensive historical account of genomics across four decades (1980s to 2010s) and three different species: yeast, pig and Homo sapiens. It shows that the history of genomics is broader than the development of the Human Genome Project, and that what a genome is and does is inextricably entangled with the trajectories of the communities that produced it.