Skip to main content

Elites, networks, and power in modern urban China (1830-1949).

Periodic Reporting for period 1 - ENPMUC (Elites, networks, and power in modern urban China (1830-1949).)

Reporting period: 2018-09-01 to 2020-02-29

The project examines the transformation of elites in China over a century (1830-1949) in an era of tremendous change. It intends to break through the existing limits of access to historical information that is embedded in complex sources of a different nature forming now massive digital corpora. Understanding this particular historical process will provide keys for a significant revision of what made modern China until today. Our vision of modern China has been — and still is — tainted by the revolutionary experience and the interpretations that resulted thereof. The approach we pursue to produce scalable data-rich history will serve to collect and deliver historical information at an unprecedented scale, to reshape the analysis of existing sources, and to develop the tools and techniques for exploration/exploitation of massive historical corpora. The key objectives of the project include: Analyzing urban elites in modern China at the level of actors rather than state institutions or community organizations; Analyzing the vectors, patterns and timelines of the involvement of elites in public action; Investigating the process of transnationalization of urban elites; Demonstrating the capacity to radically change the scale and quality of historical information; Establishing long-term digital historical resources in the form of innovative databases for extended research by the broader scholarly community.
The implementation of the project in the last 18 months followed several interconnected tracks:
Acquisition of the first major corpora in English and Chinese (newspapers, but also directories, dictionaries, etc.) in digital format and creation of the documentary infrastructure to support the access, process, and preservation of these corpora (ExHist Database).
Recruitment of postdoctoral researchers (history, NLP) and specialists (data science, GIS) to constitute the core team of the project, around which a large circle of scholars in history, computing, and linguistics are collaborating.
Evaluation and selection of the instruments and methods to be applied to the corpora, and integration of these instruments and methods into a coherent infrastructure with well-defined workflows.
Training sessions in various advanced techniques (data visualization, R language, MCA, NLP) for the scholars involved in the project.
Creation of the beta version of our two major databases: Modern China Biographical Database and Modern China Geospatial Database, with a preliminary public interface (on-going and not made public yet).
Opening of a research blog ( which we deemed sufficient in the initial phase of the project to publicize our actions and accomplishments (1,800 unique visitors and 2,556 visits in February 2020). We just started designing a prototype for the web poral that will integrate all our instruments and resources.
Organization of two international workshops, one on biographical databases, one on elites and networks in China. The former served to bring together top-rate experts to discuss and assess existing biographical databases. The latter brought together historians of China on the specific theme of elites, knowledge, and power. An edited volume is currently being prepared by the P.I.
Participation to various conferences (Biographical Data in a Digital World 2019, DADH 2019, AAS 2020 [panel accepted, but conference cancelled due to coronavirus], EHSSC 2020 [postponed to 2021 for the same reason] and seminars (Bristol University, Aix-Marseille University, Naples University [postponed to autumn 2020].
At this stage (M18), it is too early to claim making progress beyond the state of the art. Yet significant advances were made in several directions: collection of biographical data on 90,000 individuals, both Chinese and non-Chinese, under their various denominations (1 to 9 names), ready for inclusion in the Biographical database. This is a key element for the identification of historical figures in historical sources across languages; collection of all the biographical pages for Chinese in the Chinese and English Wikipedia to be processed for data extraction. This is a major methodological breakthrough at both the level of collection and data extraction. Indexing of all the English and Chinese corpora (newspapers) to enable the identification and extraction of all named entities (persons, institutions, locations, etc.) at the level of articles and initial construction of a graph tool for the exploration of data. For Chinese newspapers, the ENPMUC research team is the only one worldwide to have gained access to these resources with the capability to process them with advanced techniques of data mining.