Skip to main content
European Commission logo print header

Towards a theory of mathematical signs based on the automatic treatment of mathematical corpora

Periodic Reporting for period 1 - SemioMaths (Towards a theory of mathematical signs based on the automatic treatment of mathematical corpora)

Reporting period: 2019-10-01 to 2021-09-30

The philosophy and history of mathematics has recently evolved in the direction of understanding mathematical knowledge as a product of human practices rather than purely abstract logic. This has led to an increased importance of the analysis of mathematical language. However, the available tools for analyzing mathematical texts exhibit a lack of a systematic methodology associated with the absence of a solid connection between the perspectives of contemporary theories of language and the field of philosophy and history of mathematics. The SemioMaths project aims to fill this gap by introducing semiological insights and methods hitherto unexplored in the field, while establishing solid connections between the two domains. In particular, the project aims to propose a formal model for the automatic analysis of mathematical corpora, based on semiological theories and recent advances in formal and computational linguistics, that can provide relevant insights about the relation between mathematical knowledge and its historically determined textual expressions. Ultimately, this project seeks to improve our comprehension of mathematics as a human practice and its role in society, including theoretical and practical concerns in mathematics education, scientific communication, and the broader social and cultural contexts in which mathematical knowledge is produced and used. In this way, the SemioMaths project intends to contribute to increasing the connections between formal and natural sciences and the humanities.
From a scientific perspective, the work performed over the course of the project focused on two main aspects of a formal analysis of language crucial for the analysis of mathematical textuality: segmentation, i.e. the identification of relevant textual units; and typing, i.e. the organization of such units into mutually dependent classes. The case of elementary arithmetic was chosen as the privileged case study for this approach. This work resulted in several scientific publications and the development of “semiolog”, a Python software package, released in alpha version for the use of the scientific community. The main insight provided concerns the relation between those two aspects of linguistic analysis, often treated independently in the current orientations of the field, and the proposition of a distributional approach to certain aspects of mathematical knowledge. Moreover, the conceptual consequences of this novel perspective offered a ground to carry out a critical approach to recent applications of artificial intelligence to the analysis of language, motivating a series of publications within the framework of this project. This analytical and critical work was also accompanied by a historical account of certain aspects of the problems addressed, motivating one scientific publication and informing the teaching activity associated with this project. A seminar was organized in the early stages of the project, as well as an international workshop near its conclusion. The results were disseminated in several academic events (seminars, workshops, conferences) and communicated to a broader audience in lectures, public debates, tutorials, and publications. Finally, the project motivated several international scientific collaborations.
The SemioMaths project has introduced a novel perspective on the formal analysis of language in the context of mathematical textuality, by bringing together techniques from computational linguistics, natural language processing, and distributional approaches to language. The semi-empirical setting proposed through the computational treatment of corpora is a distinctive contribution to the field of philosophy and history of mathematics. The impact of fostering the collaborative development of computational tools and methods in this field is potentially significant, allowing for the development of more rigorous, systemic, and reliable theoretical frameworks. This has the potential to shape both scholar and public understandings of mathematical practice and its history. The connection established between scientific communities from different horizons should not only encourage the adoption of novel methods in the field, but also motivate, in return, informed critical assessments of current trends in computer science, and AI in particular, where the societal stakes are higher than ever. The expected refinement of the tools proposed can contribute to bringing further interpretability to state of the art black-box models which are currently being deployed with practically no control over the possible societal consequences. Moreover, by providing a means of examining the underlying structures of mathematical discourse, the project creates opportunities to investigate the implications of linguistic and non-linguistic factors on the development of mathematical knowledge and to address a variety of socio-economic issues related to mathematics education and scientific communication. More generally, the successful conclusion of the project will contribute to reinforcing the place of the humanities in their relation to formal and natural sciences.