The work within the project has focused on the following three themes:
Infrastructure for quantitative analyses and data collection
We collected saliva samples from 200 individuals, speakers of several indigenous languages in the Ucayali-Urubamba region. These samples have been processed for whole-genome sequencing. We furthermore collected cultural data on ca. 63 societies and grammatical data on ca. 100 languages.
Whereas the databases for genetics and biogeography follow established designs, the database designs for cultural and linguistic data were developed specifically for the project. They are based on low-level, fine-grained variables, which in turn are grouped into higher-level classifications. This design allows us to assess signals in the data at different levels of granularity.
In order to streamline the workflow, the project has developed an R package called glottospace (
https://CRAN.R-project.org/package=glottospace(öffnet in neuem Fenster)). This package contains several analytical tools and visualization techniques that allow for the spatial analysis of linguistic and cultural data, based on distance measures.
In collaboration with a number of colleagues, we are furthermore developing spatial representations of language locations in a consensus map based on previously published maps, which we have digitized, georeferenced and annotated. This allows for the representation of language locations in the form of polygons, which is a more accurate approach to reality. The polygon data will also be integrated into the glottospace package. An accompanying paper is currently under review.
Approach and method development
No established framework exists for the analysis of multidisciplinary data to reconstruct the past. This part of the project, therefore, has focused on the development of a framework for combining signals from different disciplines to reconstruct population history, in particular historical contact scenarios. This framework relies heavily on both new and established quantitative methods but firmly embeds the signals that result from the quantitative analysis in the existing ethnohistorical literature. An important methodological tool involves several methods of several types of distance measuring (between societies, languages), which allows for direct comparisons between the disciplines, and the establishment of (mis)matches. This part of the approach has also been included in the aforementioned Glottospace package. Another recurring element in our approach is to test hypotheses that have been proposed in the literature, which imply predictions about the patterns we would expect to find in the different disciplines.
Regional studies
Using the infrastructure and methodology mentioned above, we have initiated a number of regional and thematic studies. These studies focus in different ways on uncovering patterns that are suggestive of past contact scenarios in the Upper Amazon and between Upper Amazonian and their Andean and Amazonian neighbors. See more on these projects in the next section.