Skip to main content
European Commission logo print header

Optimising big data from citizen science projects for biodiversity research

Article Category

Article available in the following languages:

Boosting biodiversity research with citizen science

The use of citizen scientists in ecology and conservation is still very inefficient, so a team of researchers built a new framework to harness this useful data effectively.

Climate Change and Environment icon Climate Change and Environment

Everyday citizens can be important sources of data for scientific research. Many people volunteer to help professional scientists through the collection of results for field experiments or in the lab. Yet, without adequate structures and systems for assimilating this data, it can become less useful. In biodiversity research, interest in citizen science almost outweighs the ability for scientists to use this input effectively. “The main challenge of citizen science data in ecology and conservation is that it is often unstructured. There is no systematic methodology followed by the collectors that ensures that all places are equally covered and that all taxa are sampled,” says Henrique Miguel Pereira, head of research group at the German Centre for Integrative Biodiversity Research in Leipzig and OptimCS project coordinator. “This can cause bias in the data towards places closer to people or species that attract particular attention,” he explains. In the EU-funded OptimCS project, undertaken with the support of the Marie Skłodowska-Curie Actions programme, a team of researchers created a new workflow to maximise the information citizen scientists contribute to the collective knowledge of biodiversity.

Improving biodiversity citizen science

Most biodiversity citizen science data collected today is added each year into the Global Biodiversity Information Facility, a database that compiles all observations about species worldwide. “Today, much of the research in biodiversity science uses citizen science data obtained through the GBIF to address ecological questions and calibrate models,” adds Pereira. “But citizen science data from species has applications that go beyond ecology.” For instance, photos of species classified by the community of observers in the social network and app iNaturalist have been used to train one of the best computer vision systems in the world that automatically tells users which species is in a photo. In the OptimCS project, Pereira and his team used a range of approaches to maximise the information that citizen science observations have. They wrote a series of articles exploring the topic, including one in which they looked at what species are over-represented in citizen science data. They found that large-bodied species are particularly popular with citizen scientists. In another paper, the team assessed how many citizen scientists need to visit a site for scientists to estimate the number of species of a taxonomic group in that site.

Incorporating algorithms into citizen science

The OptimCS project also developed new algorithms to speed up the processing of citizen science data. These algorithms are able to find the highest ‘valued’ sites in time and space for biodiversity sampling, for example. “One uses models to look at what sites would provide more information if sampled, based on expected species richness or on information gaps for that site and surrounding region,” Pereira explains.

Building on results from OptimCS

One very interesting result from the project was to find, using mostly citizen science data, that butterflies with high affinity to urban environments have high thermal flexibility and generalist life history. “We think this is because urban areas have large thermal ranges, from the heat island effect, for example, and miss many of the most specialised host plants,” notes Pereira. The team plans to continue the research, and has several new studies in the pipeline focusing on this topic.


OptimCS, biodiversity, citizen, science, algorithms, data,

Discover other articles in the same domain of application