CORDIS - EU research results
CORDIS

New opportunities to enhance or extend (mobile) web survey data and get better insights

Periodic Reporting for period 2 - WEB DATA OPP (New opportunities to enhance or extend (mobile) web survey data and get better insights)

Reporting period: 2021-08-01 to 2022-09-30

Survey research is the most frequently used method for collecting data in many disciplines. However, surveys suffer from several problems, in particular, measurement errors, due to human memory limitations, people making mistakes, or unwillingness to do the required effort to answer properly. In addition, more and more people are connected everywhere, all the time. In this frame, new measurement opportunities have arisen that can help reduce measurement errors but also enable the study of completely new research questions.
The main goal of this project is to investigate how new measurement opportunities can help get more accurate and/or more complete insights, which, in turn, will help key actors (e.g. politicians, firms) make better decisions. This is a new research area in which very little is known, even if these opportunities are already there: some of the new data are already being collected massively (e.g. passive metered data).
Nevertheless, researchers do not know yet how to properly use these new data. They are missing the tools and knowledge that would allow them to take advantage of the huge amount of information already collected and stored, and to integrate new measurement opportunities that a large part of the population in Europe is currently using in everyday life (voice input, visual data sharing) to make surveys easier for respondents and more natural, increasing data quality and respondents' satisfaction. New research is urgently needed to rethink the way surveys are implemented and provide a framework to start integrating new measurement opportunities in a proper and beneficial way. This is what this project does, by considering four types of new measurement opportunities: voice data, visual data, metered data (obtained through a tracking application or "meter" installed by the participants on their devices to register at least the URLs visited) and data collected in-the-moment (i.e. just after an event of interest occurred). This project's key objective is to develop the tools to best implement these new measurement opportunities, to improve web survey data quality and to produce knowledge that helps other researchers to better decide when/how web survey questions can be replaced by or combined with other data.
First, we developed three tools: WebdataVoice, to answer through dictation or voice recordings, WebdataVisual, to capture visual data, and WebdataNow, to implement in-the-moment surveys triggered by online behaviours and geolocation data.
Second, to provide a general background for the whole project, we listed the expected benefits and disadvantages of using different new data types.
Third, to help researchers decide whether requesting visual data in the frame of web surveys: 1) we studied the respondents’ skills and willingness to share such data, their availability, and the burden associated to their creation and sharing. 2) We created a practical guide that discusses the main steps involved in the process of asking images in surveys. 3) We provided empirical evidences about the impact of asking to answer with images on noncompliance, completion time, and survey experience, using data from an opt-in panel in Germany.
Forth, since the potential benefits of the new data types can only materialize if people accept to participate, we explored the willingness to participate in 1) in-the-moment surveys triggered by online activities and 2) geolocation-based research, especially sharing geolocation data for the specific purpose of being invited to in-the-moment surveys. Conjoint experiments carried out on samples of opt-in panellists in Spain showed overall high levels of willingness. Moreover, even panellists willing to participate may fail to do so if they do not see the survey invitation in time. Thus, we investigated the acceptance of different invitation methods. Because the way data sharing activities are presented to participants may lead to different levels of willingness, we also investigated how different descriptions of the activities affect the levels of willingness.
Finally, inspired by the Total Survey Error, we developed a Total Error framework for digital traces collected with Meters (TEM), which describes the data generation and the analysis process for metered data and documents the sources of bias and variance that may arise in each step of this process. Then we showed how the TEM can be used to document, quantify and minimize error sources, using the TRI-POL project (https://www.upf.edu/web/tri-pol). We developed an approach to measure tracking undercoverage, as well as to simulate the bias that it can introduce to statistics. We also tested how different design decisions affect the validity of the measurements of news media exposure.
All the research carried out has already provided valuable results, which contribute to filling gaps in the previous literature. Very little was known before on the different data types of interest. Thus, the research done so far provides some key evidence about the feasibility of using such data in the frame of web surveys, as well as about the main challenges and errors to be considered. Important practical recommendations have also been proposed based on our first results. For instance, our research on visual data suggests that the main limiting factor that could explain a relatively low participation when asking to share visual data is related to the availability of such data. Thus, researchers need to think carefully before asking for visual data whether and to what extent such data might be available for respondents. Another example: the TEM framework, by identifying all the different error types for metered data, helps researchers planning their metered based research. Moreover, we proposed a case study to show how the TEM can be applied in real projects to identify, quantify and reduce metered data errors. This illustration provides a valuable example to other researchers to understand how the TEM can be used in practice.
The development of the WebdataVoice, WebdataVisual, and WebdataNow tools also represents a key contribution, and even if it is highly advisable to have at least basic notions about web page development to set up these tools, they make the data collection of the new data types much more accessible.
However, there is still a lot to be done. In particular, we need now to use the new tools and implement them in the frame of surveys, to see how they perform in practice. We will implement experiments for all four types of data to compare the participation and data quality of conventional surveys versus data collection which take advantage of the new measurement opportunities. We will also consider the participants’ satisfaction (did they like using these new tools?) and evaluation of the tools (how easy/difficult it was). We expect to find a lower participation than for conventional surveys when proposing the new tools in samples of opt-in panels, where participants are all volunteers to answer conventional surveys. However, we still expect a relatively high participation. Moreover, we expect improvements both in data quality and in participants' satisfaction.
More and more of people's life happens online