Periodic Reporting for period 2 - WEB DATA OPP (New opportunities to enhance or extend (mobile) web survey data and get better insights)
Período documentado: 2021-08-01 hasta 2022-09-30
The main goal of this project is to investigate how new measurement opportunities can help get more accurate and/or more complete insights, which, in turn, will help key actors (e.g. politicians, firms) make better decisions. This is a new research area in which very little is known, even if these opportunities are already there: some of the new data are already being collected massively (e.g. passive metered data).
Nevertheless, researchers do not know yet how to properly use these new data. They are missing the tools and knowledge that would allow them to take advantage of the huge amount of information already collected and stored, and to integrate new measurement opportunities that a large part of the population in Europe is currently using in everyday life (voice input, visual data sharing) to make surveys easier for respondents and more natural, increasing data quality and respondents' satisfaction. New research is urgently needed to rethink the way surveys are implemented and provide a framework to start integrating new measurement opportunities in a proper and beneficial way. This is what this project does, by considering four types of new measurement opportunities: voice data, visual data, metered data (obtained through a tracking application or "meter" installed by the participants on their devices to register at least the URLs visited) and data collected in-the-moment (i.e. just after an event of interest occurred). This project's key objective is to develop the tools to best implement these new measurement opportunities, to improve web survey data quality and to produce knowledge that helps other researchers to better decide when/how web survey questions can be replaced by or combined with other data.
Second, to provide a general background for the whole project, we listed the expected benefits and disadvantages of using different new data types.
Third, to help researchers decide whether requesting visual data in the frame of web surveys: 1) we studied the respondents’ skills and willingness to share such data, their availability, and the burden associated to their creation and sharing. 2) We created a practical guide that discusses the main steps involved in the process of asking images in surveys. 3) We provided empirical evidences about the impact of asking to answer with images on noncompliance, completion time, and survey experience, using data from an opt-in panel in Germany.
Forth, since the potential benefits of the new data types can only materialize if people accept to participate, we explored the willingness to participate in 1) in-the-moment surveys triggered by online activities and 2) geolocation-based research, especially sharing geolocation data for the specific purpose of being invited to in-the-moment surveys. Conjoint experiments carried out on samples of opt-in panellists in Spain showed overall high levels of willingness. Moreover, even panellists willing to participate may fail to do so if they do not see the survey invitation in time. Thus, we investigated the acceptance of different invitation methods. Because the way data sharing activities are presented to participants may lead to different levels of willingness, we also investigated how different descriptions of the activities affect the levels of willingness.
Finally, inspired by the Total Survey Error, we developed a Total Error framework for digital traces collected with Meters (TEM), which describes the data generation and the analysis process for metered data and documents the sources of bias and variance that may arise in each step of this process. Then we showed how the TEM can be used to document, quantify and minimize error sources, using the TRI-POL project (https://www.upf.edu/web/tri-pol). We developed an approach to measure tracking undercoverage, as well as to simulate the bias that it can introduce to statistics. We also tested how different design decisions affect the validity of the measurements of news media exposure.
The development of the WebdataVoice, WebdataVisual, and WebdataNow tools also represents a key contribution, and even if it is highly advisable to have at least basic notions about web page development to set up these tools, they make the data collection of the new data types much more accessible.
However, there is still a lot to be done. In particular, we need now to use the new tools and implement them in the frame of surveys, to see how they perform in practice. We will implement experiments for all four types of data to compare the participation and data quality of conventional surveys versus data collection which take advantage of the new measurement opportunities. We will also consider the participants’ satisfaction (did they like using these new tools?) and evaluation of the tools (how easy/difficult it was). We expect to find a lower participation than for conventional surveys when proposing the new tools in samples of opt-in panels, where participants are all volunteers to answer conventional surveys. However, we still expect a relatively high participation. Moreover, we expect improvements both in data quality and in participants' satisfaction.