Prof Jennifer Rodd
Department of Experimental Psychology, University College London
Popular version of paper 2aSC1 Collecting experimental data online: How to maintain data quality when you can’t see your participants
Presented at the 180th ASA meeting
In early 2020 many researchers across the world had to close up their labs and head home to help prevent further spread of coronavirus.
If this pandemic had arrived a few years earlier, these restrictions on testing human volunteers in person would have resulted in a near-complete shutdown of behavioural research. Fortunately, the last 10 years have seen rapid advances in the software needed to conduct behavioural research online (e.g., Gorilla, jsPsych) and researchers now have access to well regulated pools of paid participants (e.g., Prolific). This allowed the many researchers who had already switched to online data collection could to continue to collect data throughout the pandemic. In addition, many lab-based researchers, who may have been sceptical about online data collection made the switch to online experiments over the last year. Jo Evershed (Founder CEO of Gorilla Experiment Builder) reports that the number of participants who completed a task online using Gorilla nearly tripled between the first quarter of 2020 and the same time period in 2021.
But this rapid shift to online research is not without problems. Many researchers have well-founded concerns about the lack of experimental control that arises when we cannot directly observe our participants.
Based on 8 years of running behavioural research online, I encourage researchers to embrace online research, but argue that we must carefully adapt our research protocols to maintain high data quality.
I present a general framework for conducting online research. This requires researcher to explicitly specify how moving data collection online might negatively impact their data and undermine their theoretical conclusions.
- Where are participants doing the experiment? Somewhere noisy or distracting? Will this make data noisy or introduce systematic bias?
- What equipment are participants using? Slow internet connection? Small screen? Headphones or speakers? How might this impact results?
- Are participants who they say they are? Why might they lie about their age or language background? Does this matter?
- Can participants cheat on your task? By writing things down as they go, or looking up information on the internet?
I encourage researchers to take a ‘worst case’ approach and assume that some of the data they collect will inevitably be of poor quality. The onus is on us to carefully build in experiment-specific safeguards to ensure that poor quality data can be reliably identified and excluded from our analyses. Sometimes this can be achieved by pre-specifying specific performance criteria on existing tasks, but often it included creating new tasks to provide critical information about our participants and their behaviour. These additional steps must be take prior to data collection, and can be time-consuming, but are vital to maintain the credibility of data obtained using online methods.