Joshua Fullard from the Behavioural Science Group in the Warwick Business School at the University of Warwick discusses his research into consent when linking survey and administrative data. He shares about why data linkage can be beneficial and methods to increase respondents consent rates.
The rise of survey data and the importance of consent
Quantitative social scientists generally use two types of data:
- experimental data and
- observational data
Experimental data is data deliberately created under controlled conditions, often in a social science lab.
Observational data is data generated by an ongoing uncontrolled process and can take many different forms including administrative records (e.g., health or pensions records) and survey data.
Survey data includes everything from cross-sectional (e.g., Labour Force Survey) and longitudinal (e.g., Understanding Society) studies to customer satisfaction surveys e.g., what you might be emailed after buying insurance online.
Research conducted using survey data has not always been viewed favourably by many social scientists, particularly economists, with many researchers focused on using administrative data.
Some of the apprehension around survey data includes concerns over measurement – self-reporting can be unreliable and some response formats might not be useful for quantitative interpretation – and concerns over sampling and sample sizes. In surveys sample sizes are often small which makes it challenging to identify modest differences and generalise the results to a wider population.
Supported by developments in computing power two of the main benefits of using administrative data are the large sample sizes – researchers can identify even modest effect sizes or changes in a small subgroup of the population – and fewer measurement errors (the difference between the measured value and the true value).
However, the use of survey data is becoming increasingly popular, driven by the fact that, thanks to the availability of easy-to-use survey software (e.g., Qualtrics) and a growing body of research on Survey Methods and Measurement, social scientists have become increasingly more involved in the design of surveys and survey questions. This has given social scientist more confidence that the underlying variables of interest are measured in ways that can be used for quantitative interpretation.
Linking Survey Data and Administrative Data
One of the most promising areas of social science research currently involves linking administrative data with survey data. Data linkage not only increases the research opportunities for social scientists, it also improves the cost-efficiency of data collection and reduces respondent burden.
The linkage of survey and administrative data can be used to supplement administrative data.
An example of this might be where administrative data doesn’t contain information on subjective expectations or beliefs. This can be useful when it is empirically challenging to identity what the researcher is interested in using administrative data alone.
For instance, if a researcher is interested in investigating why young people from less affluent backgrounds are less likely to apply to university, using administrative data, it can be challenging to separate the role of different, often correlated, factors (e.g., family income, the requirements for university admission and the returns to education, and tastes for education) as any combination of these factors can conceivably be consistent with observed choices.
Using survey data on university-related subjective expectations combined with administrate data would allow the researcher to distinguish between factors (i.e., differences in knowledge about the returns to a degree).
Linkage can also enhance survey data. For instance, a researcher might be doing a study to investigate the relationship between health conditions and attitudes to risk. In this case it is easier, and more accurate, to link participants health records to their survey responses than to ask them to report their entire medical history in the survey.
While data sources are becoming increasingly available to social scientists (e.g., social media data), linking administrative data to survey data often requires respondents informed consent to the linkage.
In many countries obtaining consent is a legal requirement – in the UK this is covered by the Digital Economy Act. Since not all respondents give their consent, this may create a source of bias if the respondents who consent are systematically different than those who do not.
While there are empirical tools to correct for this bias, there is a growing literature in Survey Methodology that studies how consent rates differ by participants characteristics and how researchers can design surveys to maximise consent rates.
My ongoing research contributes to this literature.
In an earlier project I investigated the plausibility, and potentially implications, of data linkage in the context of teachers and the effect of a light-touch information intervention (additional information on the data linkage process) on the decision to consent to link teachers survey data with their employment records.
I found that teachers were highly willing to consent to data linkage (75 percent) – possibly due to teachers’ prosocial motivations. But I also found large differences in consent rates by ethnicity and sector – teachers from a non-white background and those who work in the independent sector are significantly less likely to consent.
As non-white teachers and those in the independent sector make up a small proportion of teachers in England, systematic differences in consent rates could introduce bias into a survey-administrative linked dataset.
Interestingly, I found that providing teachers additional information about the data linkage process has a large, positive, effect on the propensity for teachers who work in the independent sector to consent to data linkage. In a wider context these results demonstrate that an inexpensive information intervention can have a meaningful effect on consent to data linkage among subgroups who were otherwise less likely to consent.
Consent to data linkage and trust
One of the other results from my research on teachers was that teachers in schools that had a connection with the university that the research team was based in were more likely to consent to data linkage. This result provided suggestive evidence that participants trust in the research team plays a role in their decision to consent to data linkage (or not).
We find that participants trust in the research team is positively correlated with their propensity to consent to data linkage – respondents who have a higher level of trust in the research team are more likely to consent to data linkage. In addition, we find that black respondents have significantly lower levels of trust in the research team compared to their white counterparts, which helps explain why ethnic minority groups generally have lower consent rates.
In this project we also investigate the effect of a light-touch information intervention (written and/or visual information about the research team) on respondents trust in the research team and consent to data linkage. While we find that the additional information about the research team has a positive effect on respondents’ trust in the research team, particularly those from an ethnic minority background, it has no effect on respondents consent rates.
While we are surprised that the information didn’t improve consent rates our interpretation of this result is that, while the provision of information about the research team improves respondents trust in the research team it is not specific, or powerful, enough to improve consent rates.
A more specific light-touch intervention, for example if the information was about the research teams experience with data linkage, or something more powerful like a video of the research team discussing the benefits of data linkage, seems like a promising area of future research.
Conclusion and next steps
We have learnt several things from this research.
Firstly, Black respondents have significantly lower levels of trust in the research team than our other respondents.
As participant trust plays an important role in not only consent to data linkage, but any study using human participants (e.g., the willingness to take part in a study), this can significantly hinder researchers’ efforts to investigate subgroups of the population and/or generalise their finding. Therefore, it is important that social scientists work towards building a stronger relationship with marginalised communities.
Second, our research shows that very cheap, and easy to implement light-touch information interventions, can have a meaningful effect on survey participants.
This is especially important given that many popular interventions to improve survey response or consent rates are expensive and inaccessible to many social scientists (higher incentives, interviewer follow-ups and extra mailings etc.). Investigating how different types of interventions, such as a video of the research team discussing the societal benefits of the research, and how they might influence different sub-groups seems like a promising area of future research.
About the author
Joshua Fullard is an Assistant Professor in the Behavioural Science Group in the Warwick Business School at the University of Warwick. His research agenda focuses on using experimental methods to investigate policy relevant questions in topics related to education inequalities, teachers and teacher labour markets and survey methodology. His work has gained national and international media coverage in a range of publications including: the Times, Telegraph, Guardian, Economist, Daily Mail and China Daily. Joshua received his PhD in Economics from the Institute for Social and Economic Research (ISER) and has previously worked as a lecturer in the Department of Economics at the University of Essex, a senior researcher at the Education Policy Institute and as a visiting research fellow at the ifo Center for the Economics of Education in Munich.