In this five-part mini series, Joe Allen gets us thinking about the challenges and ethical implications of using Twitter data.
This is part three of a five-part series on the ease and ethics of utilising Twitter data, based on a talk I gave at the NCRM Research Methods e-Festival. In the last post, I explored the access industry has to Twitter data and the ethics of this access.
In my view, Twitter data is personal data, and so any use of it should be subject to ethical review. Twitter as a company does not take responsibility for policing the ethics of Twitter data use, so in this post, I take a look at the ethical review process in UK academia and how it could help improve procedures for the ethical use of Twitter data.
Q7: When Twitter data is used in academic research, should we inform each user that their data is collected?
In a traditional academic study involving personal data, we know that standard scholarly practice involves seeking informed consent from participants. We also give these users the right to revoke their data at any time.
However, when dealing with Twitter data, things are a lot more complicated. For one thing, the sheer volume of tweets that exist on the internet provides a baseline of anonymity. Without contacting all of these users, academic researchers cannot offer them the chance to give informed consent for their data to be used. Moreover, there’s a general consensus that users shouldn’t be contacted in this way, as shown in the Mentimeter responses from my audience below.
There’s also the problem that if we were to ask for consent and only use data from those who provide it, then we run the risk of introducing bias into our research results. And perhaps most importantly, we already know from my last post that contacting Twitter users is a breach of GDPR. So there are all sorts of tensions here when we consider if and how we can regulate the ethical use of Twitter data.
Q8: Should research with Twitter data require ethical review?
All UK universities require that an ethical review approves research studies using personal data. Some universities already state that Twitter data is personal data and it therefore requires ethical review, but not all do.
When I asked my audience (who were all academic researchers) what they thought about this question, most felt that social media data should require an ethical review, even though this isn’t yet common practice in all universities. Fewer disagreed, and just a couple were unsure.
Q9: All universities require that personal data and related studies go under ethical review. Is the text from tweets personal data?
Personal data requires ethical review. A tweet is public but could contain data that is personal and hence requires protection. And it’s sometimes pretty easy to reidentify the author of a tweet simply by copying the contents of the tweet into Google. So is Twitter data ever really anonymous, and in how many cases is this high risk of reidentification problematic?
When I asked my academic audience about this, they recognised this conundrum. The risk of reidentification is high, but the data is public. As with so many data problems, the answer here is context dependent:
- Who wrote the tweet?
- When did they write it?
- Who is collecting the tweet?
- What does the collector intend to do with the tweet?
Moreover, who is responsible for determining whether the answers to these questions warrant access to the data? I’ll be tackling these questions in tomorrow’s blog, so tune in again then.
So far in this blog series, I’ve discussed an example of socially good “donation” of data, a more malicious example of data being “stolen”, and in this post I’ve started to unpick how we might begin to solve the problem of Twitter data use in the academic domain.
For me, the answer lies in good legislation. Current best practice for research data management and ethical review in academia provides us with a robust process for handling and obtaining informed consent from research participants, which grants them the right to revoke data if they wish to and stipulates that personal data should not be held for any longer than it needs to be for the purposes of the research project using it. There is a lot we can learn from this when using social media data and I hope that advancements can be made in the near future. We know we can’t ask for consent to use social media data without making users uncomfortable or breaching GDPR, but we can learn from existing ethical review processes so that we don’t simply give blanket approval to all projects using social media data. At present, there are only a few universities that expect an ethical review for the use of Twitter data, so encouraging more universities to follow suit would be a good step forwards in tackling the problems bound up with the ethics of Twitter data use.
In the next part of this series on the ethics of Twitter data, I will be discussing who is responsible for the distribution of Twitter data.
Check out the whole series as it becomes available below:
- Part 1 : Should we use Twitter data in academia?
- Part 2: Should industry have access to Twitter data?
- Part 3 : Is using Twitter data ethical?
- Part 4 : Who is responsible for Twitter data?
- Part 5 : What is next for Twitter data?
All data used in these blog posts are available on the UK Data Service GitHub.
Joseph Allen is a Research Associate at the UK Data Service, based at the Cathie Marsh Institute for Social Research at the University of Manchester. For the last year Joe has been focusing on making Twitter easier to use, whilst also exploring the ethics of this access.