In this blog, originally published on JISC’s Research Data Management blog, Louise Corti and Richard Welpton from UK Data Service (UKDS) talk about the services they provide to UK researchers.
This post is in response to the thread on the data management jiscmail list regarding handling and providing access to sensitive data. While the list vigorously discussed possible solutions, including setting up a new working group under RDA, I wanted to highlight tried and tested models already in existence in the UK. In the UK, USA and Germany where millions of pounds has been invested in developing and implementing a workable pathway for assessing, handling and accessing sensitive data. I would make a plea not to reinvent any wheels for data access solutions before examining whether this model is useful.
The Economic and Social Research Council, via the Department of Business, Innovation and Skills (BIS), has invested millions of pounds over the past five years in the Secure Data Service from 2010 at Essex (now part of the UK Data Service) and then from 2013, the Administrative Data Research Network (ADRN), coordinated by the Administrative Data Service, also based at Essex. This funding has allowed us to develop Safe People Safe Projects Safe Settings Safe Outputs Safe Data protocols (the 5 Safes) and to test and fully implement designated Approved and Accredited Researcher pathways that both UK government and ESRC currently use for user approval and access to sensitive data.
Here is a bit more detail provided by our resident expert, Dr Richard Welpton. Please also note our forthcoming training course on September 17/18th for this community on these protocols.
The UK Data Archive has provided safe and secure access to confidential and sensitive microdata for nearly four years now. First, under the Secure Data Service, and since October 2012, through the UK Data Service, we have provided secure remote access to data deemed too confidential for download. Our remote access solution, known as the Secure Lab, enables bona fide researchers who have completed the necessary steps, to log in to a secure server based at the UK Data Archive at University of Essex, and access data for analysis. Once the researcher’s analyses are completed, the statistical results will be screened by experience staff to ensure no link between the results and the original data can be made: preserving the confidentiality of the data subjects. The results, when declared ‘safe’, are returned to the researcher.
This service builds upon international best practice of secure data access, established throughout the world by initiatives such as the UK Office for National Statistics Virtual Microdata Laboratory, and the NORC Data Enclave at University of Chicago. We operate the facility on five simple protocols:
Only ‘trusted researchers’ from UK Higher Education Institutions and other ESRC-funded research institutes may access data through the Secure Lab. These are researchers whose interest in accessing the data is purely to serve the ‘public good’. To apply, researchers must register with the UK Data Service using the institution credentials. They must then undertake a full project application, describing quite clearly their research proposal, their data requirements, and a justification for accessing these data (explaining why less sensitive sources are required). They should also complete an ‘individual’ application, in which they demonstrate their suitableness for accessing and handling such data (for example, they must have prior experience of handling such data, or be supervised by a colleague who has). In addition, they must read and sign a User Agreement, which must also be counter-signed by a legal representative of their institution. Finally, they undertake a mandatory full day training course, during which they cover the following topics: the legal and ethical responsibilities of accessing confidential/sensitive data; statistical disclosure control; using the Secure Lab.
We only allow projects to be undertaken in the Secure Lab which ‘serve the public good’. Projects that would try to exploit confidential information, and indeed identify and exploit data subjects, are strictly prohibited. Researchers must explain how their research will benefit society when they apply.
The UK Data Archive provides the Secure Lab, which is a secure facility for providing access to confidential/sensitive data. We use Citrix secure remote access technology, frequently used by the banking and military sector and renowned for its robustness. In addition, the UK Data Archive is accredited for the ISO 27001 Information Security standard. This means that the Archive operates an Information Security Management System, creating a culture of information security, continuous improvement, and vigilance among our staff, who apply best-practice standards for handling confidential data. In addition, we annually hire ‘ethical hackers’ to undertake penetration tests of our secure servers. The results of these tests, and the outcomes of our 6-monthly ISO surveillance audits, are made available to the dozen or so Government Departments and other agencies that regularly supply us with confidential data, to make available to researchers.
Only statistical outputs (results) which have been screened by staff to ensure they cannot be used to identify the data subjects, can be released to the researcher. These typically include ‘descriptive statistics’ that have been sufficiently aggregated such that identification is near enough impossible, and modelled output (regression coefficients etc.) which are inherently non-confidential.
This is a misnomer. Because of the high standards described above, we are able to provide access to ‘unsafe’ data. These are data that are relatively easy to identify (such as business data), and data about individuals with sensitive variables (e.g. the sexual preferences of young people, child school results etc.). Of course, direct identifiers such as names and addresses have been removed, but the data are still confidential/sensitive, and are considered ‘personal’ under the Data Protection Act.
Working with others
We work extensively with other organisations to promote and practice safe access to confidential data. We regularly meet our partners at the HMRC Datalab, Ministry of Justice Datalab, and Office for National Statistics Virtual Microdata Laboratory, to share our experiences. The addition of the Administrative Data Research Network has led to new thinking on secure data access, and we also work very closely with this network too, regularly sharing policies and procedures to enable a consistent and joined-up approach. For example, we have recently completed a National Researcher Accreditation training course with our partners which will be run from April 2015. We also intend to create a working group to consider a consistent approach to Statistical Disclosure Control. And we regularly meet our counterparts in Germany and France: the recently EU-funded Data without Boundaries project has provided an opportunity to achieve consensus on secure access at a European level.
In our efforts to provide secure access to confidential data, we remain very willing to engage with new service providers. We believe we can all benefit from sharing best practice and learning from everybody’s experiences. Despite different data collections and contexts, there are many similarities in this field, and rather than reinventing the wheel, we encourage dialogue with existing services.
We are holding a 2 day workshop in Manchester on 17/18th September which will explain how the 5 Safes works; if you have been tasked with setting up a secure research facility, this course is for you!
The event will include smaller ‘break-out’ sessions where you can interact with experienced service staff from the UK Data Service, HMRC, ONS, MoJ and ADRN. Here you will have the opportunity to raise any questions specific to your own work, plus there are optional 1-to-1 surgeries on day two where you can bring any specific topics or challenges to discuss. Find out more about this event here.