A new integrated approach: training researchers to use sensitive microdata

James Scott, Senior Support Officer at the Secure Lab, part of the UK Data Archive, a partner in the UK Data Service, updates us on the new Safe Use of Research data Environments training (SURE) which has been designed for new users of Secure Lab and other Research Data Centres.

A more engaged approach to training

It’s often said that “if it ain’t broke, don’t fix it.” But what if you could make something perfectly serviceable even better? This was the thinking behind replacing our established Safe Researcher Certification Course with something more interactive and a wider scope.

It’s vital that users of the remote access UK Data Service Secure Lab share our views on data security and have a full understanding of the processes in place to ensure that the highly sensitive microdata held in the Lab are used in an appropriate, and therefore safe, way. The UK Data Service Secure Lab provides secure access to data that are too detailed, sensitive or confidential to be made available under the standard End User Licence or Special Licence. Our specialised staff apply statistical control techniques to ensure the delivery of safe statistical results. Data accessed in this way cannot be downloaded. Once researchers are specially trained, they analyse the data remotely from their institutional desktop or in our Safe Centre. We provide access to statistical and office software to make remote analysis and collaboration secure and convenient. Our security philosophy is based upon training and trust, leading-edge technology, licensing and legal frameworks, and security policies and penalties.

There are many legal and practical issues to consider when using these secure data and when thinking about the kinds of outputs that can be released into the wider world beyond Secure Lab, once statistical disclosure checks have taken place.

Although our previous course served its purpose well, we felt that researchers would benefit from a course that contained more classroom exercises and enhanced participation to help reinforce the important messages being imparted.

Collaborative and efficient

Another driving force behind the redevelopment of this course was the opportunity to collaborate with other secure access providers. Several Research Data Centres (RDCs) operate across the UK, all of them offering access to sensitive microdata, though usually from a physical location rather than the remote access model used by the UK Data Service. Each RDC provides broadly similar training to users and there was an opportunity for greater efficiency in addressing the current process of requesting that researchers train through one RDC and then, if they want to access data from further RDCs, undertake a second, or even third, training course.

The new course was developed through collaboration between the UK Data Service, HM Revenue & Customs, the Office for National Statistics and the Administrative Data Research Network with the aim of making accreditation transferable between these services. Not only does this reduce the training burden on researchers who were previously required to travel to different locations to train for each service that they wished to use, it also makes for better use of resources for each RDC, which are no longer compelled to train researchers that have already been trained elsewhere.

Each service now delivers the same core modules on the legal aspects of using the data and on statistical disclosure control, ensuring that all relevant material is covered, whoever the trainer. Of course, there may be some service-specific practicalities for each RDC, but these can be delivered in a concise manner upon the users’ first visit to each site. Secure Lab is somewhat unusual in that there is no specific location and so a short online service-specific module is being developed to take care of this and researchers who have initially trained with another service will not need to travel a second time simply to complete this relatively short piece of Secure Lab specific training.

Feedback on this new training has been very positive, and we’re continuing to refine the course based on participant feedback and the practicalities of delivery. We sincerely wish this to be the best course that it can be so that researchers using these data are equipped with the right knowledge and attitudes to be able to work with these highly sensitive data safely.

So what’s involved?

The course covers many facets of working with these data, including trust in researchers, the ‘five safes’ of the data security model, relevant legislation, sanctions for security breaches, and statistical disclosure control. Researchers are encouraged to think about how all of this relates to their own proposed research. Moreover, the course includes numerous examples of (fake) outputs that participants have to evaluate as ‘safe’ or ‘unsafe’ for release, identifying any problems observed.

The approach helps to cement the core messages and gives the researcher a good idea of what those checking for disclosure issues are looking for and why. This is hugely important; by having a full understanding of the various ways in which potentially disclosive information may be presented, researchers can avoid submitting outputs for release from Secure Lab that will be returned to them for changes before they can be passed as ‘safe’. The ‘to-ing and fro-ing’ involved when this happens can take valuable time for both researcher and support staff, which we hope will be reduced by researchers receiving practical training in disclosure issues. If the researcher performs an effective initial check for Statistical Disclosure Control issues, then everyone benefits as outputs are likely to be released sooner.

Any researcher attending this course should leave with a greater understanding of the ‘principles-based’ approach to statistical disclosure control and an increased awareness of how to use these sensitive data as safely as possible, helping them to carry out their analyses in a safe and secure way. This can only be good news for everyone involved – data owners, RDCs, data subjects and researchers and in supporting impact development through supporting timely dissemination of the research.

Useful links

Find out more about Secure Lab: https://www.ukdataservice.ac.uk/help/faq/securelab and the ESSNet SDC: Guidelines for checking of output based on microdata research: https://neon.vb.cbs.nl/casc/ESSnet/guidelines_on_outputchecking.pdf

Data Impact blog

A new integrated approach: training researchers to use sensitive microdata

Tags