Citing your Data with the Secure Lab

Deborah Wiltshire and James Scott from the UK Data Service outline why citing secondary data is so important for the research community and what the Support Team at UK Data Service Secure Lab are doing to help ensure levels of citation increase

It has long been the standard in all disciplines to cite the sources of information that we use in our research, whether that’s a book chapter, journal paper or another source.

Researchers are taught very early on in their studies of the importance of correctly citing an array of different sources and of plagiarism and its consequences. Software such as TurnItIn are often employed by universities to help detect plagiarism, and key texts such as Cite Them Right set out very clearly how to cite.

However, what is often missing is clear instruction on how to cite secondary data such as those produced from social and business surveys, which has led to the very real problem of under-reporting of data in bibliographies.

So why should we cite data and why does it matter if we don’t?

We cite publications so that we can acknowledge information and ideas that are not our own and so that we can give due credit to the authors of that work. So why is secondary data different?

The answer is of course that it is not!

A dataset is a source of information that is not our own, and thus if we use a secondary data source then we should acknowledge it as such. If the failure to properly cite a publication is an academic breach, then failure to cite data should be considered in the same way.

Why is this so important?

In addition to acknowledging the work of authorship or ownership, there is another fundamental reason why poor levels of data citation is something that should concern the research community: impact. All parties in the research community must demonstrate their impact –individual researchers, research centres, data archives and the survey teams who collect and produce the data. Very simply – demonstrated impact equals continued funding!

Social surveys are a vital source of information – they inform policy and they highlight social inequalities and issues, but they are expensive and labour intensive to run.

One of the key ways that they can prove their impact is through demonstrating how many publications cite their data. And herein lies the problem: if data is not correctly and consistently cited, the ability of the survey team to prove their impact is greatly impaired and future levels of funding may be affected as a result. In an economy marked by tightened budgets, it is now more important than ever to ensure that researchers are citing the data they use.

Failure to do so risks a diminishing of the rich array of data available to us.

The data catalogue page for each dataset held by the UK Data Archive already includes the correct citation text that users can simply copy & paste. But at the Secure Lab, we’re keen to support the research community in data citation in a number of additional ways:

  • When training future Secure Lab users we now specifically discuss the need to cite the data
  • Output requests – we have now added data citation as one of the minimum criteria
  • User Guide –this now includes guidance on where to find the citation text
  • Online – we have now included a tick box reminder to the output request process online


Richard Pears and Graham Shields, 2005 Cite them right: the essential guide to referencing and plagiarism. Newcastle Upon Tyne, Pear Tree Books. This is produced for the social sciences which uses the Harvard referencing system.

Leave a Reply

Your email address will not be published. Required fields are marked *