It’s not such ‘A Fair Way Off’ to process open data: Facing requirements on open access and the FAIR data principles.

Sebastian Netscher from GESIS – Leibniz-Institute for the Social Sciences explores the FAIR data principles and how they can support increasing transparency in research.

In the context of Open Access to research data – also referred to as Open Data – the FAIR data principles (Wilkinson et al. 2016) are being integrated more and more often into research projects (for example see Guidelines on FAIR Data Management in Horizon 2020). In requiring that data be

findable (F),
accessible (A),
interoperable (I) and
reusable (R),

funders and journals expect that facilitating Open Data will (inter alia) increase transparency in research, foster researcher innovation and scientific cooperation (both international and cross-disciplinary), and ensure efficient use of public funding. However, the requirements for producing data that meet all of the FAIR data principles can be challenging for many researchers.

Much can be made clearer by remembering three simple statements about the FAIR data principles:

FAIR data is not a synonym for Open Data

While Open Data should be available for everyone and all purposes, access to FAIR data can be restricted, e.g. due to legal issues and the obligation to protect personal information, and still be FAIR. Second, FAIR does not establish what data must look like.

FAIR data principles are best understood as a multi-dimensional continuum

“There is no such thing as ‘unfair’” (Barend Mons et al. 2017), instead, the FAIR data principles are best understood as a multi-dimensional continuum, composed of the four (more or less) independent FAIR facets. For example, findability is not a fixed feature of data. The ‘findability’ of research data can be improved, for instance, by mapping domain-specific standardised information about the data (i.e. metadata) to other metadata standards of other research domains to increase metadata distribution.

FAIR is less a matter of the data itself than it is of the data’s metadata

While the FAIR data elements remain the “ultimate goal”, having “FAIR metadata is of very high value in its own right” (FORCE11). In other words, although the data itself might not be accessible, the capacity to find information about the data and why its access is restricted online achieves one aspect of FAIRness.

From these three statements, we can draw two simple conclusions when supporting researchers who process Open Data on the base of the FAIR data principles.

The first is that researchers should focus on processing shareable data that is as open as possible, enabling the widest possible user community to continue working with the data.

The second is that researchers whose aim is to make their data FAIR should rely on existing research infrastructures, i.e. data archives and repositories, and their standards and guidelines. In general such institutions:

assign a unique and persistent identifier to the data and increase findability by registering data in online data catalogues as well as ensuring citable data,
manage data access in the long run,
facilitate interoperability (of metadata) by mapping and harvesting metadata, and
license data appropriately and thus support its reusability.

In conclusion, realising the FAIR data principles depends not only on individual researchers processing shareable data, but rather it is the responsibility of the whole research community to make metadata and data as FAIR as possible to facilitate their reuse.

About the author

Dr. Sebastian Netscher – CESSDA Training at GESIS – Leibniz-Institute for the Social Sciences

Data Impact blog