Atlas of Longitudinal Datasets: A New Tool for Data Discovery

 

Georgina Brown and Siann Hirani introduce the Atlas of Longitudinal Datasets.

 

 

By following groups of people over time, longitudinal datasets are a powerful research tool that have the potential to offer valuable insights into areas such as human development, population trends, health outcomes, and societal change.

However, longitudinal data are often underused because they can be difficult to discover and access. But here comes the Atlas of Longitudinal Datasets – a free, user-friendly platform designed to increase the discoverability of longitudinal datasets from across the world.

The Atlas contains information about longitudinal datasets across a variety of topics and study designs. It includes key details about datasets such as a sample description, data types collected, data collection methods, lived experience involvement, and more!

Users can search for datasets, filter their searches to narrow down the number of datasets relevant to their needs, compare features and save their favourite datasets to return to later. Over 2,000 datasets worldwide are currently included on the Atlas, and more are being added daily!

Importantly, the Atlas also provides information on data sharing policies and directs users to relevant data access guidelines, sharing platforms, and contact points. This helps users find the data they need, supporting the discoverability and accessibility of longitudinal data.

Data accessibility in the UK and worldwide

Overall, 76% of the datasets on the Atlas are accessible and of those, 18% host data directly on their websites or through data sharing platforms, and 58% are accessible by contacting the study team or corresponding authors usually found on publications.

However, data access methods differ depending on where the datasets are based.

Datasets based in the UK:

A pie chart representing where datasets are based  in the UK and how accessible they are: 
- accessible by contact with study team: 56%
- accessible via study website or data sharing platforms: 26%
- not accessible: 18%

Datasets based outside of the UK:

A pie chart representing where datasets are based  outside the UK and how accessible they are: 
- accessible by contact with study team: 58%
- accessible via study website or data sharing platforms: 17%
- not accessible: 25%

UK-based datasets are more likely to be accessible via study websites and data sharing platforms than non-UK-based datasets. Datasets based outside of the UK, on the other hand, tend to rely more on contact-based data access or are not accessible.

How do data custodians share their data?

Study websites or data sharing platforms

These may offer direct downloads of data or require user registration and login on Trusted Research Environments (TRE). Many datasets require interested data users to submit an application or research proposal, often found via the platform or study website, which is reviewed by the data custodians.

Access can also be restricted, such as for research use only. In some cases, access to data involves fees, which vary across datasets and data variables collected.

Contact with study team

Unlike datasets available through data access platforms or study websites, those that provide an email contact for data access requests often offer less readily available information, such as data sharing policies, eligibility criteria, or data dictionaries.

The lack of information can make it challenging to access the data, ascertain the steps involved in requesting data, and any associated fees. Also, providing only an email contact can make data access particularly challenging when contact details are outdated or no longer monitored.

The UK Data Service: a key platform for data access

One of the data repositories that hosts many longitudinal datasets is the UK Data Service (UKDS). The UKDS provides access to economic, population, and social research data in the UK. Among these are several important UK longitudinal studies, such as the 1970 British Cohort Study (BCS70), Understanding Society, and the National Child Development Study (NCDS).

The UKDS outlines the access conditions for each dataset, simplifying the process of accessing data. After registering for an account, users can download or request access to data. Some datasets are only available to certain groups, such as researchers, and restrictions are specified for each dataset.

Through enhancing data accessibility, platforms such as the UKDS are incredibly valuable.

Additional data sharing platforms used across the world include the Database of Genotypes and Phenotypes (dbGaP), European Bioinformatics Institute and the National Institute of Mental Health Data Archive (NDA).

Data access difficulties and improving data access

While it is fantastic that so many datasets on the Atlas are easily accessible, many are not, posing a barrier for those interested in collaboration or using existing data.

Unfortunately, around a quarter of all datasets on the Atlas are not accessible. Most often, this is because there is no information provided about data access, particularly for legacy datasets or studies that were completed a long time ago.

So how can we improve transparency and ease of data access?

Make more datasets available through repositories or TREs

Data repositories, including the UKDS, include clear guidelines on how to access data and provide a direct route to accessing data, making it easier to use existing longitudinal data. This is especially valuable for researchers outside the UK, where data tends to be less accessible.

Provide data availability statements

If datasets cannot be shared on a repository, providing any information on data access is useful. Thankfully, this is becoming more common thanks to the increasing inclusion of data availability statements in academic articles.

Increase transparency

Providing more detailed information on data access eligibility and sharing terms is vital to improving transparency and clarity to data sharing.

Conclusions

The Atlas of Longitudinal Datasets enables researchers, funders, governments and the public to discover a wide variety of existing longitudinal datasets from around the world, of different designs and topics.

Tools such as the Atlas and data repositories like the UKDS enable researchers to maximise the potential of these valuable longitudinal datasets.

However, while the level of access to datasets on the Atlas is encouraging, particularly in the UK, we need to improve strategies and services for improving data access.

Visit the Atlas of Longitudinal Datasets.


About the authors

Siann and Georgina are Research Assistants at the Social, Genetic and Developmental Psychiatry (SGDP) Centre, King’s College London. They have been working with the team on the Atlas of Longitudinal Datasets since January 2025.

 


Comment or question about this blog post?

Please email us!