As part of our continuing census series, James Reid introduces the opportunities, challenges and solutions when approaching different geographies in the UK censuses.
The UK decennial censuses of population are unique in terms of their geographical coverage and depth – not least because of the geographical detail at which they capture primary socio-economic and demographic data at a small area level.
Why ‘geography products’?
Census geography products (typically expressed as digital data for use in statistical and mapping software), supplement the census statistical information in published reference tables (the ‘aggregate’ data – which most people would consider ‘the census’ i.e. tables of numbers) and provide a wealth of associated datasets which describe for example the relation of one geographical area to another and cross-walk areal codes so that it is possible to traverse the census geography at different spatial scales from street level to country level.
The core geography census products are effectively the areal ‘buckets’ into which the aggregate (population) count information can be poured for purposes of mapping, visualisation and statistical analyses.
Many users of the UK censuses regard this so called ‘small area’ information as an unparalleled source for base-lining population characteristics and for ensuring information richness at a scale that provides intra and inter area comparison at a sufficiently localised scale that enables policy and planning interventions to be meaningfully expressed ‘on the ground’.
Local councils for example use the census information at small area level, inter alia, to forecast school enrolment populations, estimate demand for local health provision and to inform infrastructure provision for housing demand.
The UK censuses of population are the only reliable national surveys that offer such a fine grained and data rich description at small area level – the mosaic of which covers the entire UK territorial area. These small areas comprise the individual jigsaw puzzle pieces that together provide the picture puzzle view of contemporary society and allow for macro aspects of household and population changes to be discerned e.g. ageing population structure, the rise of single person households etc.
Not just one census
In many respects the census is thus a ‘bottom up’ exercise that has its physical expression as the ground truth at small area level.
What do we mean though by ‘small area’ as this is a rather subjective term and one that varies by context and over time? The UK censuses, whilst occurring simultaneously in all parts of the UK (except this time when Scotland will be conducted a year later!) is actually conducted separately by three different agencies – the responsible body in England and Wales is the Office for National Statistics (ONS), in Scotland the National Records of Scotland (NRS) and in Northern Ireland the Northern Ireland Statistics and Research Agency (NISRA).
As you might suspect, that means there are inevitably subtle differences in e.g. terminology – the smallest census geographical unit of recording in England & Wales and Scotland is the ‘Output Area’ whilst in Northern Ireland for 2011 the comparable geographical unit was referred to as ‘Small Areas’!
The challenge of standardization
Each agency defines the census geographies used for their country and the methodology and thresholds it uses to derive them.
All however, attempt to capture population characteristics at a geographical level which ensures the anonymity of individual census respondents – that’s always been the case with the census but in an era of Cambridge Analytica/Facebook privacy paranoia it’s not surprising that the security of peoples responses to quite personal questions cannot be stressed enough.
The methodology for the construction of these areas has varied over time but in the last few censuses a fundamental objective has been to try and maximise consistency both across space and over time – the Output Area (and its equivalents) now provide the basic geographical unit for describing areal characteristics and are the ‘building block’ against which all higher geography types are derived.
The rationale for this is to provide a convenient mechanism by which new geographies can be constructed via aggregation, whilst at the same time allowing for flexibility in comparing areas and their changes over time – one of the outstanding features of UK geography (not just census geography), is its mutability over time which complicates temporal analyses.
A range of techniques have been proposed to try and counter the comparability problems that changing geographies present to analysts, from grid count based solutions to dasymetric and cartogram mapping techniques. In any case, cross comparison between censuses remains an ongoing topic of research and any intercensal analysis must account for the fact that UK census geography and the definition of ‘small areas’ has varied over time and represents as much a convenience mechanism for conducting the census as for definitively expressing geographical variation across space in a consistent and time comparable way.
Interested readers may wish to go and Google the term ‘MAUP’ although a word of caution – that’s one for the geography nerds! The take away message is that census users of small area data must temper their conclusions by recognising that there is always a degree of ‘apples and oranges’ in comparability, no matter how sophisticated the approaches adopted to redress the small area comparability problem.
Challenges and solutions
As the foregoing implies, UK census geographies provide a range of issues and challenges to prospective users.
Firstly, the fact that there are three separate, independent but (mostly) cooperating census agencies adds to the complexities – each has its own principal census website, its own dissemination mechanisms, its own publishing time-scales and more importantly, their own individualistic approach to data capture and output.
A significant feature that the UK Data Service census team attempts to do is to provide a single point of entry to all official census geography outputs (and for that matter all census products, not just the geographical ones) and in so doing we commit significant effort to managing and ingesting data from the various agencies, quality assuring and processing data.
An illustration of value adding by quality assuring is the issue of so called ‘sliver’ polygons – often artefacts introduced at creation time due to the semi-automatic approaches used to generate Output Areas (OAs).
Slivers are effectively very small polygons enclosed within the larger parent polygon which represent the OA itself. Often not apparent except at very large-scale these small residual artefacts can become problematic as OAs are aggregated to produce higher geographies and they can become visually intrusive when maps are produced based on them. They also produce incorrect polygon counts for area types as the slivers are regarded as legitimate areal features in most GIS. We have developed an ingest workflow which automatically tests for slivers and removes them so that the derivation of the higher geography types do not suffer from excessive polygon counts due to their inclusion.
Additionally, the data are ‘value added’ by providing a functionally rich online tool suite as well as through the provision of additional ‘harmonised’ data products that provide UK wide census products for key census geographies.
The UK Data Service catalogue is the principal data-finding aid provided by us to allow casual users an intuitive entry point to locate the myriad of census outputs. Users more au fait with the range of census products and services available will likely short-cut direct to the various census tools themselves and use the bespoke filtering and search tools provided to refine their data requests.
Discovery is of course only the first step towards use.
Users face further challenges of
- format usability – can they deal with the format the data is delivered in?
- data volumes
- downstream tooling – does the use have the required software to use the data?
- and even with adequate metadata – can the user actually meaningfully interpret and use the data as supplied?
We offer a range of approaches to dealing with these issues which ultimately make the data more useful, saves users time and does a lot of the unglamorous but essential data cleaning and verification for them.
As census day for England, Wales and Northern Ireland passes in 2021 (and in Scotland approaches in 2022), we will continue to provide the tools and support to ensure these quintessentially vital socio-economic data are made broadly available to all our users with the minimum of friction.
About the author
James Reid is Co-Investigator on the UK Data Service Census and oversees the delivery of the geography outputs of the census. James has over 25 years experience in the geospatial industry working in both the public sector and academia. He is a Board member of the Association for Geographic Information (Scotland) and sits on it’s GEMINI Working Group defining the UK’s formal geospatial metadata standard.