Professor David Martin has led the way in modelling populations in geographical information systems.
His career with the UK Data Service and its predecessors has been a long and rich one. In the first of two blog posts, we catch up with him to talk about his key achievements and contributions to the field.
Much of your career has been driven by the census – what is it that fascinates you about census data?
For me, the fascinating thing about the census has always been that it’s the only source of information that provides a really geographically-detailed picture of the UK population – the single most comprehensive source of data we have, in fact.
That matters, because it provides the denominators for everything else that we’re interested in.
We may be exploring detailed ethnicity, people’s journeys to work or migration patterns – and the data on all these topics (and many more) are there. It’s the comprehensive nature of the dataset that means we can do all these things with it in a completely unique way, which I find incredibly exciting.
How and why did you start working with census data?
I’m a geographer and have been fascinated with the census ever since I started: I can even remember sitting in my school library, in the early 1980s and being fascinated by an early census atlas of London (based on 1971 data!)
My PhD was about trying to build a mapping system for healthcare data when there weren’t really any established computer mapping tools around. I rapidly discovered that I’d need denominator data and it was going to have to come from the census.
‘Number of people with health events’, ‘number of people who speak another language’ or ‘number of people who are unemployed’ can’t be properly interpreted unless we have the denominator – the context within which the data lies.
Most of the time we need a reasonably sophisticated denominator. It’s not a great deal of help knowing the crude rate of a disease.
To plan an intervention, you would need data about age, sex etc. With that context, we can then standardise the data and understand how the disease is affecting people, for example, in more deprived areas or amongst people with particular ethnicities. Understandably, these are questions we’ve needed to ask during the pandemic.
For my PhD research, I had to discover a way to build the denominator I needed at a small area level. Consequently, my PhD shifted from the original topic and became focused on the best way for mapping the population. I’ve basically spent the last thirty-five years doing the thing!
Along the way, I got involved with the Office for National Statistics (ONS) and began persuading them to use new, improved methods for census geography design.
During the 1990s, I built a prototype for a different set of small areas.
In the past, census areas were basically drawn on maps by hand, were then computerised and filled in with enumerated data. There were various problems with the process, including ‘holes’ in the maps where data for small areas couldn’t be published as it risked breaking confidentiality for the people living there. It was a very hit and miss way of working.
I was invited to become a guest researcher at the ONS and suggested that Geographic Information Systems (GIS) – very much a new idea at the time – could be used to rethink the census small areas. I coded a prototype automated design system, which let them work with a whole host of constraints to devise new population geographies and characteristics.
Despite some nervousness from the ONS about using this new system, it was finally given the go-ahead for the 2001 census. And it worked.
For the first time we produced a set of ‘census output areas’. These new output areas were designed so that were no suppressed areas in the outputs and that individuals’ confidentiality would be maintained by design.
What I hadn’t anticipated at the time was how popular this approach would be – everyone was getting into GIS and wanted to work with small area data.
Confidentiality measures, however, meant that a lot of that data still couldn’t be published within these output areas, so the ONS had the idea to use the same approach for something a bit bigger –‘super output areas’ to marry the needs of both confidentiality and small area outputs.
The ONS have said that the output areas, recoded from my original version, were one of the most successful innovations from the 2001 Census.
That success has meant that they have continued to be used for the 2011 and 2021 censuses. The super output areas have even proved their worth when monitoring the pandemic.
And all starting from the prototype I produced in 1996!
That’s an impressive contribution to the field – you changed the geography of the census!
Yes, I suppose you could say that.
If you look at any map of England, Wales or Northern Ireland census data (not Scotland, as they have their own system), what you’ll see is effectively the boundaries that my work on small output areas generated (albeit with lots of work from others along the way!)
Talking of changing geography, you’ve been developing another approach, haven’t you?
During my PhD, I got very involved with building gridded models of population.
Using a grid mapping approach, the total population is reallocated into regular grid squares on the map. Squares covering unpopulated areas are left empty.
It’s a really simple idea, but quite hard to bring off.
What it offers, though, is much more realistic population densities in different contexts where people live, whether that be city centres or villages.
Gridded maps also incorporate genuinely unpopulated spaces, meaning there’s no risk of trying to attach rates of unemployment or deprivation to places where there aren’t people like mountain sides or open countryside. For these reasons are better suited to studying change over time and linkage with data from environmental models.
Quite a few countries around the world routinely think of their census data as a gridded dataset, which I find fascinating, as it shows that other countries think about their social data very differently to here.
Northern Ireland has produced gridded population data from the census since 1971. England and Wales did it in 1971 but it wasn’t continued meaning there’s been an amazing loss of potential insights.
That said, the ONS have committed to doing some gridded data outputs from the 2021 Census – the first time in over fifty years that we might get some official gridded outputs for England!
In part two of this profile of Professor Martin, he discusses time-space population modelling, census advocacy and his plans after stepping back from the UK Data Service.
About the author
David Martin is a Professor of Geography at the University of Southampton and a Deputy Director of the UK Data Service.
He has been involved in several major data initiatives, including as Director of ESRC’s former Census Programme. His research in geographic information science led to development of the system of census output areas currently used in England, Wales and Northern Ireland. David was awarded an OBE in 2019 for services to Geography and Population Studies.