How (mis)representing ethnicity risks distorting understandings and responses to inequalities through data

Hannah Manzur, from the Violence and Society Centre at City St. Georges, examines how UK ethnicity data can (mis)represent lived identities.

Data is an immensely powerful tool for revealing and, when combined with the right motivation and action, reducing inequalities in society.

From evidencing the ethnicity pay gap to tracking changes in hate crime, nationally representative data makes inequalities visible by translating individual experiences into actionable evidence that can be used to affect change.

But what happens when data obscures, rather than reveals, inequalities? What is lost or misrepresented when we translate complex individuals and experiences into statistics? How truly representative is nationally representative data, and for whom?

Ethnic categorisation in national statistics may seem like more of a technical question for data collectors and statisticians, but it actually cuts to the root of a much wider debate about identity, representation, and equality.

For decades, UK national data on ethnicity has relied on grouping people into five main categories: White, Asian, Black, Mixed, and ‘Other’.

Conceptual debates on the meaning of ethnicity have long highlighted its multi-dimensionality and complexity. However, the way we categorise people’s ethnicity has remained largely unchanged for nearly fifty years and studies on their effectiveness for measuring ethnic inequalities are virtually non-existent.

In 2021, a petition to the UK Parliament calling for ‘Latinx/Hispanic’ to be included in the UK Census highlighted the importance of representation in national survey data, asserting, “We are not white, black, Asian, and certainly not ‘other’”.

While using ethnicity data in my PhD on violence and inequalities, I found myself increasingly questioning how well these categories captured people’s own self-identities and experiences.

I spoke to students who also felt, like myself, that they didn’t quite ‘fit’ into these categories, despite being the national metric for understanding ethnic differences.

It seemed that in an increasingly diverse society, the need for consistent, neat categories was leaving many people feeling misrepresented and alienated from the data and statistics that claimed to represent them.

The challenge of misrepresentation

In our paper, “(Mis)Representing Ethnicity in UK Government Statistics and Its Implications for Violence Inequalities,” we brought the challenge of misrepresentation to the data itself.

Using the Crime Survey for England and Wales (CSEW), we demonstrated that these issues were more than theoretical. They resulted in distinctive ethnic groups being miscategorised, conflated, and ‘Other’-ed.

In particular, our research challenged the idea of ‘Mixed’ as a cohesive ethnic group rather than a characteristic of ethnic identity, distinguished between different ‘Asian’ groups, and better revealed Arab/MENA and Latinx/Hispanic otherwise grouped as ‘Other’.

But more than just critiquing the current categories, we also offered an alternative solution that bridged statistical needs, conceptual critiques, and lived realities.

By comparing our new approach to the standardised ethnic categories, we tested whether the story of violence and ethnic inequalities changed based on how we defined ethnic groups. The story, we found, can change quite a lot.

We found that:

South Asian respondents reported greater fear of violence than East, South East, and Central Asian respondents (otherwise conflated as ‘Asian’).
‘Mixed’ ethnicities had significantly higher rates of victimisation than singular/non-mixed ethnicities.
Latinx/Hispanic (currently excluded as a category) and Arab/MENA respondents had higher victimisation and fear than the ‘White’ or ‘Other’ categories otherwise subsumed within.

If the groups most affected by violence are hidden within broad or misrepresentative categories, they may also be missed in terms of how we respond to violence in policy and practice.

Our recommendations

Violence reduction strategies, victim support services, policing practices, and resource allocation all rely on accurate and representative evidence.

When data obscures rather than reveals inequalities in violence, violence reduction efforts can end up reproducing rather than reducing these same inequalities.

We therefore recommend several key changes in how we categorise ethnic groups and capture ethnic inequalities:

Introduce ‘Latinx/Hispanic’ as a response option for ethnicity.
Define ‘Mixed’ as a dimension of ethnicity, rather than a cohesive ethnic group.
Distinguish between ‘South Asian’ and East, South East, and Central Asian groups.
Review the conflation of ‘Arab’ with ‘Other’ ethnicities.
Introduce ‘write-in’ options for ‘Other’ ethnic groups.
Enable greater flexibility, usability, and accessibility of ethnicity data.

How is our work leading to impact?

In translating these recommendations into real change, we engaged with data producers reshaping how they designed and implemented ethnicity measures to capture inequalities.

As well as consultation with the Office for National Statistics (ONS) on future Census categories and inclusion in evidence to Parliamentary select committees, our work is also set to impact how national-level data on UK media diversity and representation is captured.

Diamond Demographics, an industry-wide monitoring system that collects diversity data on UK TV and media content, working with the BBC, ITV, Channel 4, Sky and other leading broadcasters, were particularly interested in our findings.

When redesigning how they collect data on ethnic representation in UK media, they not only plan to integrate our recommendations, but through conversations with our team, are pushing these recommendations to improve the specificity and capturing of ‘Mixed’ identities even further.

Diamond Demographics offers a promising case study for how research that challenges entrenched data practices can help transform national-level data in ways that better capture the diverse experiences of the people it claims to represent.

How we measure ethnicity matters because once a category becomes the basis for comparison, it starts to shape the stories we tell about inequalities. A statistic is never just a statistic, it is a representation of social reality, filtered through a set of decisions about how to count and classify people.

Revealing and reducing inequalities requires challenging the ways we represent people through collaborations between both data users and providers. Because how people are counted in national data becomes who counts in shaping policy and practice for tackling inequalities in society.

You can read about this research, co-authored by Dr Manzur, Dr Blom, and Dr Barbosa, in our open-access paper.

Meet the author

Dr Hannah Manzur is a Research Fellow at the Violence and Society Centre and works with the VISION Consortium on Violence, Health, and Inequalities.

Her research focuses on violence victimisation at the intersection of gender, ethnicity, and migrant-status, and her doctoral research investigated inequalities in knowledge production on violence through the Crime Survey for England and Wales.

She previously worked for the European Parliament and European Women’s Lobby as a policy advisor and consultant on gendered violence and inequalities.

Comment or question about this blog post?

Please email us!

Data Impact blog