Julia Kasmire has recently joined the UK Data Service to develop a training programme on New Forms of Data. But what exactly does that entail?
As a new addition to the UK Data Service, I would like to start by introducing myself. My name is actually Julia Kasmire and I am the Research Fellow leading UK Data Service’s new training programme on New Forms of Data.
“But what is ‘New Forms of Data’?!?” I hear you cry.
You are right to do so.
New Forms of Data is the cool new way to talk about Big Data. There is every reason to hope that New Forms of Data will be a more inclusive concept (size is not everything, man!) but only time will tell.
What’s that? You want to know a bit more about me?
Sure. I have studied linguistics, the evolution of language and cognition, and the evolution of complex adaptive systems in the context of industrial transitions to sustainability. This means I have degrees in social sciences, interdisciplinary sciences, and engineering.
As a result, I tend to skip around between topics, get exciting at shiny new inter-disciplinary things, and confuse people by assuming that everyone is an academic jack-of-all-trades.
My plans for the New Forms of Data training team at the UK Data Service include a lot of work in various media on shiny, new, computation-intensive data methods. Through this, I am hoping to show social science researchers how to use data in a two-pronged attack. One prong is about insight and the other prong is about validation.
Let’s talk about insight.
I find that data (big data, tedious data, real-time data, all that jazz) may be the best way to gain insights about rapidly shifting, modern topics. These topics can be hard to approach with traditional methods, especially when the results are most useful with a tight turnaround.
For example, consider the how the day-to-day movement of people changes in response to news stories about flooding, terror risks, contamination threats, toilet paper shortages, etc.
It would be incredibly useful to understand how people change (or don’t change) their behaviour and patterns of movement in these cases, especially for those creating policy around responsible reporting or attempting to provide useful safety messages.
However, such news stories often arise without prior notice and the apparently psychic reaction time of some people make traditional research methods impractical.
As an alternative, social scientists could use sources of movement data that are already being collected. To do so requires knowing what data is available, how to acquire, prepare, join, subset, analyse and other wise work with potentially vast quantities of arguably baffling data.
So, the first prong of my new forms of data training plans is about helping you gain knowledge of and skills to work with vast and baffling data.
What about validation? Always a tricky topic, made even trickier when your conclusions come from methods that are hard to inspect, replicate or see in action.
Fortunately, computational social science and data research methods can be shared in ways that allows other researchers to really see your processes, step by step, to fully understand your method and to replicate your results. Such validation works well when used in conjunction with more traditional social science methods to get multiple, potentially similar or at least complementary, results from the different methods.
Not only does this validate the results of both methods, but it can reveal insights and issues that are not apparent when only one method is used.
What might training look like?
As for how these two prongs will be prung (is that a word?), I plan to run some familiar online training resources, including webinar series, online case studies and interactive learning modules in R markdown or jupyter notebooks.
I also plan on exploring some of the wacky new online tools that the youth use… Anyone interesting in watching me live code on Twitch?
Ideally, the most popular topics will also be taught in face-to-face workshops, although this obviously depends on several factors that are currently uncertain at the time of writing.
When it becomes clear that it is safe to gather in groups safely again, we want to host learning workshops because most of us learn best when we have a chance to really roll up our sleeves, dig into a thorny problem, get a bit lost, ask for help, make some progress, make mistakes, decide some of those mistakes are actually pretty useful anyway, and then chat to people about what we have learnt. In the meantime, we hope to use the vast array of online communication, teaching and learning options to provide both prongs in a non-infectious environment!
She approaches this task as an interesting combination of thinking like a computer (essential for data sciences) and thinking like a human (essential for social sciences) in the context of complex adaptive systems. She is deeply committed to equality, diversity and inclusivity and is currently dabbling with stand-up comedy as a form of science communication.