Spotlight on… Reproducibility

Dr Julia Kasmire recently ran a reproducibility bootcamp through the NCRM. Here she reflects on the importance of reproducibility, the ground covered during the bootcamp, and the response of participants.

Reproducibility. Huh! What is it good for? Absolutely everything.

In only five short weeks of boot camp, we covered a heck of a lot of reproducibility ground ranging from high-level theory to practical recommendations. Many of the ideas were totally new to the participants, although others were familiar but were only now put into the context of reproducibility.

Something that struck me over the course of the bootcamp was that much of the material was incredibly useful for researchers, no matter their level of experience or whether they were working independently or in teams.

Thus, many of the topics we discussed, skills we explored and tools we examined were as applicable to early career researchers as they were to established academics and had potentially to improve the work of individual projects or collaborative efforts.

Importantly, I am not the only one to think this as participants reported that they could see how the new information might apply to a wide variety of their day-to-day tasks and how they might be able to develop their skills further over time.

Feedback from many of the participants also included the common lament that they had not yet been formally taught any of this and that opportunities to acquire this sort of practical knowledge and experience were limited. Even though so much of the material seemed useful, participants reported that some of it was completely novel to them prior to joining the bootcamp.

While I am always pleased to introduce people to helpful ideas, I do wish this information were more widely available or easily accessible as there are few researchers out there that would not benefit from it in some way.

Even me!

I too was learning throughout the bootcamp as participants asked insightful questions, shared relevant experiences, and suggested new tools or approaches that I had not previously considered.

I honestly believe that researchers can only do their best when they are given the resources they need and the opportunities to explore them.

In this way, everyone can discover for themselves what works as well as having the time to reflect on how they can incorporate their new insights into their practice.

Transparency and curiosity

A huge part of reproducibility is about transparency; a piece of work cannot be reproduced if it is not clear:

how the work was done
why it was done in that way
and with what tools it was completed

However, such transparency can be deeply uncomfortable. Being open about the decisions made, steps taken, resources used, and contribution of collaborators enhances the reproducibility of the work but also opens it to being criticised.

As a result, transparency can feel like vulnerability.

Like all people, researchers tend to avoid vulnerability; they might do so out of a desire to achieve more, or because of a fear of failure, or even simply an enthusiasm to finish projects and move on to the next. It seems to me that the answer is curiosity. Researchers benefit from being curious about:

what others have done,
how others respond to their work, and
whether the way they have done something is the best way to do that.

When researchers are genuinely curious, that curiosity can open them up to a better understanding of the knowledge and practices of others, can drive them to seek out how their work is interpreted by others, and can motivate them to engage with how they work is incorporated into or contrasted with existing bodies of knowledge.

The value of curiosity obviously applies to the outputs of a researcher’s work, but it can also apply to the processes by which researchers work.

Curious researchers can apply their curiosity inwardly. For example, researchers would benefit from being curious and asking questions such as:

How do I capture my insights and ideas? How else might I capture them?
How do I document the decisions made over time on long term projects?
What is a fair distribution of work on collaborative projects and how are contributions tracked?
What tools are available to help me capture, document, or track the less visible parts of research work?

This curious reflection can also feel quite vulnerable because it entails being open to the idea that there are diverse ways to do things, that assumptions may not be accurate, that current process may not be equitable in the way opportunities, responsibilities or credit are shared across a team.

Reproducibility and pressure

We all feel pressured in modern research contexts: pressured to produce, pressured to portray confidence and authority, pressured to become an expert and gain respect, pressured to do it all faster and better than everyone else.

Unfortunately, such pressure might seem to allow no time for the meticulous, sustained and detailed effort that makes a project reproducible nor does it seem compatible with the kind of curiosity and reflection that a transparent and reproducible project requires.

In fact, making research more reproducible can save time.

For example, if you make your research accessible then you can save time on emailing those who want a copy of your data or code, while if your work is transparent then you save time getting others up to speed on your methods and processes. And just think of all the time you save if work from others is equally accessible and transparent!

Likewise, curiosity and reflection can improve efficiency.

When you give yourself time to try out new processes, you may find that you can automate some of the things that you currently do manually, can improve the accuracy of things that you used to accept as unavoidable errors, or can simplify steps that you used to find difficult.

Equally, when researchers reflect properly on how they act and collaborate, they may find ways to make the work more equitable which allows everyone to work to the best of their ability.

Although it was only five weeks, I hope that the reproducibility bootcamp provided time and space for participants to, however temporarily, reduce the pressure that was stopping them from engaging with reproducibility in a meaningful way.

What happened during the bootcamp?

We started off by exploring what reproducibility is and why scientific research currently finds itself in a crisis of reproducibility. After all, what is science when a key step of the scientific method (the one about allowing others to recreate your experiments) becomes impossible?

We also explored what factors and pressures may be driving the crisis of reproducibility and how various interested parties have so far attempted to address that crisis. Hint: making “soft” sciences more like biology and chemistry will not work since these too are caught up in the crisis of reproducibility.

Following that, we dove into collaboration and communication, with a focus on the implications for reproducibility.

While it may not seem obvious, how we communicate with each other is often not as clear or well-documented as we might think it is. That means that groups may be wasting time, working at cross-purposes, and failing to write down things that later turn out to be important for reproducibility.

We specifically explored the skills, tools and practices that will help collaborative research become more transparent, well-documented and reproducible so that researchers can collaborate with others (and with themselves in the future!).

We then discussed how to do the challenging work of documenting the mind, mental processes, workflows, and research habits. Although tricky, there are several useful ideas that can help researchers improve their skills in this area.

For example, exploring various tools to capture ideas (notebook by the bed, anyone?) seems obvious but is often overlooked in an environment where some people subconsciously believe that “I am clever and so will remember this brilliant idea without writing it down!”

Other overlooked skills include setting aside time to look through the captured ideas, reflecting critically on ideas to turn them from vague plans into concrete and scheduled steps organised by priority or urgency, and taking time to reflect on whether current work practices are effective or helpful. Documentation and organisation turn out to have very positive impacts on reproducibility as well as mental health, teamwork, and even career progression.

We discussed all things data in relation to reproducibility.

We explored esoteric theories about the shape of data; how many of you have ever sat down and really considered whether you should restructure your data differently for analysis or presentation or whether you should amplify the structure present within semi-unstructured data? Beyond the conceptual, we also discussed several ways to share data and the role of synthetic data to improve the reproducibility of projects with disclosure risks.

Finally, we discussed publication.

or traditional publication, this meant understanding how to use digital object identifiers and supplemental resources to improve reproducibility. On a more contemporary publication approach, we discussed interactive web documents and dynamic analysis tools. We even discussed the value of “far out there” ideas like academic stand-up comedy, live streaming your coding sessions, or interpretive dance.

Don’t worry; no one was forced to dance against their will.

Reflections

I am pleased that this five-week bootcamp was an opportunity, for me as well as the participants, to reflect on research practises and the reproducibility of those practices.

Further, by reflecting as a group, we gave ourselves permission to throttle back on the pressure, to learn from each other, to consider new ways to do things and to be curious.

Reproducibility is important and looks set to grow more important as society demands more transparency and evidence when making policy decisions and plans.

I hope to take what I have learned from running this five-week boot camp forward. I want to run more such boot camps, to run targeted training sessions that drill down into specific concepts, and to encourage researchers to take time to learn and explore relevant skills and tools.

I want to start a dialogue with the wider scientific community about the value of reproducibility, curiosity, and reflection.

I want to see researchers who are ok with being vulnerable and who are kind with the vulnerability of others.

About the author

Julia Kasmire researches and teaches on how to use new forms of data for social scientists with the UK Data Service and the Cathie Marsh Institute at the University of Manchester.

She approaches this task as an interesting combination of thinking like a computer (essential for data sciences) and thinking like a human (essential for social sciences) in the context of complex adaptive systems. She is deeply committed to equality, diversity and inclusivity and is currently dabbling with stand-up comedy as a form of science communication.

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Data Impact blog