So, first things first, how would you define computational reproducibility?
Brian: “In general, computational reproducibility means you can run the same code with the same data, and easily get the same result. People often forget the ‘easily’ part! One of the goals I always aim for is to think about how I can make something as user friendly as possible. If you reduce the complexity, there is less chance of someone using the software incorrectly or in a way that’s unexpected.”
Kitty: “You could think of it like this: say you really liked a cake someone baked, and you wanted to bake the same cake. If they only told you the ingredients, that they mixed them, put them in a tin and baked them, you probably wouldn’t get the same cake, right? Whereas if they give you step-by-step instructions, tell you the brand and quantity of each ingredient, and whether the oven is fan assisted, then you’re much more likely to get the same cake. Computational reproducibility is the same kind of concept. When you publish a study, alongside that, you want to share the data you used and describe the computational steps in enough detail, so if someone else were to use your data and follow the steps, they would be able to get the exact same results.”
Why did you decide to enter the competition last year?
B: “I had a preprint under review at the time and I’d just started my PhD at Imperial. The software I submitted for the competition was actually a project I had done at my previous job, where I was a bioinformatician at Icahn School of Medicine at Mount Sinai in New York City. I made this software to automate fine-mapping to identify causal variants within Genome-Wide Association Studies. So the timing made sense, I had already spent a lot of time making the pipeline reproducible and putting it into an R package. I saw the competition and figured it was a good opportunity to showcase my work. Also, I think it’s great that this competition values reproducibility. Everyone knows there is a reproducibility crisis in science, and people have different definitions, so it has become a bit of a buzzword. It’s helpful to have standards in place to make sure things are reproducible.”
K: “The timing made sense, as I’d just published a preprint where I had developed an R package. Also, I think that computational reproducibility should be standard practice for computational studies, so this competition was a great opportunity to promote that.”
What was the application process like?
B: “One of the appeals for me was that it didn’t require a lot of work, certainly less than submitting a grant or a paper! I thought that was great because we’re all busy people. It was nice to have a short and succinct application form that gave the judging panel what they needed without taking up too much time to prepare.”
K: “The process was really straightforward, and anything I wasn’t sure about, I’d look at GitHub repositories of R packages that I liked and used, to make sure I was including everything I should be.”