recomputation

Linux Distribution for Recomputable Experiments

I had a great time yesterday at the Workshop for Research Software Engineers put on by the Sustainable Software Institute and hosted at the Oxford e-Research Centre. While there, I had an interesting conversation with Ian Gent, writer of “The Recomputation Manifesto”. You should head over to the site and read some of Ian’s articles, particularly this one, as he’s put quite a lot of thought into the topic. The gist of the idea is as follows (apologies Ian for any misunderstandings, you should really go to his site after this one): 1) computational experiments are only valuable if they can be verified and validated, 2) in theory, it should be fairly easy to make computational science experiments, particularly small-scale computer science experiments, perfectly repeatable for all time, 3) in practice this is never/rarely done and reproduction/verification is really hard, 4) the best or possibly only way to accomplish this goal is to make sure that the entire environment is reproducible by packaging it in a virtual machine. I have a few thoughts of my own on how we can better accomplish these goals after the break. The implementation of these ideas should be fairly simple and basically add up to extending or developing a few system utilities and systematically archiving distributions and updates on a service like figshare, but I believe that the benefits to the cause of improving the reproducibility of computational experiments would be enormous.