One of the projects I worked one, CloVR tried to help the reproduciblity problem by a including the run for a pipeline in the output. When I worked on it, I don’t think we got to the point of automating reruns but the information was there to do it if so desired.
But CloVR was for existing, standardized, pipelines rather than exploration. For anyone that doesn’t know, bioinformatics is an incredibly interesting field and a lot of opportunities exist there.
I did some bioinformatics in college and would love to do some work in the industry - in fact I follow a lot of bioinformatics projects online, I just haven’t figured out how to get into the field. Do you have any advice for starting?
In my experience, university professors often have projects that need skills that they & their students don’t have. I’ve had very good luck getting interesting contract work using a local university’s online classified ad system. My contracts were all for at least several weeks of work, but it’s probably possible to find smaller things if you want to just work 1 day a week and keep your day job.
I’ve thought about this stuff a lot over the years.
I talked at Pycon with Anthony Scoptz, the author of xonsh, a fancy shell that aims to seamlessly integrate bash and Python. One of the things he told me is that his true goal is to trick scientists into making reproducible analyses by giving them a tool that’s so useful to them that they can’t help but want to use it, while imbuing the tool with powerful history features similar to what’s described in the OP. He works in a different field than bioinformatics but a lot of the problems translate, and it’s interesting to see a variety of people taking this approach to solve them.