“This paper shows that Gen outperforms state-of-the-art probabilistic programming systems, sometimes by multiple orders of magnitude”
One of my heuristics on these things is to read the abstract to see the problem followed by looking at the evaluation to see if the specifics in the middle are even worth reading. Sometimes, that’s in the abstract. Even more motivating. For this paper, I say you know your team is doing a good job when you can open saying you beat the competition sometimes by “orders of magnitude.” :)
I think it’s still decent work. My advice with research in most communities is don’t go with the system in the paper, go with the system(s) all the papers want to show you they are better than.
I’d modify that to just look at them all, re-run the experiments, and go with whatever seems best. The value of your critique is a reminder to watch out for a type of bullshit comparison I rarely run into outside formal verification. Forgot about it. I’ll either watch out for it or reserve praise until I’ve checked who did the work.
Comparing all solutions is often very time-consuming. The heuristic is that the community already knows the solution that works the best. I’m not exaggerating, a proper evaluation of all the probabilistic programming systems out there is a serious engineering endeavour that could take on the order of months.
Many research results in computer science papers are still remarkably challenging to replicate.
“This paper shows that Gen outperforms state-of-the-art probabilistic programming systems, sometimes by multiple orders of magnitude”
One of my heuristics on these things is to read the abstract to see the problem followed by looking at the evaluation to see if the specifics in the middle are even worth reading. Sometimes, that’s in the abstract. Even more motivating. For this paper, I say you know your team is doing a good job when you can open saying you beat the competition sometimes by “orders of magnitude.” :)
It helps when that competition is some of your lab’s earlier work that’s really slow.
Oops, I missed that in my skim. Yeah, that usually borders on fraud in most situations. I’ll reserve judgement on this one since it’s not my area.
I think it’s still decent work. My advice with research in most communities is don’t go with the system in the paper, go with the system(s) all the papers want to show you they are better than.
I’d modify that to just look at them all, re-run the experiments, and go with whatever seems best. The value of your critique is a reminder to watch out for a type of bullshit comparison I rarely run into outside formal verification. Forgot about it. I’ll either watch out for it or reserve praise until I’ve checked who did the work.
Comparing all solutions is often very time-consuming. The heuristic is that the community already knows the solution that works the best. I’m not exaggerating, a proper evaluation of all the probabilistic programming systems out there is a serious engineering endeavour that could take on the order of months.
Many research results in computer science papers are still remarkably challenging to replicate.
An updated version presented at PLDI 2019 is also available. (Yes, it’s ACM, but it’s an open access article.)