1. 21
  1.  

  2. 27

    What a delightful example of everything wrong with modern publication practices:

    In some cases, I won’t really read the rest of the paper if I’ve already decided it’s getting The Big SR.

    Yeah, God forbid something with a rough intro but an interesting proof make it into the common body of literature!

    Focusing the paper on the mundane implementation details, rather than the ideas.

    It’s precisely the mundane details that tend to be important in the real world. Especially in computer science, where people are unmotivated to provide useful experimental data or source code to duplicate they’re work, these mundane details are the only chance a later reader has of reproducing or learning from their work.

    If by simplifying the problem just a little bit, you render your beautiful design unnecessary, it might be time to work on a different problem.

    Taken to its conclusion, this sort of haughty dismissal does kill off a lot of interesting work. Sure, there is a lot of research done that seems to be an answer in search of a problem, but that’s common to a lot of early engineering fields. Hell, quaternions in math were basically practically useless until aerospace and later computer animation.

    ~

    Some of the complains (bad grammar, ignoring related work, and being overly verbose) are valid, but goddamn does the author sound like one of the folks that kills papers in peer review unfairly.

    I understand that it’s thankless work, but really that’s a problem that should be solved with the system (perhaps by, I don’t know, removing the gatekeepers).

    1. 18

      Yeah, God forbid something with a rough intro but an interesting proof make it into the common body of literature!

      There are already far more papers published every year than anyone could ever hope to read even the intros. If you’re submitting something for publication, show some respect for your readers.

      It’s precisely the mundane details that tend to be important in the real world. Especially in computer science, where people are unmotivated to provide useful experimental data or source code to duplicate they’re work, these mundane details are the only chance a later reader has of reproducing or learning from their work.

      You should include all the details needed to reproduce… in the appendix. Not in the paper itself.

      Taken to its conclusion, this sort of haughty dismissal does kill off a lot of interesting work. Sure, there is a lot of research done that seems to be an answer in search of a problem, but that’s common to a lot of early engineering fields. Hell, quaternions in math were basically practically useless until aerospace and later computer animation.

      If the research is purely theoretical that’s ok - but say so. Coming up with a completely contrived “application” doesn’t help anyone.

      1. 7

        Regarding the mundane details, some of them are very much inconsequential. Was it a pentium running at 300mhz or 350mhz? Who cares? If the results are meaningful, I should be able to reproduce them on my 400mhz pentium. There’s a lot of this in 90s era papers. Absurd detail in describing the hardware, but only pseudo code for the actual code.

        1. 7

          Counter argument: was it a 486SX or 486DX? Was it a Pentium III or Pentium 4? Those are meaningful differences.

          If they do not list their exact experimental setup, they are not doing reproducible science.

          1. 7

            I like the information in principle, but in practice, at least for papers intended to have some kind of non-ephemeral lifespan, I rarely actually find it usable for reproducing. When I run into a paper that goes on for paragraphs about the finer points of their Sun UltraSPARC II cluster, what I am supposed to do with that information in 2016? Start buying parts on ebay to replicate their results? I do think they should at least briefly report what they ran the experiment on, but it’d be much more useful to structure the experiment so that it’s not so tied to details of a specific hardware platform, at least in cases where the tying isn’t inherent to the problem (certain kinds of low-level OS research can’t avoid it).

            That’s been one sort of side benefit of the move to the cloud. Lots of researchers now run their experiments on the cloud, and because you can’t really trust that the exact performance characteristics are reproducible, people try to not rely on them when possible. For example in a search algorithm or neural-net training, count some reproducible hardware-independent feature (node expansions, training iterations, etc.) rather than wall-clock time, since that’s more reproducible on different machines and the results don’t end up dependent on things like how loaded the cloud was at the time you ran the experiments.

            1. 4

              I mostly disagree: it’s the time/space complexity that usually matters. A linear-time algorithm scales linearly on a 486SX or 486DX. I think the primary exception is you want to show a new optimization that requires particular instructions. But the far majority of work that list exact machine specifications are not in that area.

              From practical experience: if you leave the exact details out, another reviewer will complain that you should specify the machine exactly ;).

              1. 3

                A linear-time algorithm scales linearly on a 486SX or 486DX

                If you’re only interested in asymptotic complexity it’s not appropriate to put any kind of benchmark numbers into your paper at all. It is a fine and lovely thing to publish papers about asymptotic complexity.

                For papers where you want to present a technique that you think makes an improvement on running time on real-world hardware, I’m going to want to know roughly what the hardware that you used looks like.

                It’ll matter for applicability. Say I read a 10-year-old paper about how someone managed to make a program go 40% faster by avoiding the need to make some syscalls, then it’s easier to judge the likelihood of it being applicable to the hardware that I have on my desk right now if I know whether the benchmark was performed on a Pentium 4 (which had the slowest syscall implementation on any CPU I’ve ever heard of) or on an Itanium (which apparently had unusually fast system calls).

                1. 2

                  I think the primary exception is you want to show a new optimization that requires particular instructions.

                  So, the problem is that in many cases you can run into implementation factors that occur even without seeking particular instructions.

                  My example of the Pentium III and P4 is exactly such a case: One could present a hashing or crypto algorithm that has better theoretical performance in their paper, and then during testing on their P4 find that it fails to be much faster than the previous generations' work on the subject (which happened to use PIIIs). The reason being, in this example, the hardware barrel shifter that was on the Pentium III but not the P4–and thus, for shift-heavy workloads like crypto, you’d expect worse performance. However, you’d only know about this if the authors of both papers showed their experimental setups.

                  There are similar issues with graphics cards, wonky hard drive controllers, and all sorts of things. On the whiteboard, sure, every algorithm is obvious, but you don’t have engineering happening until you care about implementation.

                2. 2

                  Or not? I’m not running any of those processors today. So should I just throw the paper out? If the science is useful, I should be able to reproduce it on my computer.

              2. 6

                I understand that it’s thankless work, but really that’s a problem that should be solved with the system (perhaps by, I don’t know, removing the gatekeepers).

                What gatekeepers? If you’re getting a paper peer previewed, then you’ve already submitted to going through gatekeepers by definition. But there is nothing stopping you from publishing your work without going through a peer review system, and plenty of places/people do that (unfortunately they are usually cranks).

                I’m not actually sure what world you want to live in. In anything where you submit to other’s views and opinions there is going to be a bunch of bullshit, and sure that needs to be worked on, but I’m not sure what you want to be done. Removing fictional gatekeepers doesn’t make research better.

                1. 5

                  Let me rant a little about the system,because im afraid it will take a while to fix:

                  The scientific system is truly broken nowadays.

                  I believe some massive intervention is needed. Yes, you have the entitled reviewers, like OP, but you also have the circles of friends that don’t review at all.

                  But Peer review is just the tip of the iceberg: the amount of shitty papers is overwhelming, but everyone need to publish something to justify the research founds received. Or to get more grants. And the best way to get grant money is to invent some glorious BS, and publish 4 content empty pages on a high impact factor journal. And fuck statistics and scientific method, if you want to do some research you do need to publish more, so who cares about reproducing the results. There’s not enough time anyway. And consider that theres no international organization that really matters to judge misconduct. And fines and punishments are ludicrous. And that if the authors want a paper retracted because the professor changed it and lied it takes something like 4 years. And whistleblowers will never get hired again in academia. And the science is done by postdocs and grad students that work 70h week for 35k$ year, with a good chance of being blackmailed on their visas if they don’t produce enough.

                  Sorry for the salt, but fuck science guys. until we manage to decouple it from money and fame it’s only going to get worse.

                  1. 3

                    I understand that it’s thankless work, but really that’s a problem that should be solved with the system (perhaps by, I don’t know, removing the gatekeepers).

                    When reading through this I just kept thinking, “Why isn’t this on stackexchange, it would be similar to codereview.stackexchange.com?”. Git would also be a reasonable way to handle this. The general public could submit merge requests for spelling and grammar. The real reviewers could comment on bad lines or submit merge requests to make improvements. You get a history/audit trail for free. If you use github you can even do the whole thing from start to finish online without running anything locally.

                    1. 10

                      You can do all that before submitting the paper. Your paper should be proof read before submission, not after.

                      1. 2

                        Well it could be like an issue on a kanban board where it progresses from multiple drafts to public request for comments, real peer review and finally publishing. Updates after that could even be issues.

                        1. 2

                          But that would require interest, cooperation, openness and most of all, agility. It is going to happen, but not sooner than in 20 years.

                          Right now, scientists are having a hard time establishing repositories for sharing raw data. Apparently, it’s quite hard when you are competing. Sharing, that is.

                          And, in some cases, the data are outright dangerous to publish directly. Czech linguists apparently have recordings of some elder roma talking about the time they murdered someone. Imagine publishing that when your colleagues work on their integration…

                      2. 5

                        It’s pretty common in some areas to first post a preprint on arXiv to solicit feedback. People are usually not interested in proofreading your paper, but if it’s in their area they may have questions about the work, requests for clarifications, complaints about its treatment of existing work, etc. In physics that kind of pre-peer-review has become close to standard, and papers only get submitted to journals further down the line, once they percolate through the arXiv a bit first.

                        1. 2

                          I wish it was possible.. Maybe internally to a group, but globally will not work without major reforms. Because of money, fame, pride, distribution and different audience, and much more: You need to have a precise author list if you want the $. You need to write in English if you want to reach out. Problems are usually subtle when comparing to common code samples. And they don’t necessarily have a beginning or an end.

                          Also Consider this: If science were really scientific you would not have to use literature to describe your work, but formulas and data would suffice. (Like in code). Sadly, very few fields actually let you do that, and those are usually very sparsely populated.

                          1. 4

                            formulas and data would suffice. (Like in code).

                            The comparison here would be code without comments or documentation. The formulas and data might be the important bit, but the words are also valuable so that the other people don’t have to reverse-engineer what the heck it is you’re on about.

                            1. 3

                              Sure. But technically shouldn’t a good paper follow a very strict format, like a form or a code page? Or be a simple pull request to the State of the art?

                              Let’s say, for the sake of the discussion that I’m a young researcher and i have realized a solar cell with +x efficiency vs state of art.

                              There’s not much more to it, I might have changed materials, preparation or whatever. As a scientist i am feeling obliged to share my work. ideally i should publish just an update to the state of the art. fill a form/diff or something like a pull request.

                              Instead, first i have to spend one month wrapping it with crap. Then it might be rejected because the reviewer is too busy to unwrap or doesn’t like my oratory skills .

                              I really don’t understand why we bother with the artistic quality and reject over novelty and plagiarism.

                              Imho the ideal scientific journal should be in between a git repo and a wiki. We all could be much more productive without the time wasted writing ~~poetry~~ papers.

                              1. 4

                                Writing is an important step in clarifying your own thoughts, and in subsequently communicating them to others. I emphatically disagree that time spent writing in prose of reasonable quality is a waste, or just some kind of peripheral artistic indulgence.

                                We’re not machines, but human beings! Literature is one of the tools we have built as part of the human process of science, engineering, and other professional fields: communicating with all the other humans.

                                1. 1

                                  One of my former colleagues was a very good writer, he had a lot of good publications. Of absolutely no real content whatsoever.

                                  I obviously enjoy reading great papers too, but you must admit that nowadays too often the data/content is subordinate to the prose.

                        2. -1

                          Do you think this is a serious post? I first read it as some kind of self deprecating satire by pretending to be tone-deaf in their declaration of self-superiority - I mean who would write like this in ernest, but one never knows …

                        3. 6

                          Honestly? Everything here seems completely reasonable to me. The simple truth of the matter is that there’s a lot more crankery than science and a lot more bad science than good science in the world. If you have something interesting and valuable to say, it’s worth taking the time to say it clearly and well.

                          1. 2

                            I found Simon PJs presentation on writing great research papers really nice:

                            http://research.microsoft.com/en-us/um/people/simonpj/papers/giving-a-talk/giving-a-talk.htm

                            It approaches this from a positive angle and is full of good advise.