1. 7

    I usually find that “math” doesn’t prove real-world facts. In fact I’d go so far as to say that it doesn’t prove anything, beyond itself. When trying to “rate” generals in terms of numbers (from a source the author has admit is limited - notice that Genghis Khan is vastly underrepresented), all you can do is visualize, compare and analyze data. On the one hand, one has to consider that we neither have all the data to accuracy describe reality, nor is all the data trustworthy: let’s not forget, history is written by the winners - and what is written has no necessity to be true. On the other hand, “just” looking at numbers cuts out so many other factors which could have been vital to the course of history, weather, strategy, technology, economic issues like food distribution/production, social factors like election, revolutions, coup d’état, memory of previous wars or bad omens by oracles. War has over and over proved not to be just a conflict determined by manpower - otherwise it would be truly democratic. Cities in Europe capitulated before the Mongols just because of what they heard that they did to cities that didn’t, even though their manpower might have been significantly lower. That’s what I call a good PR campaign. Or how can one say that when Alexander’s army decided not to go further than India, that this is somehow to blame on Alexander the Great himself. What about the Russian generals that forced Napoleon into the Russia winter and thereby destroyed his armies? How is that non battle, negative choice factored in to the statistics?

    Ultimately, I see this kind of a project as a futile attempt at ahistorical and decontextualized ratings of generals, as if it were some kind of a video game (Interestingly enough this could serve as a foundation to critique a modern video-game mentality of understanding the world and history). If his intention were to rate general based on this WAR mechanism and with limited data, then fine. But the inferred jump from this rating to an absolute standard of “goodness” in being a general. I might even agree that Napoleon, Hanibal and Alexander the Great were good generals, but this is not a statistical matrer, and I don’t belive it can be reduced to an issue described via numbers.

    1. 6

      I don’t agree that you can’t quantify this, but I do agree that this article is fundamentally bogus. The thing that bugs about these kinds of overblown claims is that it poisons the well for deeper analysis since it causes people to write off the idea of using data. IMO, the problem here isn’t the use of data, it’s the unreasonably strong conclusion drawn from very weak data. Ironically, you can even see this in baseball if you look at pre-“sabermetric” analysis of baseball. Many of the stats people used to judge baseball players simply weren’t very good (in that they had low predictive power in terms of both future player performance and team win probability), which gave the anti-stats camp ammunition they could use to dismiss the idea of deeper analysis.

      It appears that the author has taken all factors that aren’t sheer numbers and found “wins about replacement” (WAR) for all of those factors combined. Some things that make WAR work for baseball that seem difficult to apply to warfare include:

      1. In baseball, there’s a well defined concept of a “win” and regular season wins are equivalent. This is certainly not true in warfare and a single “win” can sometimes reverse the effect of many losses.
      2. In baseball, runs scored are independent enough that you can predict wins from runs scored “well enough” and the effect of players is independent enough that you can predict run generation and prevention “well enough”. You can apply this type of analysis for sports where players actions aren’t even close to being independent, but it’s much harder to do so. It’s not clear what “hidden” or “intermediate” variables in warfare can stand in for the equivalent variables in baseball. A naive application using intermediate variables like casualties seems likely to yield almost no value (I could be convinced otherwise, but I’d need to see an argument, not the blind application of a formula). If this analysis were serious and not clickbait, I’d expect the discussion of these variables and the justification that these variables are predictive to take up most of the space.
      3. There’s a large sample size. We have tons of data about baseball (and other sports). While we know about “lots” of battles, we don’t really have good data for many of them. Even for battles that have happened in the past decade, casualty estimates can vary by an order of magnitude depending on who’s doing the estimate. Even if everything else were perfect, you’d have a hard time doing sabermetric analysis of baseball if your error bars on the number of runs scored varied by an order of magnitude. Also note that, when comparing across eras in any sport, you have to adjust for how the environment was different in different eras; different adjustments can give very different rankings of players. If any kind of era adjustment was done in this article, it’s not mentioned and the adjustment isn’t justified. For many eras, I suspect we simply don’t have the data to do a data-driven adjustment, so any adjustment would either involve original research or intuition applied to case studies.
      4. Replacement level is fixed. There’s a debatable but relatively well defined meaning for replacement level in baseball, but no such thing exists in warfare. As we’ve seen in baseball analysis, moving the replacement level target around can drastically effect findings of who’s “best” and since we have a variable that gives the person doing the analysis total freedom, we need some justification for why replacement level is set to where it is if we want to be convinced the the result isn’t just a post hoc justification. Note that this goes into all of the variables used in (2) and their weights, so in this case the person writing the article had many degrees of freedom that they could tweak to tune the results arbitrarily and there’s no justification for any of “knob settings” in the entire article.

      I don’t think there’s anything inherently wrong with running the numbers and seeing what comes out. But the author then claims that this somehow “proves” that some generals are mathematically superior to others. As you say, this is not correct.

      Using a methodology that can’t answer the question being asked and then feeding in bogus data is quite common, so I don’t mean to pick on this article in particular for that reason. The thing I find most unfortunate about this is that I’ve seen this article passed around a lot, and AFAICT it’s been passed around precisely because the author makes a lot of claims that they can’t back up – an article titled “a preliminary investigation of ‘wins above replacement’ applied to warfare” probably wouldn’t go viral, wouldn’t have gotten linked to on Marginal Revolution, etc. This article appears to have been widely read not in spite of being clickbait, but because it’s clickbait. It uses the cachet of sabermetrics, as if sabermetrics is about having some formula and a term called “wins above replacement” rather than being about figuring out what we can “know” from data and what the limit of our knowledge are. As an article about who the “best” general is, it’s pure cargo culting.

      1. 1

        I don’t think there’s anything inherently wrong with running the numbers and seeing what comes out. But the author then claims that this somehow “proves” that some generals are mathematically superior to others. As you say, this is not correct.

        Well that was basically my point ^^ My introductory part about “math not being able to prove anything” (in itself, but it can be used.as a tool to model or calculate, no doubt) were maybe more provocative that they should have been. But yeah, there’s no moral issue or any other issue with doing calculations, nor do I think that anyone belives that, just in case anyone misunderstood the intention of my comment.

    1. 7

      It’s amazing to me that this was only 16 years ago.

      For windows 2000, the original build system was… a single team that approved and manually merged all checkins? Checking anything in to this team required approval, sometimes on paper!? The team approved ~100 changes per day and manually issued the appropriate commands to grab source control and build. That’s for 1400 devs! When the build broke, build activity stopped until the build was fixed.

      The initial code checkout for a dev took 1 week. Just syncing changes took 2 hours.

      This was all terrible, so they bought the source code to perforce and created a fork of it. With the fork, they were able to get the initial checkout down to 3 hours, and syncing changes down to 5 minutes. That was considered to be AMAZINGLY GOOD. One of the main points of the point of the talk appears to be to describe this great new system where it only takes 3 hours to do the equivalent of git clone

      I also like the bullet point, “Abuse and intimidation gets way out of control; can’t keep calling people stupid and except them to listen”, which is nested under “Sloppiness is not tolerated”, next to “Great idea, but very difficult to nurture as group grows”. it’s so hard to nurture the right amount of calling people stupid.

      1. 6

        On the whole I agree with the article. I have worked for 3 startups and been burned 3 times. I’m much happier with the mega-corps now. That said; I keep seeing these types of articles talking about $250K packages for a senior programmer and how you can work from anywhere with this kind of salary.

        This has not been my experience and the Bureau of Labor Statistics seems to be more apt from what I have experienced

        http://www.bls.gov/ooh/computer-and-information-technology/computer-programmers.htm

        I’m well above the BLS median, but not even half of $250K. Is this really achievable as “the norm” from anywhere in the USA, and if so where are all of these opportunities that I’m so obviously missing?

        How much of that $250K is actual salary and then how much of it is “value of benefits”? This is the other piece that is a head scratcher since the $250K always includes “value of benefits” I feel like it is a bit of sleight of hand that hides the lower actual salary.

        Here is the BLS breakdown by state, which is also interesting, http://www.bls.gov/oes/current/oes151131.htm#st California median is around 89K and Washington seems to have the highest median, still only 115K.

        1. 3

          Note that the BLS definition of “Computer programmer” appears to be very low level:

          “…They turn the program designs created by software developers and engineers into instructions that a computer can follow.”

          It’s likely that many of us here who would call ourselves programmers might fit into another statistical bucket for the BLS, like “software developers” - median $97k, or even “Computer and Information Research Scientists”, median $108k. Honestly, even those seem low and so I assume they’re wrapping together some jobs that I wouldn’t consider equivalent.

          In general it seems really hard to tell how much of these stories of high compensation to believe without just asking your peers, something I’m always reluctant to do. Certainly having some idea of what sort of field these offers are being made for is useful - Dan’s article was helpful in pointing out that people in “hot fields” get gobs more money.

          It’s also not always totally clear what being “senior” means. I think I had “senior” in my title once, but I think it means different things at different places. :)

          1. 1

            How much of that $250K is actual salary and then how much of it is “value of benefits”?

            Zero. A mediocre compensation package for a senior engineer today is $150k salary, $100k/yr of equity that’s not quite as good as cash (but pretty close) and bonuses.

            1. 7

              It’s certainly not anywhere near that in New York or Boston. Perhaps some SV outliers.

              1. 2

                That’s what people I know at Google make in Madison, WI. I’ve that numbers in places with a similar cost of living (like Austin, TX) are similar. Numbers are often much higher in SV, of course.

              2. 1

                Roughly how many years of industry experience does a senior engineer at Google correspond to? (I know that years of experience is a horridly imperfect metric, but it can be useful for HR-type stuff.)

                1. 1

                  A decade, give or take a few.

                  1. 1

                    Three or Four or more

              1. 7

                Functional programming.. I’d lean towards Maybe on this one, although this is arguably a No. Functional languages are still quite niche, but functional programming ideas are now mainstream, at least for the HN/reddit/twitter crowd.

                Also Java, so I’d call it a yes.

                Model checking is omnipresent in chip design. Microsoft’s driver verification tool has probably had more impact than all chip design tools combined

                I don’t know where the author is coming from here. It’s hard to imagine we would achieve the amazing hardware we have today without tools to harness complexity. What does he mean?

                1. 1

                  Sure, tools in general, but probably not for formal methods, which is what’s being discussed in that section. Sorry if that was unclear :).

                1. 1

                  How do you actually do this? The “tactics” presented are unobjectionable (e.g. have a clear mission, have people take responsibility), but they seem more like strategy than tactics. How do you actually make those things happen?

                  I know a few folks at twitter, and a lot of ex-twitter folks. This is, of course, anecdotal, but the experience of the people I know is highly variable. Some people love it. Some people hate it. A few are somewhere in between. It’s certainly not a place I think of as stable for employees; among folks I know, the attrition rate is quite high.

                  Contrast that to facebook, where a lot of my friends joke that they must be brainwashing people because almost everyone talks about how much they love it there and how great it is. The attrition rate seems incredibly low considering how many employees there never have to work again.

                  I don’t think they have different strategies. If you talk to folks at FB, they’ll tell you that it’s important that teams have clear missions and that people take responsibility. But some difference in tactics causes those strategies to be more effective at FB than at twitter.

                  1. 4

                    I suspect the answer is that it’s more complex than one might think at first :-).

                    This interview with Butler Lampson is quite long, but if you’re interested in capability based computing, it’s probably worth reading in its entirety. It the interview, Lampson describes why capability based computing is exciting, and why so many great researchers worked on capability based systems in the early days (Jim Gray, Charles Simonyi, and a number of other well known folks). The conclusion that Lampson and others came to is that, as an engineering trade off, building an OS around capabilities isn’t worth it. It’s not a fundamentally terrible idea, but no one has figured out how to do it simply enough that it’s worth all of the extra complexity.

                    And then the question is, are capabilities a good tool for organizing the system? And I think our conclusion there was they work, we made it work, and they have some advantages because you get uniformity. By contrast, the way in which security works in UNIX, for example, they have these file descriptors, those are capabilities, but they also have a bunch of more ad hoc things. And on the whole, I think that works better. Some of them are more complicated, but you can tune some of the ad hoc things that need to have high performance better, specific requirements. I don’t know really how you would figure this out. You’d have to do some systematic experiments, and they would have to on a fairly large scale; and no one has ever attempted that kind of thing. Basically, I think our judgment was that there were easier ways to get to the same effect.

                      1. 27

                        What kind of evidence do you want? Google is pretty careful about drilling “communicate with care” into employees from day one. There are lots of ways they do this, but the takeaway is that most legally sensitive issues end up being discussed one on one, in person if possible, and over video chat or phone if not possible.

                        Having worked at Google in the past, I can think of a number of pretty bad stories, and in every case where I heard about what HR did, HR was very careful to act in a way that mitigated the risk of legal liability as much as possible. It’s extraordinarily unlikely that there’s an email lying around that’s a smoking gun that says something like “Welp, director Y clearly harassed person X, now what?”.

                        On the other side of things, I know multiple people who have avoided talking publicly about their stories because of the severe damage to their career they were afraid would be caused by the backlash. There’s a lot of downside to speaking out about stuff like this and not much upside.

                        1. 14

                          At the very least, Rod could have denied that she poured a drink on him before hanging up the phone. I know I can remember everybody who’s poured a drink on me. Sounds like they caught him off guard for a sec before he remembered his training.

                          1. 6

                            What kind of evidence do you want?

                            Corroboration from any of the dozen+ people who were present for the incident(s).

                            1. 13

                              Some of whom are likely still employed under this person, where speaking out may affect their careers?

                            2. [Comment from banned user removed]

                              1. 24

                                And yet, on the other end of the thread, you say - with just as much evidence - that she thinks she is at war with 3.5 billion people.

                                1. [Comment from banned user removed]

                                  1. 35

                                    We should all be at war with the patriarchy. The term refers to the exclusion of women from positions of power. I feel pretty confident that the culture I live in would be better off for everyone if it were more meritocratic.

                                2. 16

                                  You must be kidding. You think claims of sexual harassment (something which is well known to happen, far more than is talked about) are comparable to claims of alien abduction or reptile people?

                              2. 21

                                Assume for a minute (or an hour, or a day), that it happened just the way she tells it. So, what do you do after the internal reporting process failed and you kept it quiet? Also, harassment cases are rarely the one where a colleague flips out the camera and starts filming.

                                1. [Comment from banned user removed]

                                  1. 27

                                    “Smashing the patriarchy” means smashing a system that unfairly benefits men, not men themselves.

                                    1. 21

                                      I’m pretty sure she will have a harder time finding a new job than her colleagues after that.

                                  2. 15

                                    “Dang, my smoke alarm is going off. That’s a pretty serious allegation, and what if it’s a false alarm? Better stay in bed until I see some REAL evidence of a fire.”

                                    1. 1

                                      += 10

                                    2. 6

                                      While I wouldn’t dismiss these stories out of hand (after all, it is pretty hard to prove of hand remarks that have passed before you realize what’s happened), I agree with your stance that we should approach them with critical caution.

                                      While sexual harassment is a real problem, and no doubt exists in the technology industry (and pretty much everywhere else), there have been many cases (outside of tech) where women falsely accused employers, coworkers or acquaintances of sexual misbehaviour of various natures.

                                      History also teaches us that if you give a certain issue public attention (such as these kinds of acts have been enjoying in recent months), many stories, both true and false, will come out of the woodwork.

                                      I’m not saying this one in specific is a false story though.

                                      1. 5

                                        there have been many cases (outside of tech) where women falsely accused employers, coworkers or acquaintances of sexual misbehaviour of various natures.

                                        There have also been many cases where women truly accused employers, coworkers and acquaintances of sexual harassment.

                                        And there have been many cases where women didn’t accuse employers, coworkers and acquaintances who had sexually harassed them.

                                        One guess which of these totally blow away the other in terms of frequency.

                                        1. 5

                                          Absolutely. I’m just saying we shouldn’t take every accusation at face value, innocent until proven guilty etc.

                                    1. 5

                                      Does anybody have any advice on how to become a effectively become a better writer? Yes, you want to write as much as possible, that’s a given, but can you get meaningful feedback from someone who’s much better than you at it? Is there quality coaching of any sort that you can get?

                                      1. 2

                                        My advice here is pretty standard, not just for writing, but for any field, but I think it’s standard because it works. Get good feedback and practice incorporating it. Don’t try to fix everything at once; add skills one at a time.

                                        In my experience, it takes a while to find people who will give you good feedback for free. Most people don’t come up with substantive writing critiques, but if you ask for advice on a lot of drafts, you’ll eventually find a few people who come up with meaningful critiques and not just spelling and grammar corrections.

                                        And of course you don’t have to only get free advice. If you’re a programmer, you can certainly afford to hire a professional editor to help you. I don’t want to post what’s basically an advertisement for an editor on lobsters, but if you’re interested in this, contact me privately (see my profile) and I can recommend a good editor.

                                        My process is to get good feedback about something (often a blog post), incorporate the feedback in the current thing, and then try to write something from scratch that incorporates one specific thing from that feedback. I then get feedback on that second thing to see if I was able to improve that one thing. When I’m able to consistently write something that doesn’t get the same critical feedback, I move on to another problem in my writing and try to eliminate it.

                                        I don’t claim that my writing is good, but when I look at how my writing has changed over the past year and a half, it’s certainly improved a lot. For a given level of effort (30 second email, 15 minute blog post, 3 hour blog post done in 3 sittings, etc.), my writing is a lot better. I’d say that my dashed off 15 minute blog posts are as “good” as my serious 2-4 revision blog posts were a year and a half ago, when I started this blogging/writing improvement experiment. Considering the time investment (a few hours a month), I’d say that improving my writing using this method is one of the highest ROI things I’ve done lately.

                                        1. 1

                                          Does anybody have any advice on how to become a effectively become a better writer?

                                          Yes.

                                          Yes, you want to write as much as possible, that’s a given,

                                          That is a given of course. Read too. Read as much as you can on as much as you can. It really is a shame that our modern lives are so strongly aligned towards any other form of recreation than reading, given reading’s relatively lengthy time requirements to get much out of it. Beware the shysters who claim they can “speed read” at some exorbitant number of words per minute. The act of reading requires more than merely pushing and popping groupings of words into and out of your brain’s mental queue as fast as possible.

                                          but can you get meaningful feedback from someone who’s much better than you at it?

                                          Absolutely. Editors exists for more reasons than merely finding typos and other grammatical errors.

                                          Is there quality coaching of any sort that you can get?

                                          There are several communities of critiques, readers/writers, reviewers, editors, teachers and so on. In my experience, their eagerness to help is typically proportional to your own reciprocity in helping them in turn. That being said, finding an audience can still be difficult given our desire to become ever more efficient schedulers of our limited (and in some cases decreasing) free time.

                                          1. 1

                                            I agree with the posted essay. Writing can bring great clarity and you can grep it.

                                            A few observations from good essays / blogs

                                            1. Put your personality into it.
                                            2. Give a strong purpose to each essay / blog.

                                            I consider zenhabbits.net to be a good representative of the above said points.
                                            Words like good and better are however, subjective.

                                            Hope that helps :)

                                            1. 1

                                              (link with typo correction: http://zenhabits.net)

                                          1. 2

                                            Minor point: calloc is required to check for overflow. The standard was clarified such that if it’s not possible for all indices up to nmemb to be valid, calloc must return null.

                                            1. 1

                                              Thanks for the correction! It’s really appreciated.

                                              I credited you as “tedu” in the acknowledgments. Let me know if you’d prefer to not be mentioned, or be mentioned with a different name/alias.

                                            1. 3

                                              The Rust Guide issue happens when you have certain web fonts installed, they get corrupted and not downloaded for some reason :/

                                              1. 2

                                                Thank you so much! Deleting all of my existing web fonts fixed the problem. I wonder why Chrome can’t/doesn’t checksum fonts, and why the problem didn’t affect Firefox.

                                                1. 3

                                                  Wooo!

                                                  Yeah, totally unsure. It’s a bummer.

                                              1. 2

                                                http://danluu.com/ Hardware, hardware/software co-design, and low-level software nonsense, plus whatever I’ve learned lately.

                                                I’ve been experimenting with blogging by setting aside 30-90 minutes, writing a post in on sitting, and calling it done. That’s resulted in 5 posts in the past week (as opposed to my usual 1 post per month). I’m super interested in negative/critical feedback since it’s hard for me to tell if I’ve writing stuff that’s substantially less readable by not editing seriously.

                                                1. 5

                                                  I’m trying to find if there are good cache simulators for things that aren’t CPUs. I spent a chunk of last week experimenting with CPU cache eviction policies, and I’d like to be able to see how the results generalize to other domains without having to actually convince someone to switch production machines over.

                                                  I also want to get back to debugging a couple bugs found by a fuzzer I wrote. One has been unreproducible (a base library function randomly segfaults, but replaying the fuzzer with the same seed doesn’t hit the segfault), and I haven’t been able to get a test case less than about 5 minutes long for another (an exception is randomly not caught by a try/catch block that should catch all exceptions, but modifying the test case at all makes the bug go away), which is too long for me to really want to dig into the compiler to figure out what’s wrong.

                                                  I probably need to write some kind of test case reducer to really attack those bugs.

                                                  1. 4

                                                    I find cache eviction policies really interesting. I find it interesting how they’re somewhat like generational GC, and whether some of the algorithms from caches can be applied to GC (thinking SRRIP and DRRIP). I guess the only difference is you can evict something you’ll use again, this is not the case with GC.

                                                    1. 3

                                                      Your eviction policy post was interesting! I’m not much of an expert here, but the first things I thought of were evaluations of disk buffer caches and network processor ASICs.

                                                      I don’t have any actual direct recommendations, but all the things I found in my quick googling just now were posted in SIGMETRICS, so that’s where I’d start.

                                                      There’s also multiprocessor cache coherence directories, although replacement policies there get complicated. ( I liked this paper about SCD (Scalable Coherence Directory) at HPCA ‘12 from Daniel Sanchez and Christos Kozyrakis - has some interesting info about how design constraints of coherence directories impacts replacement, noting that their preferred cache design ends up selecting replacement candidates in a de-facto random fashion. I haven’t thought about coherence directories in a few years and that paper was pretty easy to follow - they did a nice job on the background (section 2).

                                                    1. 12

                                                      The author has an interesting point but deliberately biases the whole discussion by defining type systems such as Hindley-Milner as “human friendly” and traditional ones as “compiler friendly”, then circularly arguing that the ones he defines as human friendly are more friendly to humans.

                                                      If you’re going to employ biased language to make an argument you could just as easily describe Hindley-Milner as “mathematician friendly” and C-style ones as “programmer friendly”. But that would get us nowhere.

                                                      What I’d love to see is an empirical study of the effect that type systems have on programmer productivity, software reliability and code maintenance. Does anyone know if there are any studies out there like that?

                                                      1. 8

                                                        Yes, it turns out there are a lot of studies on this. After hearing some particularly strong claims about this sort of thing recently, I did a literature review and wrote up notes on each study, just for my own benefit.

                                                        The notes are way too long to post as a comment here (the notes are over 6000 words, i.e., longer than the average Steve Yegge blog post), and they’re actually pretty boring. Let me write a short summary. If anyone’s interested in the full results, I can clean them up and turn them into a blog post.

                                                        1. Most older controlled studies have a problem in their experimental methodology. Also, they often cover things most people probably don’t care about much anymore, e.g., is ANSI C safer than K&R C?

                                                        2. Some newer controlled studies are ok, but because of their setup (they compare exactly one thing to exactly one other thing) and the general lack of similar but not identical studies, it’s hard to tell how generalizable the results are. Also, the effect size these studies find is almost always very small, much smaller than the difference in productivity between programmers (who are usually selected from a population that you’d expect to have substantially smaller variance in productivity than the population of all programmers).

                                                        3. People rarely study what PL enthusiasts would consider “modern” languages, like Scala or Haskell, let alone languages like Idris. I don’t like that, but it’s understandable, since it’s probably easier to get funding to study languages that more people use.

                                                        4. There are a number of studies that mine data from open source repositories. They come up with some correlations, but none of them do anything to try to determine causation (e.g., instrumental variable analysis). Many of the correlations are quite odd, and suggest there are interesting confounding variables to discover, but people haven’t really dug into those.

                                                        For example, one study found the following in terms of safety/reliability of languages: (Perl, Ruby) > (Erlang, Java) > (Python, PHP, C), where > indicates more reliable and membership in tuple indicates approximately as reliable. Those weren’t the only languages studied; those particular results just jumped out at me. The authors of the study then proceeded to aggregate data across languages to determine the reliability of statically typed v. dynamically typed languages, weakly typed v. strongly typed, and so on and so forth.

                                                        That analysis would be interesting if the underlying data was sound, but it doesn’t pass the sniff test for me. Why are languages that are relatively similar (Perl/Ruby/Python) on opposite ends of the scale, whereas some languages that are very different end up with similar results? The answer to that is probably a sociological question that the authors of the study didn’t seem interested in.

                                                        This seems to echo the results of the controlled studies, in that the effect a language has on reliability must be relatively small for it to get swamped by whatever confounding variables exist.

                                                        5. There are lots of case studies, but they’re naturally uncontrolled and tend to be qualitative. Considering how little the other studies say, the case studies make for some of the most interesting reading, but they’re certainly not going to settle the debate.

                                                        1. 3

                                                          Anecdotally, SML’s type system is not particularly friendly to programmers (it’ll beat you over the head over and over again until you satisfy its exacting requirements) but extremely supportive of correctness; Java’s type system isn’t especially friendly to anyone due to its inflexibility, inexpressiveness, and extraordinary amount of required boilerplate; and Python’s type system (or lack thereof) is incredibly friendly to programmers (“sure, you can do that! Why not! Treat this int as a dict, I don’t care!”) but allows all sorts of bugs ranging from the blatantly obvious to the deviously subtle to make it into running code.

                                                          I’ve seen a couple of studies of programmer productivity in various languages/environments (not, IIRC, looking specifically at type systems), but none well-founded or conclusive enough for me to remember or recommend them.

                                                          1. 5

                                                            For me, it’s the lack of discussion that makes me unlikely to comment. When I look at my old comments here, close to half of them are the only comment on the article. Conversely, I literally can’t remember the last time I made a comment on HN that didn’t get a response. I’d much rather have a discussion than make a comment and get no reply.

                                                            Worse yet, the more substantive the comment, the less likely that other comments will show up. This is no different from HN or reddit, where fluffy articles get lots of comments, and technical articles are relative ghost towns. But, HN and reddit get enough total traffic that technical articles are still likely to have at least one interesting comment. Not so, here.

                                                            1. 11

                                                              I have the same problem. I wonder if we could improve the situation with comment-discovery mechanisms that are less obsessively focused on recency. As it stands today, if you’re commenting on an article more than a day old, it’s unlikely your comment will be read by enough people to get a reply. (This is far better than HN, where the time horizons are even shorter, and if you take too long thinking about your comment you get punished with a /deadlink when you go to post it, but it’s still bad.)

                                                              It might be useful, for example, if I could see a feed of the latest comments by people I think are interesting enough to follow — like a Planet aggregator, but for comments; or if it were easier to link to or even transclude relevant comments from the past. (For example, at the moment, the Your Threads page only lists the last few comments you’ve made; if you want to link to a comment even you yourself made a few months back, good luck finding your own comment. Maybe you can Google it. And finding other people’s related comments is even more difficult. Meanwhile lobste.rs has comment score data that it could totally use in a comment search engine, and Google can’t.)

                                                              Usenet did pretty well for some years with only a feed of the latest “comments”. You might even say that Usenet’s second-biggest problem was how well it did at fostering ongoing conversations. (Its biggest problem was vulnerability to spam.)

                                                              I would have liked to see the @dl comments before, and in fact I’m replying to one from two months ago (which is actually related to this topic) now, but without better discovery mechanisms I probably won’t see them. And I won’t see if anybody replies to my replies, either.

                                                              Another problem is that the user pages display the average score per article/comment; this score goes down if you post a comment that nobody reads, because it will always be scored 1. If you display scores on a web site, people will try to game them, and the way to win this game is to post very few comments — ideally short jokes or related links on recently-posted articles, and never questions betraying ignorance, corrections of errors, lengthy analysis, especially on articles that weren’t just posted, and especially when it’s not a top-level comment or a reply to a top-level comment; deep in a thread your chances are nil. (This comment’s score doesn’t fit my theory, though.) Compare how badly @dl (6.47) is beating @kragen (1.70), largely by virtue of posting an order of magnitude fewer comments. (Also @dl is probably smarter, more cogent, and better-looking than I am, but I think the score difference is primarily a result of the comment count.)

                                                              Maybe we should consider how to adopt some techniques from the popular sites that are at the opposite extreme: StackExchange and Wikipedia. SE automatically does some kind of full-text indexing thing to find “related questions” and answers, and interesting questions and answers continue gathering upvote karma (and sometimes further responses) for years, not hours. Wikipedia articles continue to garner interaction forever, even if that interaction is in the form of somebody correcting your spelling or punctuation (or, worse, deleting your contribution); and although there’s a “Recent Changes” page, the only people who read it are the vandalism patrol. Also, both SE and Wikipedia have social structures and software that reward participation, rather than punish it.

                                                              (Edited extensively.)

                                                              1. 3

                                                                One of the few ways (other than lack of traffic) I find lobsters worse than HN, is that here, contentless comments and jokes float to the top whereas at HN, lengthy analysis floats to the top. Not always and everywhere, but most of the time.

                                                                If I open an HN thread on a non-linkbait topic, and there’s a deep analysis anywhere in there, it’s likely to be the top comment, or at least be in the top thread. Here, those sorts of comments usually languish with 1 or 2 upvotes, while pithy comments that are clever, but not informative, are the most upvoted comments.

                                                                1. 4

                                                                  citation needed

                                                                  1. 1

                                                                    I did link to “short jokes and related links” that I had posted in the past which garnered an IMHO unreasonable number of upvotes, and a number of comments that I thought were better that ended up with a score of 1 or 2. Hopefully I’m not the worst offender here, so you should be able to find much better examples.

                                                                    1. 1

                                                                      See the examples cited in kragen’s post, for example. For that matter, see this thread, where I’m the only person to have upvoted kragen’s lengthy comment, where he spent the time to go through the archives and picked out a handful of examples.

                                                                      I could post five examples off the top of my head, but you might say I’m cherry picking examples. To try to get a representative sample, I clicked through every thread (that I haven’t hidden) on the first two pages that has at least two comments, I couldn’t find any threads with a deep analysis; there’s a symptom of the problem.

                                                                      I’m not really inclined to spend more than the ten minutes I already have digging for more evidence to support my impression, for this thread which is, predictably, buried at the bottom of this topic that few people will read anyway. If your impression on this is different, that’s fine, and I suppose we’ll have to agree to disagree.

                                                                      1. 1

                                                                        Comment length does not equal quality, and people should not be upvoting comments just because the author took a long time to write it.

                                                                        kragen’s comment was an entire page long, was made on a story with only 12 points and posted 2 days after the story had already appeared.

                                                                        I’m not really inclined to spend more than the ten minutes I already have digging for more evidence to support my impression, for this thread which is, predictably, buried at the bottom of this topic that few people will read anyway.

                                                                        And that is my point. People don’t want to spend all that time reading long comments, especially in a thread that isn’t very interesting. That is why they don’t get upvoted.

                                                                        1. 3

                                                                          No, of course comment length does not equal quality; and I’m sure I’ve written long comments of low quality, too, for all that I try to write well. But comment length does correlate weakly positively with quality, and strongly negatively with score. Not only should people not be upvoting comments just because the author took a long time to write them, we shouldn’t be upvoting comments just because the author wrote them quickly and without thought — but the fact is, we do. I explained several plausible reasons why we do.

                                                                          Back to the original topic, though: why don’t people comment much? I don’t comment much because I have a poor chance of having a conversation as a result, and therefore having a chance to learn something.

                                                                  2. 2

                                                                    +1 for the feed of latest comments from a selected list, and other things made possible by ‘following’ people - like weighting upvotes by people I follow higher than general upvotes, etc.

                                                                    On HN, there are a few people whose comments I usually appreciate even on otherwise dumb comment threads, and there’s no good way there to surface those.

                                                                1. 5

                                                                  I agree with the general point about the ubiquity of networks, but I strongly disagree with the author’s “law” on the convergence to tiered-star topologies.

                                                                  Looking at early switched networks, key innovations were the non-blocking Clos topology in the 50s and then the Benes network in the 60s; these are still used today in many telephone switches.

                                                                  Looking at processor interconnect, we went from simple busses to 2-d mesh and torus networks (for ease of implementation). n-cube / hypercube networks came next because they’re the obvious next step, and minizming the diameter is an “obvious” optimization target. A couple papers came out that showed that low-dim topologies were better given the constraints of the time, which motivated a move back towards 2-d and 3-d mesh and torus networks, though the degree has snuck back up as physical constraints have changed. And lately, fabrics with high node-degree have gone to butterfly and Clos networks.

                                                                  This comment is long enough as-is, but suffice to say that you could make similar comments about other networked applications. It’s complicated, there are a lot of tradeoffs, and it’s not always (or even usually) the right decision to go with a star topology.

                                                                  I suppose this all depends on what you mean by tiered-star. IMO, this doesn’t look much like this, let alone something like this.

                                                                  1. 8

                                                                    I strongly agree that comments on why people don’t like things are preferable to downvotes. However, that only happens occasionally, and if you force people to comment to downvote, I suspect you’ll get something like you do when you force people to pick a downvote reason. People will make a comment that doesn’t contribute much, the same way that people pick an arbitrary reason to downvote something they don’t like.

                                                                    I have a longer analysis here, but the short version is here’s what I see on HN, which has no downvotes: There’s a high volume of new stories, which means that things have a very limited amount of time to hit the front page. Linkbait is highly likely to collect enough votes from /newest in that timespan. Technical articles aren’t likely to. The more technical the article, the worse the odds. When technical articles do make it to the front page, they often get enough upvotes to stay there for a long time because front page traffic is so much higher than /newest traffic.

                                                                    On reddit, which has downvotes, people very aggressively downvote content and most things are pretty much immediately pushed into the negative range where they’ll fall off the front page without much exposure. If something gets pushed up higher before it gets downvoted to oblivion, it can get enough momentum to stick for a long time. So, it basically works the same as HN, where linkbait has a much better probability of getting that first push than technical content.

                                                                    So, I think things are different, but in the limit the behavior will converge to be pretty similar either way.

                                                                    Of course, if you don’t like having lots of technical content, this isn’t a problem. And, considering what gets upvoted, it must be that people prefer to have less technical content and more fluffy content. Maybe the right thing to do is to give people what they want, even though that’s the opposite of my preference.

                                                                    1. 4

                                                                      There’s a high volume of new stories, which means that things have a very limited amount of time to hit the front page.

                                                                      That is exactly what /recent was designed to fix (which I just enabled again right before you posted your comment), since that was one of my complaints about HN when creating Lobsters.

                                                                      1. 1

                                                                        The underlying problem here is that we’re constructing conversation environments where fast-twitch reactions drown out slow, considered ones. There have been attempts to construct conversation environments where that’s not true, like the Rotisserie system (description in a book) and Wikipedia, but they have been only partly successful; this is perhaps partly because the matching law of operant conditioning means that rapid rewards, even small ones, are much more effective than slow rewards, partly because web sites that you interact with only occasionally don’t have a way to get your attention, and partly because the rapid back-and-forth of rapid reactions simply allows groups that interact frequently to be creative in ways that rarely-interacting groups aren’t. (Consider how many current popular memes come from 4chan.)

                                                                      1. 2

                                                                        I took a look at the AAUW study. It’s not really fair to say it makes “literally the opposite claim.”

                                                                        What it does is present some evidence showing a gender gap in earnings, broken out by college major. This is the chart that the post reproduces.

                                                                        Later, it presents a chart showing annual earnings 1 year after graduation, broken out by occupation, not major, and applying a bunch of corrective factors (which definitely seem debatable). This is Fig 8, which the post omits. In several fields, they find no significant gap for the cohort they studied, and Engineering is one of those fields.

                                                                        This post focuses on the first chart, but it’s the second part that the author of the original Qz article is referring to.

                                                                        1. 3

                                                                          I saw Fig 8, and I think it’s informative, but note that it’s normalizing for known factors, which includes factors that are partially caused by discrimination. If the original article had claimed that 2/3 of the gender pay gap was understood, I would have no problem with that.

                                                                          But the title claims that the gap doesn’t exist (that’s the part that’s literally the opposite), and the text is a combination of claiming that it doesn’t exist and that it isn’t a problem because it’s understood. However, as the authors of the AAUW study go through and account for various factors, they explicitly note that some of them are partially due to discrimination. To make a claim that there is no gap is false. To make the claim that the gap is not a problem because we’ve understood that part of the gap is because of discrimination and don’t understand what 1/3 of the gap is from is not technically wrong, but it strikes me as a strange claim to advance. My interpretation is that the original article makes much stronger claims than that.

                                                                        1. 10

                                                                          It’s interesting how much hostility there is around talking about this sort of thing. When I replied with a comment to the original post, pulling some numbers out of the studies, with a comment saying that the numbers didn’t support the conclusion, I was immediately downvoted (-1, troll).

                                                                          Now that this blog post has been publicized, the author of the post I’m responding seems to be running a twitter smear campaign against me, with a series of personal attacks and an appeal to authority thrown in for good measure.

                                                                          I’d like to see a real discussion of the issues, but that’s not happening here and it’s difficult to see how it’s even possible.

                                                                          1. 2

                                                                            Sorry for side-tracking here:

                                                                            What is Engineering Technology, should it be include with Engineering? Why are CS and IT lumped together? Why are Bio/PhysSci/Science Technology(same question as E.T.)/Math/AgSci all lumped together?

                                                                            I don’t know if these categorizations were yours or the original studys.

                                                                            1. 1

                                                                              The categorization is from the original study. My post is actually pretty boring; there’s no synthesis or analysis, just quotes from the actual studies with some comments here and there.

                                                                              Good question about engineering tech; I hadn’t heard of it myself until I did grad school at a place that offered EE and EETech degrees. In my mind, engineering tech and engineering folks have pretty much the same skillset. With EETech, there’s more of a focus on the practical and less on the theoretical.

                                                                              For some strange reason, learning about solid state physics, combinatorics, and gauge fields made me a lot more employable, despite having little practical value in my professional life (combinatorics has occasionally been useful). Employers seem to prefer hiring folks with EE degrees to folks with EETech degrees, especially for higher level positions.

                                                                            2. 1

                                                                              I’m sorry to hear that, Dan. I’ve upvoted you. Definitely appreciate the actual statistics vs. linkbait crap other people post online.

                                                                            1. 9

                                                                              Post title: There is no gender gap in tech salaries.

                                                                              Actual result of main linked study: Looking only at people who graduated with a B.S. a year ago, there’s a 12% gap in “engineering and engineering technology” and a 23% gap in “computer and information sciences”. If you control for self-reported number of hours worked and type of employment, 2/3 of the gap disappears.

                                                                              Later, the blog post refers to a study indicating that the gap is “only” 14% when adjusting for hours worked and not 23%.

                                                                              The blog post then concludes with, “Despite strong evidence suggesting gender pay equality, there is still a general perception that women earn less than men do, and this perception is just one more factor discouraging women from entering the tech space.”

                                                                              To be fair, I didn’t bother reading the last two studies the author linked to, but they didn’t seem relevant. One is a study of part time workers only. From the abstract, the other is a study from the 80s that concludes that the gender pay gap is from cohort effects, which is directly contradicted by the first study.

                                                                              1. 7

                                                                                My thoughts on some of the suggestions here:

                                                                                • Making the downvoting usernames public will just cause more off-topic meta discussion than there is now, or personal attacks. I really dislike when all of the comments about a story are just nitpicking the tags or the voting or whatever. This was why I initially made the meta tag filtered by default, because I don’t like so much of the site discussion being just about the site discussion. Also, if a story has 10 upvotes and 2 people downvoted or flagged it but those tallies were never shown to anyone other than mods, there would be no complaining about it because a net score of 8 would be fine and no one would even know those 2 downvotes were cast. Making the tallies public seems to have just caused lots of bickering, and showing usernames would make it worse.

                                                                                • Removing the low quality option will just cause those using it now to pick some other reason. I think a “hide” option is needed so that people have a way of doing something more than ignoring things they don’t like, without affecting the score and hurting feelings.

                                                                                • I think that the downvote option should be removed completely and just replaced with a “flag” option that doesn’t alter the score, and add a “hide” option that just removes it from the hider’s view. I think people downvote articles they don’t want to see, when they should instead just be filtering out those tags or clicking “hide” to remove it from their view. I don’t believe many articles that people downvote are actually bad enough to warrant a “flagging” or making their score go negative but with no hide option, they choose to downvote.

                                                                                1. 2

                                                                                  I am in complete agreement.

                                                                                  I agree with those that believe showing the names of those who’ve voted will add some culpability and responsibility. There was a time when I believed that Lobste.rs could be mature about that sort of thing. Recently, my faith has been adjusted and I think it would end up in meta-hell (or worse.)

                                                                                  1. 2

                                                                                    Thanks for taking this seriously. One of the worst things about the programming reddit is that people use downvote to indicate dislike, and that things quickly fall off the front page if downvoted.

                                                                                    Link bait gets enough upvotes to survive the torrent of downvotes that hits pretty much every new article, but technical articles rarely do, which heavily biases the top articles towards rants and away from posts with real technical content.