1. 4
  1.  

  2. 7

    I usually find that “math” doesn’t prove real-world facts. In fact I’d go so far as to say that it doesn’t prove anything, beyond itself. When trying to “rate” generals in terms of numbers (from a source the author has admit is limited - notice that Genghis Khan is vastly underrepresented), all you can do is visualize, compare and analyze data. On the one hand, one has to consider that we neither have all the data to accuracy describe reality, nor is all the data trustworthy: let’s not forget, history is written by the winners - and what is written has no necessity to be true. On the other hand, “just” looking at numbers cuts out so many other factors which could have been vital to the course of history, weather, strategy, technology, economic issues like food distribution/production, social factors like election, revolutions, coup d’état, memory of previous wars or bad omens by oracles. War has over and over proved not to be just a conflict determined by manpower - otherwise it would be truly democratic. Cities in Europe capitulated before the Mongols just because of what they heard that they did to cities that didn’t, even though their manpower might have been significantly lower. That’s what I call a good PR campaign. Or how can one say that when Alexander’s army decided not to go further than India, that this is somehow to blame on Alexander the Great himself. What about the Russian generals that forced Napoleon into the Russia winter and thereby destroyed his armies? How is that non battle, negative choice factored in to the statistics?

    Ultimately, I see this kind of a project as a futile attempt at ahistorical and decontextualized ratings of generals, as if it were some kind of a video game (Interestingly enough this could serve as a foundation to critique a modern video-game mentality of understanding the world and history). If his intention were to rate general based on this WAR mechanism and with limited data, then fine. But the inferred jump from this rating to an absolute standard of “goodness” in being a general. I might even agree that Napoleon, Hanibal and Alexander the Great were good generals, but this is not a statistical matrer, and I don’t belive it can be reduced to an issue described via numbers.

    1. 6

      I don’t agree that you can’t quantify this, but I do agree that this article is fundamentally bogus. The thing that bugs about these kinds of overblown claims is that it poisons the well for deeper analysis since it causes people to write off the idea of using data. IMO, the problem here isn’t the use of data, it’s the unreasonably strong conclusion drawn from very weak data. Ironically, you can even see this in baseball if you look at pre-“sabermetric” analysis of baseball. Many of the stats people used to judge baseball players simply weren’t very good (in that they had low predictive power in terms of both future player performance and team win probability), which gave the anti-stats camp ammunition they could use to dismiss the idea of deeper analysis.

      It appears that the author has taken all factors that aren’t sheer numbers and found “wins about replacement” (WAR) for all of those factors combined. Some things that make WAR work for baseball that seem difficult to apply to warfare include:

      1. In baseball, there’s a well defined concept of a “win” and regular season wins are equivalent. This is certainly not true in warfare and a single “win” can sometimes reverse the effect of many losses.
      2. In baseball, runs scored are independent enough that you can predict wins from runs scored “well enough” and the effect of players is independent enough that you can predict run generation and prevention “well enough”. You can apply this type of analysis for sports where players actions aren’t even close to being independent, but it’s much harder to do so. It’s not clear what “hidden” or “intermediate” variables in warfare can stand in for the equivalent variables in baseball. A naive application using intermediate variables like casualties seems likely to yield almost no value (I could be convinced otherwise, but I’d need to see an argument, not the blind application of a formula). If this analysis were serious and not clickbait, I’d expect the discussion of these variables and the justification that these variables are predictive to take up most of the space.
      3. There’s a large sample size. We have tons of data about baseball (and other sports). While we know about “lots” of battles, we don’t really have good data for many of them. Even for battles that have happened in the past decade, casualty estimates can vary by an order of magnitude depending on who’s doing the estimate. Even if everything else were perfect, you’d have a hard time doing sabermetric analysis of baseball if your error bars on the number of runs scored varied by an order of magnitude. Also note that, when comparing across eras in any sport, you have to adjust for how the environment was different in different eras; different adjustments can give very different rankings of players. If any kind of era adjustment was done in this article, it’s not mentioned and the adjustment isn’t justified. For many eras, I suspect we simply don’t have the data to do a data-driven adjustment, so any adjustment would either involve original research or intuition applied to case studies.
      4. Replacement level is fixed. There’s a debatable but relatively well defined meaning for replacement level in baseball, but no such thing exists in warfare. As we’ve seen in baseball analysis, moving the replacement level target around can drastically effect findings of who’s “best” and since we have a variable that gives the person doing the analysis total freedom, we need some justification for why replacement level is set to where it is if we want to be convinced the the result isn’t just a post hoc justification. Note that this goes into all of the variables used in (2) and their weights, so in this case the person writing the article had many degrees of freedom that they could tweak to tune the results arbitrarily and there’s no justification for any of “knob settings” in the entire article.

      I don’t think there’s anything inherently wrong with running the numbers and seeing what comes out. But the author then claims that this somehow “proves” that some generals are mathematically superior to others. As you say, this is not correct.

      Using a methodology that can’t answer the question being asked and then feeding in bogus data is quite common, so I don’t mean to pick on this article in particular for that reason. The thing I find most unfortunate about this is that I’ve seen this article passed around a lot, and AFAICT it’s been passed around precisely because the author makes a lot of claims that they can’t back up – an article titled “a preliminary investigation of ‘wins above replacement’ applied to warfare” probably wouldn’t go viral, wouldn’t have gotten linked to on Marginal Revolution, etc. This article appears to have been widely read not in spite of being clickbait, but because it’s clickbait. It uses the cachet of sabermetrics, as if sabermetrics is about having some formula and a term called “wins above replacement” rather than being about figuring out what we can “know” from data and what the limit of our knowledge are. As an article about who the “best” general is, it’s pure cargo culting.

      1. 1

        I don’t think there’s anything inherently wrong with running the numbers and seeing what comes out. But the author then claims that this somehow “proves” that some generals are mathematically superior to others. As you say, this is not correct.

        Well that was basically my point ^^ My introductory part about “math not being able to prove anything” (in itself, but it can be used.as a tool to model or calculate, no doubt) were maybe more provocative that they should have been. But yeah, there’s no moral issue or any other issue with doing calculations, nor do I think that anyone belives that, just in case anyone misunderstood the intention of my comment.

    2. 2

      In the Vietnam War, there was a focus on data above everything else that led to a technocratic approach to war. Such-and-such many killed; so-and-so many weapon caches destroyed. It led to the murder of innocent people and the falsification of metrics, so that generals sitting in the rear could fool themselves into thinking they were winning. In light of this, it seems like a really bad idea to try to apply Sabermetrics to this space.

      1. 1

        I really enjoyed this article because I’m a huge baseball fan and seeing the use of Sabermetrics outside of the sport is always interesting. People always go off on WAR and why its such a valuable stat but to me its too general to really show what a player is good at. I know having >= 1WAR is good, but I also know (and prefer) seeing a slashline of .290/.320/.490 (BA/OBP/SLG) is good. A player with negative WAR can have this slashline. Its not likely, but it is possible. The same goes for pitchers, if I see and ERA of <3 and a WHIP of <1, that individual can really toss the ball. Seeing something like 3WAR makes me go “Oh, he’s generally good, he’s 3 times better than the league-average-replacement-level-individual-of-the-same-position, but does his changeup really drop and fool opposing batters? I just don’t know”.

        1. 2

          This makes me wonder how good a pitcher Napoleon would have been. There’s a famous anecdote of him leading his mates in school in an epic snowball fight…

        2. 0

          Also: he ate the Ziggy Piggy. The entire thing!