1. 15

  2. 12

    The comparison of markup languages at the end of the post should be interpreted with a specific use-case in mind, writing technical documentation, and an evaluation choice in mind, which is to not care very much about the usability/cost of writing the documentation (one writer, many readers). For example, the author finds roff “Very good”, but roff is rather horrible to use “in a hurry”, without good support from your text editor and some time reading the documentation. I think this is a perfectly fine use-case for markup formats, but it is only one among many.

    For me the value of markdown and markdown derivatives is: this makes it easy to write short ASCII texts on the web that use HTML features without hassle. Markdown is great for short comments, forum posts, and reasonably convenient for blog posts. I agree that it should not be used for “technical documentation”, as its very limited expressivity becomes an issue – for example it does not support footnotes or table of contents. I have used AsciiDoc for this purpose, which I find more suitable (for example my ocamlbuild manual started in Markdown and is now an AsciiDoc document).

    One thing that is a bit frustrating as a language designer is that there is no obvious reason why we couldn’t get all the nice things at once: a convenient syntax like Markdown, without the very limited expressiveness. Intuitively one would think that building a more expressive/powerful markup language on top of Markdown (possibly with minor changes) should be easy, and AsciiDoc is playing in the same niche (so are Pandoc’s extensions, org-mode, etc.). But then it becomes very difficult to get people to agree on a feature set, and to have wide support among the many websites that support Markdown as a lowest common denominator today, so these more expressive languages become less portable across tools and places. Having built-in rendering support in Github for example is a criterion of choice even for a technical documentation language – because it means that you can always point your users to the most up-to-date version of the document, directly in the version-control system, without having to come up with an additional rendering-and-release pipeline.

    1. 8

      This is… I don’t know, I’d call it funny I guess? I should not be surprised, I have been watching people (esp. those of a programmer mindset) misunderstand Markdown for years.

      The reality is right in the name: Markdown is not a markup language. Certainly not in the sense of any of the other ones the article juxtaposes it to.

      Instead, Markdown is a shorthand for HTML. You’re really just writing (and reading) HTML.

      It’s non-independent by its very premise. And that’s why it has so few constructs. Those two choices are explicit design decisions which are predicated on each other: it doesn’t need to do everything HTML can do, only the things you write over and over and over and over again when you author in raw HTML. Everything else, you just use HTML tags for.

      That’s why its code spans are so purportedly horrible: because if you want <code>, you can just write <code>. The problem with <code> and friends is that you need to escape lots of characters that come up frequently in code, because they are part of the HTML syntax. That’s why Markdown’s backticks and code blocks exist and why they disallow formatting: because that means they don’t need any quoting mechanism. That makes them easier to write for the simple stuff you want a lot of of the time. If you could embed formatting in them, you’d lose the entire reason these constructs exist in the first place. Instead, if you want to format your code snippets/blocks, just do what you’d have done in the absence of Markdown: use <code>. That’s why Markdown is non-independent: on purpose.

      That approach also mostly avoids the problems with ambiguity in the emphasis markup. If you need precision – write it in HTML tags! That may seem like a copout – and to some extent, yes it is –, but consider: would you confront human readers with **bold***italic***bold**? At least, if you expected them all to interpret that the same way? If you wanted a human to understand that clearly, you’d write it with a heavier syntax that makes the nesting explicit. Which is what HTML is. Why is why Markdown lets you write HTML when that’s clearer. (So, to an extent, the lack of clear definition is a result of laziness: if it works for the simple cases and HTML is there for the complex ones… why bother, right? I consider that a copout; just one that the design could afford.)

      And so almost all of the weaknesses Ingo rants about are reasonably deft choices if you think of Markdown as “a stenographic form of HTML” rather than some kind of self-contained markup language in the vein of the other ones he lists. (But lack of standardisation really is a problem. And line breaks by whitespace are… a coherent choice within the aims of Markdown, but nevertheless a terrible one. Lastly, while I don’t think I agree on the semantic confusion issue, it’s at least debatable. (Aside from those issues with it, I have one of my own, namely list renumbering – a superb help in ~1% of cases but also a massive pain in the foot ~10% of the time.))

      All of which also means lots of people process Markdown input wrong. The right way to deal with it is to convert it to HTML, and then process that HTML just as you would process any ol’ fully-general HTML document. You want to whitelist certain constructs? You use a HMTL sanitiser. You want to generate another format from Markdown? You write a HTML-to-YourFormat converter. Etc etc.

      Which, btw, also means that if you want to convert from another format to Markdown, then strictly speaking it’s totally valid to just output HTML that doesn’t use any Markdown shorthand. (Of course, because most people implement Markdown processing wrong, that document will be useless in lots of places which purport to accept Markdown.)

      In a sense, Markdown is how a designer or linguist might respond to the task of writing HTML (emphasise the common stuff, blur insignificant distinctions) rather than how a programmer or computer scientist would design a standalone markup language (carefully make all constructs equally plausible and orthogonally composable in every context). And it‘s a pretty damn good stab at its purpose.

      And so it’s a problem that Markdown’s adoption has prompted many people to try to use it as though it were (a lot) more like a self-contained markup language… whence, cue the linked rant. So while I think Ingo has fundamentally misunderstood Markdown and missed its point, I guess I still believe he has good reason to rant.

      Edit: fixed my confused attribution of Ingo’s positions to pitrh.

      1. 3

        Markdown is very close to writing plain text. Writing html just like you would write a plain text may have a lot of defaults, but is comfortable.

        Whenever you want to write enhanced plain text (blog posts, comments) Markdown is great. Let’s only use it in these cases and we will have no problem. :)

        1. 3

          (pitrh is not ingo.)

        2. 5

          This is really more a rant about markdown than a feature announcement.

          1. 6

            Yeah it’s a rant, but an informative one from someone with years of practical experience implementing markup and text processing. A rant from someone who loves markdown over anything else would probably not be as informative.

            It is also biased, which I am sure Ingo would have no problem admitting :)

            1. 3

              Yes, I liked it, didn’t mean it in a rude way. It’s just the title that didn’t really fit the description and that threw me off. To me, rant sounds like a neutral word.

              1. 3

                It is part of Lobsters submission guidelines to “not editorialize titles”, which is why, I suppose, the original title was kept unchanged.

          2. 2

            As a commenter on that page already mentioned, where’s CommonMark? Admittedly does not solve all the issues, but at least it’s a currently maintained standard.

            As for the *** vs ** issue (bold and italics), I always preferred Textile’s use of * for bold and _ for italics. I wrote many an email (1980s) where that was how I marked up the text. The rest of Textile, however, doesn’t work for me, though mdoc certainly seems of the same style, i.e., use of short prefixes for that line to indicate its semantic or formatting meaning.

            Finally, as an author who used Leanpub to format books, you can only go so far using Markdown (even if it’s an extension, i.e., Kramdown) before you start tearing your hair out and wishing for direct LaTex support. This is one of the reasons the Leanpub folks created their own markup language, Markua, but that was too late for me.

            1. 1

              Now we just have to wait a few that this reaches the protable mandoc implementation. Or switch to OpenBSD! :)