1. 18

The author here, after noticing a large number of XSS-exploits in sites rendering markdown on the internet I decided to create a library where you have to explicitly opt-out of security instead of it being opt-in as in many markdown implementations.

  1.  

  2. 4

    Hey, primary maintainer of cmark-gfm here! I like what you’ve done here a bunch. I think safe-by-default is almost always the correct decision. I might consider changing cmark-gfm to be likewise, though I’m concerned about diverging from upstream in this subtle way.

    1. 4

      Hi, that would be absolutely amazing.

      If the result of my little project is that you change the standard cmark-gfm implementation to be safe-by-default then I would consider it a greater success then I could ever imagine. Please keep posted on this.

      The goal of the project is to provide a markdown renderer which is safe to use with user-provided input out of the box and where the risk of error is minimized.

      1. 3

        Done! Thanks for the push to do this – it’ll cause a period of migration for downstream consumers, but I think it’s worth it.

        1. 2

          That’s very cool of you, I really appreciate the swift course of action and as you say I really think it will be worth it long term.

          This is basically like when YAML decided to change the standard implementations to be safe by default a little while back instead of unsafe defaults, basically making the internet a little bit safer.

    2. 6

      I wish VFMD became more popular. It has a nice, exhaustive and deterministic spec, and potential for being well-specified with regards to worst case complexity.

      1. 3

        How is this different than, or related to, Common Markdown?

        1. 2

          It predates Common Markdown. It’s a real specification, it is unambiguous and exhaustive (I haven’t read CM, but from what I heard it’s more example-based, and still ambiguous and non-exhaustive). As a genuine spec, it’s kinda “declarative algorithm expressed in words”; it’s precise as to expected observed behaviors, but over that it doesn’t force a particular implementation (thus it’s not a “specification by reference implementation”). It comes with a huge test suite, starting from simple cases, up to and including various non-trivial cases, known to be ambiguous among various implementations.

          Also I believe that with a few non-disruptive extra restrictions (an O(N) regexp engine, and how many lines can a link span) it could be made O(N·M) worst-case complexity (N being number of lines in text, M being length of the longest line).

          However, it didn’t win the popularity contest, as there’s only one guy behind it, who’s not a “SE celebrity” or a good marketing/PR guy. But he did put the extra effort to make it well readable and to maintain a clean and solid website for VFMD, still keeping it active after many years.

          1. 3

            It’s a real specification, it is unambiguous and exhaustive (I haven’t read CM, but from what I heard it’s more example-based, and still ambiguous and non-exhaustive).

            I think you are missing something, because this is exactly the goal of CommonMark Spec and the reason why it was created is the ambiguity of daring fireball’s markdown description.

            From the latest spec:

            […] John Gruber’s canonical description of Markdown’s syntax does not specify the syntax unambiguously. Here are some examples of questions it does not answer: […]

            or the website:

            John Gruber’s canonical description of Markdown’s syntax does not specify the syntax unambiguously. […]

            @akavel wrote:

            […] there’s only one guy behind it […]

            I think what made CommonMark so successful was that they’ve been very open to input from day one (aka there is a big community participation forum) and that they’ve released a lot of specification versions over a short period of time (considering how old the initial Markdown draft is).

            1. 1

              I searched some more now, and what I currently found is a “commonmark vs vfmd” page on vfmd wiki. It compares VFMD with CommonMark as of 2014, when the latter was emerging. The page links to a 2014 discussion on HN between vfmd’s author and one of CommonMark authors. I suppose the quote that stayed in my memory, which IIUC was valid at least at the time, was:

              The problem is that there’s no formal grammar and the spec of “Standard Markdown”, while being more specific than John Gruber’s, is still full of ambiguities.

              Some examples of ambiguities:

              […]

              Thing is, a specification-by-example like this would have to keep an ever-growing list of corner cases and give examples for each of them. […] Hence the need for a formal grammar, which is the shortest way of expressing something unambiguously. […] (Shameless plug: vfmd (http://www.vfmd.org/) is one such Markdown spec which specifies an unambiguous way to parse Markdown, with tests and a reference implementation.)

              (“Standard Markdown” was the original name of CommonMark, before John Gruber objected to this naming.) And John MacFarlane’s (CM co-author’s) reply further down basically confirmed this:

              Your comments (coming from someone who has actually tackled this surprisingly difficult task) are some of the most valuable we’ve received[…] We considered writing the spec in the state machine vein, but I advocated for the declarative style. It may be worth rethinking that and rewriting it, essentially spelling out the parsing algorithm.

              I haven’t tracked CommonMark since then, so I don’t know whether they processed this feedback eventually, or not. Whereas VFMD was ready then already. And it was not “less open”; it was just less popular, and its author was not an Internet celebrity.

              1. 1

                Ah, thank you for checking why you’ve remembered this that way - makes more sense now.

                “commonmark vs vfmd” page on vfmd wiki

                Tried out many of the examples in Pandoc, some of the criticism appears to have changed, others not - some of the examples are I guess personal preference and no “right” way to do it, no matter how much you discuss the topic.

                I haven’t tracked CommonMark since then, so I don’t know whether they processed this feedback eventually, or not.

                I can’t find any implementation in vein, or something that helps you implementing CM - besides reading spec and source, but there are so many implementations of CommonMark now. But yes, it has processed on this feedback.

                Whereas VFMD was ready then already. And it was not “less open”; it was just less popular, and its author was not an Internet celebrity.

                There are claims that they’ve already worked on CommonMark in 2012. And even if not, sometimes ideas, concepts, projects that are coming later, do erase someones work that never had the opportunity to take off.

                So I can’t tell you why VFMD hasn’t taken off, but CommonMark for whatever reasons did. I think Roopesh Chander work on VFMD is stunning and his efforts should be praised…

                I don’t know who you are referring to as an “Internet celebrity”, whether you mean “John MacFarlane” or “Jeff Atwood”.

                Also Pandoc’s first commits are in November 2009, so long before VFMD, I totally get why someone who built something like Pandoc would put efforts into creating CommonMark.


                It predates Common Markdown. It’s a real specification, it is unambiguous and exhaustive (I haven’t read CM, but from what I heard it’s more example-based […]

                After this discussion I don’t see any proof for your claim and “from what I’ve heard in 2014” is really not helpful - other than confusing people with polarizing statements. Read both specs for a proper discussion. Both have examples.

                1. 2

                  I do understand now, thanks to this discussion, that CM may have changed more than I used to expect over this time. If I get back to some efforts on parsing Markdown in future, I will most certainly take CM seriously into consideration and comparison now, to develop an opinion on it anew and with a fresh eye. I’m very happy to hear they may have improved so much. With that said, at the time being, I cannot invest my time into this comparison, and don’t have a need for this. But I will sure be more considerate with my claims in this area from now on. Thank you very much for this again.

                  I generally agree with what you’ve written in this last post. As to Pandoc, VFMD docs do mention and acknowledge the influence of John’s work quite clearly. By “Internet celebrity”, I mean Jeff Atwood; and I totally don’t want to mean this in a bad way. Just as a statement of fact, and some meditation on importance of publicity and popularity on technology impact and adoption. Even if it makes me somewhat sad, that idealised “pure merit” is not enough to succeed; but that’s just how this world works, so I find it pointless to argue with that.

        2. 1

          Interesting, will have to look into that further.

        3. 2

          Very cool! Thanks for this, I’m starring it!