1. 8

    Regarding Table of Contents generation, what do you think about hxtoc(1) which is part of the HTML/XML utilities by the w3c?

    Also, I’ve made a similar experience regarding a joyful discovery of CommonMark recently, but instead of using the parser you mention, I’ve taken up lowdown as my client of choice. I guess this is something it has in common with most C implementations of markdown, but especially when compared to pandoc, it was fast. It took me a fraction on a second to generate a website, instead of a dozen or more. So I guess, I wanted to see, what other clients you’ve looked into, for example discount, as an example of an another popular implementation.

    1. 5

      Hm, I’ve actually never heard of hxtoc, lowdown, or discount!

      I haven’t been using Markdown regularly for very long. I started it using more when I started the Oil blog in late 2016. Before that, I wrote a few longer documents in plain HTML, and some in Markdown.

      I had heard of pandoc, but never used it. I guess cmark was a natural fit for me because I was using markdown.pl in a bunch of shell scripts. So cmark pretty much drops right in. I know a lot of people use framework-ish static site generators, which include markdown. But I really only need markdown, since all the other functionality on my site is written with custom scripts.

      So I didn’t really do much research! I just felt that markdown.pl was “old and smelly” and I didn’t want to be a hypocrite :-) A pile of shell scripts is pretty unhygienic and potentially buggy, but that is what I aim to fix with Oil :)

      That said, a lot of the tools you mention do look like the follow the Unix philosophy, which is nice. I would like to hear more about those types of tools, so feel free to post them to lobste.rs :) Maybe I don’t hear about them because I’m not a BSD user?

      1. 4

        I had heard of pandoc, but never used it.

        It’s a nice tool, and not only for working with Markdown, but tons of other formats too. But Markdown is kind of it’s focus… If you look at it’s manual, you’ll find that it can be very finely tuned to match ones preferences (such as enabling or disabling raw HTML, syntax highlighting, math support, BibLaTeX citations, special list and table formats, etc.). It even has output options that make it resemble other implementations like PHP Markdown Extra, GitHub-Flavored Markdown, MultiMarkdown and also markdown.pl! Furthermore, it’s written by John MacFarlane, who is one of the guys behind CommonMark itself. In fact if you look at the cmark contributers, he seems to be the most active maintainer.

        I usually use pandoc to generate .epub files or to quickly generate a PDF document (version 2.0 supports multiple backends, besides LaTeX, such as troff/pdfroff and a few html2pdf engines). But as I’ve mentioned, it’s a bit slow, so I tend to not use it for simpler texts, like when I have to generate a static website.

        I know a lot of people use framework-ish static site generators, which include markdown.

        Yeah, pesonally I use zodiac which uses AWK and a few shell script wrappers. You get to choose the converter, which pipes some format it, and HTML out. It’s not ideal, but other than writing my own framework, it’s quite ok.

        Maybe I don’t hear about them because I’m not a BSD user?

        Nor am I, at least not most of the time. I learned about those HTML/XML utilities because someone mentioned them here on lobste.rs, and I was supprised to see how powerful they are, but just how seemingly nobody knows about them. hxselect to query specific elements in a CSS-fashion, hxclean as an automatic HTML corrector, hxpipe/hxunpipe converts (and reconverts) HTML/XML to a format that can be more easily parsed by AWK/perl scripts – certainly not useless or niche tools.

        But I do have to admit that a BSD user influenced me on adopting lowdown, and since it fits my use-case, I stick by it. Nevertheless, I might take a look at cmark, since it seems interesting.

      2. 2

        Unfortunately, it looks like lowdown is a fork of hoedown which is a fork of sundown which was originally based on the markdown.pl implementation (with some optional extensions), and is most likely not CommonMark compliant. Pandoc is nice because it can convert between different formats, but it also has quite a few inconsistencies.

        One of the biggest reasons I like CommonMark is because it aims to be an extremely solid, consistent standard that makes markdown more sane. It would be nice to see more websites move towards CommonMark, but that’s probably a long shot.

        Definitely check out babelmark if you get a chance which lets you test different markdown inputs against a bunch of different parsers. There are a bunch of example divergences on the babelmark FAQ. The sheer variety of outputs for some simple inputs is precisely why CommonMark is useful as a standard.

        1. 3

          Lowdown isn’t CommonMark conformant, although it has some bits in place. The spec for CommonMark is huge.

          If you’re a C hacker, it’s easy to dig into the source to add conformancy bit by bit. See the parser in document.c and look for LOWDOWN_COMMONMARK to see where bits are already in place. The original sundown/hoedown parser has been considerably simplified in lowdown, so it’s much easier to get involved. I’d be thrilled to have somebody contribute more there!

          In the immediate future, my biggest interest is in going an LCS implementation into the lowdown-diff algorithm. Right now it’s pretty ad hoc.

          (Edit: I’m the author of lowdown.)

          1. 2

            One of the biggest reasons I like CommonMark is because it aims to be an extremely solid, consistent standard that makes markdown more sane. It would be nice to see more websites move towards CommonMark, but that’s probably a long shot.

            I guess I can agree with you when it comes to websites like Stackoverflow, Github and Lobsters having Markdown formatting for comments and other text inputs, but I really don’t see the priority when it comes to using a not 100% CommonMark compliant tool for your own static blog generator. I mean, it’s nice, no doubt, as long as you don’t intentionally use uncertain constructs and don’t over-format your texts to make them more complicated than they have to be, I guess that most markdown implementations are find in this regard – speed on the other hand, is a different question.

            1. 1

              Are you saying that CommonMark should be used for comments on websites, but not for your own blog?

              I would say the opposite. For short comments, the ambiguity in Markdown doesn’t seem to be a huge problem, and I am somewhat comfortable with just editing “until it works”. I don’t use very many constructs anyway – links, bold, bullet points, code, and block code are about it.

              But blogs are longer documents, and I think they have more lasting value than most Reddit comments. So although it probably wasn’t strictly necessary to switch to cmark, I like having my blog in a format with multiple implementations and a spec.

              1. 3

                At least in my opinion, its useful everywhere, but more so for comments, because it removes differences in implementations. Often times the people using a static site generator are developers and can at least understand differences between implementations.

                That being said, I lost count of how many bugs at Bitbucket were filed against the markdown parser because the library used resolves differences by following what markdown.pl does. I still remember differences in bbcode parsing between different forums - moving to a better standard format like markdown has been a step in the right direction… I think CommonMark is the next step in the right direction.

                1. 1

                  The point has already been brought up, but I just want to stress it again. You will probably have a feeling for how your markup parser works anyway, and you will write according. If your parser is commonmark compliant, that’s nice, but really isn’t the crucial point.

                  On the other hand, especially if one likes to write longer comments, and uses a bit more than the basic markdown constructs on websites, having a standar to rely on does seem to me to offer an advantage, since you don’t necessary know what parser is running in the background. And if you don’t really use markdown, it doesn’t harm you after all.

          1. 4

            I have servers at BSWS, where OpenBSD is fully supported, for many years now. No problems and (can I name names?) staff often found at BSD conferences. Suffice to say, can’t do much better. :) I’ve also recently started to use RootBSD due to some requirements for hosting in the US. Again, OpenBSD. Would highly recommend both, though I’ve only been working a few months with the latter. Both of these are for production systems. (Edited for tense.)

            1. 1

              If this is for a web application, I wrote kwebapp for exactly this situation. It can be fairly easily extended to emit Java, although its focus now is on BCHS applications.

              1. 1

                I’ll take a look, thanks!

              1. 2

                To date, I’ve been using Shotwell for managing photos and ufraw (via gimp) for the editing. (See example output.) With gphoto2 for import. It’s… pretty clunky, but I’ve been using ufraw for a while. Looks like it’s time to try switching again—anything to get out of shotwell’s limitations!

                1. 5

                  You may already be familiar with this subreddit:

                  https://www.reddit.com/r/FOSSPhotography/

                  But I’m mentioning it in case anyone else isn’t.

                1. 3

                  What about python, pandoc, flask, and so on is lightweight? Has the world gone mad?

                  1. 2

                    I like how BCHS keeps evolving.

                    Am I correct that there is no concept of templates at this point?

                    1. 3

                      Most of my own usage is in exporting JSON, then doing page manipulation in the browser. But since this is all in C, you should just be able to link to any templating engine library. (As noted below, kcgi does have a simple khttp_template(3), but it’s just token-pasting and nothing more sophisticated.)

                      1. 2

                        From the khttp_template(3) man page (part of kcgi):

                        DESCRIPTION
                             The khttp_template, khttp_templatex, khttp_template_buf,
                             khttp_templatex_buf, khttp_template_fd, and khttp_templatex_fd functions
                             comprise a template system for a kcgi(3) context allocated with
                             khttp_parse(3).  They may only be called after khttp_body(3), else
                             behaviour is undefined
                        
                      1. 1

                        (Minor typo: “rooted at thd node”. Might want to fix that.)

                        1. 2

                          Thanks, noted! (Will push when document is next updated.)

                        1. 1

                          This is quite neat. One question from someone who didn’t compile the code and play with it: the “XML diff” algorithm BULD seems almost insensitive to ordering, but the order of text matters a lot (and classic diff - and your merge algorithm - are very linear comparisons.) Does the algorithm “behave” once you start moving blocks?

                          Thanks for sharing!

                          1. 2

                            BULD works on ordered trees—that was one of the reasons it was chosen. And it indeed supports the “move” concept in the edit script. In the lowdown implementation (specifically, in the merging algorithm), moves are made into insert/delete simply for the sake of readability of the output. It’s straightforward to extend the API to have “moved from” and “moved to” bits. Then have a little link in the output. Maybe in later versions…

                          1. 3

                            Is there a visual example of the lowdown-diff output? I was disappointed not to see a ready comparison of the output versus the diff and wdiff examples in the introduction.

                            EDIT: Apparently I missed the line near the top:

                            For a quick example of this functionality, see diff.diff.html, which shows the difference between this document and a [fabricated] earlier version.

                            1. 3

                              Thanks—I’ll put in an in-line exemplar as well when next editing.

                            1. 1

                              Can we not start people off with inline javascript? CSP is a thing.

                              <!DOCTYPE html>
                              <html>
                                <head>
                                  <meta charset="utf-8" /> 
                                  <title></title>
                                  <script>
                                    function init() {
                                      var e = document.getElementById('foo');
                                      var txt;
                                      if (null !== e) {
                                        txt = document.createTextNode('hello, world');
                                        e.appendChild(txt);
                                      }
                                    }
                                    document.addEventListener('DOMContentLoaded', init);
                                  </script>
                                </head>
                                <body>
                                  <span id="foo"></span>
                                </body>
                              </html> 
                              
                              
                              1. 1

                                I like your example. I also see value in what you are saying. I’m actually pretty new to JS myself. Thanks for the tip!

                              1. 3

                                lowdown author here. I use lowdown on Mac OS (has a homebrew) and Linux, and of course on OpenBSD. I don’t know how to distribute on Windows. (The source uses a very straightforward Makefile and C without dependencies, if that helps?)

                                The next release (currently on the GitHub, but not stabilised for release) features almost-totally-completely re-written internals to support nicer PDF output.

                                1. 1

                                  Does lowdown support linking a CSS file when generating the standalone HTML? When I run it with the -s flag, it includes the title and author, but not the other MMD metadata.

                                  1. 2

                                    As noted here, you mean? I’ll include that in the next release. Thanks!

                                    1. 1

                                      Yes, thanks!

                                1. 2

                                  I feel much better reading this article as a joke.

                                  1. 1

                                    I didn’t know Dexter could index our databases. Do I e-mail him queries or just pester him at the next BSD conference…?

                                    1. 3

                                      Every markdown article deserves a link to Ingo’s comments on Markdown.

                                      1. 1

                                        This might deserve its own top level post :)

                                        1. 4
                                      1. 2

                                        A more interesting question is that of password reset. Does one store a random password that’s e-mailed (and/or texted—though this might incur additional cost to the operator) to the user? The details matter: can one reset a password when the temporary one has already been issued? For how long is the temporary one valid? Can the existing one still be used? If the existing one is used, is the temporary one wiped out? Should there be intermediary “security questions” before issuance of a temporary token (as per the OSAWP recommendation)?

                                          1. 1

                                            Also an excellent article, you should submit it separately!

                                        1. 1

                                          It seems like lots of these hoops can be removed when openbsd implements the pledge directory whitelist feature. Can anyone elaborate as to why that feature was added to the function signature, but not implemented for such a long time?

                                          1. 7

                                            Changing the prototype for a function, especially a syscall, is involved and annoying. Adding the extra argument let us play with the implementation a bit without constantly changing code everywhere. The whitelist plan is taking longer than expected to come together, and it’s probably going to look a little different in final form.

                                            1. 1

                                              Thanks, I wondered if it was performance issues that hadn’t been resolved. It’s a feature I’m looking forward to.

                                              1. 4

                                                Making sure it can’t be abused as a performance bottleneck is also part of it.

                                                The feature came about in part because I wanted to sandbox Firefox. The objective is to prevent Firefox from reading my .ssh directory, or my maildir, and uploading it to some rogue site. But sometimes people like uploading files on purpose. So we’d whitelist the ~/uploads directory or something. But some open questions are things like do we kill the whole browser just because somebody clicked the wrong file? How do users know which files can be uploaded and which can’t? So that stalled a bit. At least for that design, we knew it would work for Firefox, but the problem was really at the other end. What does the feature look like at the frontend of Firefox?

                                                There weren’t a lot of use cases in base. For the most part, base utilities can do a pure privdrop approach. Some exceptions can use a privsep approach instead. It was hard to make a list of exactly which programs would benefit from any given design.

                                                1. 4

                                                  Are you trying to sandbox the content process, or the parent process as well? Our goal (which will be mostly realized for Firefox 56) is that the content process shouldn’t need access to any of the filesystem (modulo its own resources directory and a few other things), when it needs to do something like upload a file for an <input type="file"> that’s orchestrated by the parent process.

                                                  If you’re interested in chatting more about how OpenBSD’s pledge could fit into the existing sandbox architecture, feel free to reach out (work email in my profile).

                                                  1. 2

                                                    Well, I’m off team sandbox for now, but the general outline is to orchestrate everything in the parent. That generally works well with pledge, with fd passing.

                                                  2. 2

                                                    Could there be, say, a separate process for only the file selection window, which passes a file descriptor for the chosen file over a Unix socket to the main sandboxed process?

                                                    Or maybe the file opener actually reads the file and passes raw bytes to the main sandboxed process?

                                                    1. 3

                                                      You can do that, although in the case of a browser some extra precautions are needed as well. I want to stop the browser from uploading my files. But the browser also uploads files deliberately. So whether it’s in the render process or somewhere else, there exists code for the purpose of file uploading.

                                                      In theory, there could be a whitelist in the browser. Only allow files in these directories to be uploaded. But such whitelists, much like the same origin policy, are subject to bypass and confusion. And we could say that files can only be uploaded in response to user action. But there’s this whole javascript runtime dedicated to creating fake events that look a lot like user action.

                                                      The concern isn’t just that there’s a buffer overflow in some image library or whatever. The purposefully written and existing upload code might be activated against our wishes as well. This was in the time of the pdf.js vuln. Actually, the original pledge (nee tame) whitelist code was added two weeks after this. https://blog.mozilla.org/security/2015/08/06/firefox-exploit-found-in-the-wild/

                                                      1. 2

                                                        I was thinking more:

                                                        • the browser process doesn’t have the ability to open a file to upload it,
                                                        • but it does have the ability to pop an ‘upload a file?’ window
                                                        • and the ‘upload a file?’ window will open a file iff you, the user, pick one
                                                        • and the ‘upload a file?’ window can send those bytes to the browser

                                                        so now for a miscreant to upload your files requires suborning the ‘upload a file’ program via the browser, which could be more difficult because it could have a quite small attack surface exposed to the browser process

                                                      2. 1

                                                        this definitely seems like the “clean” solution, but wouldn’t that involve working on the GUI toolkits involved?

                                                      3. 1

                                                        I see, I wanted it to allow my programs to freely read and write a database, config files and logs in a specific directory but nothing else. Firefox is definitely an interesting use case.

                                                        1. 1

                                                          But some open questions are things like do we kill the whole browser just because somebody clicked the wrong file?

                                                          Seems like the right answer is namespaces. You can’t click on the wrong file if it isn’t listed.

                                                    2. 1

                                                      It wouldn’t help. As mentioned, SQLite makes no guarantee as to which files are going to be used between versions of SQLite. You can guess, but sometimes a file only pops up in response to a given workload, or a given database mode (which can change, like switching to WAL, during run-time). One possibility is to white-list all possible files. But of course, then the developer must tie an application to a particular version of SQLite. And of course, one can also direct SQLite to change the directory in which some of these files live.

                                                      …hence the hoops.

                                                      1. 1

                                                        But you could whitelist a directory

                                                    1. 1

                                                      I’ve tried this kind of thing a few times and it’s never been as much fun as just using a framework. Not worth it.

                                                      1. 4
                                                      1. 4

                                                        I’m not quite sure if this is meant as a joke or not.

                                                        On the one hand, I totally see a resurgence of “let’s just use 4 basic operating system functionalities and not invent every part of it” (which, by the way, I fully subscribe to, most web pages don’t need technical innovation and could live with a lot less moving parts).

                                                        On the other hand, there’s stuff like this:

                                                        Anybody can write crappy, bug-ridden and insecure code. In any language. Good news: with C, it’s even easier! So familiarise yourself with common (and common-sense) pitfalls noted in the SEI CERT C coding standard and comp.lang.c FAQ.

                                                        and the example feels a bit… well… overly simple. :D

                                                        http://www.learnbchs.org/easy.html

                                                        (I mean, in which sense is just writing raw HTTP to the output stream less magical then using a specific library for it?)

                                                        1. 5

                                                          As linked: tutorial 1, tutorial 2. Especially see the latter.

                                                          1. 3

                                                            Hi Kristaps! Good stuff.

                                                            Just a heads-up, https://kristaps.bsd.lv/ksql/ksql.3.html links to other ksql man pages on man.openbsd.org which don’t exist there.

                                                            If you are still looking sqlite wrapper API ideas, SVN has an internal sqlite wrapper that I found fairly easy to get used to: https://svn.apache.org/viewvc/subversion/trunk/subversion/include/private/svn_sqlite.h (Though it may not be as minimalist as you would like :)

                                                            1. 2

                                                              (I’ll fix the manpage links—thanks!) I’d seen the SVN version. Any way you can break that out as a separate library (may I suggest “stspsql”?) so other folks can use it? Missed you all at bsdcan!

                                                        1. 14

                                                          My favourite part about BCHS is watching folks come out of the woodwork to argue about C.

                                                          1. 4

                                                            Plotting libraries are beasts—the APIs necessarily juggle between getting structured (or parameterised) data plotted and full-out drawing. Where users purport to need the first, but inevitably ask for the second. And that’s not even to mention the backend, which is where most of the weight comes from: PDF, PNG, JPG, etc. So a plotting library is usually a shim atop a drawing library, which is a shim atop various output format libraries (libpng, etc.). Each inheriting the common denominator of the layer below.

                                                            In the non-chroot world, I’ve used a lot of gnuplot and grap and plotutils; but when buliding dynamic graphs into a chrooted C system (not sure about C++), I ended up rolling my own, kplot, after being overwhelmed by plplot. I’d heard of Qt Charts but discounted it due (1) to being C++ and (2) the requirement of Qt, which was too much for my own small needs.

                                                            At the end of the day, I’m convinced that the most important thing about a plotting library is which default colours are used. Oh—and fonts. (On the topic of beasts…)

                                                            1. 2

                                                              Absolutely! Adding Qt to a small project is overkill just to have a few charts. On the other hand, if you are doing prototypes for the desktop and need simple and easy chart capabilities on your program, or if you are developing GUIs with Qt already and need some plotting done really quick, then madplotlib can really be a game changer. kplot seems a lot of fun, I will certainly take a closer look, thanks!