1. 29
  1.  

  2. 14

    Raw roff markup is alien and mysterious to modern sensibilities, and a Markdown-family language based on the manpage document model rather than the HTML document model is a cool idea.

    Unfortunately, just as Markdown is geared for presentational HTML rather than semantic (there’s no easy way to add a class to a block of markup, for example), it seems scdoc is geared for presentation man(7) output rather than semantic mdoc(7). I can’t really fault the author; I’m not even sure what a semantic Markdown-alike could look like. But it wasn’t too hard for me to learn mdoc(7) myself, and I’ll probably stick with it.

    1. 3

      These are good points. I deliberately chose not to expose to the user any control over the low-level roff output, and also deliberately chose not to add semantic components because the output is only ever going to be man pages. I’m of the opinion that using several specialized tools (for each facet of your documentation, be it man pages or HTML pages or PDFs) is better than attempting to fit one octopus-shaped peg into several holes, and this principle feeds directly into scdoc’s design.

      That being said, it’s totally valid to hold other viewpoints. For some people mdoc may be better. For myself, it’s not.

    2. 5

      This is neat! I’ve been looking for a man page generator. I also use asciidoc and have had toolchain issues with it, particularly on macOS.

      I perused the source code a bit since I was curious about the combination of “no dependencies” and “UTF-8 support.” What, exactly, do you mean by UTF-8 support?

      I see you hand rolled your own UTF-8 handling, but what I don’t quite understand is why you did it in the first place. What I mean is that your parser only seems to care about ASCII, and you never actually take advantage of UTF-8 itself with one obvious exception: if the data you read isn’t valid UTF-8, then your parser sensibly gives up. Is there some other aspect of UTF-8 you’re using? Perhaps I’ve skimmed your code too quickly and missed it.

      I do see that you’re using the various POSIX char functions such as isdigit and isalnum, but those operate based on POSIX locale support, and aren’t, as far as I know, aware of the various Unicode definitions of those functions. Moreover, the documentation for those functions states that it is UB to pass a value that cannot be represented in an unsigned char.

      I’m not a C expert, so I could be missing something pretty basic!

      1. 3

        Hey, thanks for your feedback!

        I perused the source code a bit since I was curious about the combination of “no dependencies” and “UTF-8 support.” What, exactly, do you mean by UTF-8 support?

        Yeah, supporting it wasn’t too hard. I probably don’t have to explicitly handle it, but I prefer to enforce all input files must be UTF-8 and all output files must be UTF-8 rather than leave wiggle room. One intentional design decision of scdoc is that it is very strict - it will error out if you try to write #header instead of # header, for example. Enforcing UTF-8 is another form of strictness that ensures all scdoc files have a baseline of sanity.

        I do see that you’re using the various POSIX char functions such as isdigit and isalnum, but those operate based on POSIX locale support, and aren’t, as far as I know, aware of the various Unicode definitions of those functions. Moreover, the documentation for those functions states that it is UB to pass a value that cannot be represented in an unsigned char.

        I should probably enforce that characters I feed into this are <0x80. Good catch, filed a bug:https://todo.sr.ht/~sircmpwn/scdoc/13

        1. 2

          Ah, yeah, that makes sense. Starting with strict validation on UTF-8 is smart. :-)

      2. 5

        I’m gong to be looking very hard at stealing this for Myrddin. I may also add an implementation that generates HTML so that I can have documentation for both the website and manpages generated from one source.

        Tosses another log on the too many things to do fire

        1. 5

          This is very interesting!

          I was looking for something like this for Jehanne (my dream is a tool that can be read in source form, with minimal syntax).

          Here an example of what scdoc source looks like.

          1. 4

            I wish I could upvote more than once. Very neat tool.

            1. 6

              Why not just directly write man(7), which is all this tool produces? Or use the existing perlpod, pandoc, docbook, lowdown, rst2man, or any other tool doing exactly the same thing from diverse formats?

              Because I’m sure the world needs more opaque, un-indexable manpages.

              (Edit: to clarify, use mdoc(7).)

              1. 5

                Author here. Did you even read the blog post? I answered all of these questions.

                perlpod is built on a mountain of perl, and pandoc on a mountain of haskell. lowdown is a Markdown implementation, and Markdown and roff are mutually exclusive. RST and roff are mutually exclusive. I spoke about docbook directly in my article (via asciidoc, which is a docbook frontend). I also directly addressed mdoc.

                Man pages are already being indexed. If you search the web for “man [anything]” you’ll find numerous websites which scrape packages and convert the roff into HTML.

                1. 1

                  Thanks for your hack. It’s a good candidate for a port in my little os.

                  A couple of question:

                  • have you considered to avoid the bold markers around man page refs as you already have the parentheses to identify the reference?
                  • also section titles have conventional names: what about omitting the starting sharp to mark them as titles?
                  • what about definition lists? (I know they are an HTML thing, but they can be useful to describe options for example)
                  • I know tables are the most difficult format to express in a readable source form, but what alternatives did you considered and why you discarded them?

                  And btw… Thanks again!

                  1. 2

                    Glad you like it!

                    have you considered to avoid the bold markers around man page refs as you already have the parentheses to identify the reference?

                    This is an interesting thought. https://todo.sr.ht/~sircmpwn/scdoc/12

                    also section titles have conventional names: what about omitting the starting sharp to mark them as titles?

                    I’m not fond of this idea. Given that lots of man pages will need to have section titles which fall outside of the conventinoal names, and that I want all headers to look the same, this isn’t the best design imo.

                    what about definition lists? (I know they are an HTML thing, but they can be useful to describe options for example)

                    man pages do “definition lists” with borderless tables, which are possible to write with scdoc like this

                    |[ *topic*
                    :[ definition
                    |  *topic
                    :  definition
                    # etc
                    

                    I know tables are the most difficult format to express in a readable source form, but what alternatives did you considered and why you discarded them?

                    The main approach I’ve seen elsewhere is trying to use something resembling ascii art to make tables look like tables in the source document. I’ve never been fond of this because you then have to do annoying edits when updating the table to keep all of the artsy shit intact, which in addition to being just plain annoying can also bloat your diffs, lead to more frequent merge conflicts, etc.

                    An alternative some formats have used is to make aligning your columns optional, but still using an artsy-fartsy kind of style. I figure that if you’re going to make aligning the columns optional you no longer have any reason to require a verbose format like that. So I invented something more concise.

                    Also, the troff preprocessor used for tables supports column alignment specifiers and various border styles, which I wanted to expose to the user in a concise way. Other plaintext table formats often have this feature but never concise.

                    1. 1

                      man pages do “definition lists” with borderless tables

                      Do you think you could render something like this with scdoc in a source-readable way http://man7.org/linux/man-pages/man8/parted.8.html (see section OPTIONS and COMMAND)?

                      The main approach I’ve seen elsewhere is trying to use something resembling ascii art to make tables look like tables in the source document.

                      Actually it was what I was thinking about. You propose a good point, but my counter argument is that manual pages are (hopefully) read more often then they are written. But I admit that my goal is people using cat to read manual pages by default, so I can see how in a more conventional system using Troff the people most often read a rendered page, thus the annoyance is pointless. OTOH, it should be relatively easy to write a tool that take scdoc document as input and output another scdoc document where tables are automatically aligned, removing the annoyance to align the cells while writing.

                      Having said that, I find your table syntax nice.
                      I wonder if one could nest tables (I mean put a table in a cell). Also, you organize the table by rows, but given the format, some table might benefit from being organized by column.

                      1. 2

                        Do you think you could render something like this with scdoc in a source-readable way http://man7.org/linux/man-pages/man8/parted.8.html (see section OPTIONS and COMMAND)?

                        You don’t actually even need tables for this. scdoc preserves your indent. https://sr.ht/I0g7.txt

                        I wonder if one could nest tables (I mean put a table in a cell). Also, you organize the table by rows, but given the format, some table might benefit from being organized by column.

                        I think nested tables is a WONTFIX. Also not sold on column-oriented tables. IMO man pages should be careful to keep their tables fairly narrow to stay within 80 characters.

                        1. 1

                          Wow, that’s really readable!

                          Fine for nested tables. Just to be sure I explained what I meant by column-oriented (that just like nested tables might or might not be a good idea): suppose you want to create something like

                          English    Italian    Swahili
                          Hello!     Ciao!      Habari?
                          Tour       Viaggio    Safari
                          Lion       Leone      Simba
                          

                          You might prefer a syntax like

                          |[ English
                          :[ Hello!
                          :[ Tour
                          :[ Lion
                          |[ Italian
                          :[ Ciao!
                          :[ Viaggio
                          :[ Leone
                          |[ Swahili
                          :[ Habari?
                          :[ Safari
                          :[ Simba
                          

                          Or even, for such a simple table (that I don’t know if actually exists in a man page, so…), you could put each column (or row) in the same line:

                          |[ English :[ Hello! :[ Tour :[ Lion
                          |[ Italian :[ Ciao! :[ Viaggio :[ Leone
                          |[ Swahili :[ Habari? :[ Safari :[ Simba
                          

                          (that a tool could easily turn into:

                          |[ English :[ Hello!  :[ Tour    :[ Lion
                          |[ Italian :[ Ciao!   :[ Viaggio :[ Leone
                          |[ Swahili :[ Habari? :[ Safari  :[ Simba
                          

                          )

                          Ok… now I’ve really annoyed you enough for a single night… good work!

                2. 5

                  Because you cannot have progress without research.

                  Now troff is not readable in source form.
                  This is better in this regard. You are right about indexing, but the project have a very short log. I guess we can talk about it with the author, and see what he think about that.

                  Maybe he like the idea, and add it. Or he doesn’t, and will not add it.
                  You will always be able to fork it and fine tune to you need.

                  I’m grateful to hackers who challenge the status quo.

                  1. 4

                    While mdoc(7) is great (thanks for that!) , I think your questions are answered on the page. I think lowdown is probably the closest to what u/SirCmpwn was aiming for (no dependencies, man output), maybe they hadn’t seen it?

                    Man formatting is inscrutable to the un-trained eye (most people), and we need to acknowledge the popularity of markdown is related to its ease of reading/writing.

                    1. 4

                      I think your questions are answered on the page. I think lowdown is probably the closest to what u/SirCmpwn was aiming for (no dependencies, man output), maybe they hadn’t seen it?

                      groff (as installed on every Linux distribution that uses groff for man pages, which is basically all of them, and macOS) has had native support for mdoc for at least a decade. If you install an mdoc man page and then man $thepage, you get exactly what you expect.

                  2. 2

                    mdocml is small and has minimal dependencies, but it has runtime dependencies - you need it installed to read the man pages it generates. This is Bad.

                    mdoc is part of the system. I guess not on Linux??

                    1. 3

                      mdoc is part of the system on Linux too.

                      1. 3

                        Depends on the Linux.

                        1. 1

                          Do you have any particular distribution in mind where it isn’t?

                      2. 1

                        Guess what? There is life outside Unix! :-D