1. 22
  1.  

  2. 10

    I thought this was going to be along the lines of “What color is your function” but it’s really its own deal. I find the colors a little hard to keep straight. In the end, the author concludes that there’s a spectrum, which is part of why it’s hard to keep straight: the different kinds of markup all bleed into each other. Still this is a really interesting line of thought, and I’m interested in reading what’s next in the series.

    One thought is that things like JSX show that there’s interest in describing applications as XML-like trees of markup.

    Another thought is when you do the CSS for a webpage, there are basically two different categories of CSS: the content well vs everything else. For most of the page, you try to use semantic tags and style things with classes, so a call to action might have an H2 that gets styled as small text or whatever. But the content well is totally different, and now you can’t use classes and you’re stuck with handful of basic HTML rich text tags that you’re styling. It’s a pretty severe shift, and I think it definitely relates to switching from one “color” of markup to another at the boundary.

    1. 5

      Thanks for the kind words :)

      I agree the colors are slippery. I had a hard time making a fairly abstract discussion a little more visual/tangible. This was like draft 5 of this color idea, and this color idea was itself part of the third broad framework for approaching the whole discussion. I had to make peace with shipping it, but it’d be satisfying to see someone synthesize it better.

      When I outlined the posts in this framework back in Spring, I thought it would be the 5th post (covering just 3 colors–and structurally hewing much closer to “What color is your function”) in a series that started with https://lobste.rs/s/rgqg1v/what_functions_why_functions. Alas, writing the series and the first few drafts of this post made me want to call out and discuss more than just the first 3 types (and restructure the post because I think I stretched the comparison too thin).

      The main reason I’ve held on to the color conceit despite this trouble is that I can imagine highlighting tools that help humans ~see these distinctions in their markup, which feels like a good first step to drilling down on problems with reusing something like a documentation corpus.

      I agree that describing user interfaces with XML-like markup has been fruitful despite the separation-of-concerns problems created by the specifics of how we’ve stumbled into it (at least in the web stack).

      1. 4

        … in the web stack.

        I, too, thought the article was interesting.

        I do hope you’ll also discuss other ecosystems; e.g. reStructuredText can be used as a “cooler” (and extensible) Markdown, and LaTeX has a separate document class for major presentation… plus a mix of pure-layout and nearly-pure-structure commands (e.g. \textbf vs. \label). Something like Photoshop is quite far towards a “warm” extreme (but keeping e.g. layers separate means that it’s not as “warm” as e.g. Paint). Etc.

        1. 4

          I dithered a bit about how many examples I needed to annotate to illustrate the concept, but the next few on my list are reST, mdoc, and LaTeX. (Though my LaTeX knowledge is mainly looking in from the outside; would you be open to a DM for thoughts or feedback at some point?)

          Interesting point about Photoshop. I’ve mulled both imperative local style and labeled style approaches in Word and InDesign, but I guess my text/documentation-centric focus has shaped how I have been extending these ideas out into multiple media without really reflecting on more traditional graphic design/illustration/etc. (even though they’re obviously germane).

          1. 6

            LaTeX is a bit weird because it’s not really a distinct thing. TeX is a Turing-complete imperative language that is designed for producing typeset output. LaTeX is an attempt to build an eDSL on top of TeX that gives semantic markup. There are several problems derived from the fact that TeX is a programming language, not a markup language. They also have exciting scoping rules, where some things affect the current ambient state, rather than describing their arguments. You can write something in italics as either \textit{this is italic} or {\itshape this is italic}. The definitions of the markup language are inline and can be interleaved with the programming language and so you get various things that you might call red markup in your taxonomy but are actually infrared (or something): they’re not markup describing how something appears, they’re commands describing how to make something appear a specific way. These are interleaved with the clean semantic markup.

            This means that it’s impossible to parse LaTeX as a markup language, in the general case. For example, if you see \foo{bar}, this might be:

            • Semantic markup saying that bar has some property.
            • Presentation markup saying that bar should be typeset in some way.
            • A marker that you should start a bar-typed region that requires some explicit markup to terminate it.
            • A macro that redefines another macro that is used later on.
            • Something that invokes some arbitrary code.

            It’s possible to be disciplined and separate your yellow and blue markup from other things in LaTeX. I’ve done this in a couple of books, where the only LaTeX macros that I use in the .tex files for each chapter describe semantics (e.g. this is a heading, this is a keyword, this is an abbreviation and its expansion, this is a code listing pulling in lines n to m from file x). Doing this let me also parse my markup with something that generated semantic XHTML markup for ePub versions. That isn’t really writing LaTeX though, it’s writing a custom markup language in TeX that happens to include a subset of LaTeX.

            1. 2

              Great context. This sounds hard-won :)

              There’s a lot of “you can do this with existing languages if you’re willing to make up a custom language, use just a subset of an existing one, or both” in this space.

              The definitions of the markup language are inline and can be interleaved with the programming language and so you get various things that you might call red markup in your taxonomy but are actually infrared (or something): they’re not markup describing how something appears, they’re commands describing how to make something appear a specific way. These are interleaved with the clean semantic markup.

              I guess this corresponds with the “procedural” type in the Coombs et al taxonomy. I’ll have to chew on whether to fold it in or call it out-of-scope.

              1. 3

                I wish SILE had more traction. It implements a load of the algorithms from the TeX papers for typesetting (including the ones that the authors of TeX wanted to implement but which required more than 1 MiB of RAM and so were totally infeasible back then). It has a much cleaner separation here. It consumes some markup (this is pluggable and it has a couple of options out of the box), which is a pure markup language. It then uses Lua for defining how these are mapped to typesetting concepts. Both the input and output are better than TeX, but LaTeX has had decades of people building useful features on top. TikZ is my favourite example here: it’s a graphics package built entirely in TeX, which is a completely ludicrous substrate for it, but it’s so useful. Porting it over to SILE would be almost as much work as writing SILE was in the first place (probably less work than implementing it in TeX on the first place, but that’s sunk cost).

            2. 3

              would you be open to a DM for thoughts or feedback at some point?

              I’m afraid that wouldn’t be a very good idea, since I’m rather intermittently available for the next couple of weeks (months?). I’m sorry - I do very much want to read what you have to say, but I can’t realistically commit to anything. Sorry!

              1. 3

                No worries–exactly the kind of reason I asked ahead for :)

      2. 4

        The colour analogy really worked for me here.

        Red, yellow, blue, primary colours, that’s cute.

        Purple? Oh, because it’s between red and blue. Makes sense.

        Magenta? Uh, like purple but different? This could get tricky.

        Green between yellow and blue? Uh, OK. Chartreuse? Probably too cute. Orange? Ugh, my head’s beginning to hurt.

        Oh, a colour wheel? Oh. Oh! I get it!

        Well done!

        1. 3

          I like this, especially banishing the word ‘semantic.’ My mental framework is a little different. I treat the different colors as different language games we play with a text (see Wittgenstein). From this point of view, semantics refers to the queues for actions in the language game. Syntax refers to how those queues are encoded. But there isn’t just one language game.

          In a similar vein, I think that this is why REST is pretty much describing systems for humans to interact with. How to make a computer figure out and play language games is pretty much unexplored at this point. It would be really interesting to have tools that let a goal directed program attempt to rule follow in novel language games.

          1. 2

            I love this! One of the best blog posts I’ve read this year. I’ve often had half-baked thoughts along the same lines, but didn’t have the words to express them. From the title I guessed it was going to be about block vs. inline but this is much more interesting. In several projects I’ve tried to build up an optimal markup language, usually by extending Markdown with pandoc filters, and wanted things to be “semantic” but at the same time feeling there is more to it than a binary semantic/not semantic. Looking forward to the next post!

            1. 2

              I’ve had the sense that I have something exciting by the tail throughout this project, but it’s proven really hard to coax out of my intuition and into language. I felt very dumb for quite a few weeks while I was struggling with this post, so this is a relief. Thank you.

            2. 2

              I’m not sure I agree with your colors for <pre> and <code>. I would like to use <code> for presenting source code, but it’s defined as an in-line level tag, and not a block level tag. So if I were to use <code> to present code, I would have to wrap each line, and ensure each line was forced to the next line (possibly with a <br>?) Because of those, I use <pre>, defined as a block level tag, to present source code (without an excessive amount of markup).

              I might also be abusing tags. For instance, when I quote email, the email headers are presented in a <dl>, with the header name as a <dt> and the value as <dd>. Might not be semantically correct, but it’s either that, or a bunch of <div>s and <span>s to apply formatting to. I do use <span> to denote foreign words, because neither <i> nor <em> are quite right for the task.

              As an aside, I would like to read your blog, but your site navigation is, to put it lightly, difficult to grok. Other than manually pulling down the RSS feed, is there any way to navigate your blog on the site itself?

              1. 3

                Because of those, I use , defined as a block level tag, to present source code

                For code blocks, you should use <code> inside <pre>.

                I do use to denote foreign words, because neither <i> nor <em> are quite right for the task.

                <i> is the correct element for foreign technical terms, if that’s what you mean. https://developer.mozilla.org/en-US/docs/Web/HTML/Element/i

                1. 1

                  I’m not sure I agree with your colors for and . I would like to use for presenting source code, but it’s defined as an in-line level tag, and not a block level tag. So if I were to use to present code, I would have to wrap each line, and ensure each line was forced to the next line (possibly with a ?) Because of those, I use , defined as a block level tag, to present source code (without an excessive amount of markup).

                  It’s roughly right, as I use it in the code example, but I realize the example doesn’t disambiguate block/inline elements and I do muddy the water when I say that blue markup “might indicate that a block of content is code.” I’m speaking generally of what a blue/ontological element can be, and you’re absolutely right that it doesn’t square with the weird world HTML forces on us when it expects something like:

                  <pre>
                      <code>line 1</code>
                      <code>line 2</code>
                      ...
                  </pre>
                  

                  I might also be abusing tags… Might not be semantically correct, but it’s either that, or …

                  I’m calling this abuse in the sense that we’re using tags for their presentational effects rather than their intended purpose, but I would stop shy of saying it’s wrong of us to abuse them. For example, the details element literally does something that you can’t, AFAIK, use straight CSS to get your browser to do for you. It’s inviting abuse.

                  I lay it out more explicitly in a separate post linked from this one, semantic: the 8-letter s-word, but “semantic HTML” is all about begging markup authors to use tags/labels in a way that is consistent with what user-agents want. So, it’s abuse from their perspective–but I feel like it’s a pretty obvious/natural outcome of a system in which they’ve tried to offload a lot of cognitive dissonance on markup authors by expecting us to use markup that more or less by definition can’t truly align with the semantics of our content as we see them.

                  From a linguistics perspective, I guess they’re prescriptivists and I see myself as on the side of descriptivists. So the goal for now is just making it more evident how the toolchain’s expectations are molding what we can express.

                  As an aside, I would like to read your blog, but your site navigation is, to put it lightly, difficult to grok. Other than manually pulling down the RSS feed, is there any way to navigate your blog on the site itself?

                  Fair :)

                  It’s interactive, like a MUD, but it’s less obvious on blog post pages because the input cursor isn’t active. You can type help or look (or just l) at the bottom of the post page to see other posts on the same ~topical sub-blog, and type hub (or visit https://t-ravis.com/room/post/) to see all posts.

                2. 2

                  What markup? There’s not even a <noscript> if you disable JavaScript. Then criticizes Web Components for the JS requirement, but at least it’s building an interactive component. As a cherry on top, the main content is in <div id="terminal" class="terminal terminal-modern"> which is ironic because you get no content when accessing it via a TUI browser. This is a blog–it’s purpose is to provide static information and it shouldn’t require code execution.

                  1. 3

                    Since you so kindly reminded me what my interactive personal site is and isn’t, I’ve added a barebones noscript on the post pages.

                    1. 2

                      I disabled JavaScript on the page just to see what would happen, and I get a blank page. Nothing shows up unless I have JavaScript enabled.

                      1. 1

                        Cached copy, I think. (I’ve tested it in Safari, Chrome, and Lynx.)

                      2. 1

                        Ideally it shouldn’t need JS, but that is a good start to at least give a user some context. The audience for a blog like this involves technical people and a lot of those folks take their browser security seriously disabling JS by default or they’re reading from a TUI RSS reader/doing some sort of scraping. You want to retain these users.

                        1. 1

                          Ideally you shouldn’t need to share an uncharitable dismissal of the post (on a forum that discourages these) just because you don’t like the site. All you had to do is ask for noscript support.

                          1. 1

                            It’s not dismissal if it’s advice on accessibility and technical user retention. But you said you took a step in the right direction, which is appreciated.