1. 4

    This is progressing towards the rediscovery of regular expressions.

    1. 4

      Indeed. If anyone finds this article interesting and doesn’t know regular expressions (“regexes”) yet, I recommend reading regular-expressions.info/tutorial.html. When I learned regexes from that site I found it to be well-written. The site plugs the author’s own regex-testing tool in between explanations, but you can just use regex101.com, which is free and equally powerful.

      Here’s an example of using a regex in Python to extract text within square brackets:

      import re
      string = "123[4567]890"
      re.search(r'\[(.*)\]', string).group(1)  # evaluates to '4567'
      
      # You could also write the regex with whitespace for readability:
      # re.search(r'\[ (.*) \]', string, re.X).group(1)
      

      Regexes have some advantages over the extract DSL defined in the article. They support powerful features such as extracting multiple parts of the text with one search. They are supported by all major programming languages. Most text editors let you use them to search your text. They are also very concise to type. However, they have flaws of their own, particularly how hard they can be to read. So though regexes are useful to learn, they are not the ultimate way of extracting parts of text.

      Here are some projects that aim to improve on regexes (but are much less widespread):

      • Regexes in the Raku language. Raku, formerly known as Perl 6, breaks compatibility with Perl 5’s regex syntax (the same syntax used by most regex engines) in an attempt to make regexes more readable.
      • Egg expressions, or eggexes, are a syntactic wrapper for regexes that will be built into the Oil shell.
      1. 2

        I’d prefer r'\[(.*?)\]' or r'\[([^]]*)\]' to avoid multiple square brackets in the string matching more than expected. Also, in newer versions of Python, you can use [1] instead of .group(1)

        https://www.rexegg.com/ is another great site for learning regexp. And https://github.com/aloisdg/awesome-regex is a good collection of resources for tools, libraries, regexp collections, etc.

        1. 2

          And Parse in Red is also a nice alternative to regexes.

        2. 3

          Perhaps we can coin a new aphorism! Greenspun’s Tenth Zawinski’s Law: Any sufficiently complicated Lisp text processing program contains an ad hoc, informally-specified, bug-ridden, slow implementation of half of regular expressions.

          Edit: Or perhaps ‘Every Lisp program attempts to expand until it contains an ad hoc, informally-specified, bug-ridden, slow implementation of half of regular expressions. Programs which cannot so expand are replaced by those which can.’

          1. 2

            Friendly advice: lead your texts with a concise, sample render.

            1. 3

              Gotta admit, Dumbdown is kinda clean

              1. 21

                The article never mentions the, in my humble opinion, most important part of good logging practices and that is structured logging. Without it you end up with weird regexes or other hacks trying to parse your log messages.

                1. 4

                  As a sibling post notes, if you use structured logging you’re mostly throwing away the idea that the entries must be easily parsable by a human. If that’s the case, and we’ll need a custom method of displaying the structured logs in a human friendly way, I believe we should forego plain text all together and gain the benefits of logging directly to binary.

                  1. 5

                    You can do human readable structured logging if you use key="value" formats inside text messages. Some people still prefer json, but there is a middle ground.

                    1. 2

                      If you need just key=value, that’s not really structured in my opinion.

                      1. 4

                        Why not?

                        1. 2

                          Because the amount of information added by this format would be infinitesimal over a line based logger with manual tokenization. The reason why you’d want a structured logger is to allow proper context to a message. Unless you’re working with simple cases, the structure that would offer such context is more than one level deep.

                          1. 3

                            Hmm, definitely not.

                            Structured logging is about decorating log events with just enough of a schema to make them machine parseable, so that searching, aggregating, filtering, etc. can more than a crapshoot. Deeply nested events significantly increase the complexity of that schema, and therefore the requirements of the consumer.

                            By default, structured logs should be flat key/value pairs. It gets you the benefits of richer parseability, without giving up the ability to grep.

                  2. 2

                    Excellent point. That’s become such second nature to me by now, that I forgot to even mention it!

                    1. 1

                      I’m surprised it wasn’t mentioned, but the larger advantage of passing a logger around to constructors is the ability to then have nested named loggers, such as

                      Battery.ChargingStatus.FileReader: Failed to open file { file: "/tmp/battery charge", error: ... }
                      Battery.ChargingStatus: Failed to access status logs, skipping report
                      
                      1. 1

                        On top of that, structured logger if implemented properly, can often be faster and be operated at granular levels (like the other comments pointed out, sometimes you do want to on-fly turn on some logs at some locations, not all logs at all locations).

                        1. 1

                          I love structured logging, with one caveat: the raw messages emitted (let’s assume JSON) are harder for me to scan when tailing directly (which I usually only do locally as we have better log querying tools in the cloud), in contrast to a semi-structured simple key-value format. Do you all use a different format than JSON? Or a tool that transforms structured logs to something more friendly to humans, eg. with different log levels displayed in different appropriate colors, eg. JSON syntax characters diminished, for local tailing?

                          1. 5

                            At Joyent, we used the Bunyan format. Each line in the file was a separate JSON object with standard properties, some mandatory and some optional, and freeform additional properties. We shipped a tool, bunyan, that was capable of acting as a filter that would render different human readable views of the JSON. For example, you would often run something like:

                            tail -F $(svcs -L manatee) | bunyan -o short
                            

                            It also had some rudimentary filtering options. It also had a relatively novel mode that would, instead of reading from a file or standard input, use DTrace probes for different log levels to allow you to dynamically listen for DEBUG and TRACE events even when those were not ordinarily present in the log files. The DTrace mode could target a particular process, or even all processes on the system that emitted Bunyan logs.

                            1. 1

                              Hi, what were the required fields? Was it just a unique request ID? Thanks for sharing about bunyan. Even though it’s been out for a while I was unaware of it.

                            2. 5

                              Do you all use a different format than JSON? Or a tool that transforms structured logs to something more friendly to humans, eg. with different log levels displayed in different appropriate colors, eg. JSON syntax characters diminished, for local tailing?

                              We use JSON and the only tools I use are grep and jq. And although I am pretty much still a novice with these two, I found that with the power of shell piping I can do almost anything I want. Sometimes I reach for the Kibana web interface, get seriously confused and then go back to the command line to figure out how to do it there.

                              I wrote a simple tutorial for the process, just a couple of weeks ago.

                              1. 1

                                Agreed. jq is a really nice tool. It made the decision to transition to using JSON for logging very easy.

                                1. 1

                                  If you rely on external tools to be able to make sense of your logs, why not go all the way, gain the speed and size benefits that binary logs would bring, and write your own log pager? I feel like the systemd folks had the right idea even when everyone was making fun of them.

                                  1. 3

                                    I don’t think the average employer would be happy subsidizing an employee writing a log pager instead of implementing something that would bring a tangible result to the business. The potential money savings by using binary logs probably doesn’t outweigh the new subs/increased profits of churning out more features.

                                    1. 1

                                      To me that sounds like an excuse. The world is not made up of only software that is beholden to the all mighty shareholder.

                                      1. 1

                                        I mean, yes, if you’re developing something in your personal time, go bananas on what you implement.

                                        But I also know my manager would look at me funny and ask why I’m not just shoving everything into CloudWatch/<cloud logging service>

                                    2. 2

                                      I’m sure most problems with systemd journals are fixable, but they’ve left a very bad taste in my mouth for two main reasons: if stuff gets deleted from under them they apparently never recover (my services continue to say something like “journal was rotated” until I restart them), and inspecting journals is incredibly slow. I’m talking magnitudes slower than log files. This is at its worst (I often have time to make a cup of tea) when piping the output into grep or, as journalctl already does by default, less, which means every byte has to be formatted by journalctl and copied only to be skipped over by its recipient. But it’s still pretty bad (I have time to complain on IRC about the wait) when giving journalctl filters that reduce the final output down to a few thousand lines, which makes me suspect that there are other less fundamental issues.

                                      I should note that I’m using spinning disks and the logs I’m talking about are tens to hundreds of GB over a few months. I feel like that situation’s not abnormal.

                                      1. 1

                                        If you rely on external tools to be able to make sense of your logs, why not go all the way, gain the speed and size benefits that binary logs would bring, and write your own log pager?

                                        It’s hard to imagine a case at work where I could justify writing my own log pager.
                                        Here are some of the reasons I would avoid doing so:

                                        • Logs are an incidental detail to the application.
                                        • Logs are well understood; I can apply a logging library without issues.
                                        • My application isn’t a beautiful and unique snowflake. I should use the same logging mechanisms and libraries as our other applications unless I can justify doing something different.
                                        • JSON is boring, has a specification, substantial library support, tooling, etc.
                                        • Specifying, documenting, and testing a custom format is a lot of work.
                                        • Engineering time is limited; I try to focus my efforts on tasks that only I can complete.
                                        1. 2

                                          Logs are an incidental detail to the application.

                                          I think this is trivially disproved by observing that if the logs stop working for your service, that is (hopefully!) a page-able event.

                                          Logs are a cross-cutting concern, but as essential as any other piece of operational telemetry.

                                          1. 1

                                            Logs are a cross-cutting concern, but as essential as any other piece of operational telemetry.

                                            I rely heavily on logging for the services I support but the applications I wrote for work have only error reporting. They are used by a small audience and problems are rare; I might get a crash report every 18 months or so.

                                            1. 1

                                              Ah, yeah, I presume the context here is services.

                                    3. 3

                                      Don’t use JSON, use logfmt.

                                      1. 1

                                        Yes! Logfmt is the good stuff. But it’s only semi-structured. Why not use JSON and a tool to transform to logfmt (with nested data elided probably) when needing to scan as a human?

                                        1. 1

                                          Logfmt is fully structured, it just doesn’t support nesting, which is an important feature! Structured logs should be flat.

                                  1. 19

                                    I work on porting software to IBM i; that’s a platform you might know better as OS/400. It’s popular with businesses, usually in the retail/financial/logistics/etc spaces. Those kinds of shops are usually pretty isolated from trends in the tech industry, but they’re everywhere. Chances are you’ve worked with or seen them and never really thought of it.

                                    Most of the software I target runs in the AIX compat layer; AIX itself is technically POSIX compliant, but it really stretches the boundaries of compliance. All that AIX stuff is PowerPC; the CPUs are actually relevant/competitive. Actual native software is even weirder and is basically EBCDIC WebAssembly, to tl;dr it.

                                    1. 2

                                      AS/400 is a neat system. Very high uptimes.

                                      Also, I was always impressed with AIX on rs/6000s or HP PA RISC hardware (hpux was not so good).

                                      1. 2

                                        Do you work for IBM? What type of system do you run that on?

                                        Floodgap’s main server has been AIX (first on an Apple Network Server 500 and now on a POWER6) since its first existence, and I used to do work on a workstation with 3.2.5. There’s also a ThinkPad “800” and 860 around here. However, IBM’s kind of hostile to us AIX hobbyists and I dislike having to dig out an HMC to do any reconfiguration with the LPAR. And IBM i (and OS/400) are worse, given that the entire system is one big vendor lock-in.

                                        1. 4

                                          I don’t work for IBM. The box I use is a hosted LPAR on a POWER9.

                                          I never got into AIX except as a faster way to cross-compile; smit is a poor substitute for real administration on a 5250 (It’s still better than HP-UX though.). One dirty secret is as much as IBM wants you to use the HMC, you don’t really have to for single systems; due to the screaming of i users who don’t want to use VIOS, let alone an HMC, you can totally do basic administration without an HMC.

                                          I don’t know if there’s AIX hobbyists, but I’m involved with a community for i hobbyists.

                                          1. 1

                                            If there’s not an AIX hobby club, then let it begin with me. (Jokes aside, someone used to call themselves the “MCA Mafia.”) But how would you reconfigure RAM allocation and so forth? On this POWER6, ASMI doesn’t really have any options for that.

                                            smit happens, but smit is definitely better than sam, I agree!

                                            1. 1

                                              Oh, the Ardent Tool of Capitalism?

                                              It’s been a long while since I looked at ASMI. If you’re running i as the dom1, then i can actually act as a mini-VIOS, with some limitations (i.e no SEAs, you have to bridge virtual ports to something).

                                      1. 1

                                        How inexpensive is this non-volatile ‘ram-like’ memory these days?

                                        1. 1

                                          It isn’t cheap yet, but I think there’s little doubt PMEM is the future. It’s like seeing transition to 64-bit and SSD.

                                          1. 1

                                            Cheaper than flash SSDs, gigabyte for gigabyte, and obviously SSDs are cheaper than RAM or instead of having a few hundred gig of SSDs holding our swapfiles, we’d have a few hundred gig of RAM and no swapfiles.

                                            The thing is that they’re byte-by-byte rewritable. You don’t need that in a disk; in fact, you need to wrap it in a tonne of extra logic to hide it away, since disks work on a sector-by-sector or block-by-block basis. So it makes 3D Xpoint less competitive in the SSD space.

                                        1. 4

                                          (Not my language; one I’m contributing some small things to:)

                                          “Small” language because:

                                          • no modifiers
                                          • no exceptions
                                          • no null pointers
                                          • no implicit imports
                                          • no static members
                                          • no collection literals
                                          • no implicit conversions
                                          • no method overloading

                                          More on not repeating existing mistakes:

                                          • <> for generics is wrong and broken, therefore use []
                                          • Type ident works poorly with generics, therefore use ident: Type
                                          • fields + methods + properties failed, instead avoid leaking method/field implementation details to callers

                                          Pretty much the only thing I plan to spend language on is merging if-statements, pattern matching and the ternary operator into unified condition syntax.

                                          That let’s me get rid of three moderately complex features in favor of one that is only slightly more complex (but vastly less complex than the sum of the complexity of all three).

                                          1. 9

                                            I don’t think you mentioned the name of the language :)

                                            1. 4

                                              Likely

                                              https://github.com/dinfuehr/dora

                                              Clicked on some links and that’s something this user has contributed to recently on Github.

                                            2. 1

                                              That’s very cool! Do you have any thoughts on a unified function syntax? i.e., in most languages you can say f(x) = x + 5, or you can say f = x -> x + 5… function and lambda syntax are inconsistent!

                                              1. 1

                                                Yes: I would love to make it consistent, but I currently see no solution to that.

                                              2. 1

                                                I totally agree on your syntax points.

                                                What are your thoughts on the fields + methods + properties … solutions?

                                                1. 1

                                                  Basically there are two ways to “pay” for data: storage or computation.

                                                  So you only ever need two ways to express them, i. e. let and fun.

                                                  The core idea is that (unlike in C) let and fun already disambiguate these possibilities way better than an () after the name – so get rid of that.

                                                  Consider this:

                                                  class Person(let name: String)
                                                      fun firstName: String = name.split(" ").get(0)
                                                      fun lastName: String = name.split(" ").get(1)
                                                  

                                                  One field, two methods. Used like this:

                                                  somePerson.name
                                                  somePerson.firstName
                                                  somePerson.lastName
                                                  

                                                  Imagine some time later the class gets refactored to this:

                                                  class Person(let firstName: String, let lastName: String)
                                                      fun fullName: String = firstName + " " + lastName
                                                  

                                                  … and the callers don’t need to change at all!

                                              1. 2

                                                I like it. How does the routing work though? When I did a cursory look at tailscale, the docs indicated that you had to manually specify certain routes. The particular case I was interested in is the one shown on the homepage. In that example, you may have to route through a jump box and two VPNs.

                                                Regardless , I think this stuff is really neat. This is setting a new bar for what people should expect.

                                                1. 2

                                                  Ah I think you are confusing subnet routing with the default behavior. If you look at that diagram closer you can see that it’s what you have without Tailscale. Click on the “with Tailscale” and you can see the point to point connections in action.

                                                1. 1

                                                  This is a good presentation. He doesn’t really cover the video display system of bhyve (vnc) but he demos it nicely at one point.

                                                  1. 3

                                                    Great article. It’s worth noting that this is actually exactly what the Cocoa JSON serialisation looks like: It can serialise NSArrays and NSDictionarys containing NSStrings, NSNumbers, and NSNull. If you can transform your model into that set of things, then it’s trivial.

                                                    The older and more general NSCoding protocol works in a slightly different way. Objects implement -encodeWithCoder: methods that require them to call method on an NSCoder to store key-value pairs of a fairly small set of types (e.g. primitive types, arrays of primitives, a few common structures, and pointers to other objects that must also implement the NSCoding protocol) and an -initWithCoder: constructor that the corresponding getters on the serialisation object to intialise itself. This is very general and completely decouples the interchange format from the serialisation mechanism. There’s a secure variant of the protocol that allows callers to restrict the set of classes that can be instantiated as a result of any given object deserialisation.

                                                    If performance is not your primary concern then this design is great. If it is, JSON is not the right thing for you, but something like FlatBuffers, which generates objects that let you directly use the wire format byte buffers without any copying, is the right approach. Because this defines the objects from the wire format, it’s far harder to make into a generic thing that can be applied to any object, but it can also be used in a more generic serialisation interface if you are willing to give that up.

                                                    1. 1

                                                      Do Cocoa frameworks still support the old non XML plist format?

                                                      1. 1

                                                        For reading, I don’t know if they can still write it, though some tools still use it for human interaction (e.g. defaults read. The plconvert tool will read it, but not write it.

                                                        GNUstep extended the old plist format to support everything that the modern formats support (dates and so on), Apple didn’t. They deprecated the old format rather than extend it. The old plist format also didn’t have any version metadata, so they couldn’t extend it in a cleanly compatible way. The binary and XML formats both do.

                                                    1. 7

                                                      This is really exciting, especially at this price point. The datasheet is really excellent. A lot of products in this space, even if they have similar features on paper usually have crappy, badly documented toolchains, and often don’t have very good or detailed data sheets.

                                                      Considering it has a USB PHY onboard, I expect this will quickly become the go-to board for keyboard DIY-ers. Personally, I wonder if it would be possible to get it to show up as a keyboard and a mouse at the same time. That could be really useful for building a low-cost KVM (the “proper” kind that keep a keyboard/mouse device attached to all the clients at once, rather than physically changing the circuit paths of the USB connection when the input changes).

                                                      I’m especially interested to see what will happen with the “PIO state machines”, discussed in Chapter 3 of the data sheet. I imagine people will find all kinds of nifty use cases for these.

                                                      1. 4

                                                        I wonder if it would be possible to get it to show up as a keyboard and a mouse at the same time

                                                        Why wouldn’t it be possible? E.g. LOGITacker on the nRF52 shows up as four endpoints – CDC-ACM console, mouse, keyboard, custom raw HID. If you have low-level USB access, you can do whatever you want. Having multiple endpoints is actually pretty common.

                                                        this will quickly become the go-to board for keyboard DIY-ers

                                                        Yeah, probably, due to the power of branding. But honestly I would stick to STM32 for things that don’t need wireless.

                                                        1. 1

                                                          I have not personally worked with the STM32. Most of my experience is with AVR. I am curious why you would prefer the STM32?

                                                          Indeed as you say, it appears to support all 15 endpoints specified by full speed, so presumably it could identify as a composite device with two different HID endpoints. I have not worked very much with USB at such a low level though.

                                                          1. 3

                                                            I have done it on the raspberry pi zero. It emulates a keyboard, mouse, network interface, and a block storage device all at the same time. I believe it should definitely be possible with this new chip. Also, I think you are spot on about the keyboard aspect. I am personally looking at using it for a homebrew mechanical keyboard project.

                                                            1. 1

                                                              I don’t suppose you would happen to have the source code available?

                                                              1. 3

                                                                My keyboard emulation is public: https://github.com/drudru/kb-key/blob/master/src/main.cpp

                                                                The USB OTG magic happens when you configure the OS to create the proper devices.

                                                                A good Google search to get you started is: “pi zero gadget mode”

                                                                Also, check out the PiKVM project on github

                                                                good luck 👍

                                                                1. 1

                                                                  Awesome. Thank you for the link!

                                                            2. 1

                                                              STM32 has a big ecosystem already, it’s like, the most popular brand of Cortex-M microcontrollers.

                                                              Sure, the Raspberry chip will have a decent ecosystem soon-ish too, but it will take time before there’s good bootloaders, support in RTOSes, full support in OpenOCD and probe-rs, etc. etc. But like, if you look past the branding, the only advantage it has is the PIO thing, which sure is cool, but probably not required for your next project :)

                                                          1. 10

                                                            That datasheet is a thing of beauty. Things are written in plain english, diagrams are in color, there are inline code snippet examples in both asm and c most of which link to a file in a github repo that show how they fit in a larger program.

                                                            Seriously that is so many orders of magnitude better than any datasheet I’ve ever had to read before.

                                                            1. 1

                                                              I totally agree. They really spent some quality time on it. I think this is going to help them sell a lot of these.

                                                            2. 2

                                                              This looks even better then TI datasheets (which I found to be very readable for a software guy).

                                                            1. 4

                                                              PIO is interesting, I wonder how it compares to ESP32’s ULP.

                                                              1. 1

                                                                Thx for mentioning this. When I glanced over the blog post, I totally missed this important point. The PIO is actually really interesting from what I have read so far.

                                                              1. 3

                                                                Is this for Optane only or are there other NVRAM technology dimms available to consumers?

                                                                1. 1

                                                                  There are enclosures for traditional DIMMs with backup battery. Not sure if they provide the same “API”, though.

                                                                1. 1

                                                                  This is nice to see. I was just thinking about Self last night out of the blue. Although a lot of people were very excited about prototypes in the mid 90s, it just didn’t seem to catch on.

                                                                  I think the biggest problem with Self was that it used to only run on Sparc hardware.

                                                                  1. 9

                                                                    I have to say this is one of these lobsters links and comments where I discover something completely new, that opens perspectives. Lately they have been rare for me, so this post is even more valuable.

                                                                    1. 8

                                                                      Self is very significant. Historically it was extremely influential even though it wasn’t widely used — sort of like the Velvet Underground whom “nobody listened to, but everyone who did started a band.”

                                                                      • It pioneered prototypes in OOP, later adopted of course by JavaScript.
                                                                      • The extremely dynamic nature of the language appeared to make it inefficient, but the JIT compiler introduced features like monomorphization and dynamic recompiling that made it much faster than it had any right to be. After Self’s creators moved to Sun, they applied those same techniques to the HotSpot JVM, and of course all modern JavaScript VMs use them.
                                                                      • The oddball visual environment was AFAIK the first GUI to apply techniques from the animation world, like distorting objects to emphasize the sense of motion. Later on these were adopted in systems like the iOS UI.
                                                                      1. 4

                                                                        The people who created the V8 JavaScript VM at Google as well as Urs Holzle (first Google Fellow)… actually worked on the Self project.

                                                                        Dave Ungar was most recently working on Swift at Apple. I don’t recall him saying anything about pushing for the kind of dynamism that self had into the Swift runtime. I think he was more interested in the IDE experience. Things like Swift playground.

                                                                      2. 4
                                                                        1. 1

                                                                          This is really nice, you explained some points better than I did.

                                                                      1. 1

                                                                        What if a drive returns bad data?

                                                                        1. 1

                                                                          One small correction: _N is not a prefix.

                                                                          1. 1

                                                                            Indeed it isn’t. Thanks, fixing now.

                                                                          1. 4

                                                                            Nice article, simple and pleasant to read.

                                                                            1. 1

                                                                              Thanks. Was researching the topic of HTTP3, so I thought other people might be interested in an overview.

                                                                              1. 1

                                                                                I have only seen HTTP/2 push used for Apple’s APNS protocol.

                                                                              1. 2

                                                                                Nice review. One thing not covered is the whole ‘how to detect QUIC compatibility’ on initial connection. For example there is talk of using DNS for this.

                                                                                  1. 0

                                                                                    Last time I’ve checked that, the server had to send alt-svc header via HTTP/2 or HTTP/1. For reference, HTTP/2 upgrade happens via TLS NPN or ALPN.

                                                                                    1. 1

                                                                                      That’s how it works, but DNS is also an option.