1. 6

    You don’t need to specify the compile line if it’s a C or C++ file:

    foo : $(OBJS)
    

    is enough. Here’s the Makefile (GNUMake, excluding dependencies) for a 150,000+ line project I have:

    %.a :
    	$(AR) $(ARFLAGS) $@ $?
    
    all: viola/viola
    
    libIMG/libIMG.a     : $(patsubst %.c,%.o,$(wildcard libIMG/*.c))
    libXPA/src/libxpa.a : $(patsubst %.c,%.o,$(wildcard libXPA/src/*.c))
    libStyle/libStyle.a : $(patsubst %.c,%.o,$(wildcard libStyle/*.c))
    libWWW/libWWW.a     : $(patsubst %.c,%.o,$(wildcard libWWW/*.c))
    viola/viola         : $(patsubst %.c,%.o,$(wildcard viola/*.c))	\
    		libIMG/libIMG.a		\
    		libXPA/src/libxpa.a	\
    		libStyle/libStyle.a	\
    		libWWW/libWWW.a
    

    I have a rule to automatically make the dependencies.

    1. 3

      foo: $(OBJS) with no command is not enough for non-GNU make.

      1. 9

        Ah. I haven’t used a non-GNUMake in … 20 years?

        1. 2

          indeed ! it is way better to use a portable make, rather than write portable makefiles :)

          1.  

            In that case I’ll use NetBSD make ;)

      2. 1

        for < 10k line projects i tend to just do ‘cc *.c’ :P usually it is fast enough

        1. 1

          Meanwhile, I have a 2.5kloc project where a full recompile takes 30 seconds, so incremental compilation is kind of necessary :p C++ is slooow.

        2.  

          Is this a revived ViolaWWW?

          1.  

            Somewhat. It’s one of those “I want to do something but I don’t know what” type projects where I clean up the code to get a clean compile (no warnings—I still have a ways to go). It works on a 32-bit system, but crashes horribly on 64-bit systems because of the systemic belief that sizeof(int)==sizeof(long)==sizeof(void *).

        1. 1

          Nice! I’ll have to make an Emacs template for when I first open a Makefile to insert that boilerplate. I’d also add the code to generate the header file dependency tree using the correct g++ flags.

          1. 5

            Here’s the rule I use (GNUMake) to do the dependencies:

            depend:
            	makedepend -Y -- $(CFLAGS) -- $(shell find . -name '*.c') 2>/dev/null
            

            This will exclude the system header files, but include all local headerfiles. Remove the -Y to include system headers.

            1.  

              Better than this crufty make command I’ve been carrying around for probably 20 years!

              MAKEDEPS=@echo Making $*.d; \ 
                  set -e; rm -f $@; \ 
                  $(CXX) -MM $(CXXFLAGS) $< > $@.$$$$; \ 
                  sed 's,\(.*$(*F)\)\.o[ :]*,$(*D)/\1.o $@ : ,g' < $@.$$$$ > $@; \ 
                  rm -f $@.$$$$ 
              
              %.d: %.cpp 
                  $(MAKEDEPS)
              
              1.  

                Make already gets such overwhelmingly negative press so let’s not leave this as-is lest others think this is how things have to be. Here is mine:

                %.d: %.cpp
                    $(CXX) -MM -MT "$(basename $@).d $(basename $@).o" $< > $@
                
                1.  

                  Nice! I’m not even sure where I got mine, just a case of “leave it alone it works”.

              2.  

                On BSD:

                depend:
                	mkdep $(CFLAGS) $(SRCS)
                
            1. 7

              I would have rather seen the HardenedBSD code just get merged back into FreeBSD, I’m sure there are loads of reasons, but I’ve never managed to see them, their website doesn’t make that clear. I imagine it’s because of mostly non-technical reasons.

              That said, It’s great that HardenedBSD is now setup to live longer, and I hope it has a great future, as it sits in a niche that only OpenBSD really sits in, and it’s great to see some competition/diversity in this space!

              1. 13

                Originally, that’s what HardenedBSD was meant for: simply a place for Oliver and me to collaborate on our clean-room reimplementation of grsecurity to FreeBSD. All features were to be upstreamed. However, it took us two years in our attempt to upstream ASLR. That attempt failed and resulted in a lot of burnout with the upstreaming process.

                HardenedBSD still does attempt the upstreaming of a few things here and there, but usually more simplistic things: We contributed a lot to the new bectl jail command. We’ve hardened a couple aspects of bhyve, even giving it the ability to work in a jailed environment.

                The picture looks a bit different today. HardenedBSD now aims to give the FreeBSD community more choices. Given grsecurity’s/PaX’s inspiring history of pissing off exploit authors, HardenedBSD will continue to align itself with grsecurity where possible. We hope to perform a clean-room reimplementation of all publicly documented grsecurity features. And that’s only the start. :)

                edit[0]: grammar

                1. 6

                  I’m sorry if this is a bad place to ask, but would you mind giving the pitch for using HardenedBSD over OpenBSD?

                  1. 19

                    I view any OS as simply a tool. HardenedBSD’s goal isn’t to “win users over.” Rather, it’s to perform a clean-room reimplementation of grsecurity. By using HardenedBSD, you get all the amazing features of FreeBSD (ZFS, DTrace, Jails, bhyve, Capsicum, etc.) with state-of-the-art and robust exploit mitigations. We’re the only operating system that applies non-Cross-DSO CFI across the entire base operating system. We’re actively working on Cross-DSO CFI support.

                    I think OpenBSD is doing interesting things with regards to security research, but OpenBSD has fundamental paradigms may not be compatible with grsecurity’s. For example: by default, it’s not allowed to create an RWX memory mapping with mmap(2) on both HardenedBSD and OpenBSD. However, HardenedBSD takes this one step further: if a mapping has ever been writable, it can never be marked executable (and vice-versa).

                    On HardenedBSD:

                    void *mapping = mmap(NULL, getpagesize(), PROT_READ | PROT_WRITE | PROT_EXEC, ...); /* The mapping is created, but RW, not RWX. */
                    mprotect(mapping, getpagesize(), PROT_READ | PROT_EXEC); /* <- this will explicitly fail */
                    
                    munmap(mapping, getpagesize());
                    
                    mapping = mmap(NULL, getpagesize(), PROT_READ | PROT_EXEC, ...); /* <- Totally cool */
                    mprotect(mapping, getpagesize(), PROT_READ | PROT_WRITE); /* <- this will explicitly fail */
                    

                    It’s the protection around mprotect(2) that OpenBSD lacks. Theo’s disinclined to implement such a protection, because users will need to toggle a flag on a per-binary basis for those applications that violate the above example (web browsers like Firefox and Chromium being the most notable examples). OpenBSD implemented WX_NEEDED relatively recently, so my thought is that users could use the WX_NEEDED toggle to disable the extra mprotect restriction. But, not many OpenBSD folk like that idea. For more information on exactly how our implementation works, please look at the section in the HardenedBSD Handbook on our PaX NOEXEC implementation.

                    I cannot stress strongly enough that the above example wasn’t given to be argumentative. Rather, I wanted to give an example of diverging core beliefs. I have a lot of respect for the OpenBSD community.

                    Even though I’m the co-founder of HardenedBSD, I’m not going to say “everyone should use HardenedBSD exclusively!” Instead, use the right tool for the job. HardenedBSD fits 99% of the work I do. I have Win10 and Linux VMs for those few things not possible in HardenedBSD (or any of the BSDs).

                    1. 3

                      So how will JITs work on HardenedBSD? is the sequence:

                      mmap(PROT_WRITE);
                      // write data
                      mprotect(PROT_EXEC);
                      

                      allowed?

                      1. 5

                        By default, migrating a memory mapping from writable to executable is disallowed (and vice-versa).

                        HardenedBSD provides a utility that users can use to tell the OS “I’d like to disable exploit mitigation just for this particular application.” Take a look at the section I linked to in the comment above.

                    2. 9

                      Just to expound on the different philosophies approach, OpenBSD would never bring ZFS, Bluetooth, etc into the OS, something HardenedBSD does.

                      OpenBSD has a focus on minimalism, which is great from a maintainability and security perspective. Sometimes that means you miss out on things that could make your life easier. That said OpenBSD still has a lot going for it. I run both, depending on need.

                      If I remember right, just the ZFS sources by themselves are larger than the entire OpenBSD kernel sources, which gives ZFS a LOT of attack surface. That’s not to say ZFS isn’t awesome, it totally is, but if you don’t need ZFS for a particular compute job, not including it gives you a lot smaller surface for bad people to attack.

                      1. 5

                        If I remember right, just the ZFS sources by themselves are larger than the entire OpenBSD kernel sources, which gives ZFS a LOT of attack surface.

                        I would find a fork of HardenedBSD without ZFS (and perhaps DTrace) very interesting. :)

                        1. 3

                          Why fork? Just don’t load the kernel modules…

                          1. 4

                            There have been quite a number of changes to the kernel to accommodate ZFS. It’d be interesting to see if the kernel can be made to be more simple when ZFS is fully removed.

                            1. 1

                              You may want to take a look at dragonflybsd then.

                        2. 4

                          Besides being large, I think what makes me slightly wary of ZFS is that it also has a large interface with the rest of the system, and was originally developed in tandem with Solaris/Illumos design and data structures. So any OS that diverges from Solaris in big or small ways requires some porting or abstraction layer, which can result in bugs even when the original code was correct. Here’s a good writeup of such an issue from ZFS-On-Linux.

                  1. 2

                    I remember dialing into BBSes at 300 baud back in the late 80s. I found 1200 much better as I could read at that speed; 2400 was a bit too fast to keep up with. Downloading files took forever though.

                    The guy needs to set his TTY session to send a CRLF at the end of each line instead of just the LF. A stty onlcr should work.

                    1. 3

                      I host my own, and have since 1998 (back when I was wrangling servers for an ISP/web hosting company). I’ve had the current IP for my email host for probably 10 to 15 years now, so it’s clean. I’m also the only user, so no spam goes out.

                      I currently run Postfix with a greylist daemon. I check email directly on the server using mutt (which I’ve noticed lowers my outgoing email spam score by a lot). I’ve configured SPF, and I have a valid PTR record for my server. Except for some rare hiccups (mostly a decade ago with AOL, Yahoo and Google—go figure) I have had no issues. I attribute this to longevity and a slow boil [1] than anything else.

                      [1] In the frog sense. I’ve had to keep up over the years; I have avoided having to start from scratch today.

                      1. 1

                        I’m curious, where do you get your own Public IPv4 address?

                        1. 1

                          I have a virtual server at a data center from the company I used to work for.

                          1. 1

                            I understand. Where I live IPv4 are uncommon, and abundant, but the entities behind them aren’t trustworthy.

                      1. 4

                        Hopefully they only hide www. when it is exactly at the start of the domain name, leaving duplicates and domains in the middle (like notriddle.www.github.io and www.www.lobste.rs) alone.

                        1. 43

                          How about just leaving the whole thing alone? URI/URLs are external identifiers. You don’t change someone’s name because it’s confusing. Such an arrogant move from google.

                          1. 11

                            Because we’re Google. We don’t have to care know better than you.

                            1. 3

                              Eventually the URL bar will be so confusing and arbitrary users will just have to search google for everything.

                              1. 5

                                Which is of course, Google’s plan and intent, all along. Wouldn’t surprise me if they are aiming to remove URLs from the omni bar completely at some point.

                            2. 3

                              It’s the same with Safari on Mac - not only do they hide the subdomain but everything else from the URL root onwards too. Dreadful, and the single worst (/only really bad) thing about Safari’s UI.

                              1. 3

                                You don’t change someone’s name because it’s confusing

                                That’s why they’re going to try to make it a standard.
                                They will probably also want to limit the ports that you can use with the www subdomain, or at least propose that some be hidden, like 8080

                                1. 2

                                  Perhaps everyone should now move to w3.* or web.* names just to push back! Serious suggestion.

                                2. 1

                                  Indeed, but I still think it is completely unnecessary and I don’t get how this “simplifies” anything

                                1. 15

                                  This started out as a total rant about the current state of the web, insecurity and the lack of proper rigidity in specifications. I decided not to post it while I was all riled up. The next day I rewrote it in its current form. It’s still a bit one-sided as I’m still having trouble understanding their reasoning. I vainly hope they’ll either give me a more coherent explanation why they dropped the formal grammar, or actually fix it.

                                  1. 15

                                    The formal grammar doesn’t reflect reality. The browsers started diverging from it years ago, as did the server authors. Sad, but true of many many similar specifications. The WHATWG spec a descriptive spec, not a prescriptive one: it was very carefully reverse engineered from real behaviours.

                                    1. 8

                                      You can model that, too. Specs trying to model C with undefined behavior or protocol operation with failure modes just add the extra stuff in there somewhere. Preferably outside of the clean, core, expected functioning. You still get the benefits of a formal spec. You just have to cover more ground in it. Also, good to do spec-based test generation run against all those clients, servers, or whatever to test the spec itself for accuracy.

                                      1. 1

                                        … that’s exactly what these modern bizarro algorithmic descriptions of parsers are—rigorous descriptions of real behaviors that have been standardized. “Just add the extra stuff” and this is what you get.

                                        It sounds like by a “formal spec” you mean a “more declarative and less algorithmic” spec, which definitely seems worthwhile. But be clear about what you want and how it’s different from what people have been forced to do by necessity in order to keep the web running.

                                        1. 1

                                          By formal spec, I mean formal specification: a precise, mathematical/logical statement of the standard. A combo of English and formal spec (esp executable) with both remove ambiguities, highlight complexities, and aid correct implementation.

                                          Certain formal languages also support automatic, test generation from specs. That becomes a validation suite for implementations. A formal spec also allows for verified implementations, whether partly or fully.

                                          1. 2

                                            I am exceedingly familiar with what a formal specification is. I am pretty sure you are confused about the difference between rigor and a declarative style—the two are entirely orthogonal. It is possible to specify something in an algorithmic style and to be entirely unambiguous, highlight complexities, aid correct implementation, and support automatic test generation, moreover, this has been done and is done extremely often—industry doesn’t use (or get to use) parser generators all the time.

                                            1. 1

                                              Ok, good you know it. It’s totally possible Im confused on rigor. Ive seen it used in a few different ways. How do you define it?

                                              1. 2

                                                Sorry for the delay, renting a car :(

                                                I would define rigor as using mathematics where possible and extremely precise prose when necessary to removing ambiguity, like you pointed out. Concretely, rigor is easier to achieve when the language you are writing in is well defined.

                                                If you written using mathematical notation you get the advantage of centuries of development in precision—you don’t have to redefine what a cross product or set minus or continuity are, for example, which would be very painful to do in prose.

                                                Specs try to achieve the same thing by using formalized and often stilted language and relying on explicit references to other specs. Because mathematicians have had a much longer time to make their formalisms more elegant (and to discover where definitions were ambiguous—IIRC Cauchy messed up his definition of convergence and no one spotted the error for a decade!) specs are often a lot clunkier.

                                                For an example of clunkiness, look at the Page Visibility API. It’s an incredibly simple API, but even then the spec is kind of painful to read. Sorry I can’t link to the specific section, my phone won’t let me. https://www.w3.org/TR/page-visibility/#visibility-states-and-the-visibilitystate-enum

                                                Separately, for an example of formal methods that looks more algorithmic than you might normally expect, see NetKAT, which is a pretty recent language for programming switches. https://www.cs.cornell.edu/~jnfoster/papers/frenetic-netkat.pdf

                                                Certainly web spec authors have a long way to go until they can commonly use formalisms that are as nice as NetKATs. But they still have rigor, just within the clunky restrictions imposed by having to write in prose.

                                    2. 5

                                      I have to parse sip: and tel: URLs (RFC-3261 and RFC-3966) for work. I started with the formal grammar specified in the RFCs (and use LPeg for the parsing) and even then, it took several iterations with the code to get it working against real-world data (coming from the freaking Monopolistic Phone Company of all places!). I swear the RFCs were written by people who never saw a phone number in their life. Or were wildly optimistic. Or both. I don’t know.

                                      1. 8

                                        I may hazard a guess… I watched the inception of WHATWG and used to follow their progress over several years, so I have a general feeling of what they’re trying to do in the world.

                                        WHATWG was born as an anti-thesis to W3C’s effort to enforce a strict XHTML on the Web. XHTML appealed to developers, both of Web content and of user agents, because, honestly, who doesn’t want a more formal, simpler specification? The problem was that the world “in large” is not rigid and poorly lends itself to formal specifications. WHATWG realized that and attempted to simply describe the Web in all its ugliness, complete with reverse engineered error handling of non-cooperative browsers. They succeeded.

                                        So I could imagine the reasoning for dropping the formal specification is due to admitting the fact that it can’t be done in a fashion compatible with the world. Sure, developers would prefer to have ABNF for URLs, but users prefer browsers where all URLs work. Sorry :-(

                                        1. 3

                                          This is my understanding too, but you still need to nail down some sort of “minimally acceptable” syntax for URLs to prevent further divergence and to guide new implementations.

                                      1. 4

                                        The logo looks like the severed head of a Lego minifig who is missing an eye.

                                        1. 4

                                          Just once, I would like to see one of these TDD proponents tackle an actual hard problem, such as writing a device driver (which includes an interrupt handler). I’d love to see how TDD would work in such a situation.

                                          1. 1

                                            I’m not a TDD proponent, and I’ve never written device drivers, but I tend to test the impact of asynchronous external events (e.g. interrupts) using generative testing; e.g. using QuickCheck to generate a sequence of plausible or pathological actions as part of the test’s inputs, and firing them during the test’s execution of the code. This can use dummy or mock I/O actions, since we’re mostly testing the logic; or we can run in a standalone simulator/interpreter/VM.

                                            I believe that this is quite common for testing networked systems (where the “actions” are incoming packets or requests, rather than interrupts).

                                            1. 1

                                              Not really about device drivers, but Mike Bland wrote a series on how to test the openssl code that was involved in heartbleed: https://martinfowler.com/articles/testing-culture.html I guess this subject matter could be considered harder than bog-standard business CRUD.

                                            1. 2

                                              I just switched to OpenBSD for e-mail using the following stack:

                                              Inbound: opensmtpd -> spampd(tag) -> opensmtpd -> clamsmtpd(tag) -> opensmtpd -> procmail -> dovecot(Maildir) outbound: opensmtpd -> dkim_proxy -> opensmtpd(relay)

                                              I don’t use the spamd/grey listing up front like a lot of tutorials suggest, but spampd(spam assistant) seems to get the majority of it.

                                              My old stack was similar, but used postfix on opensuse. I really like the opensmtpd configuration; loads simpler than postfix. However I wish it supporter filters that the other MTAs do. It had filter support for a bit, but was clunky and subsequently removed. It makes it difficult (impossible?) to run things like rspam.

                                              1. 5

                                                rspamd has an MDA mode, so you can do like

                                                accept from any for local virtual { "@" => mike } deliver to mda "rspamc --mime --ucl --exec /usr/loca
                                                l/bin/dovecot-lda-mike" as mike
                                                

                                                and dovecot-lda-mike is

                                                #! /bin/sh
                                                exec /usr/local/libexec/dovecot/dovecot-lda -d mike
                                                

                                                smtpd is really really really good. For some reason the email software ecosystem is a mess of insane configs and horrible scripts, but my smtpd.conf is 12 lines and the only script I use (that rspamd one) is going to go away when filters come back. smtpd is so good I went with an MDA instead of a web app to handle photo uploads to my VPS. It’s one line in smtpd.conf and ~70 lines of python, and I don’t have to deal with fcgi or anything like that.

                                                1. 1

                                                  smtpd is so good I went with an MDA instead of a web app to handle photo uploads to my VPS

                                                  Oh that’s a clever idea. I’ve been using ssh (via termux) on my phone but that is so clumsy.

                                                2. 5

                                                  I do greylisting on my email server [1] and I’ve found that it reduces the incoming email by 50% up front—there are a lot of poorly written spam bots out there. Greylisting up front will reduce the load that your spam system will have to slog through, for very little cost.

                                                  [1] Yes, I run my own. Been doing it nearly 20 years now (well over 10 at its current location) so I have it easier than someone starting out now. Clean IP, full control over DNS (I run my own DNS server; I also have access to modify the PTR record if I need to) and it’s just me—no one else receives email from my server.

                                                  1. 2

                                                    I’m the author/presenter of the tutorial. If I may, I suggest looking at my talk this year at BSDCan: Fighting Spam at the Frontline: Using DNS, Log Files and Other Tools in the Fight Against Spam. In those slides I talk about using spf records (spf_fetch, smtpctl spfwalk, spfwalk standalone) to whitelist IPs and mining httpd and sshd logs for bad actors and actively blacklisting them.

                                                    For those who find blacklisting a terrifying idea, in the presentation I suggest configuring your firewall rules so that your whitelists always win. That way, if Google somehow get added to your blacklists, the whitelist rule will ensure Gmail can still connect.

                                                    I also discuss ways to capture send-to domains and add them to your whitelists so you don’t have to wait hours for them to escape the greylists.

                                                    1. 1

                                                      I didn’t find SPF to be all that great, and it was the nearly the same three years earlier. Even the RBL were problematic, but that was three years ago.

                                                      As for greylisting, I currently hold them for 25 minutes, and that might be 20 minutes longer than absolutely required.

                                                    2. 1

                                                      Greylisting is the best. Back when my mailserver was just on a VPS it was the difference between spamd eating 100% CPU and a usable system.

                                                  1. 2

                                                    So I’m pretty familiar with how cool befs is, but what are the other advantages of beos over unix?

                                                    1. 10

                                                      Here’s the 10,000ft view. Main advantage was performance through pervasive concurrency. My current box is Linux on an Intel Celeron. It’s responsiveness is worse than BeOS was on a Pentium due to inferior architecture. My favorite demonstration is this clip where they throw everything they can at the machine. Notice that (a) it doesn’t crash immediately vs OS’s of the time and (b) graceful degradation/recovery. I still can’t do that on crap hardware with Ubuntu without lots of lagging or something.

                                                      1. 2

                                                        When I used to demo BeOS for folks in NYC, I would throw even more at it than what is in the demo. It was amazing how good BeOS was. I’ve never had a better day to day OS in terms of stability and responsiveness. Even when it was in beta and I had to kill off servers from time to time, they would pop right back up and everything would keep going.

                                                        1. 3

                                                          I rarely hear something is better than the demo in practice. Wow. It’s architecture is worth copying today. I don’t know how close HaikuOS is to the architecture and stability under load. They might be doing their own thing in some places.

                                                          Anyone copying or reusing its principles today also has tools like Chapel, Rust, and Pony that might make the end result even better in both performance and stability. QNX and BeOS were the two I wanted cloned the most into a FOSS desktop. I hate freezes, crashes and data loss. Aside from hardware failure, no reason we should have to deal with them any more.

                                                          1. 2

                                                            The relatively limited time that I got to use QNX was pretty nice. It was limited in some areas but from a stability and responsiveness standpoint, it was a joy to use.

                                                            1. 3

                                                              If you’re curious, John Nagle described how it balanced performance and stability here. He’s constantly encouraging a copy of its design on places like HN. I did find an architectural overview from the company itself in 1992.

                                                              EDIT: Another person on HN described what the desktop experience was like. That person’s main memory was how big compiles would slow down their main workstation but not QNX desktop. It’s real-time design, maybe built-in priorities, made sure the UI parts ran immediately despite heavy load from other processes. The compiles got paused just enough for whatever he was doing. That sounded cool given one app can drag down my whole system or interrupt my text to this day.

                                                              1. 3

                                                                Back in the mid-90s I was hired to port a bunch of Unix programs to QNX. What blew me away about QNX was the network transparency in the command line. I could run a program on my machine A, loading a file from B, piping the output to a program that lives on C but run it on D and pipe that output to a local printer hooked to E, all from the command line. My boss would regularly use the modem attached to my machine from his machine (in the office next to mine).

                                                                Now, this meant that all machines had to have the same accounts install on all the machines, and the inter-processing message passing was done over Ethernet, not IP, so it was limited to a single segment. Such a setup might not fly that well in these more security-conscience days.

                                                                As far as speed goes, QNX was fast. I had friends that worked at a local company that sold commercial X servers and they said that the fastest X servers they had all ran on QNX.

                                                                1. 2

                                                                  That all sounds awesome. I wonder if it’s still that fast on something like Intel Core CPU’s given how hardware changed (eg CPU vs memory bottlenecks). Some benchmarks would be interesting against both Linux and L4-based systems.

                                                                  If it held up, then someone should definitely clone and improve on it. Alternatively, port its tricks to other kernel types or hypervisors.

                                                    1. 6

                                                      What? No links at all to these fabulous text only websites?

                                                      1. 2

                                                        You’re right! I can’t believe I forgot that “detail”. I will update the article when I get home. Sorry about that! Thanks for pointing this out.

                                                      1. 4

                                                        Its a bit of a tough choice. With the current state of things, most users would see a massive improvement from switching from ISP DNS servers that admit to collecting and selling your data and switching to cloudflare who has agreed to protect privacy.

                                                        In the end, you have to trust someone for your DNS. Mozilla could probably host it themself but they also dont have the wide spread of server locations that a CDN company has.

                                                        1. 5

                                                          While I agree that, you need to trust someone to your DNS, it shouldn’t be a specific app making that choice for you. A household or even a user with multiple devices benefits from their router caching DNS results for multiple devices, every app on every device doing this independently is foolish. If Mozilla wants to help users then they can run an informational campaign, setting a precedent for apps each using their own DNS and circumventing what users have set for themselves is the worst solution.

                                                          1. 1

                                                            It isn’t ideal that firefox is doing DNS in app but it’s the most realistic solution. They could try and get microsoft, apple and all linux distros to change to DNS over HTTPS and maybe in 5 years we might all have it or they could just do it themself and we all have it in a few months. Once firefox has started proving it works really well then OS vendors will start adding it and firefox can remove their own version or distros will patch it to use the system DoH.

                                                            1. 6

                                                              They could try and get microsoft, apple and all linux distros to change to DNS over HTTPS

                                                              I don’t WANT DNS over HTTPS. I especially don’t want DNS over HTTP/2.0. There’s a lot of value in having protocols that are easy to implement, debug, and understand at a low level, and none of those families of protocols are that.

                                                              Add TLS, maybe – it’s also a horrendous mess, but since DNSCURVE seems to be dead, it may get enough traction. Cloudflare, if they really want, can do protocol sniffing on port 443. But please, let’s not make the house of card protocol stack that is the internet even more complex.

                                                              1. 8

                                                                DNS is “easy to implement, debug, and understand”? That’s news to me.

                                                                1. 5

                                                                  it’s for sure easier than when tunneled over HTTP2 > SSL > TCP, because that’s how DoH works. The payload of the data being transmitted over HTTP is actual binary DNS packets so all this does is adding complexity overhead.

                                                                  I’m not a big fan of DoH because of that and also because this means that by default intranet and development sites won’t be available any more to users and developers, invalidating an age-old concept of having private DNS.

                                                                  So either you now need to deploy customized browser packages, or tweak browser’s configs via group policy or equivalent functionality (if available), or expose your intranet names to public DNS which is a security downgrade from the status quo.

                                                                  1. 3

                                                                    It is when you have a decent library to encode/decode DNS packets and UDP is nearly trivial to deal with compared to TCP (much less TLS).

                                                                  2. 0

                                                                    Stacking protocols makes things more simple. Instead of having to understand a massive protocol that sits on its own, you now only have to understand the layer that you are interested in. I haven’t looked in to DNS but I can’t imagine it’s too simple. It’s incredibly trivial for me to experiment and develop with applications running on top of HTTP because all of the tools already exist for it and aren’t specific to DoH. You can also share software and libraries so you only need one http library for a lot of protocols instead of them all managing sending data over TCP.

                                                                    1. 6

                                                                      But the thing transmitted over HTTP is binary DNS packets. So when debugging you still need to know how DNS packets are built, but you now also have to deal with HTTP on top. Your HTTP libraries only give you a view into the HTTP part of the protocol stack but not into the DNS part, so when you need to debug that, you’re back to square one but also need your HTTP libraries

                                                                      1. 6

                                                                        And don’t forget that HTTP/2 is basically a binary version of HTTP, so now you have to do two translation steps! Also, because DoH is basically just the original DNS encoding, it only adds complexity. For instance, the spec itself points out that you have two levels of error handling: One of HTTP errors (let’s say a 502 because the server is overloaded) and one of DNS errors.

                                                                        It makes more sense to just encode DNS over TLS (without the unnecessary HTTP/2 stuff), or to completely ditch the regular DNS spec and use a different wire format based on JSON or XML over HTTP.

                                                                        1. 4

                                                                          And don’t forget that HTTP/2 is basically a binary version of HTTP

                                                                          If only it was that simple. There’s server push, multi-streaming, flow control, and a huge amount of other stuff on top of HTTP/2, which gives it a relatively huge attack surface compared to just using (potentially encrypted) UDP packets.

                                                                          1. 3

                                                                            Yeah, I forgot about all that extra stuff. It’s there (and thus can be exploited), even if it’s not strictly needed for DoH (I really like that acronym for this, BTW :P)

                                                                2. 1

                                                                  it shouldn’t be a specific app making that choice for you

                                                                  I think there is a disconnect here between what security researchers know to be true vs what most people / IT professionals think is true.

                                                                  Security, in this case privacy and data integrity is best handled with the awareness of the application, not by trying to make it part of the network or infrastructure levels. That mostly doesn’t work.

                                                                  You can’t get any reasonable security guarantees from the vast majority of local network equipment / CPE. To provide any kind of privacy the application is the right security barrier, not your local network or isp.

                                                                  1. 3

                                                                    I agree that sensible defaults will increase security for the majority of users, and there is something to be said for ones browser being the single most DNS hungry app for that same majority.

                                                                    If its an option that one can simply override (which appears to be the case), then why not. It will improve things for lots of people, and those which choose to have the same type of security (dnscrypt/dnssec/future DNS improvements) on their host or router can do so.

                                                                    But I can’t help thinking its a bit of a duct tape solution to bigger issues with DNS overall as a technolgy and the privacy concerns that it represents.

                                                              1. 4

                                                                @akkartik what are your thoughts on having many little languages floating around?

                                                                1. 9

                                                                  I see right through your little ploy to get me to say publicly what I’ve been arguing privately to you :) Ok, I’ll lay it out.

                                                                  Thanks for showing me this paper! I’d somehow never encountered it before. It’s a very clear exposition of a certain worldview and way of organizing systems. Arguably this worldview is as core to Unix as “do one thing and do it well”. But I feel this approach of constantly creating small languages at the drop of a hat has not aged well:

                                                                  • Things have gotten totally insane when it comes to the number of languages projects end up using. A line of Awk here, a line of Sed there, makefiles, config files, m4 files, Perl, the list goes on and on. A newcomer potentially may want to poke at any of these, and now (s)he may have to sit with a lengthy manpage for a single line of code. (Hello man perl with your 80+ parts.) I’m trying to find this egregious example in my notes, but I noticed a year or two ago that some core Ruby project has a build dependency on Python. Or vice versa? Something like that. The “sprawl” in the number of languages on a modern computer has gotten completely nuts.

                                                                  • I think vulnerabilities like Shellsock are catalyzing a growing awareness that every language you depend on is a potential security risk. A regular tool is fairly straightforward: you just have to make sure it doesn’t segfault, doesn’t clobber memory out of bounds, doesn’t email too many people, etc. Non-trivial but relatively narrow potential for harm. Introduce a new language, though, and suddenly it’s like you’ve added a wormhole into a whole new universe. You have to guard against problems with every possible combination of language features. That requires knowing about every possible language feature. So of course we don’t bother. We just throw up our arms and hope nothing bad happens. Which makes sense. I mean, do you want to learn about every bone-headed thing somebody threw into GNU make?!

                                                                  Languages for drawing pictures or filling out forms are totally fine. But that’s a narrower idea: “little languages to improve the lives of non-programmers”. When it comes to “little languages for programmers” the inmates are running the asylum.

                                                                  We’ve somehow decided that building a new language for programmers is something noble. Maybe quixotic, but high art. I think that’s exactly wrong. It’s low-brow. Building a language on top of a platform is the easy expedient way out, a way to avoid learning about what already exists on your platform. If existing languages on your platform make something hard, hack the existing languages to support it. That is the principled approach.

                                                                  1. 4

                                                                    I think the value of little languages comes not from what they let you do, but rather what they wont let you do. That is, have they stayed little? Your examples such as Perl, Make etc are those languages that did not stay little, and hence, no longer as helpful (because one has to look at 80+ pages to understand the supposedly little language). I would argue that those that have stayed little are still very much useful and does not contribute to the problem you mentioned (e.g. grep, sed, troff, dc – although even these have been affected by feature creep in the GNU world).

                                                                    Languages for drawing pictures or filling out forms are totally fine. But that’s a narrower idea: “little languages to improve the lives of non-programmers”. When it comes to “little languages for programmers” the inmates are running the asylum.

                                                                    This I agree with. The little languages have little to do with non-programmers; As far as I am concerned, their utility is in the discipline they impose.

                                                                    1. 3

                                                                      On HN a counterpoint paper was posted. It argues that using embedded domain specific languages is more powerful, because you can then compose them as needed, or use the full power of the host language if appropriate.

                                                                      Both are valid approaches, however I think that if we subdivide the Little Languages the distinction becomes clearer:

                                                                      • languages for describing something (e.g. regular expression, format strings, graph .dot format, LaTeX math equations, etc.) that are usable both from standalone UNIX tools, and from inside programming languages
                                                                      • languages with a dedicated tool (awk, etc.) that are not widely available embedded inside other programming languages. Usually these languages allow you to perform some actions / transformations

                                                                      The former is accepted as “good” by both papers, in fact the re-implementation of awk in Scheme from the 2nd paper uses regular expressions.

                                                                      The latter is limited in expressiveness once you start using them for more than just ad-hoc transformations. However they do have an important property that contributes to their usefulness: you can easily combine them with pipes with programs written in any other language, albeit only as streams of raw data, not in a type-safe way.

                                                                      With the little language embedded inside a host language you get more powerful composition, however if the host language doesn’t match that of the rest of your project, then using it is more difficult.

                                                                      1. 3

                                                                        First, a bit of critique on Olin Shivers’ paper!

                                                                        • He attacks the little languages as ugly, idiosyncratic, and limited in expressiveness. While the first two is subjective, I think he misses the point when he says they are limited in expressiveness. That is sort of the point.
                                                                        • Second, he criticizes that a programmer has to implement an entire language including loops, conditionals, variables, and subroutines, and these can lead to suboptimal design. Here again, in a little language, each of these structures such as variables, conditionals, and loops should not be included unless there is a very strong argument for the inclusion of it. The rest of the section (3) is more an attack on incorrectly designed little languages than on the concept of little languages per say. The same attacks can be leveled against his preferred approach of embedding a language inside a more expressive language.

                                                                        For me, the whole point of little languages has been the discipline they impose. They let me remove considerations of other aspects of the program, and focus on a small layer or stage at a time. It helps me compose many little stages to achieve the result I want in a very maintainable way. On the other hand, while embedding, as Shivers observes, the host language is always at hand, and the temptation for a bit of optimization is always present. Further, the host language does not always allow the precise construction one wants to use, and there is an impedance mismatch between the domain lingo and what the host language allows (as you also have observed). For example, see the section 5.1 on the quoted paper by Shivers.

                                                                        My experience has been that, programs written in the fashion prescribed by Shivers often end up much less readable than little languages with pipe line stages approach.

                                                                        1. 1

                                                                          That’s tantalizing. Do you have any examples of a large task built out of little stages, each written in its own language?

                                                                          1. 2

                                                                            My previous reply was a bit sparse. Since I have a deadline coming up, and this is the perfect time to write detailed posts in the internet, here goes :)

                                                                            In an earlier incarnation, I was an engineer at Sun Microsystems (before the Oracle takeover). I worked on the iPlanet[1] line of web and proxy servers, and among other things, I implemented the command line administration environment for these servers[2] called wadm. This was a customized TCL environment based on Jacl. We chose Jacl as the base after careful study, which looked at both where it was going to be used most (as an interactive shell environment), as well as its ease of extension. I prefer to think of wadm as its own little language above TCL because it had a small set of rules beyond TCL such as the ability to infer right options based on the current environment that made life a bit more simpler for administrators.

                                                                            At Sun, we had a very strong culture of testing, with a dedicated QA team that we worked closely with. Their expertise was the domain of web and proxy servers rather than programming. For testing wadm, I worked with the QA engineers to capture their knowledge as test cases (and to convert existing ad-hoc tests). When I looked at existing shell scripts, it struck me that most of the testing was simply invoke a command line and verify the output. Written out as a shell script, these may look ugly for a programmer because the scripts are often flat, with little loops or other abstractions. However, I have since come to regard them as a better style for the domain they are in. Unlike in general programming, for testing, one needs to make the tests as simple as possible, and loops and subroutines often make simple stuff more complicated than it is. Further, tests once written are almost never reused (as in, as part of a larger test case), but only rerun. Further, what we needed was a simple way to verify the output of commands based on some patterns, the return codes, and simple behavior such as response to specific requests, and contents of a few administration files. So, we created a testing tool called cat (command line automation tool) that essentially provided a simple way to run a command line and verify its result. This was very similar to expect[3]. It looked like this

                                                                            wadm> list-webapps --user=admin --port=[ADMIN_PORT] --password-file=admin.passwd --no-ssl
                                                                            /web-admin/
                                                                            /localhost/
                                                                            =0
                                                                            
                                                                            wadm> add-webapp --user=admin --port=[ADMIN_PORT] --password-file=admin.passwd --config=[HOSTNAME] --vs=[VIRTUAL_SERVER] --uri=[URI_PATH]
                                                                            =0 
                                                                            

                                                                            The =0 implies return code would be 0 i.e success. For matching, // represented a regular expression, “” represented a string, [] represented a shell glob etc. Ordering was not important, and all matches had to succeed. the names in square brackets were variables that were passed in from command line. If you look at our man pages, this is very similar to the format we used in the man pages and other docs.

                                                                            Wadm had two modes – stand alone, and as a script (other than the repl). For the script mode, the file containing wadm commands was simply interpreted as a TCL script by wadm interpreter when passed as a file input to the wadm command. For stand alone mode wadm accepted a sub command of the form wadm list-webapps --user=admin ... etc. which can be executed directly on the shell. The return codes (=0) are present only in stand alone mode, and do not exist in TCL mode where exceptions were used. With the test cases written in cat we could make it spit out either a TCL script containing the wadm commands, or a shell script containing stand alone commands (It could also directly interpret the language which was its most common mode of operation). The advantage of doing it this way was that it provided the QA engineers with domain knowledge an easy environment to function. The cat scripts were simple to read and maintain. They were static, and eschewed complexities such as loops, changing variable values, etc, and could handle what I assumed to be 80% of the testing scenarios. For the 80% of the remaining 20%, we provided simple loops and loop variables as a pre-processor step. If the features of cat were insufficient, engineers were welcome to write their test cases in any of perl, tcl, or shell (I did not see any such scripts during my time there). The scripts spat out by cat were easy to check and were often used as recipes for accomplishing particular tasks by other engineers. All this was designed and implemented in consultation with QA Engineers with their active input on what was important, and what was confusing.

                                                                            I would say that we had these stages in the end:

                                                                            1. The preprocessor that provides loops and loop variables.
                                                                            2. cat that provided command invocation and verification.
                                                                            3. wadm that provided a custom TCL+ environment.
                                                                            4. wadm used the JMX framework to call into the webserver admin instance. The admin instance also exposed a web interface for administration.

                                                                            We could instead have done the entire testing of web server by just implementing the whole testing in Java. While it may have been possible, I believe that splitting it out to stages, each with its own little language was better than such a step. Further, I think that keeping the little language cat simple (without subroutines, scopes etc) helped in keeping the scripts simple and understandable with little cognitive overhead by its intended users.

                                                                            Of course, each stage had existence on its own, and had independent consumers. But I would say that the consumers at each stage could chosen to have used any of the more expressive languages above them, and chose not to.

                                                                            1: At the time I worked there, it was called the Sun Java System product line.

                                                                            2: There existed a few command lines for the previous versions, but we unified and regularized the command line.

                                                                            3: We could not use expect as Jacl at that time did not support it.

                                                                            1. 1

                                                                              Surely, this counts as a timeless example?

                                                                              1. 1

                                                                                I thought you were describing decomposing a problem into different stages, and then creating a separate little DSL for each stage. Bentley’s response to Knuth is just describing regular Unix pipes. Pipes are great, I use them all the time. But I thought you were describing something more :)

                                                                                1. 1

                                                                                  Ah! From your previous post

                                                                                  A line of Awk here, a line of Sed there, makefiles, config files, m4 files, Perl, the list goes on and on … If existing languages on your platform make something hard, hack the existing languages to support it. That is the principled approach.

                                                                                  I assumed that you were against that approach. Perhaps I misunderstood. (Indeed, as I re-read it, I see that I have misunderstood.. my apologies.)

                                                                                  1. 1

                                                                                    Oh, Unix pipes are awesome. Particularly at the commandline. I’m just wondering (thinking aloud) if they’re the start of a slippery slope.

                                                                                    I found OP compelling in the first half when it talks about PIC and the form language. But I thought it went the wrong way when it conflated those phenomena with lex/yacc/make in the second half. Seems worth adding a little more structure to the taxonomy. There are little languages and little languages.

                                                                                    Languages are always interesting to think about. So even as I consciously try to loosen their grip on my imagination, I can’t help but continue to seek a more steelman defense for them.

                                                                        2. 2

                                                                          Hmm, I think you’re right. But the restrictions a language imposes have nothing to do with how little it is. Notice that Jon Bentley calls PIC a “big little language” in OP. Lex and yacc were tiny compared to their current size, and yet Jon Bentley’s description of them in OP is pretty complex.

                                                                          I’m skeptical that there’s ever such a thing as a “little language”. Things like config file parsers are little, maybe, but certainly by the time it starts looking like a language (as opposed to a file format) it’s well on its way to being not-little.

                                                                          Even if languages can be little, it seems clear that they’re inevitably doomed to grow larger. Lex and Yacc and certainly Make have not stood still all these years.

                                                                          So the title seems a misnomer. Size has nothing to do with it. Rust is not small, and yet it’s interesting precisely because of the new restrictions it imposes.

                                                                        3. 3

                                                                          I use LPeg. It’s a Lua module that implements Parsing Expression Grammars and in a way, it’s a domain specific language for parsing text. I know my coworkers don’t fully understand it [1] but I find parsing text via LPeg to be much easier than in plain Lua. Converting a name into its Soundex value is (in my opinion) trivial in LPeg. LPeg even comes with a sub-module to allow one to write BNF (here’s a JSON parser using that module). I find that easier to follow than just about any codebase you could present.

                                                                          So, where does LPeg fall? Is it another language? Or just an extension to Lua?

                                                                          I don’t think there’s an easy answer.

                                                                          [1] Then again, they have a hard time with Lua in general, which is weird, because they don’t mine Python, and if anything, Lua is simpler than Python. [2]

                                                                          [2] Most programmers I’ve encountered have a difficult time working with more than one or two languages, and it takes them a concerted effort to “switch” to a different language. I don’t have that issue—I can switch among languages quite easily. I wonder if this has something to do with your thoughts on little languages.

                                                                          1. 2

                                                                            I think you are talking about languages that are not little, with large attack surfaces. If a language has a lengthy man page, we are no longer speaking about the same thing.

                                                                            Small configuration DSLs (TOML, etc), text search DSLs (regex, jq, etc), etc are all marvelous examples of small languages.

                                                                            1. 1

                                                                              My response to vrthra addresses this. Jon Bentley’s examples aren’t all that little either.[1] And they have grown since, like all languages do.

                                                                              When you add a new language to your project you aren’t just decorating your living room with some acorns. You’re planting them. Prepare to see them grow.

                                                                              [1] In addition to the quote about “big little language”, notice the “fragment of the Lex description of PIC” at the start of page 718.

                                                                              1. 1

                                                                                What, so don’t create programming languages because they will inevitably grow? What makes languages different from any other interface? In my experience, interfaces also tend to grow unless carefully maintained.

                                                                                1. 2

                                                                                  No, that’s not what I mean. Absolutely create programming languages. I’d be the last to stop you. but also delete programming languages. Don’t just lazily add to the pile of shit same as everybody else.

                                                                                  And yes, languages are exactly the same as any other interface. Both tend to grow unless carefully maintained. So maintain, dammit!

                                                                        1. 4

                                                                          LPEG, the Lua pattern-matching tool based on PEGs, translates patterns into programs that are interpreted by a parsing machine. Here’s the paper where they go into details about it (section 4): http://www.inf.puc-rio.br/~roberto/docs/peg.pdf

                                                                          1. 3

                                                                            I use LPeg a lot. One example, I created an LPeg expression that generates another LPeg expression to parse dates based upon the format string used by strfime(). (Code).

                                                                          1. 3

                                                                            I did not create the language, but INRAC was used for, as far as I can determine, two programs, only one of which I’ve been able to obtain (both programs do similar things). The most well known of these two programs is Racter, which supposedly wrote The Policeman’s Beard is Half-Constructed. The language is … um … rather mind blowing [1][2].

                                                                            The only “trick” I can think of is write the program in a language that makes sense for the problem, then write the compiler.

                                                                            [1]

                                                                            [2] The above is the most in-depth investigation into INRAC on the Internet, which is sad because I wrote the above and I’d still like to learn more, but without the actual compiler (or manual) it’s pretty difficult.

                                                                            1. 2

                                                                              I like this talk about destroying software more than I like this manifesto.

                                                                              1. 2

                                                                                $WORK: continue adding new features to some legacy code.

                                                                                $FUN: Need to get off my butt and actually release the Lua TLS module I wrote via LuaRocks. It should work with any version of libtls from version 2.3.0 on up.

                                                                                1. 4

                                                                                  A much better example, IMO, are these Markov generated Tumbler posts, trained with Puppet documentation and a collection of H.P. Lovecraft stories.

                                                                                  1. 3

                                                                                    “The whippoorwills were piping wildly, and in a form capable of modifying the local system.”

                                                                                    1. 5

                                                                                      Poetic.

                                                                                      And this one looks like one of those quotes that become historical, but almost no one that uses it knows what it means:

                                                                                      “Any reasonable number of resources can be specified in a way I can never hope to depict.”

                                                                                    2. 2

                                                                                      I like King James Programming. Example: Exercise 3.63 addresses why we want a local variable rather than a simple map as in the days of Herod the king

                                                                                      1. 4

                                                                                        hath it not been for the singular taste of old Unix, “new Unix” would not exist.

                                                                                        Truth.

                                                                                    1. 9

                                                                                      Most of what I want out of a website is unformatted text and jump links. Given that, the appropriate (already existing) technology to deliver it with the minimum of fuss is not HTML+CSS but gophermaps! I’d love to see people use gopher+gophermaps instead of HTML+HTTP when formatting isn’t necessary, though that’s an even bigger leap than getting people to avoid CMSes for that use case.

                                                                                      1. 2

                                                                                        Okay. gopher://i-logout.cz/1/en/bongusta/ and gopher://gopher.black/1/moku-pona are two feeds of gopher based blogs (aka phlogs). Read and enjoy.

                                                                                        1. 1

                                                                                          Thanks! I’ve read Alex Schroder’s phlog but I wasn’t aware of phlog aggregators