Threads for hoistbypetard

  1. 4

    When talking about “staff” in this article, I do not mean the Staff software engineer role that is found at tech companyies after senior. That is a different usage of the term.

    Talk about a buried lede for the kind of places this gets discussed.

    1. 2

      Indeed, I expected this to be about senior roles - but I’m sort of glad for the confusion because I don’t think I’d have clicked otherwise. Turns out I found this topic more interesting. While not a software developer / engineer, it turns out I’ve hopped between “line” and “staff” a number of times and it’s helping to clarify some of the things I’ve been thinking about in my latest career progression.

    1. 3

      This sounds like a thing which might be more convenient with some tooling support. Like you have a (partial) ordering over all .h files, the IDE knows about it, and if you type var foo = std::make_unique<Bar>() then the IDE automatically inserts #imports for <bar.h> and also all the headers that <bar.h> depends on, in the right order so that everything works out.

      …at which point you’ve invented like half of a proper import system, but oh well.

      1. 4

        …at which point you’ve invented like half of a proper import system, but oh well.

        Maybe? Proper import system I think is an unsolved problem for languages with C++/Rust style of monomorphisation. Semantics-wise, Rust crate/module system is great (that’s my favorite feature apart from unsafe). But in terms of physical architecture (what the article talks about) it’s not so great.

        • There’s nothing analogous to pimpl/forward declaration, which significantly hamstrings separate compilation. C++ is better here.
        • Although parsing and typechecking of templates happens once, monomorphisation is repeated for every compilation unit, which bloats compile time and binary size in a big way.
        1. 1

          analogous to pimpl/forward declaration

          Box<>‘d opaque types? I’ve seen multiple blog posts mentioning using this for mitigating dependency chains.

          Although parsing and…

          I miss the SPECIALIZE pragma from GHC Haskell. Your generic functions get a slow fully polymorphic version generated (with an implicitly passed around dictionary object holding typeclass method pointers) and then you could easily write out a list of SPECIALIZE pragmas and to generate monomorphic copies for specific types you really care about the performance on.

          This feels like it ought to be possible in principle to deduplicate monomorphisations happening in different compilation units with a mutex and a big hash table.

          1. 1

            Box<>‘d opaque types? I’ve seen multiple blog posts mentioning using this for mitigating dependency chains.

            I don’t believe there’s a functional analogue to pimpl in Rust, but I need to see a specific example to argue why it isn’t.

            What you could do in Rust is introducing dynamic dispatch, but it has significantly different semantics, is rather heavy weight syntactically (requires introducing single-implementation interfaces and a separate crate), and only marginally improves compilation time (the CU which “ties the knot” would still needs to be recompiled. And you generally want to tie the knot for tests).

        2. 1

          Tooling increasingly supports modules, which require you to do the opposite thing: have a single header for each library, parse it once, serialise the AST, and lazily load the small subset that you need. This composes with additional tooling such as Sony’s ‘compilation database’ work that caches template instantiations and even IR for individual snippets.

          The approach advocated in this article imposes a much larger burden on the programmer and makes it very hard for tooling to improve the situation.

          1. 2

            This reminds me a lot of Robert Dewar’s paper on the GNAT compilation model, https://dl.acm.org/doi/abs/10.1145/197694.197708

            He ditched the traditional Ada library database, and instead implemented Ada’s with dependency clauses in a similar manner to C #include, which made the compiler both simpler and faster.

            1. 1

              Interesting, thanks. I am vastly out of touch with what’s happened in C++ since 1998.

              1. 1

                In 2004 the approach advocated in the article paid off. And the larger burden was not quite enough of an ongoing thing to really hurt.

                Modules would be much nicer if the ecosystem support is there. (I’m kind of thankful not to need to know whether it is… I spend a lot less time with my C++ tooling in 2022 than I did in 2004.)

                And this:

                additional tooling such as Sony’s ‘compilation database’ work that caches template instantiations

                sounds like the stuff dreams are made of.

            1. 1

              I presided over an effort to do this for a large C++ codebase once, in the mid ‘00s. We PIMPL’d everything too.

              Our one exception to the rule was a header that was nothing but a set of macros which were used to forward declare smart pointers and linked lists.

              It cost us a full time intern for 3 months plus about 1/4 of a senior developer over that timeframe to keep it on the rails. It was really worth it. Once we were done, we were able to add Linux, OS X, FreeBSD and Solaris support to a stack that had previously been Windows-only. It was relatively easy to maintain the discipline once we got everything converted.

              The other thing I really wanted but never got for that codebase was pervasive use of precompiled headers. We only ever managed that on Windows. That would have been a tremendous reduction in compile time. I had to maintain gcc 2.95 support for way too long.

              1. 6

                I feel certain I’m missing something… I never cared for Heroku. It always seemed slow, made me think I had to jump through weird hoops, and never seemed to work very well for anything that needed more horsepower than github/gitlab’s “pages” type services. And their pricing always had too much uncertainty for me.

                Granted, I’m old, and I was old before heroku became a thing.

                But ever since bitbucket and github grew webhooks, I lost interest in figuring out Heroku.

                What am I missing? Am I just a grouch, or is there some magical thing I don’t see? Am I the jerk shaking my fist at dropbox, saying an FTP script is really just the same? Or am I CmdrTaco saying “No wireless. Less space than a Nomad. Lame.”? Or is it just lame?

                1. 5

                  By letting and making developers only care about the code they develop, not anything else, they empower productivity because you just can’t yak shave nor bikeshed your infra nor deployment process.

                  Am I the jerk shaking my fist at dropbox, saying an FTP script is really just the same?

                  Yes and you’d be very late to it.

                  1. 3

                    Yes and you’d be very late to it.

                    That’s what I was referencing :-)

                    I think I’m missing this, though:

                    By letting and making developers only care about the code they develop, not anything else, they empower productivity because you just can’t yak shave nor bikeshed your infra nor deployment process.

                    What was it about Heroku that enabled that in some distinctive way? I think I have that with gitlab pages for my static stuff and with linode for my dynamic stuff. I just push my code, and it deploys. And it’s been that way for a really long time…

                    I’m really not being facetious as I ask what I’m missing. Heroku’s developer experience, for me, has seemed worse than Linode or Digital Ocean. (I remember it being better than Joyent back in the day, but that’s not saying much.)

                    1. 2

                      I just push my code, and it deploys.

                      If you had this set up on your Linode or whatever, it’s probably because someone was inspired by the Heroku development flow and copied it to make it work on Linode. I suppose it’s possible something like this wired into git existed before Heroku, but if so it was pretty obscure given that Heroku is older than GitHub, and most people had never heard of git before GitHub.

                      (disclaimer: former Heroku employee here)

                      1. 3

                        it’s probably because someone

                        me

                        was inspired by the Heroku development flow and copied it to make it work on Linode

                        Only very indirectly, if so. I never had much exposure to Heroku, so I didn’t directly copy it. But push->deploy seemed like good horse sense to me. I started it with mercurial and only “made it so” with git about 4 years ago.

                        Since you’re a former Heroku employee, though… what did you see as your distinctive advantage? Was it just the binding between a release in source control and a deployment into production, or was it something else?

                        1. 3

                          Since you’re a former Heroku employee, though… what did you see as your distinctive advantage? Was it just the binding between a release in source control and a deployment into production, or was it something else?

                          As a frequent customer, it was just kind of the predictability. At any point within the last decade or so, it was about three steps to go from a working rails app locally to a working public rails app on heroku. Create the app, push, migrate the auto-provisioned Postgres. Need to backup your database? Two commands (capture & download). Need Redis? Click some buttons or one command. For a very significant subset of Rails apps even today it’s just that few steps.

                          1. 1

                            I don’t really know anything about the setup you’re referring to, so I can only compare it to what I personally had used prior to Heroku from 2004 to 2008, which was absolutely miserable. For the most part everything I deployed was completely manually provisioned; the closest to working automated deploys I ever got was using capistrano, which constantly broke.

                            Without knowing more about the timeline of the system you’re referring to, I have a strong suspicion it was indirectly inspired by Heroku. It seems obvious in retrospect, but as far as I know in 2008 the only extant push->deploy pipelines were very clunky and fragile buildbot installs that took days or weeks to set up.

                            The whole idea that a single VCS revision should correspond 1:1 with an immutable deployment artifact was probably the most fundamental breakthru, but nearly everything in https://www.12factor.net/ was first introduced to me via learning about it while deploying to Heroku. (The sole exception being the bit about the process model of concurrency, which is absolutely not a good general principle and only makes sense in the context of certain scripting-language runtimes.)

                            1. 2

                              I was building out what we were using 2011-2013ish. So it seems likely that I was being influenced by people who knew Heroku even though it wasn’t really on my radar.

                              For us, it was an outgrowth of migrating from svn to hg. Prior to that, we had automated builds using tinderbox, but our stuff only got “deployed” by someone running an installer, and there were no internet-facing instances of our software.

                      2. 2

                        By letting and making developers only care about the code they develop

                        This was exactly why I never really liked the idea of it, even though the tech powering it always sounded really interesting. I think it’s important to have contextual and environmental understanding of what you’re doing, whatever that may be, and although I don’t like some of the architectural excesses or cultural elements of “DevOps”, I think having people know enough about what’s under the hood/behind the curtain to be aware of the operational implications of what they’re doing is crucial to be able to build efficient systems that don’t throw away resources simply because the developer doesn’t care (and has been encouraged not to care) about anything but the code. I’ve seen plenty of developers do exactly that, not even bothering to try and optimise poorly-performing systems because “let’s just throw another/bigger dyno at it, look how easy it is”, and justifying it aggressively with Lean Startup quotes, apparently ignorant of the flip-side of “developer productivity at all costs” being “cloud providers influencing ‘culture’ to maximize their profits at the expense of the environment” - and I’ve seen it more on teams using Heroku than anywhere else because of the opaque and low-granularity “dyno” resource division. It could be that you can granularize it much more now than you could a few years ago, I haven’t looked at it for a while, and maybe even that you could then if you dug really deep into documentation, but that was how it was, and how developers used (and were encouraged to use) it - and to me it always seemed like it made the inability to squeeze every last drop of performance out of each unit almost a design feature.

                    1. 5

                      A bit disappointing that this uses GoogleAds. Not sure what ads add to this website.

                      1. 1

                        Sorry, I have Ka-Block! and I didn’t notice any ads on the site.

                        1. 1

                          Well I didn’t either, I looked at the JS loaded to show these pages (with umatrix) and saw that googleads was included. :) I can understand the usage of some kind of tracking to improve the website (which it might be used for), but not sure overall of the point of googleAds here.

                          1. 4

                            but not sure overall of the point of googleAds here.

                            It’s similar to the point of ads on most pages: the people who publish the page are hoping to make some money.

                            1. 1

                              Indeed, although on these kind of sites, there isn’t any ads usually so I was surprised.

                      1. 4

                        “ As a user, you can force allow zooming”

                        Isn’t this problem solved, then?

                        1. 21

                          No. Just because there’s an option to enable it, that doesn’t mean disabling it should be encouraged. Not everyone knows about the option, for one thing.

                          1. 10

                            You’ve identified a web browser UI design problem, which can be solved by the probably-less-than-double-digits number of teams developing popular web browsers, rather than by asking tens of millions of web content creators to change their behavior.

                            1. 5

                              Perhaps browser makers can treat it like a potentially undesirable thing. Similar to “(site) wants to know your location. Allow/Block” or “(site) tried to open a pop up. [Open it]”

                              So: “(site) is trying to disable zooming. [Agree to Disable] [Never agree]” or similar.

                            2. 8

                              I think the better question is why can you disable this in the first place. It shouldn’t be possible to disable accessibility features, as website authors have time and time again proven to make the wrong decisions when given such capabilities.

                              1. 3

                                I mean, what’s an accessibility feature? Everything, roughly, is an accessibility feature for someone. CSS lets you set a font for your document. People with dyslexia may prefer to use a system font that is set as Dyslexie. Should it not be ok to provide a stylesheet that will override system preferences (unless the proper settings are chosen on the client)?

                                1. 3

                                  Slippery slope fallacies aren’t really productive. There’s a pretty clear definition of the usual accessibility features, such as being able to zoom in or meta data to aid screen readers. Developers should only be able to aid such features, not outright disable them.

                                  1. 6

                                    I think this is a misunderstanding of what “accessibility” means. It’s not about making things usable for a specific set of abilities and disabilities. It’s about making things usable for ALL users. Color, font, size, audio or visual modality, language, whatever. It’s all accessibility.

                                  2. 1

                                    https://xkcd.com/1172/

                                    (That said, I don’t understand why browsers let sites disable zoom at all.)

                                2. 6

                                  Hi. Partially blind user here - I, for one, can’t figure out how to do this in Safari on IOS.

                                  1. 3

                                    “Based on some quick tests by me and friendly people on Twitter, Safari seems to ignore maximum-scale=1 and user-scalable=no, which is great”

                                    I think what the author is asking for is already accomplished on Safari. If it isn’t, then the author has not made a clear ask to the millions of people they are speaking to.

                                    1. 4

                                      I am a web dev dilettante / newbie, so I will take your word for it. I just know that more and more web pages make browsing them with my crazy pants busted eyes are becoming nearly impossible to view on mobile, or wildly difficult enough so as to be equivalent to impossible in any case :)

                                      1. 3

                                        And that is a huge accessibility problem. This zoom setting is a huge accessibility problem.

                                        My point is that the solution to this accessibility problem (and almost all accessibility problems) is to make the browser ignore this setting, not to ask tens of millions of fallible humans to update literally trillions of web pages.

                                        1. 3

                                          As another partially blind person, I fully agree with you. Expecting millions of developers and designers to be fully responsible for accessibility is just unrealistic; the platforms and development tools should be doing more to automatically take care of this. Maybe if the web wasn’t such a “wild west” environment where lots of developers roll their own implementations of things that should be standardized, then this wouldn’t be such a problem.

                                          1. 2

                                            Developers and designers do have to be responsible for accessibility. I’m not suggesting that we aren’t.

                                            But very often, the accessibility ask is either “Hey, Millions of people, don’t do this” or “Hey, three people, let me ignore it when millions of people do this”. And you’re much better off lobbying the three people that control the web browsers to either always, or via setting, ignore the problem.

                                            1. 1

                                              Agreed. Front end development is only 50% coding. The rest is design, encompassing UX, encompassing human factors, encompassing accessibility. You can’t apply an “I’m just a programmer” or “works on my machine” mindset when your code is running on someone else’s computer.

                                  1. 5

                                    I think the only way for something like this to get traction is to use a different header. Stuffing it in the existing User-Agent header instead of the current mess is impractical until you get a critical mass of servers to start replacing their old, pathological uses of User-Agent with uses of the URI. And many of the servers that are leaning on the old header are just the ones that won’t be making many changes.

                                    Using a new header has potential, though. If you could get all the browsers to start emitting it, once it became common you could start turning off the old one and filing bugs with sites that break. It’d still take ages before most everyday browsing could leave the old UA header turned off. But you might get there that way.

                                    1. 2

                                      Changing the header name is the easiest thing and I do not say no… But maybe it would end like dual-stack IPv4 + IPv6 where transition to “IPv6-only” is quite rare.

                                      1. 2

                                        I came here to suggest this, too. There’s just too much legacy code out there that does UA detection, esp. ancient corporate and government websites. user-agent-uri or agent-uri or uri-user-agent would make it far more actionable.

                                        Maybe there’s an opportunity for some server action, too. Perhaps a server can respond to an initial request, or perhaps a .well-known URL, with something like user-agent-header containing user-agent or b/v for old style or user-agent-uri or uri for the new, URI way.

                                        It’d still take ages before most everyday browsing could leave the old UA header turned off.

                                        I’ll call it 15 years.

                                        1. 2

                                          It’d still take ages before most everyday browsing could leave the old UA header turned off.

                                          I’ll call it 15 years.

                                          Yeah. That might even be conservative. The first code that I personally wrote with IPv6 support, because everyone would definitely soon be using IPv6, is 24 years old now.

                                          That’s a big part of what makes me say a new header would help in this case, though. If we have to re-use the old header, ossification will keep most everyone from doing it. @franta: If we can use a new header, and this new header provides (as detailed in the proposal) better information than the old one did, you will get people using the new one if the browsers ship it. Because it’s more useful for them.

                                          And that gives you a chance to clear the low bar set by IPv6.

                                          1. 2

                                            The first code that I personally wrote with IPv6 support, because everyone would definitely soon be using IPv6, is 24 years old now.

                                            This is so depressing; what a frustrating failure IPv6 promotion has been. There seems to have been a big jump recently: this article from March 2022 shows Google says 33.96% while the current Google IPv6 adoption stat as of May 1, 2022 says 39.46%. It looks like there’s a 4-5% of total regular fluctuation, though, and we’re at a peak that is also an all-time high.

                                          2. 1

                                            What about this fallback logic?

                                            1. Check the User-Agent-URI header and if present, use its value and ignore User-Agent header.
                                            2. Check the User-Agent header and if it is valid URI, use it according to the User-Agent URI specification.
                                            3. Read the User-Agent header the old way (try to parse the mess).

                                            So brave browsers/users could put URI in the User-Agent header immediately, while conservative browsers/users could put URI in the new User-Agent-URI header and put the original mess in the User-Agent as everyone is used to.

                                            1. 1

                                              I missed this reply before… I think this is a reasonable way to set up a transition that could work.

                                        1. 7

                                          A modern C++ CI system will also catch a lot of these, so using Go as the baseline feels a bit mean:

                                          • Resource leaks: RAII for all resources, warn on explicit mutex lock rather than things like std::unique_lock / std::lock_guard (one is lexically scoped, the other allows the lock ownership to be transferred). clang-tidy and friends can warn about many of these.
                                          • Unreleased mutexes are a special case of RAII
                                          • Missing switch cases are compiler warnings.
                                          • Option types. Rust is definitely better than C++ here. You can use std::optional and, I think, most implementations special case pointers so that they compile down to using nullptr as the not-present representation, but require explicit checks for access (though you’re dependent on something like the clang static analyser to notice things where you’re using the direct accessor without checking for validity first, rather than the value_or accessor). This is very much ‘you won’t see this in C++ code written with a good style guide’ vs ‘you can’t express this at all in Rust’.
                                          • Uninitialised local variables will warn but uninitialised struct fields are particularly painful in C++ and there are lots of ways of writing them that look correct but aren’t.
                                          • Unhandled explicit errors are not too much of a problem ([[nodiscard]] requires you to do something with the return, LLVM has some error templates that abort if you don’t check the error even on non-error results, though catching this at compile time is nicer than failing in the first test that uses the functionality). Unchecked exceptions are a disaster and are a big part of the reason why I prefer -fno-exceptions for C++ code. Win for Rust here.
                                          • Data races is a fun one because Rust’s Sync trait can be implemented in safe Rust only for immutable objects. Any shared mutable state requires unsafe, including standard traits such as ARC. This, in turn, means that you’re reliant on this unsafe code being correct when composed with all other unsafe code in the same program. Still a nicer place to be than C++ though.
                                          • I’m not sure I understand the Hidden Streams argument so I can’t comment on it.

                                          Rust definitely has some wins relative to modern C++. When we looked at this, our conclusion was:

                                          • Rust and modern C++ with static analysis required for CI prevent very similar levels of bugs.
                                          • C++ is often better in places where you’re doing intrinsically unsafe things (e.g. a memory allocator) because the Rust analysis tooling is much better for the safe subset of the language than the unsafe subset, whereas all C++ analysis tools are targeted at unsafe things.
                                          • Preventing developers from committing code that doesn’t compile to a repo is orders of magnitude easier than preventing them from deciding that they know better than a static analyser and marking real bugs as analyser false positives.

                                          The last one of these is really critical. C++ code can avoid these bugs, Rust code must avoid them unless you use unsafe. We assumed that avoiding unsafe was something code review would easily handle, though since I’ve learned that Facebook has millions of lines of unsafe Rust code I’m now far less confident in that claim.

                                          1. 2

                                            The last one of these is really critical. C++ code can avoid these bugs, Rust code must avoid them unless you use unsafe. We assumed that avoiding unsafe was something code review would easily handle, though since I’ve learned that Facebook has millions of lines of unsafe Rust code I’m now far less confident in that claim.

                                            This is what makes -Wall -Werror an attractive nuisance of sorts. Unfortunately, the projects I’ve worked on that would benefit most from those also had enough spurious warnings triggered by headers for dependencies that it was never practical to leave them enabled.

                                            1. 1

                                              Tip: Use -isystem to include dependencies. Tells the compiler that it’s not your code.

                                            2. 2

                                              You can use std::optional and, I think, most implementations special case pointers so that they compile down to using nullptr as the not-present representation.

                                              You mean references, not pointers, right? If this were done for pointers, then the “present NULL” and “absent” states would be indistinguishable.

                                              1. 2

                                                You mean references, not pointers, right?

                                                I meant pointers, but you’ve made me realise that I’m probably wrong. std::optional is not defined for T&.

                                                If this were done for pointers, then the “present NULL” and “absent” states would be indistinguishable.

                                                I had always assumed that std::optional<T*> x{nullptr} would give a not-present value but a quick test suggests that this is not the case. I’ve learned something today!

                                              2. 1

                                                is the nullpointer optimization for optional pointers legal in C++? It seems like since there’s no idea of nonnullable pointers, you can’t make the optimization safely since you should be able to distinguish between “no value” and the nullptr. Also the thing with Rust is that when you’re implementing data structures, which is something that facebook will do a lot throughout their codebase, you often do need unsafe. The point is that you can wrap these unsafe blocks in safer abstractions and then the calling code needs less review. That’s not to say unsound code never pops up, but having unsafe operations constrained to specific places is still really helpful.

                                              1. 3

                                                OK, so proposing to hold the PL implementation developers responsible for use-after-free bugs is so not cool.

                                                And I’m very sympathetic to experimentation in PL in general. Who doesn’t like more toys (ideas) and the chance to see how they play (fit) together?

                                                But I think we (as an industry) should not take very seriously any PL that doesn’t try to push the status quo with regards to: correctness, safety, speed, expressiveness and such. And we all need to seriously consider this when choosing to implement a library or application in a particular PL. If this component is going to be used by millions, then it is our moral obligation to choose one of the best tools for the job… and not a toy.

                                                To repeat: Toys are great! But serious work requires serious tools. If you aren’t using serious tools for your own work… why not?

                                                1. 5

                                                  OK, so proposing to hold the PL implementation developers responsible for use-after-free bugs is so not cool.

                                                  That interpretation is on Drew, if you view the entire thread where the comment is included, you’ll see the word “liable” is used, in a moral, not legal sense.

                                                  Here’s the article: https://lwn.net/Articles/893285/

                                                  Here’s the comment by mjg59: https://lwn.net/Articles/893346/

                                                  I think we’re at the point in history where anyone who writes a compiler that permits use-after-free should be held liable for anyone who manages to fuck up as a result of that.

                                                  A bit further along, Drew has convinced himself he’s gonna be charged for criminal negligence: https://lwn.net/Articles/893510/

                                                  1. 3

                                                    FWIW, while I read mjg59’s comment as hyperbole, it read like more of a legal sense than a moral one to me. It was the sentence after the one you quoted that steered me that way:

                                                    Security issues aren’t a matter of inconvenience now - we’re reached a level of pervasive tech that results in people literally dying as a result of memory unsafety.

                                                    Saying someone should be “held liable” for “people literally dying” sounds legal to me.

                                                    Your read sounds reasonable to me, but it was not where I landed at first when I read the comment about “people literally dying.”

                                                1. 7

                                                  As I recall, CGI was present very early on, definitely by 1995, and early websites definitely made use of it — obviously for form submission, but it was also sometimes used for serving pages.

                                                  There were also early servers, like Netscape’s, that ran their own custom server-side app code — I don’t know for sure but I suspect they had their own C-level plugin system for running handlers in-process to avoid the high overhead of CGI.

                                                  I’m still wondering why only PHP became available as an easy in-process scripting language. It’s not like you couldn’t build a similar system based on Python or Ruby or JS. Maybe it was the ubiquity of Apache, and the Apache developers not wanting to add another interpreter when “we already have PHP?”

                                                  1. 14

                                                    As mentioned in the article, there were other Apache modules providing similar functionality, such as mod_python. There were also CGI approaches to the same template-forward bent, such as Mason (which was perl). If there was anyone saying “why support another since we already have PHP?” it was admins on shared hosting services. Each additional module was yet another security threat vector and a customer service training.

                                                    1. 6

                                                      I was at a talk given by Rasmus Lerdorf (creator of PHP) once and he claimed it was because the PHP implementation was the most basic, limited version possible and it therefore it was very simple to isolate different users from each other. This made PHP very popular with cheap shared hosters. Whereas the Perl implementation was much more thorough and hooked (not sure what the correct terms are) into the whole of Apache and therefore it needed a dedicated server. Much more expensive.

                                                      1. 2

                                                        Yeah. Even though mod_php is a single module loaded into a single Apache instance, it was designed with some sandboxing options like safe_mode. Or you could use PHP CGI and isolate things even better (running as the user’s UID).

                                                        Other language hosting modules for Apache like mod_perl didn’t offer the same semantics. I also recall mod_perl being pretty oriented towards having access to the web server’s configuration file to set it up. People did use Perl before the rise of PHP, but most often via CGI (remember iKonboard?)

                                                        1. 3

                                                          mod_perl was more oriented toward exposing the apache extension API so that you could build apache modules in perl, as I remember it. It got used to write some cool web applications (slashcode springs to mind) that’d have been hard to write (at that scale) any other way at the time. But mod_php was a very different beast, just aiming to be a quick way to get PHP to work well without the overhead of CGI.

                                                          I agree with the article… there’s nothing now (other than PHP, which I still use now for the kind of pages you mention, the same way I did in the early ‘00s) that’s nearly as low-friction as PHP was back then to just add a couple of dynamic elements to your static pages.

                                                          1. 2

                                                            Yeah, I was at a small web hosting company in the late ’90s, early 2000s, and we used PHP CGI with our shared hosting.

                                                      2. 10

                                                        It’s not like you couldn’t build a similar system based on Python or Ruby or JS.

                                                        Not quite. The article touches this, although not explicitly, you have to read a bit between the lines.

                                                        PHP allowed for easy jump in and out static and dynamic context like no other alternative. It still does this better than anything else. This was in the core of the language no need to install third party libraries. It also included a MySQL client library in its core with work out if the box. Essentially, it shipped with everything necessary in the typical setup. No need to fiddle with server set up.

                                                        The language was also arguably more approachable for beginners than perl with a multitude of simple data structures easily accessible through the infamous array() constructor. It also retained familiarity for C programmers, which were a big audience back then. While python for example, didn’t.

                                                        One thing I don’t agree with is the simplicity nor the deployment model. It’s only simple in the context of the old shared hosting reality. If you include setting up the server yourself like we do nowadays, it is actually more cumbersome than a language that just allows you to fire up a socket listening on port 80 and serve text responses.

                                                        It.s.how it was marketed and packages that made all the difference.

                                                        1. 9

                                                          Yes, but it was “better” in the sense of “making it easy to do things that are ultimately a lousy idea”. It’s a bit better now, but I used it back then and I remember what it was like.

                                                          Convenience feature: register_globals was on by default. No thinking about nasty arrays, your query params are just variables. Too bad it let anyone destroy the security of all but the most defensively coded apps using nothing more than the address bar.

                                                          Convenience feature: MySQL client out of the box. Arguably the biggest contributor to MySQL’s success. Too bad it was a clumsy direct port of the C API that made it far easier to write insecure code than secure. A halfway decent DB layer came much, much later.

                                                          Convenience feature: fopen makes URLs look just like files. Free DoS amplification!

                                                          Convenience feature: “template-forward”, aka “my pages are full of business logic, my functions are full of echo, and if I move anything around all the HTML breaks”. Well, I guess you weren’t going to be doing much refactoring in the first place but now you’ve got another reason not to.

                                                          The deployment story was the thing back then. The idea that you signed up with your provider, you FTP’d a couple files to the server, and… look ma, I’m on the internet! No configuration, no restarting, no addr.sin_port = htons(80). It was the “serverless” of its day.

                                                          1. 21

                                                            Yes, but it was “better” in the sense of “making it easy to do things that are ultimately a lousy idea”. It’s a bit better now, but I used it back then and I remember what it was like.

                                                            It was better, in the sense of democratizing web development. I wouldn’t be here, a couple decades later, if not for PHP making it easy when I was starting out. The fact that we can critique what beginners produced with it, or the lack of grand unified design behind it, does not diminish that fact. PHP was the Geocities of dynamic web apps, and the fact that people now recognize how important and influential Geocities was in making “play around with building a web site” easy should naturally lead into recognizing how important and influential PHP was in making “play around with building a dynamic web app” easy.

                                                            1. 3

                                                              Author here, I couldn’t have put it better. “PHP was the Geocities of dynamic web apps” — this is a brilliant way to put it. In fact I’m now peeved I didn’t think of putting it like this in the article. I’m stealing this phrase for future use. :)

                                                            2. 2

                                                              Absolutely. And indeed, I saw those things totally widespread to their full extent in plenty of code bases. To add a bit of [dark] humor to the conversation, I even whiteness code that would use PHP templating capabilities to assemble PHP code that was fed to eval() on demand.

                                                              But I am really not sure you can do anything about bad programmers. No matter how much safety you put in place. It.s a similar situation with C. People complaining of all the footguns.

                                                              Can you really blame a language for people doing things like throwing a string in an SQL query without escaping it? Or a number without asserting its type? I really don’t have a clear opinion here. Such things are really stupid. I .not sure it is very productive to design technology driven by a constant mitigation of such things.

                                                              EDIT: re-reading your post. So much nostalgia. The crazy things that we had. Makes me giggle. Register globals or magic quotes were indeed… punk, for lack of a better word. Ubernostrum put it really well in a sister comment.

                                                              1. 4

                                                                But I am really not sure you can do anything about bad programmers. No matter how much safety you put in place. […] Can you really blame a language for people doing things like throwing a string in an SQL query without escaping it?

                                                                Since you mention magic quotes … there’s a terrible feature that could have been a good feature! There are systems that make good use of types and knowledge of the target language to do auto-escaping with reasonable usability and static guarantees, where just dropping the thing into the query does the secure thing 98% of the time and throws an “I couldn’t figure this out, please hint me or use a lower-level function” compile error the other 2%. PHP could have given developers that. Instead it gave developers an automatic data destroyer masquerading as a security feature, again, enabled by default. That’s the kind of thing that pisses me off.

                                                            3. 3

                                                              I definitely had a lot of fun making mildly dynamic websites in PHP as a teen, but I wouldn’t want to get back to that model.

                                                              They might have a style selector at the top of each page, causing a cookie to be set, and the server to serve a different stylesheet on every subsequent page load. Perhaps there is a random quote of the day at the bottom of each payload.

                                                              JS in modern browsers allows that kind of dynamicity very nicely, and it’s easy to make it degrade gracefully to just a static page. It will even continue to work if you save the page to your own computer. :)

                                                            4. 6

                                                              I’m still wondering why only PHP became available as an easy in-process scripting language. It’s not like you couldn’t build a similar system based on Python or Ruby or JS. Maybe it was the ubiquity of Apache, and the Apache developers not wanting to add another interpreter when “we already have PHP?”

                                                              I am someone who is, these days, primarily known for doing Python stuff. But back in the early 2000s I did everything I could in PHP and only dabbled in Perl a bit because I had some regular business from clients who were using it.

                                                              And I can say beyond doubt that PHP won, in that era, because of the ease it offered. Ease of writing — just mix little bits of logic in your HTML! — and ease of deployment via mod_php, which for the developer was far easier than messing around with CGI or CGI-ish-but-resident things people were messing with back then. There are other commenters in this thread who disagree because they don’t like the results that came of making things so easy (especially for beginning programmers who didn’t yet know “the right way” to organize code, etc.) or don’t like the way PHP sort of organically grew from its roots as one guy’s pile of helper scripts, but none of that invalidates the ease PHP offered back then or the eagerness of many people, myself included, to enjoy that easiness.

                                                              1. 4

                                                                mod_php was always externally developed from Apache and lived in PHP’s source tree.

                                                                1. 3

                                                                  The other options did exist. There were mod_perl and mod_python for in-process (JS wasn’t really a sensible server-side option at the time we’re talking about), mod_fastcgi and mod_lisp for better-than-CGI out-of-process (akin to uwsgi today), and various specialized mod_whatevers (like virgule) used by individual projects or companies. mod_perl probably ran a sizeable fraction of the commercial web at one point. But they didn’t take PHP’s niche for various reasons, but largely because they weren’t trying to.

                                                                  1. 2

                                                                    There was also the AOL webserver, which was scriptable with TCL. It looks like this was around in the early nineties, but perhaps it wasn’t open sourced yet at that point? That would definitely make it harder to gain momentum. Of course TCL was also a bit of an odd language. PHP still had the benefit of being a seamless “upgrade” from HTML - just add some logic here and there to your existing HTML files. That’s such a nice transition for people who never programmed before (and hell, even for people who had programmed before!).

                                                                    Later on, when Ruby on Rails became prominent (ca 2006), it was still not “easy” to run it. It could run with CGI, but that was way too slow. So you basically had to use FastCGI, but that was a bit of a pain to set up. Then, a company named Phusion realised mod_passenger which supposedly made running Ruby (and later, other languages like Python) as easy as mod_php. The company I worked for never ran it because we were already using fastcgi with lighttpd and didn’t want to go back to Apache with its baroque XML-like config syntax.

                                                                    1. 2

                                                                      I worked at at shared hosting at the time of the PHP boom. It all boiled down to the safe mode. No other popular competitor (Perl / Python) had it.

                                                                      Looking back, it would have been fairly cheap to create a decent language for the back-end development that would have worked way better. PHP language developers were notoriously inept at the time. Everyone competent was busy using C, Java, Python and/or sneering at the PHP crowd, though.

                                                                      1. 1

                                                                        It’s not like you couldn’t build a similar system based on Python or Ruby or JS.

                                                                        There’s ERuby which was exactly this. But by then PHP was entrenched.

                                                                        I did a side project recently in ERuby and it was a pleasure to return to it after >10 years away.

                                                                      1. 4

                                                                        I’d be curious to know @pzuraq’s view of things like htmx, hotwire and alpine. (I’m not sure it’s fair to lump alpine in with the other two, but it’s useful with them.) Wherein you deploy a little bit of javascript to avoid authoring much of your application in javascript while still ending up with reactivity similar to vue/svelte/react/angular/etc. From where I sit, at least some, if not the majority of, interesting development feels like it’s moving that way.

                                                                        1. 4

                                                                          I think they’re interesting, for sure! I did include Astro in the full-stack side of things, I think it’s similar to what you describe. My experience overall is that these frameworks/techniques still have a lot of issues (look at GitHub’s issues with caching/state management for instance) and can be pretty tricky to use, but can work well for many types of use cases. If you’re optimizing for initial page load over everything else, they are probably a good bet.

                                                                          However, it’s hard to see how they would handle more complex use cases, e.g. Google docs or AirBnB style apps. And what excites me about the full-stack frameworks era is we seem to be getting close to the point where we won’t have to make that tradeoff anymore, between a JS-lite but ultimately inflexible solution, or a full-JS but very heavy solution. It seems like we’re approaching a sweet spot with tooling and techniques that allows us to use a one-size-fits-all solution instead, possibly. Maybe there will always be a niche for needs-to-be-as-fast-as-possible style frameworks, but for maybe 90% of apps I don’t think that is necessarily the best thing to optimize for.

                                                                          1. 1

                                                                            Astro didn’t jump out at me. That’s not one I’ve looked at so far.

                                                                            I don’t think I see hotwire/htmx as optimizing for needs-to-be-as-fast-as-possible. I see them as optimizing for development velocity. With, e.g., (rails and hotwire) or (django and htmx) you can crank out your first implementation with a very small team quickly but still have an application that “feels” the way users expect it to on the modern web. I feel like I’ve done things solo that way that would’ve needed a team of 3 if I needed to manage real frontend/backend separation.

                                                                            I agree that you couldn’t write google docs using these. I’m not so sure about AirBNB :). I think the improvement, especially in early-stage development velocity, might be pushing a swing back towards rails/django/laravel/rocket + htmx/hotwire/alpine for the 90% of apps that are not google docs. I’m not sure if it’s just a blip or if this is another era for your classification.

                                                                        1. 16

                                                                          Most of the embedded graphs don’t work since apparently the author needs to pay for a plotly subscription.

                                                                          But the Way Back Machine comes to the rescue! You can see them here: https://web.archive.org/web/20170606192003/https://input.club/the-problem-with-mechanical-switch-reviews/

                                                                          sidenote: While I hate the dead embeds being held hostage for money, I do love that the broken graphs return a 402 error. This is the only case I’ve actually seen a 402 response properly used! That HTTP status literally means “payment required”.

                                                                          1. 8

                                                                            Also, “Plot twist!” is a hilarious headline for an error page on “plotly”.

                                                                            1. 3

                                                                              I hope that once you do pay them, you get a message saying The plotly thickens.

                                                                            2. 4

                                                                              That’s crazy, they worked when I submitted it - I must have been one of that last, lucky few….

                                                                              1. 1

                                                                                sidenote: While I hate the dead embeds being held hostage for money, I do love that the broken graphs return a 402 error. This is the only case I’ve actually seen a 402 response properly used! That HTTP status literally means “payment required”.

                                                                                I love the use of the status code. And it’s the only time I’ve seen it properly used as well. Also, I was just looking at using the plotly python library to show some fairly mundane graphs. The documentation makes it look like there’s no chance that my usage could trigger a payment requirement, but now I’ll need to give that some extra scrutiny. I’d be pretty irate if that happened to me.

                                                                              1. 80

                                                                                Please take it easy in the comments. Yes, the primary author of the language is famous for posting incendiary, attention-grabbing rants. If you respond to this link with vitriol, even if you’re explaining how you dislike parts of this project’s design, you reward trolling and diminish our community.

                                                                                Remember there are always a few thousand readers to every commenter. If you’re disagreeing with someone, you’re more persuasive to readers when you make your points and let them judge than descend into bickering.

                                                                                1. 33

                                                                                  Yes, the primary author of the language is famous for posting incendiary, attention-grabbing rants.

                                                                                  Rants which I might note that he apologized for, and is doing a lot less of as well.

                                                                                  1. 17

                                                                                    Do you have a link to the apology you’re talking about? I’ve steered clear of his projects for a while now because of his toxicity, but if he’s truly making an effort to change that, I may be willing to reconsider.

                                                                                    1. 16

                                                                                      (Disclaimer: I don’t know Drew personally. I have found several of his public comments abrasive. I also use and like sr.ht quite a bit.)

                                                                                      Just because I couldn’t remember when that toxic series of posts went up either, and couldn’t remember when his (apology or at least apology-adjacent) conciliatory comments went up, I used search engines to find the words “fuck” and “asshole” on his site. Those turned up both the earlier vitriolic posts and his statement of his ambitions to no longer make such posts. And they showed that he has largely avoided those two words in the past year, as far as the search engines are concerned, since. Those posts where he did use them don’t direct them pointedly at people, IMO.

                                                                                      Sample size is small, obviously, but it looks like the stated determination to be less toxic has been accompanied by progress in that direction.

                                                                                      1. 13
                                                                                        1. 3

                                                                                          Thanks! I wish it was a little more prominent, but I’m glad someone was able to find it.

                                                                                        2. 4

                                                                                          I went looking for it, and… both the apology, as well as the angry rants which prompted him to apologize, have been deleted from his blog. (Unless I’m misremembering which status update the apology was written in the first place.)

                                                                                          1. 7

                                                                                            Thanks for looking! I’m assuming it was his Wayland rant, since I can’t find a link to it from his blog index, but I can confirm the page still exists. I even wrote a pretty large response to it on this site because it was in very poor taste. I’m avoiding linking to it directly because he seems to have taken it down on purpose, and I’m happy to respect his decision.

                                                                                            I appreciate that he’s taking steps in the right direction. I don’t think I’m quite ready to look past his behavior, but I’m still willing to reconsider if he continues in this direction. It’s a very good sign.

                                                                                            Less vitriol in the open source community and software in general is definitely a good thing.

                                                                                    1. 11

                                                                                      As someone who is rather new to languages like C (I only recently got into it by making a game with it), I have a few newbie questions:

                                                                                      • Why do people want to replace C? Security reasons, or just old and outdated?

                                                                                      • What does Hare offer over C? They say that Hare is simpler than C, but I don’t understand exactly how. Same with Zig. Do they compile to C in the end, and these languages just make it easier for user to write code?

                                                                                      That being said, I find it cool to see these languages popping up.

                                                                                      1. 33

                                                                                        Why do people want to replace C? Security reasons, or just old and outdated?

                                                                                        • #include <foo.h> includes all functions/constants into the current namespace, so you have no idea what module a function came from
                                                                                        • C’s macro system is very, very error prone and very easily abused, since it’s basically a glorified search-and-replace system that has no way to warn you of mistakes.
                                                                                        • There are no methods for structs, you basically create struct Foo and then have to name all the methods of that struct foo_do_stuff (instead of doing foo_var.do_stuff() like in other languages)
                                                                                        • C has no generics, you have to do ugly hacks with either void* (which means no type checking) or with the macro system (which is a pain in the ass).
                                                                                        • C’s standard library is really tiny, so you end up creating your own in the process, which you end up carrying around from project to project.
                                                                                        • C’s standard library isn’t really standard, a lot of stuff isn’t consistent across OS’s. (I have agreeable memories of that time I tried to get a simple 3kloc project from Linux running on Windows. The amount of hoops you have to jump through, tearing out functions that are Linux-only and replacing them with an ifdef mess to call Windows-only functions if you’re on compiling on Windows and the Linux versions otherwise…)
                                                                                        • C’s error handling is completely nonexistant. “Errors” are returned as integer codes, so you need to define an enum/constants for each function (for each possible returned error), but if you do that, you need to have the actual return value as a pointer argument.
                                                                                        • C has no anonymous functions. (Whether this matters really depends on your coding style.)
                                                                                        • Manual memory management without defer is a PITA and error-prone.
                                                                                        • Weird integer type system. long long, int, short, etc which have different bit widths on different arches/platforms. (Most C projects I know import stdint.h to get uint32_t and friends, or just have a typedef mess to use usize, u32, u16, etc.)

                                                                                        EDIT: As Forty-Bot noted, one of the biggest issues are null-terminated strings.

                                                                                        I could go on and on forever.

                                                                                        What does Hare offer over C?

                                                                                        It fixes a lot of the issues I mentioned earlier, as well as reducing footguns and implementation-defined behavior in general. See my blog post for a list.

                                                                                        They say that Hare is simpler than C, but I don’t understand exactly how.

                                                                                        It’s simpler than C because it comes without all the cruft and compromises that C has built up over the past 50 years. Additionally, it’s easier to code in Hare because, well, the language isn’t trying to screw you up every 10 lines. :^)

                                                                                        Same with Zig. Do they compile to C in the end, and these languages just make it easier for user to write code?

                                                                                        Zig and Hare both occupy the same niche as C (i.e., low-level manual memory managed systems language); they both compile to machine code. And yes, they make it a lot easier to write code.

                                                                                        1. 15

                                                                                          Thanks for the great reply, learned a lot! Gotta say I am way more interested in Hare and Zig now than I was before.

                                                                                          Hopefully they gain traction. :)

                                                                                          1. 15

                                                                                            #include <foo.h> includes all functions/constants into the current namespace, so you have no idea what module a function came from

                                                                                            This and your later point about not being able to associate methods with struct definitions are variations on the same point but it’s worth repeating: C has no mechanism for isolating namespaces. A C function is either static (confined to a single compilation unit) or completely global. Most shared library systems also give you a package-local form but anything that you’re exporting goes in a single flat namespace. This is also true of type and macro definitions. This is terrible for software engineering. Two libraries can easily define different macros with the same name and break compilation units that want to use both.

                                                                                            C++, at least, gives you namespaces for everything except macros.

                                                                                            C has no generics, you have to do ugly hacks with either void* (which means no type checking) or with the macro system (which is a pain in the ass).

                                                                                            The lack of type checking is really important here. A systems programming language is used to implement the most critical bits of the system. Type checks are incredibly important here, casting everything via void* has been the source of vast numbers of security vulnerabilities in C codebases. C++ templates avoid this.

                                                                                            C’s standard library is really tiny, so you end up creating your own in the process, which you end up carrying around from project to project.

                                                                                            This is less of an issue for systems programming, where a large standard library is also a problem because it implies dependencies on large features in the environment. In an embedded system or a kernel, I don’t want a standard library with file I/O. Actually, for most cloud programming I’d like a standard library that doesn’t assume the existence of a local filesystem as well. A bigger problem is that the library is not modular and layered. Rust’s nostd is a good step in the right direction here.

                                                                                            C’s error handling is completely nonexistant. “Errors” are returned as integer codes, so you need to define an enum/constants for each function (for each possible returned error), but if you do that, you need to have the actual return value as a pointer argument.

                                                                                            From libc, most errors are not returned, they’re signalled via the return and then stored in a global (now a thread-local) variable called errno. Yay. Option types for returns are really important for maintainable systems programming. C++ now has std::optional and std::variant in the standard library, other languages have union types as first-class citizens.

                                                                                            Manual memory management without defer is a PITA and error-prone.

                                                                                            defer isn’t great either because it doesn’t allow ownership transfer. You really need smart pointer types and then you hit the limitations of the C type system again (see: no generics, above). C++ and Rust both have a type system that can express smart pointers.

                                                                                            C has no anonymous functions. (Whether this matters really depends on your coding style.)

                                                                                            Anonymous functions are only really useful if they can capture things from the surrounding environment. That is only really useful in a language without GC if you have a notion of owning pointers that can manage the capture. A language with smart pointers allows you to implement this, C does not.

                                                                                            1. 6

                                                                                              defer isn’t great either because it doesn’t allow ownership transfer. You really need smart pointer types and then you hit the limitations of the C type system again (see: no generics, above). C++ and Rust both have a type system that can express smart pointers.

                                                                                              True. I’m more saying that defer is the baseline here; without it you need cleanup: labels, gotos, and synchronized function returns. It can get ugly fast.

                                                                                              Anonymous functions are only really useful if they can capture things from the surrounding environment. That is only really useful in a language without GC if you have a notion of owning pointers that can manage the capture. A language with smart pointers allows you to implement this, C does not.

                                                                                              I disagree, depends on what you’re doing. I’m doing a roguelike in Zig right now, and I use anonymous functions quite extensively for item/weapon/armor/etc triggers, i.e., where each game object has some unique anonymous functions tied to the object’s fields and can be called on certain events. Having closures would be nice, but honestly in this use-case I didn’t really feel much of a need for it.

                                                                                            2. 3

                                                                                              Note that C does have “standard” answers to a lot of these.

                                                                                              C’s macro system is very, very error prone and very easily abused, since it’s basically a glorified search-and-replace system that has no way to warn you of mistakes.

                                                                                              The macro system is the #1 thing keeping C alive :)

                                                                                              There are no methods for structs, you basically create struct Foo and then have to name all the methods of that struct foo_do_stuff (instead of doing foo_var.do_stuff() like in other languages)

                                                                                              Aside from macro stuff, the typical way to address this is to use a struct of function pointers. So you’d create a wrapper like

                                                                                              do_stuff(struct *foo)
                                                                                              {
                                                                                                  foo->do_stuff(foo);
                                                                                              }
                                                                                              

                                                                                              C has no generics, you have to do ugly hacks with either void* (which means no type checking) or with the macro system (which is a pain in the ass).

                                                                                              Note that typically there is a “base class” which either all “subclasses” include as a member (and use offsetof to recover the subclass) or have a void * private data pointer. This doesn’t really escape the problem, however in practice I’ve never run into a bug where the wrong struct/method gets combined. This is because the above pattern ensures that the correct method gets called.

                                                                                              C’s error handling is completely nonexistant. “Errors” are returned as integer codes, so you need to define an enum/constants for each function (for each possible returned error), but if you do that, you need to have the actual return value as a pointer argument.

                                                                                              Well, there’s always errno… And if you control the address space you can always use the upper few addresses for error codes. That said, better syntax for multiple return values would probably go a long way.

                                                                                              C has no anonymous functions. (Whether this matters really depends on your coding style.)

                                                                                              IIRC gcc has them, but they require executable stacks :)

                                                                                              Manual memory management without defer is a PITA and error-prone.

                                                                                              Agree. I think you can do this with GCC extensions, but some sugar here would be nice.

                                                                                              Weird integer type system. long long, int, short, etc which have different bit widths on different arches/platforms. (Most C projects I know import stdint.h to get uint32_t and friends, or just have a typedef mess to use usize, u32, u16, etc.)

                                                                                              Arguably there should be fixed width types, size_t, intptr_t, and regsize_t. Unfortunately, C lacks the last one, which is typically assumed to be long. Rust, for example, gets this even more wrong and lacks the last two (c.f. the recent post on 129-bit pointers).


                                                                                              IMO you missed the most important part, which is that C strings are (by-and-large) nul-terminated. Having better syntax for carrying a length around with a pointer would go a long way to making string support better.

                                                                                            3. 9

                                                                                              Even in C’s domain, where C lacks nothing and is fine for what it is, I would criticize C for maybe 5 things, which I would consider the real criticism:

                                                                                              1. It has undefined behaviour, of the kind that has come to mean that the compiler may disobey the source code. It turns working code into broken code just by switching compiler or inlining some code that wasn’t inlined before. You can’t necessarily point at a piece of code and say it was always broken, because UB is a runtime phenomenon. Not reassuring for a supposedly lowlevel language.
                                                                                              2. Its operator precedence is wrong.
                                                                                              3. Integer promotion. Just why.
                                                                                              4. Signedness propagates the wrong way: Instead of the default type being signed (int) and comparison between signed and unsigned yielding unsigned, it should be opposite: There should be a nat type (for natural number, effectively size_t), and comparison between signed and unsigned should yield signed.
                                                                                              5. char is signed. Nobody likes negative code points.
                                                                                              1. 6

                                                                                                the kind that has come to mean that the compiler may disobey the source code. It turns working code into broken code

                                                                                                I’m wary of this same tired argument cropping up again, so I’ll just state it this way: I disagree. Code that invokes undefined behavior is already broken; changing compiler can’t (except perhaps in very particular circumstances, which I don’t think you were referring to) introduce undefined behaviour; it can change the observable behaviour when UB is invoked.

                                                                                                A compiler can’t “disobey the source code” whilst conforming to the language standard. If the source code does something that doesn’t have defined semantics, that’s on the source code, not the compiler.

                                                                                                “It’s easy to accidentally invoke undefined behaviour in C” is a valid criticism, but “C compilers breaks code” is not.

                                                                                                You can’t necessarily point at a piece of code and say it was always broken

                                                                                                You certainly can in some instances. But sure, for example, if some piece of code dereferences a pointer and the value is set somewhere else, it could be undefined or not depending on whether the pointer is valid at the point it is dereferenced. So code might be “not broken” given certain constraints (eg that the pointer is valid), but not work properly if those constraints are violated, just like code in any language (although in C there’s a good chance the end result is UB, which is potentially more catastrophic).

                                                                                                I’m not saying C is a good language, just that I think this particular criticism is unfair. (Also I think your point 5 is wrong, char can be unsigned, it’s up to the implementation).

                                                                                                1. 7

                                                                                                  Thing is, it certainly feels like the compiler is disobeying the source code. Signed integer overflow? No problem pal, this is x86, that platform will wrap around just fine! Right? Riiight? Oops, nope, and since the compiler pretends UB does not exist, it just deleted a security check that it deemed “dead code”, and now my hard drive has been encrypted by a ransomware that just exploited my vulnerability.

                                                                                                  Though I agree with all the facts you laid out, and with the interpretation that UB means the program is already broken even if the generated binary didn’t propagate the error. But Chandler Carruth pretending that UB does not invoke the nasal demons is not far. Let’s not forget that UB means the compiler is allowed to cause your entire hard drive to be formatted, as ridiculous as it may sound. And sometimes it actually happens (as it did so many times with buffer overflow exploits).

                                                                                                  Sure, it’s not like the compiler is actually disobeying your source code. But since UB means “all bets are off”, and UB is not always easy to catch, the result is pretty close.

                                                                                                  1. 3

                                                                                                    Sure, it’s not like the compiler is actually disobeying your source code. But since UB means “all bets are off”, and UB is not always easy to catch, the result is pretty close.

                                                                                                    I feel like “disobeying the code” and “not doing what I intended it to do due to the code being wrong” are still two sufficiently different things that it’s worth distinguishing.

                                                                                                    1. 4

                                                                                                      Okay, it is worth distinguishing.

                                                                                                      But it is also worth noting that C is quite special. This UB business repeatedly violates the principle of least astonishement. Especially the modern interpretation, where compilers systematically assume UB does not exist and any code path that hits UB is considered “dead code”.

                                                                                                      The original intent of UB was much closer to implementation defined behaviour. Signed integer overflow was originally UB because some platforms crashed or otherwise went bananas when it occurred. But the expectation was that on platforms that behave reasonably (like x86, that wraps around), we’d get the reasonable behaviour. But then compiler writers (or should I say their lawyers) noticed that strictly speaking, the standard didn’t made that expectation explicit, and in the name of optimisation started to invoke nasal demons even on platforms that could have done the right thing.

                                                                                                      Sure the code is wrong. In many cases though, the standard is also wrong.

                                                                                                      1. 4

                                                                                                        I agree with some things but not others that you say, but these arguments have been hashed out many times before.

                                                                                                        Sure the code is wrong

                                                                                                        That’s the point I was making. Since we agree on that, and we agree that there are valid criticisms of C as a language (though we may differ on the specifics of those), let’s leave the rest. Peace.

                                                                                                  2. 4

                                                                                                    But why not have the compiler reject the code instead of silently compiling it wrong?

                                                                                                    1. 2

                                                                                                      It doesn’t compile it wrong. Code with no semantics can’t be compiled incorrectly. You’re making the exact same misrepresentation as in the post above that I responded to originally.

                                                                                                      1. 3

                                                                                                        Code with no semantics shouldn’t be able to be compiled at all.

                                                                                                        1. 1

                                                                                                          I’d almost agree, though I can think of some cases where such code could exist for a reason (and I’ll bet that such code exists in real code bases). In particular, hairy macro expansions etc which produce code that isn’t even executed (or won’t be executed in the case where it would be UB, at least) in order to make compile-time type-safety checks. IIRC there are a few such things used in the Linux kernel. There are probably plenty of other cases; there’s a lot of C code out there.

                                                                                                          In practice though, a lot of code that potentially exhibits UB only does so if certain constraints are violated (eg if a pointer is invalid, or if an integer is too large and will result in overflow at some operation), and the compiler can’t always tell that the constraints necessarily will be violated, so it generates code with the assumption that if the code is executed, then the constraints do hold. So if the larger body of code is wrong - the constraints are violated, that is - the behaviour is undefined.

                                                                                                          1. 1

                                                                                                            In particular, hairy macro expansions etc which produce code that isn’t even executed (or won’t be executed in the case where it would be UB

                                                                                                            That’s why it’s good to have a proper macro system that isn’t literally just find and replace.

                                                                                                            In practice though, a lot of code that potentially exhibits UB only does so if certain constraints are violated

                                                                                                            True, and I’m mostly talking about UB that can be detected at compile time, such as f(++x, ++x).

                                                                                                2. 6

                                                                                                  Contrary to what people are saying, C is just fine for what it is.

                                                                                                  People complain about the std library being tiny, but you basically have the operating system at your fingers, where C is a first class citizen.

                                                                                                  Then people complain C is not safe, yes that’s true, but with a set of best practices you can keep thing under control.

                                                                                                  People complain you don’t have generics, you dont need them most of the time.

                                                                                                  Projects like nginx, SQLite and redis, not to speak about the Nix world prove that C is perfectly fine of a language. Also most of the popular python libraries nowadays are written in C.

                                                                                                  1. 25

                                                                                                    Hi! I’d like to introduce you to Fish in a Barrel, a bot which publishes information about security vulnerabilities to Twitter, including statistics on how many of those vulnerabilities are due to memory unsafety. In general, memory unsafety is easy to avoid in languages which do not permit memory-unsafe operations, and nearly impossible to avoid in other languages. Because C is in the latter set, C is a regular and reliable source of security vulnerabilities.

                                                                                                    I understand your position; you believe that people are morally obligated to choose “a set of best practices” which limits usage of languages like C to supposedly-safe subsets. However, there are not many interesting subsets of C; at best, avoiding pointer arithmetic and casts is good, but little can be done about the inherent dangers of malloc() and free() (and free() and free() and …) Moreover, why not consider the act of choosing a language to be a practice? Then the choice of C can itself be critiqued as contrary to best practices.

                                                                                                    nginx is well-written, but Redis is not. SQLite is not written just in C, but also in several other languages combined, including SQL and TH1 (“test harness one”); this latter language is specifically for testing that SQLite behaves property. All three have had memory-unsafety bugs. This suggests that even well-written C, or C in combination with other languages, is unsafe.

                                                                                                    Additionally, Nix is written in C++ and package definitions are written in shell. I prefer PyPy to CPython; both are written in a combination of C and Python, with CPython using more C and PyPy using more Python. I’m not sure where you were headed here; this sounds like a popularity-contest argument, but those are not meaningful in discussions about technical issues. Nonetheless, if it’s the only thing that motivates you, then consider this quote from the Google Chrome security team:

                                                                                                    Since “memory safety” bugs account for 70% of the exploitable security bugs, we aim to write new parts of Chrome in memory-safe languages.

                                                                                                    1. 2

                                                                                                      I am curious about your claim that Redis is not well-written? I’ve seen other folks online hold it up as an example of a well-written C codebase, at least in terms of readability.

                                                                                                      I understand that readable is not the same as secure, but would like to understand where you are coming from on this.l

                                                                                                      1. 1

                                                                                                        It’s 100% personal opinion.

                                                                                                    2. 9

                                                                                                      Projects like nginx, SQLite and redis, not to speak about the Nix world prove that C is perfectly fine of a language.

                                                                                                      Ah yes, you can see the safety of high-quality C in practice:

                                                                                                      https://nginx.org/en/security_advisories.html https://www.cvedetails.com/vulnerability-list/vendor_id-18560/product_id-47087/Redislabs-Redis.html

                                                                                                      Including some fun RCEs, like CVE-2014-0133 or CVE-2016-8339.

                                                                                                      1. 2

                                                                                                        I also believe C will still have a place for long time. I know I’m a newbie with it, but making a game with C (using Raylib) has been pretty fun. It’s simple and to the point… And I don’t mind making mistakes really, that’s how I learn the best.

                                                                                                        But again it’s cool to see people creating new languages as alternatives.

                                                                                                      2. 4

                                                                                                        What does Hare offer over C?

                                                                                                        Here’s a list of ways that Drew says Hare improves over C:

                                                                                                        Hare makes a number of conservative improvements on C’s ideas, the biggest bet of which is the use of tagged unions. Here are a few other improvements:

                                                                                                        • A context-free grammar
                                                                                                        • Less weird type syntax
                                                                                                        • Language tooling in the stdlib
                                                                                                        • Built-in and semantically meaningful static and runtime assertions
                                                                                                        • A lightweight system for dependency resolution
                                                                                                        • defer for cleanup and error handling
                                                                                                        • An optional build system which you can replace with make and standard tools

                                                                                                        Even with these improvements, Hare manages to be a smaller, more conservative language than C, with our specification clocking in at less than 1/10th the size of C11, without sacrificing anything that you need to get things done in the systems programming world.

                                                                                                        It’s worth reading the whole piece. I only pasted his summary.

                                                                                                      1. 6

                                                                                                        I’ve been avoiding Brave because they’re not based on Gecko, but this is yet another thing starting to make me look that way.

                                                                                                        1. 4

                                                                                                          I love Mozilla and only use FIrefox and Thunderbird, but it has been more and more difficult to hold this position when I see their management continuously trashing their employees and their users.

                                                                                                          On the other side, I’m uneasy with Brave, they’re trying to fight Google by using a browser engine written and maintained at ~80% (my guesstimate) by Google…

                                                                                                          1. 3

                                                                                                            I still don’t know how to feel about Brave. On one hand their privacy features seem miles ahead of the competition. But on the other hand there’s their crypto currency stuff I’d rather have no part in.

                                                                                                            I’ve recently warmed up to Chromium (and by proxy Chromium-based browsers like Brave) because it’s sandboxing is so clearly much better than Firefox (even with FF’s recent improvements - which I love to see, don’t get me wrong!).

                                                                                                            1. 5

                                                                                                              Yeah. Brave’s privacy is only rivaled by Librewolf. Thankfully, the cryptocurrency nonsense is opt-in. And yes, I’m hoping Firefox gets caught up with Chromium security, but it’s not there yet. But the strides they’ve made with RLBox is good to see.

                                                                                                              1. 3

                                                                                                                It’s opt-in, unless you make some content for which they’d like to collect donations on your behalf. When they opt in for you, it’s no longer opt-in.

                                                                                                                1. 2

                                                                                                                  Indeed, the RLBox improvements are good to see!

                                                                                                            1. 13

                                                                                                              I think more programmers should be using secret scanners but there weren’t any “no-brainer” solutions I could find, so I decided to build a new one. The core of secret scanning is running regex against a large number of files, and it turns out this is something ripgrep is excellent at. By leveraging the ripgrep library effectively secrets is able to scan files roughly 100x faster than other solutions I tested. This is my first Rust project and I was impressed with how quickly I was able to put something together that is also really fast. Let me know if you have any feedback!

                                                                                                              1. 7

                                                                                                                I appreciate that you put links to other similar projects in the README! It’s a small thing but really helps to encourage adoption of the idea, even if the implementation doesn’t meet specific requirements. That being said, this tool looks good for my use case and I’m definitely going to try it.

                                                                                                                1. 5

                                                                                                                  I like the secretsignore feature. Sometimes you want things that look like secrets in your tests, and not being able to accommodate that has made me avoid similar tools in the past.

                                                                                                                  1. 1

                                                                                                                    There’s also the git-secrets project (from AWS, first released in 2015) that’s also designed as a pre-commit hook.

                                                                                                                    (I used to work for AWS and used git-secrets, but never worked on git-secrets.)

                                                                                                                  1. 31

                                                                                                                    My initial, empathy-lacking response to this is “if you’re looking for sympathy, you can find it in the dictionary between shit and syphilis.”

                                                                                                                    Alternatively, “play stupid games, win stupid prizes.”

                                                                                                                    The title’s just wrong, though. Server side rendering was easy. PHP, JSP, jinja, django are all examples of that. The special kind of hell this post describes is just what happens when you try to run javascript on the server, which anyone who thought very much about it knew was a bad idea in the first place.

                                                                                                                    The absurd complexity this post is moaning about is 100% opt-in.

                                                                                                                    1. 8

                                                                                                                      Not looking for sympathy, just pointing out a problem I rarely see discussed.

                                                                                                                      I’ve also never built a serious app using these technologies and would do my best to discourage their usage in companies I work for, but that doesn’t prevent our clients (other dev teams) from doing so.

                                                                                                                      1. 9

                                                                                                                        FWIW, based on the “soapy taste” and and “while some would decry” commentary, I was not assuming that you were the one who’d personally opted in to this problem. But I stand by my assertion that SSR is the traditional, easy way to do things and that it requires opting into bad ideas to enjoy the pain your post describes.

                                                                                                                        1. 1

                                                                                                                          I see this discussed regularly under the name hydration. It’s the piece of web development that I don’t feel like I know an elegant approach to.

                                                                                                                      1. 3

                                                                                                                        Just a heads up: It is important to firewall those IPv6 subnets used for IPv4 translation. That way you’ll avoid anyone just directly accessing the application throught a specially crafted IPv6 address. If the subnets were to be world accessible, you’d make it possible to disguise as any random IPv4 client IP address.

                                                                                                                        1. 3

                                                                                                                          Having used mismatches in IPv4 vs IPv6 firewall rules to my advantage many times in penetration tests, this is a point I really appreciate. (Here’s a good old paper on it in case anyone is unfamililar…)

                                                                                                                          In this case, though, I’m not sure I see the concern. These applications are being intentionally exposed, and there is no authentication or authorization happening on the proxy by design. The application on the back end is handling it all, so connecting directly instead of via the SNI proxy would not buy anything for an attacker.

                                                                                                                          Or am I misreading your point?

                                                                                                                          1. 1

                                                                                                                            My point is that the IPv4 address is encoded into the last 32 bits of the IPv6 server address. So if the application were to use an allowlist based on IP addresses (e.g.: only expose this login endpoint if from the office IP address), or a rate limiter, or you’d like to read the correct IP address in the logs, you’d make a decision based on an incorrect (and potentially maliciously crafted) IP address.

                                                                                                                            Let’s say the IPv6 address for the service in question is 2001:0db8:85a3::1 and that this is available in DNS as an AAAA record. Then let’s say the proxy would use the same publicly routable subnet for the IPv4-to-IPv6 conversion. If you were to enter the specially crafted IPv6 address in the browser, you’d make the application think you’re coming from a specific IPv4 address.

                                                                                                                            You’d want to firewall off everything but the public service address from all other than localhost.

                                                                                                                            1. 1

                                                                                                                              My point is that the IPv4 address is encoded into the last 32 bits of the IPv6 server address.

                                                                                                                              By my read, it’s encoded into the last 32 bits of the proxy-constructed IPv6 “client” address:

                                                                                                                              snid embeds the client’s IP address in the lower 32 bits of the source address which it uses to connect to the backend.

                                                                                                                              which means the backend app would need to also ignore the first 32 bits in order for this to be an issue. I’d never discount this, because I’ve seen quite a bit of dangerously braindead behavior that’s adjacent to it. But the fact that the web applications described in the post expect to receive unproxied IPv6 requests also, according to the article:

                                                                                                                              The AAAA record for a webapp is the dedicated IPv6 address, and the A record is the shared IPv4 address. Thus, IPv6 clients connect directly to the webapp, while IPv4 clients are proxied via snid.

                                                                                                                              makes me think that what you’re describing shouldn’t be a concern in the context the author is posting about.

                                                                                                                              1. 2

                                                                                                                                Ah, okay. Good to get it cleared up – thanks :)

                                                                                                                                So to sum it up:

                                                                                                                                1. The proxy gets a request from an IPv4 address.
                                                                                                                                2. The proxy chooses an internal IPv6 address that isn’t publicly routable and encodes the IPv4 address in the chosen internal IPv6 address (last 32 bits of the chosen source IPv6 address).
                                                                                                                                3. The applications uses the reverse logic and extracts the client’s real IPv4 address.

                                                                                                                                I must say, that is a neat trick. Glad I now understand it correctly.

                                                                                                                          2. 3

                                                                                                                            This is one of the big things that “NAT is not a security measure!” people forget about: yes it is.

                                                                                                                            1. 5

                                                                                                                              No it isn’t

                                                                                                                              Firewalls are a security measure.

                                                                                                                              1. 3

                                                                                                                                More specifically, NAT is not a security measure. A sensible firewall policy is a security measure. NAT deployments require a firewall, though not always one with a sensible policy. A lot of consumer NAT things let you designate a single internal IP that all inbound connections go to if there isn’t an explicit forwarding rule configured for that port, making that device just as insecure as one directly connected to the Internet, combined with all of the disadvantages of NAT. In contrast, a simple deny-all-inbound policy on a stateful firewall gives you all of the advantages of NAT and requires less powerful hardware to implement with the same number of connections / packet rates.