1. 33

    While I think a website like this would make sense in a few years, right now I think GDPR is complicated, confusing, and scary enough to a lot of companies that they are going to make mistakes. I’d rather help them do it better than mock them.

    1. 15

      As one of the thousands of engineers who had to figure out how to shoehorn a six-month compliance project into a lean team’s packed roadmap, I concur. This wasn’t easy, not even at a company that respects user data to begin with. Lots of the jokes I’ve seen about GDPR right now just lessen my opinion of the teller.

      1. 23

        On the other hand, we’ve all had literally more than 2 years to work on said six-month compliance project, and the fact that so many companies try to push on until the very end to start working on it is the actual problem here IMO.

        1. 4

          Not from my point of view – who cares if companies just woke up to GDPR two weeks ago, if I don’t use them for data processing? None of my actual pain came from that. But I definitely spent a lot of time working on GDPR when I’d rather have been building product, other deadlines slipped, things moved from To-Do to Backlog to Icebox because of this. We’re ready for GDPR, but that stung.

          1. 3

            I was essentially trying to put “People like you don’t get to complain about it being hard to fit something into a certain time period when they had literally 4 times that amount of time to do it.” ^__^

            1. 3

              Well, if people like you (who didn’t even do the work) get to complain, then so do I! If someone tells me they’re gonna punch me in the face, then they punch me in the face, I still got punched in the face.

              1. 4

                I did our GDPR planning and work, and I’m so glad to see it in effect. The industry is finally gaining some standards. Sometimes it’s time to own-up that you care more about your own bottom-line than doing the right thing, if you complain about having to give up a “rather have been building product” attitude.

                1. 1

                  Sometimes if you don’t build a product, GDPR compliance becomes irrelevant because you never get a company off the ground. As a one-person platform team until last September, I don’t regret how I prioritized it.

                2. 6

                  Well, if people like you (who didn’t even do the work) get to complain, then so do I!

                  I actually did do the work. But either way, complaining about it being a pain overall is just fine, because it is. On the other hand, explicitly complaining that because you had to do it in 6 months you had issues fitting it in, had other deadlines slip, and had to essentially kill other to-do’s is a very different thing. If you’d used the extra 18 months, I bet you’d have had much less issues with other deadlines.

                  If someone tells me they’re gonna punch me in the face, then they punch me in the face, I still got punched in the face.

                  This analogy doesn’t even make sense in context…

                  1. 6

                    If you’d used the extra 18 months, I bet you’d have had much less issues with other deadlines.

                    I’ll totally remember this for next time.

        2. 25

          Well, I agree in general, but this article specifically highlights some cases of just plain being mean to your users. I’m okay with mocking those.

          1. 7

            I disagree. GDPR is expensive to get wrong so the companies aren’t sure what to expect. They are likely being conservative to protect themselves.

            1. 7

              They were not conservative in tracking users, and spending for tracking and spying on users was not expensive?

              As a user I don’t care about the woes of companies. They forced the lawmakers to create these laws, as they were operating a surveilance capitalism. They deserve the pain, the costs, and the fear.

              1. 1

                and spending for tracking and spying on users was not expensive?

                Tracking users is very cheap, that’s why everyone can and does do it. It’s just bits.

                As a user I don’t care about the woes of companies.

                Feel free not to use them, then. What I am saying is that GDPR is a new, large and expansive, law with a lot of unknowns. Even the regulators don’t really know what the ramifications will be. I’m not saying to let companies not adhere to the law, I’m just saying on the first day the world would probably benefit more from helping the companies comply rather than mocking them.

                EDIT:

                To be specific, I think companies like FB, Google, Amazon, etc should be expected to entirely comply with the law on day one. It’s smaller companies that are living on thinner margins that can’t necessarily afford the legal help those can that I’d want to support rather than mock.

          2. 10

            It’s not like the GDPR was announced yesterday. It goes live tomorrow after a two year onboarding period.

            If they haven’t got their act in order after two years, it’s reasonable to name and shame.

          1. 2

            mdocml is small and has minimal dependencies, but it has runtime dependencies - you need it installed to read the man pages it generates. This is Bad.

            mdoc is part of the system. I guess not on Linux??

            1. 3

              mdoc is part of the system on Linux too.

              1. 3

                Depends on the Linux.

                1. 1

                  Do you have any particular distribution in mind where it isn’t?

              2. 1

                Guess what? There is life outside Unix! :-D

              1. 8

                A couple notes on the article (specifically, the one it links to at the beginning, The Logical Disaster of Null).

                Null is a crutch. It’s a placeholder for I don’t know and didn’t want to think about it further

                I disagree. In C, at least, NULL is a preprocessor macro, not a special object, “which expands to an implementation-defined null pointer constant”. In most cases, it’s either 0, or ((void *)0). It has a very specific definition and that definition is used in many places with specific meaning (e.g., malloc returns NULL on an allocation failure). The phrase, “It’s a placeholder for I don’t know and didn’t want to think about it further”, seems to imply that it’s used by programmers who don’t understand their own code, which is a different problem altogether.

                People make up what they think Null means, which is the problem.

                I agree. However, again in C, this problem doesn’t really exist, since there are no objects, only primative types. structs, for example, are just logical groupings of zero or more primative types. I can imagine that, in object-oriented languages, the desire to create some sort of NULL object can result in an object that acts differently than non-NULL objects in exceptional cases, which would lead to inconsistency in the language.

                In another article linked-to in Logical Disaster of Null talks about how using NULL-terminated character arrays to represent strings was a mistake.

                Should the C language represent strings as an address + length tuple or just as the address with a magic character (NUL) marking the end?

                I would certainly choose the NULL-terminated character array representation. Why? Because I can easily just make a struct that has a non-NULL-terminated character array, and a value representing length. This way, I can choose my own way to represent strings. In other words, the NULL-terminated representation just provides flexibility.

                1. 4

                  “On Multics C on the Honeywell DPS-8/M and 6180, the pointer value NULL is not 0, but -1|1.”

                  1. 3

                    The C Standard allows that. It basically states that, in the source code, a value of 0 in a pointer context is a null pointer and shall be converted to whatever value that represents in the local architecture. So that means on a Honeywell DPS-8/M, the code:

                    char *p = 0;
                    

                    is valid, and will set the value of p to be -1. This is done by the compiler. The name NULL is defined so that it stands out in source code. C++ has rejected NULL and you are expected to use the value 0 (I do not agree with this, but I don’t do C++ coding).

                    1. 2

                      I believe C++11 introduced the nullptr keyword which can mostly be used like NULL in C.

                      1. 1

                        Correct. Just for reference, from the 1989 standard:

                        “An integral constant expression with the value 0, or such an expression cast to type void * , is called a null pointer constant.”

                    2. 3

                      I would certainly choose the NULL-terminated character array representation. Why? Because I can easily just make a struct that has a non-NULL-terminated character array, and a value representing length. This way, I can choose my own way to represent strings. In other words, the NULL-terminated representation just provides flexibility.

                      That’s not a very convincing argument IMO since you can implement either of the options yourself no matter which one is supported by the stdlib, the choice of one doesn’t in any way impact the potential flexibility. On the other hand NULL-terminated strings are much more likely to cause major problems due to how extremely easy it is to accidentally clobber the NULL byte, which happens all the time in real-world code.

                      And the language not supporting Pascal-style strings means that people would need to reach for one of a multitude of different and incompatible third-party libraries and then convince other people on the project that the extra dependency is worth it, and even then you need to be very careful when passing the functions to any other third-party functions that need the string.

                      1. 1

                        You make a good point. Both options for strings can be implemented. As for Pascal strings, it is nice that a string can contain a NULL character somewhere in the middle. I guess back in the day when C was being developed, Ritchie chose NULL-terminated strings due to length being capped at 255 characters (the traditional Pascal string used the first byte to contain length). Nowadays, since computers have more memory, you could just use the first 4 bytes (for example) to represent string length, in which case, in C it could just be written as struct string { int length; char *letters; }; or something like that.

                        From Ritchie: “C treats strings as arrays of characters conventionally terminated by a marker. Aside from one special rule about initialization by string literals, the semantics of strings are fully subsumed by more general rules governing all arrays, and as a result the language is simpler to describe and to translate than one incorporating the string as a unique data type.”

                    1. 28

                      After reading the article and many HN comments, I found the headline to be highly misleading as if they’re targeting Signal for their activities in fighting censorship. It’s actually more incidental. They’re targeting a fraudulent practice Signal is doing that violates terms of service. Signal is doing it for good reasons but others might not. Google and Amazon are trying to stop it wholesale. A proper headline might be that “Several providers threaten to suspend anyone doing ‘domain fronting’ via hacks, including us.” Average person reading something like that would think it sounds totally to be expected. A technical person liking Signal or not should also notice the MO is an operational inconsistency that shouldn’t exist in the first place.

                      So, they’re not doing a bad thing given the situation. They’re just an apathetic, greedy party in a business context fixing a technical problem that some good folks were using to help some other good folks deal with evil parties in specific countries. Sucks for those specific people that they did it but they’re not aiming at Signal to stop their good deeds. They’re just addressing an infrastructure problem that affects anyone hacking around with their service. Like they should.

                      I wish Signal folks the best finding another trick, though.

                      1. 16

                        I think the correct headline would be “AWS is fixing a bug allowing domain fronting and calling it Enhanced Domain Protections”. An analogous situation would be console homebrew people exploiting buffer overflows in Nintendo games. Of course Nintendo should fix them, and like you, I root for console homebrew people to find another one.

                        1. 3

                          That’s another good one. It’s just a bug in their services. Them not fixing it would be more questionable to me.

                        2. 9

                          I found the headline to be highly misleading as if they’re targeting Signal for their activities in fighting censorship. It’s actually more incidental.

                          And that’s why they immediately sent signal an email containing a threat to close the account immediately, instead of a regretful email telling them that this will stop working due to abuse prevention measures.

                          1. 1

                            It my experience that’s generally how they treat literally any issue.

                          2. 5

                            Signal is doing it for good reasons but others might not.

                            I’m failing to think of a way to use domain fronting for a not good reason, especially one where the provider being fronted is still happy to host the underlying service.

                            1. 4

                              There is nothing fraudulent about domain fronting. Show me one court anywhere in the world which has convicted someone of fraud for domain fronting. That’s a near-libelous claim.

                              Can you provide an example of a “bad reason” for domain fronting?

                              As the article points out, the timing of Amazon’s decision relative to the publicity about Signal’s use of domain fronting suggests that Signal is in fact the likely intended target of this change, not incidental fallout.

                              The headline is accurate. Your comment really mischaracterizes what is happening.

                              1. 3

                                I meant it in the popular definition of lying while using something. Apparently, a lot of people agree its use isn’t what was intended, the domains supplied are certainly not them, and service providers might negatively react to that. It would probably be a contract law thing as a terms of use violation if it went to court. I’m not arguing anything more than that on the legal side. I’m saying he was doing something deceptive that they didn’t want him to do with their services. Big companies rarely care about the good intentions behind that.

                                “the timing of Amazon’s decision relative to the publicity about Signal’s use of domain fronting suggests that Signal is in fact the likely intended target of this change”

                                The article actually says he was bragging online in a way that reached highly-visible places like Hacker News about how he was tricking Amazon’s services for his purposes. Amazon employees stay reading these outlets partly to collect feedback from customers. I see the cloud people on HN all the time saying they’ll forward complaints or ideas to people that can take action. With that, I totally expected Amazon employees to be reading articles about him faking domains through Amazon services. Equally unsurprising that got to a decision-maker, technical or more lay person, who was worried about negative consequences. Then, knowing a problem and seeing a confession online by Signal author, they took action against a party they knew was abusing the system.

                                We can’t just assume a conspiracy against Signal looking for everything they could use against it with domain fronting being a lucky break for their evil plans. One they used against Signal while ignoring everyone else they knew broke terms of service using hacker-like schemes. If you’re insisting targeted, you’d be ignoring claims in the article supporting my position:

                                “A month later, we received 30-day advance notice from Google that they would be making internal changes to stop domain fronting from working entirely.

                                “a few days ago Amazon also announced what they are calling Enhanced Domain Protections for Amazon CloudFront Requests. It is a set of changes designed to prevent domain fronting from working entirely, across all of CloudFront.

                                It’s a known problem they and Google were apparently wanting to deal with across the board per his own article. Especially Google. They also have employees reading forums where Signal was bragging about exploiting the flaw for its purposes. I mean, what did you expect to happen? Risk-reducing, brand-conscious companies that want to deal with domain fronting were going to leave it on in general or for Signal since that one party’s deceptions were for good reasons according to claims on their blog?

                                Although I think that addresses it, I’m still adding one thing people in cryptotech-media-bubble might not consider: the manager or low-level employee who made the decision might not even know what Signal is. Most IT people I’ve encouraged to try it have never heard of it. If you explain what it does, esp trying to get things past the governments, then that would just further worry the average risk manager. They’d want a brick wall between the company’s operations and whatever legal risks the 3rd party is taking to reduce their own liabilities.

                                So, there’s at least several ways employees would react this way ranging from a general reaction to an abuse confession online to one with a summary of Signal about dodging governments. And then, if none of that normal stuff that happens every day at big firms, you might also think about Amazon targeting Signal specifically due to their full knowledge of what they’re doing plus secret, evil plans to help governments stop them. I haven’t gotten past the normal possibilities, though, with Amazon employees reading stuff online and freaking out being most likely so far.

                                1. 3

                                  This rings true to me (particularly the middle-management banality-of-evil take), bar one nitpick:

                                  The article actually says he was bragging online in a way that reached highly-visible places like Hacker News about how he was tricking Amazon’s services for his purposes.

                                  How did you get that impression? The article states:

                                  We’re an open source project, so the commit switching from GAE to CloudFront was public. Someone saw the commit and submitted it to HN. That post became popular, and apparently people inside Amazon saw it too.

                                  I haven’t read the mentioned HN thread, but that hardly constitutes “bragging online”.

                                  1. 2

                                    I can’t remember why I originally said it. He usually blogs about his activities. I might have wrongly assumed they got it out of one of his technical write-ups or comments instead of a commit. If it was just a commit, then I apologize. Thanks for the catch regardless.

                              2. 3

                                “Service provider warns misbehaving customer to knock it off after repeated RFC violations.”

                                1. 1

                                  I actually think this ASCII style is quite nice, and most certainty more portable.

                                  1. 1

                                    More portable in what way? Practically speaking, the W3 page is much more portably accessible to me than the ASCII version. I very often read things like this on my phone when e.g. on my commute, and on the screen size on a phone there is atrocious line wrapping that makes the text hard to read, and the illustrations entirely unreadable, while the HTML version renders perfectly.

                                    1. 2

                                      More portable in what way?

                                      In the sense that you don’t need a browser to view it properly, but I guess you’re right that phones have a disadvantage here. On the other hands, it’s questionable why mobile browsers can’t format this most simple format properly.

                                1. 1

                                  Is there any link to the recent page now? Can’t find one anyway.

                                  1. 5

                                    There really needs to be a federated github.

                                    1. 46

                                      Like… git ?

                                      1. 21

                                        So github but without the hub. May be on to something.

                                        1. 7

                                          Github is one of my favorite stories when I talk about how decentralized systems centralize.

                                          1. 7

                                            But did GitHub really centralize something decentralized? Git, as a VCS is still decentralized, nearly everyone who seriously uses it has a git client on their computer, and a local repository for their projects. That part is still massively decentralized.

                                            GitHub as a code sharing platform, that allows issues to be raised and discussed, patches/pull requests to be submitted, etc. didn’t previously exist in a decentralized manner. There seems to have always been some central point of reference, be it website or just a mailing list. It’s not as if whole project were just based around cc’ing email to one another all the time. How would new people have gotten involved if that were the case?

                                            The only thing I could see as centralising is the relative amount of project hosted on GitHub, but that isn’t really a system which can be properly described as “decentralized” or “centralized”..,

                                            1. 4

                                              It’s the degree to which people are dependent on the value-adds that github provides beyond git. It’s like a store having a POS that relies on communication with a central server. Sure, they can keep records on paper do sales but it’s not their normal course, so they don’t. This comment on HN sums it up: https://news.ycombinator.com/item?id=16124575

                                            2. 1

                                              Got any other examples?

                                              1. 3

                                                Email would be a prominent one. Most people (and I can’t say I am innocent) use gmail, hotmail, yahoo mail, etc. I belive there is some general law that describes this trend in systems, which can then be applied to the analysis of different topics, for example matter gathering in around other matter in physics or money accumulating itself around organization with more money, etc.

                                                On the other side you have decentralized systems which didn’t really centralized significantly, for whatever reason, such as IRC, but which had a decrease in users over time, which I also find to be an interesting trend.

                                                1. 4

                                                  Many businesses run their own email server and also I don’t have to sign up to gmail to send a gmail user an email but I do have to sign up to github.

                                                  1. 1

                                                    A tendency towards centralisation doesn’t mean that no smaller email servers exist, I’m sorry if you misunderstood me there. But on the other hand, I have heard of quite a few examples where businesses just use gmail with a custom domain, so there’s that.

                                                    And it’s true that you don’t have to be on gmail to send an email to a hotmail server, for example, but most of the time, if just a normal person were to set up their mail server, all the major mail providers automatically view this new host as suspicious and potentially harmful, thus more probably redirecting normal messages as spam. This wouldn’t be that common, if the procentual distribution of mail servers weren’t that centralised.

                                                2. 1

                                                  Did a talk using them. This cuts to the chase: https://www.youtube.com/watch?v=MgbmGQVa4wc#t=11m35s

                                            3. 1

                                              Git has a web interface?

                                              1. 7

                                                … federation is about data/communications between servers.. but seeing as you asked, yes it does: https://manpages.debian.org/stretch/git-man/gitweb.1.en.html

                                                1. 10

                                                  To be fair, whjms did say “a federated github”. The main feature of GitHub is its web interface.

                                                  1. 2

                                                    Right, and there are literally dozens of git web interfaces. You can “federate” git and use whichever web ui you prefer.

                                                    1. 12

                                                      But you then miss out on issue tracking, PR tracking, stats, etc. I agree that Git itself provides a decentralized version control system. That’s the whole point. But a federated software development platform is not the same thing. I would personally be very interested to see a federated or otherwise decentralized issue tracking, PR tracking, etc platform.

                                                      EDIT: I should point out that any existing system on par with Gitea, Gogs, GitLab, etc could add ActivityPub support and instantly solve this problem.

                                                      1. 4

                                                        Doesn’t give you access to all the issues, PRs and comments though.

                                                        1. 4

                                                          git-appraise exists. Still waiting for the equivalent for issues to come along.

                                                          https://github.com/google/git-appraise

                                                          1. 4

                                                            huh git appraise is pretty cool.

                                                            I was going to suggest some kind of activitypub/ostatus system for comments. A bit like peertube does to manage comments. But a comment and issue system that is contained within the history of the project would be really interesting. Though it would make git repos take a lot more space for certain projects no?

                                                            1. 3

                                                              I’d assume that those could potentially be compressed but yes. It’s definitely not ideal. https://www.fossil-scm.org/index.html/doc/tip/www/index.wiki

                                                              ^^^^ Unless I’m mistaken, Fossil also tracks that kind of stuff internally. I really like the idea that issues, PRs, and documentation could live in the same place, mostly on account of being able to “go back in time”, and see when you go back to a given version, what issues were open. Sounds useful.

                                                          2. 3

                                                            BugsEverywhere (https://gitlab.com/bugseverywhere/bugseverywhere), git-issues (https://github.com/duplys/git-issues), sit (https://github.com/sit-it/sit) all embed issues directly in the git repo.

                                                            Don’t blame the tool because you chose a service that relies on vendor lock-in.

                                                            1. 4

                                                              If I recall correctly the problem here is that to create an issue you need write access to the git repo.

                                                              Having issues separated out of the repositories can make it easier, if the web interface can federate between services, that’s even better. Similar to Mastodon.

                                                              1. 1

                                                                There’s nothing to say that a web interface couldnt provide the ability for others to submit issues.

                                                          3. 3

                                                            Right, and there are literally dozens of git web interfaces.

                                                            Literally dozens of git web interfaces the majority of developers either don’t know or care about. The developers do use GitHub for various reasons. voronoipotato and LeoLamda saying a “federated Github” means the alternative needs to look like or work with Github well enough that those using Github, but ignoring other stuff you mentioned, will switch over to it. I’m not sure what that would take or if it’s even legal far as copying appearance goes. It does sound more practical goal than telling those web developers that there’s piles of git web interfaces out there.

                                                            1. 1

                                                              Im going to respond to two points in reverse order, deliberately:

                                                              or care about.

                                                              Well, clearly the person I replied to does care about a git web interface that isn’t reliant on GitHub.com. Otherwise, why would they have replied?

                                                              Literally dozens of git web interfaces the majority of developers either don’t know [about]

                                                              Given the above - The official git project’s wiki has a whole page dedicated to tools that work with git, including web interfaces. That wiki page is result 5 in google and result 3 in duckduckgo when searching for “git web interface”. If a developer wants a git web interface, and can’t find that information for themselves, nothing you, or I or a magic genie does will help them.

                                                      2. 5

                                                        It’s not built-in, but Gogs and Gitea are both pretty nice.

                                                        1. 2

                                                          Hard agree. I run a personal Gogs site and it’s awesome.

                                                    2. 7

                                                      It would be enough if people stopped putting all their stuff on github.

                                                      1. 8

                                                        It won’t happen for a while due to network effects. They made it easy to get benefits of a DVCS without directly dealing with one. Being a web app, it can be used on any device. Being free, that naturally pulls people in. There’s also lots of write-ups on using it or solving problems that are a Google away due to its popularity. Any of these can be copied and improved on. The remaining problem is huge amount of code already there.

                                                        The next solution won’t be able to copy that since it’s a rare event in general. Like SourceForge and Github did, it will have to create a compelling reason for massive amounts of people to move their code into it while intentionally sacrificing the benefits of their code being on Github specifically. I can’t begin to guess what that would take. I think those wanting no dependency on Github or alternatives will be targeting a niche market. It can still be a good one, though.

                                                        1. 2

                                                          I hear the ‘network effects’ story every time, but we are not mindless automatons who have to use github because other people are doing it. I’m hosting the code for my open source projects on a self-hosted gitlab server and i’m getting contributions from other people without problems. Maybe it would be more if the code was on github, but being popular isn’t the most important thing for everyone.

                                                          1. 1

                                                            Just look at sourceforge, if everyone had to set up their own CVS/SVN server back in the say do you think all those projects would have made it onto the internet?

                                                            Now we have a similar situation with got, if GitHub/Bitbucket/etc. didn’t exist I’m sure most people would have stuck with sourceforge (Or not bothered if they had to self host).

                                                            You can also look at Googlecode to see the problem with not reaching critical mass (IMHO). There were some high profile projects there, but then I’m sure execs said, why are we bothering to host 1% (A guess) of what is on GitHub?

                                                            1. 1

                                                              ‘Network effects’ doesn’t mean you’re mindless automatons. It means people are likely to jump on bandwagons. It also means that making it easy to connect people together, esp removing friction, makes more of them do stuff together. The massive success of Github vs other interfaces argues my point for me.

                                                              “Maybe it would be more if the code was on github”

                                                              That’s what I telling you rephrased. Also, expanded to the average project as some will get contributions, some won’t, etc.

                                                          2. 4

                                                            Heck even I won’t move off of it until there is a superior alternative, sorry.

                                                          3. 3

                                                            I thought about a project along these lines a while ago. Something along the lines of cgit, which could offer a more or less clean and consistent UI, and a easy to set up backend, making federation viable in the first place. Ideally, it wouldn’t even need accounts, instead Email+GPG could be used, for example by including an external mailing list into the repo, with a few addition markup features, such as internal linking and code highlighting. This “web app” would then effectively only serve as an aggregator of external information, onto one site, making it even easier to federate the entire structure, since the data wouldn’t even be necessarily bound to one server! If one were to be really evil, one could also use GitHub as a backend…

                                                            I thought about all of this for a while, but the big downsides from my perspective seemed to be a lack of reliability on servers (which is sadly something we have come to expect with tools such as NPM and Go’s packaging), asynchronous updates could mess stuff up, unless there were to be a central reference repo per project, and the social element in social coding could be hard to achieve. Think of stars, followings, likes, fork overviews, etc. these are all factors which help projects and devs display their reputation, for better or for worse.

                                                            Personally, I’m a bit sceptical that something along these lines would manage to have a real attractiveness, at least for now.

                                                            1. 3

                                                              Lacks a web interface, but there are efforts to use ipfs for a storage backend.

                                                              https://github.com/cryptix/git-remote-ipfs

                                                              1. 3

                                                                I think there have been proposals for gitlab and gitea/gogs to implement federated pull request. I would certainly love it since I stuff most of my project into my personal gitea instance anyway. Github is merely a code mirror where people happen to be able to file issues.

                                                                1. 3

                                                                  I think this would honestly get the work done. Federated pull request, federated issue discussion

                                                                  1. 1

                                                                    I’m personally a bit torn if a federated github-like should handle it like a fork, ie, if somebody opens an issue they do it on their instance and you get a small notification and you can follow the issue in your own repo

                                                                    Or if it should merely allow people to use my instance to file issues directly there like with OAuth or OpenID Connect. Probably something we’ll have to figure out in the process.

                                                                    1. 2

                                                                      just make it work like gnusocial/mastodon. username@server.com posted an issue on your repo. You can block server, have a whitelist, or let anyone in the world is your oyster.

                                                                  2. 1

                                                                    Would be nice if I could use my gitlab.com account to make MRs on other gitlab servers.

                                                                  3. 1

                                                                    I always thought it would be neat to try to implement this via upspin since it already provides identity, permissions, and a global (secure) namespace. Basically, my handwavy thoughts are: design what your “federated github” repo looks like in terms of files. This becomes the API or contract for federation. Maybe certain files are really not files but essentially RPCs and this is implemented by a custom upspin server. You have an issue directory, your actually git directory, and whatever else you feel is important for managing a software project on git represented in a file tree. Now create a local stateless web interface that anyone can fire up (assuming you have an upspin user) and now you can browse the global upspin filesystem and interact with repos ,make pull requests, and file issues.

                                                                    I was thinking that centralized versions of this could exist like github for usability for most users. In this case users’ private keys are actually managed by the github like service itself as a base case to achieve equal usability for the masses. The main difference is that the github like service exports all the important information via upspin for others to interact with via their own clients.

                                                                  1. 1

                                                                    While this would likely help some people, it really feels more like “How to set up a basic vagrant development environment” than at all Erlang related to me, honestly.

                                                                      1. 1

                                                                        OK, thanks!

                                                                        1. 1

                                                                          Question is why it’s taking this long to just generate a new cert with the extra SAN…

                                                                          1. 4

                                                                            No one is paid to work on lobsters. If you know ansible and letsencrypt you should be able to help out.

                                                                            1. 1

                                                                              Well I don’t really know how the current Lets Encrypt cert was generated, but it’s literally just another argument. Did ask about it when it came up on IRC 3 weeks ago, but didn’t get a reply then, and figured it would probably be fixed pretty quickly then so completely forgot about it.

                                                                              1. 1

                                                                                It was manually created with certbot but, as noted in the bug, should probably be replaced with use of acmeclient to have much fewer moving parts, if nothing else.

                                                                                It’d be great to have someone who knows the topic well to help the issue along in any capacity, if you have the spare attention.

                                                                                1. 1

                                                                                  I’ve done entirely too much work with acmeclient to automate certs for http://conj.io and some other properties I run. Will try and find time this weekend to take a run at this.

                                                                                  1. 1

                                                                                    That or to use dehydrated: in a text file, one certificate per line, each domain separated by a space.

                                                                          1. 1

                                                                            Don’t use WPA2, it may have been cracked, stick to something secure like WEP instead :)

                                                                            1. 3

                                                                              I’m honestly not sure if you’re serious or not, and that will probably bother me for a few minutes. Well played.

                                                                              1. 1

                                                                                Someone made a similar comment from a reddit thread so I thought I’d copy it here :)

                                                                              1. 4

                                                                                Ugh, I really dislike it when they don’t have the slides as shown to the audience in the video. It both becomes much more annoying when they skip back and forward, but I also watch a lot of talks on my phone when on public transit, which makes a lot of talks almost inaccessible because you need the slides for context with most of them.

                                                                              1. 8

                                                                                I think the takeaway here is a) don’t confuse all kind of errors with a http request with invalid tokens (I’m not familiar with the Github API, but I suppose it returns 503 unauthorized correctly) and b) don’t delete important data, but flag it somehow.

                                                                                1. 5

                                                                                  It returns a 404 which is a bit annoying since if you fat finger your URL you’ll get the same response as if a token doesn’t exist.

                                                                                  https://developer.github.com/v3/oauth_authorizations/#check-an-authorization

                                                                                  Invalid tokens will return 404 NOT FOUND

                                                                                  I’ve since moved to using a pattern of wrapping all external requests in objects that we can explicitly check their state instead of relying on native exceptions coming from underlying HTTP libraries. It makes things like checking explicit status code in the face of non 200 status easier.

                                                                                  I might write on that pattern in the future. Here’s the initial issue with some more links https://github.com/codetriage/codetriage/issues/578

                                                                                  1. 3

                                                                                    Why not try to get issues, and if it fails with a 401, you know the token is bad? You can double check with the auth_is_valid method you’re using now…

                                                                                    1. 2

                                                                                      That’s a valid strategy.

                                                                                      Edit: I like it, I think this is the most technically correct way to move forwards.

                                                                                    2. 1

                                                                                      Did the Github API return a 404 Not Found instead of a 5xx during the outage?

                                                                                      1. 1

                                                                                        No clue.

                                                                                        1. 1

                                                                                          Then there’s your problem. Your request class throws RequestError on every non-2xx response, and auth_is_valid? thinks any RequestError means the token is invalid. In reality you should only take 4xx responses to mean the token is invalid – not 5xx responses, network layer errors, etc.

                                                                                          1. 1

                                                                                            Yep, that’s what OP in the thread said. I mention it in the post as well.

                                                                                    3. 2

                                                                                      I think the takeaway is that programmers are stupid.

                                                                                      Programs shouldn’t delete/update anything, only insert. Views/triggers can update reconciled views so that if there’s a problem in the program (2) you can simply fix it and re-run the procedure.

                                                                                      If you do it this way, you can also get an audit trail for free.

                                                                                      If you do it this way, you can also scale horizontally for free if you can survive a certain amount of split/brain.

                                                                                      If you do it this way, you can also scale vertically cheaply, because inserts can be sharded/distributed.

                                                                                      If you don’t do it this way – this way which is obviously less work, faster and simpler and better engineered in every way, then you should know it’s because you don’t know how to solve this basic CRUD problem.

                                                                                      Of course, the stupid programmer responds with some kind of made up justification, like saving disk space in an era where disk is basically free, or enterprise, or maybe this is something to do with unit tests or some other garbage. I’ve even heard a stupid programmer defend this crap because the the unit tests need to be idempotent and all I can think is this fucking nerd ate a dictionary and is taking it out on me.

                                                                                      I mean, look: I get it, everyone is stupid about something, but to believe that this is a specific, critical problem like having to do with 503 errors instead of a systemic chronic problem that boils down to a failure to actually think really makes it hard to discuss the kinds of solutions that might actually help.

                                                                                      With a 503 error, the solution is “try harder” or “create extra update columns” or whatever. But we can’t try harder all the time, so there’ll always be mistakes. Is this inevitable? Can business truly not figure out when software is going to be done?

                                                                                      On the other hand, if we’re just too fucking stupid to program, maybe we can work on trying to protect ourselves from ourselves. Write-only-data is a massive part of my mantra, and I’m not so arrogant to pretend it’s always been that way, but I know the only reason I do it is because I deleted a shit-tonne of customer data on accident and had the insight that I’m a fucking idiot.

                                                                                      1. 4

                                                                                        I agree with the general sentiment. It took me a bout 3 read throughs to parse through all the “fucks” and “stupids”. I think there’s perhaps a more positive and less hyperbolic way to frame this way.

                                                                                        Append only data is a good option, and basically what I ended up doing in this case. It pays to know what data is critical and what isn’t. I referenced the acts_as_paranoid and it pretty much does what you’re talking about. It makes a table append only, when you modify a record it saves an older copy of that record. Tables can get HUGE, like really huge, as in the largest tables i’ve ever heard of.

                                                                                        /u/kyrias pointed out that large tables have a number of downsides such as being able to perform maintenance and making backups.

                                                                                        1. 2

                                                                                          you can do periodic data warehousing though to keep the tables as arbitrarily small as you’d like but that introduces the possibility of programmer error when doing the data warehousing. it’s an easier problem to solve than making sure every destructive write is correct in every scenario though.

                                                                                          1. 1

                                                                                            Tables can get HUGE, like really huge, as in the largest tables i’ve ever heard of

                                                                                            I have tables with trillions of rows in them, and while I don’t use MySQL most of the time, even MySQL can cope with that.

                                                                                            Some people try to do indexes, or they read a blog that told them to 1NF everything, and this gets them nowhere fast, so they’ll think it’s impossible to have multi-trillion-row tables, but if we instead invert our thinking and assume we have the wrong architecture, maybe we can find a better one.

                                                                                            /u/kyrias pointed out that large tables have a number of downsides such as being able to perform maintenance and making backups.

                                                                                            And as I responded: /u/kyrias probably has the wrong architecture.

                                                                                          2. 2

                                                                                            Of course, the stupid programmer responds with some kind of made up justification, like saving disk space in an era where disk is basically free

                                                                                            It’s not just about storage costs though. For instance at $WORK we have backups for all our databases, but if we for some reason would need to restore the biggest one from a backup it would take days where all our user-facing systems would be down, which would be catastrophic for the company.

                                                                                            1. 1

                                                                                              You must have the wrong architecture:

                                                                                              I fill about 3.5 TB of data every day, and it absolutely would not take days to recover my backups (I have to test this periodically due to audit).

                                                                                              Without knowing what you’re doing I can’t say, but something I might do differently: Insert-only data means it’s trivial to replicate my data into multiple (even geographically disparate) hot-hot systems.

                                                                                              If you do insert-only data from multiple split brains, it’s usually possible to get hot/cold easily, with the risk of losing (perhaps only temporarily) a few minutes of data in the event of catastrophe.

                                                                                            2. 0

                                                                                              Unfortunately, if you hold any EU user data, you will have to perform an actual delete if the EU user wants you to delete their stuff if you want to be compliant with their stuff. I like the idea of the persistence being an event log and then you construct views as necessary. I’ve heard that it’s possible to use this for almost everything and store an association of random-id to person, and then just delete that association when asked to in order to be compliant, but I haven’t actually looked into that carefully myself.

                                                                                              1. 2

                                                                                                That’s not true. The ICO recognises there are technological reasons why “actual deletion” might not be performed (see page 4). Having a flag that blinds the business from using the data is sufficient.

                                                                                                1. 1

                                                                                                  Very cool. Thank you for sharing that. I was under the misconception that having someone in the company being capable of obtaining the data was sufficient to be a violation. It looks like the condition to be compliant is weaker than that.

                                                                                                  1. 2

                                                                                                    No problem. A big part of my day is GDPR-related at the moment, so I’m unexpectedly versed with this stuff.

                                                                                              2. 0

                                                                                                There’s actually a database out there that enforces the never-delete approach (together with some other very nice paradigms/features). Sadly it isn’t open source:

                                                                                                http://www.datomic.com/

                                                                                            1. 5

                                                                                              I wonder how many versions of this article with pretty much the same examples and code exist by now…

                                                                                              1. 4

                                                                                                Aside from the recent unroll.me hate, I think that’s just a mistake in their writing.

                                                                                                1. 2

                                                                                                  Mistake? AFAIK there isn’t a way to retrieve your password. https://unroll.me/a/login

                                                                                                  1. 3

                                                                                                    If they only use gmail/google oauth, they wouldn’t have a password. They’re just implying that the login uses oauth which uses your google account email/password.

                                                                                                    I used to have an unroll.me account, and don’t have a password recorded for them.

                                                                                                    1. 1

                                                                                                      I guess if they truly use OAuth for everything, then it’s fine.

                                                                                                      1. 1

                                                                                                        For Gmail, Outlook, and Yahoo they use OAuth to access your email. For AOL and iCloud they just need your password for IMAP access.

                                                                                                1. 1

                                                                                                  Hm, an annoyingly large portion of the text goes off the screen on my phone with no way to scroll over there, even in landscape mode and with “Request desktop site” enabled in chrome. Makes it rather annoying to read.

                                                                                                  1. 1

                                                                                                    I even have to go to landscape mode on my iPad otherwise I face the same issue, just to have lots of margins on both left and right side of the article. Not cool.

                                                                                                    1. 1

                                                                                                      … which makes the following rather funny:

                                                                                                      Dozuki makes documentation software for everything — from visual work instructions for manufacturing to product manuals that will make your customers love you.

                                                                                                    1. 5

                                                                                                      Part of the reason why it took us awhile to debug our issue was that we assumed that the stack trace we saw was accurate.

                                                                                                      Recall how it was compiled:

                                                                                                      clang -std=c99 -O3 -g -o inline_merge inline_merge.c
                                                                                                      

                                                                                                      As the GCC manual says with respect to combinging O with g: “The shortcuts taken by optimized code may occasionally produce surprising results.”

                                                                                                      1. 4

                                                                                                        And relatedly from the clang man page:

                                                                                                        Note that Clang debug information works best at -O0.

                                                                                                        1. 1

                                                                                                          So one should recompile for the purpose of debugging?

                                                                                                          1. 7

                                                                                                            Yes.

                                                                                                            GCC has -Og, which turns on all optimizations that can’t affect debugging.

                                                                                                            1. 2

                                                                                                              Or design your software in such a way that the stacktrace is not needed for debugging. Which is hard, for sure, but the current trend of depending on stacktraces for everything (Java, Python, for example) is a bit too extreme IMO. For a comparison, in Ocaml I tend to use a result type monad for things that can fail at which point the compiler makes sure I do something with all errors.

                                                                                                        1. 10

                                                                                                          I’ve read many people say that dvorak was fine for the vim movement keys.

                                                                                                          And as for the keycaps, I’m not sure I see the problem, why not just use a blank keyboard and switch at will?

                                                                                                          1. 5

                                                                                                            Although I am in theory capable of typing without looking at the keys, in practice I do a lot of key stabbing as well. And a lot of one handed typing as well. I’ve practiced this some in the dark, and it’s no fun. Definitely not interested in a blank keyboard.

                                                                                                            Anyway, same experience as the author. Learned dvorak because there were people who didn’t know dvorak, used it for a while, then found I had trouble using a qwerty keyboard. Now I just use qwerty full time, but go back and practice dvorak for a week or so at a time to maintain the skill in case I ever have a compelling reason to switch.

                                                                                                            I like dvorak for English, but find it substantially more annoying for code. And it’s a disaster for passwords. I usually set up hotkeys so I can quickly change on the fly depending on what and how much I’m typing.

                                                                                                            1. 2

                                                                                                              I love Dvorak for code! Having -_ and =+ much closer is so convenient.

                                                                                                              1. 1

                                                                                                                More than { [ ] }?

                                                                                                                1. 2

                                                                                                                  For sure, think about where it’s now positioned. Typing …) {… is so easy when ) and { are side by side. And for code that doesn’t use egyptian braces, )<enter>{ is easier for me too. When I hit enter with my pinky, and follow up with { with my middle finger, that’s natural. But trying to squeeze my middle finger into the QWERTY location for { while my pinky is still on enter totally sucks.

                                                                                                                  Meanwhile -_=+ are all typed in line with other words (i.e. variable names). And - and _ are frequently part of filenames and variables, so it’s great that they’re closest to the letter keys.

                                                                                                              2. 2

                                                                                                                I like dvorak for English, but find it substantially more annoying for code.

                                                                                                                Exactly! If I were a novelist I would probably just continue using Dvorak.

                                                                                                                1. 2

                                                                                                                  in practice I do a lot of key stabbing as well

                                                                                                                  I recently bought a laptop with a Swiss(?) keyboard layout. (It really is a monstrosity with up to five characters on one key). I thought I wouldn’t need to look at the keys at all and I could just use my preferred keymap, but I’ve been caught ought a few times. I’m just about used to it now, though.

                                                                                                                2. 4

                                                                                                                  When I am typing commands into a production machine I feel like it is only responsible of me to use a properly labelled keyboard.

                                                                                                                  This is really important when you’re on your last ssh password/smartcard PIN attempt, because you can go slow and look at what you’re doing.

                                                                                                                  1. 5

                                                                                                                    I got a blank keyboard, and I must admit that I still look at it from time to time. like for numbers, or b/v, u/i… I only do so when I start thinking “OMG this is a password, don’t get it wrong!”

                                                                                                                    Having a blank keyboard doesn’t stop you from looking at your hands. It only disappoint you when you do.

                                                                                                                    1. 5

                                                                                                                      As a happy Dvorak user I’d have to say there are better fixes to that problem. Copy it from your password manager? (You use one, right?) Type it into somewhere else, and cut and paste? Or use the keyboard viewer? (Ok that one is macOS specific, perhaps.)

                                                                                                                      Specifically re: “typing commands into prod machines” I don’t buy the argument. Commands generally don’t take effect until you hit ENTER and until then you’ve got all the time you need to review what you’ve typed in. Some programs do prompt for yes/no without waiting for Enter but it’s not like Dvorak or Qwerty’s y or n keys have a common location in either layout, so I don’t really see that as an issue either.

                                                                                                                      1. 2

                                                                                                                        Yes, the “production machines” argument is a strange one. I’d imagine it would only be an issue on a Windows system (if you’re logging in via ssh it’s immaterial) and then it would be fairly obvious quite quickly that the keyboard map is wrong. And if the keyboard map is wrong in the Dvorak vs QWERTY sense you’d quickly realise you’re typing gibberish. Or so I’d think?

                                                                                                                        Ignoring the whole issue of “you shouldn’t be logging in to a production machine to make changes”…

                                                                                                                      2. 1

                                                                                                                        In this case, I find the homing keys, reorient myself, and type whatever I need to type. (Or just use a password manager & paste). Haven’t mistyped a password in years, and I’m using Dvorak with blanks.

                                                                                                                        Homing keys are there for a reason.

                                                                                                                        Labels are only necessary when you don’t touch type. If you do, they serve no useful purpose.

                                                                                                                      3. 2

                                                                                                                        I’ve read many people say that dvorak was fine for the vim movement keys.

                                                                                                                        Dvorak is fine for Vim movement keys, but not as nearly as nice as Qwerty.

                                                                                                                        And as for the keycaps, I’m not sure I see the problem, why not just use a blank keyboard and switch at will?

                                                                                                                        The problem is, when I’m entering a password or bash command sometimes I want to slow down and actually look at the keyboard while I’m typing. In sensitive production settings raw speed isn’t nearly as valuable as accuracy. A blank keyboard would not solve this problem :)

                                                                                                                        1. 6

                                                                                                                          Dvorak is fine for Vim movement keys, but not as nearly as nice as Qwerty.

                                                                                                                          They actually work better with Dvorak for me, because the grouping feels more logical than on qwerty to me.

                                                                                                                          1. 1

                                                                                                                            Likewise: vertical and horizontal movement keys separated onto different hands rather than all on the one (and interspersed) works much better for me.

                                                                                                                          2. 2

                                                                                                                            I hate vim movement in QWERTY. I think it’s because I’m left handed, and Dvorak puts up/down on my left pointer and middle finger. For me, it’s really hard to manipulate all four directions with my right hand quickly.

                                                                                                                            1. 1

                                                                                                                              Would it make sense to use AOEU for motion then (or HTNS for right handed people)? I guess doing so may open a whole can of remapping worms though?

                                                                                                                              That won’t help with apps that don’t support remapping but which support vi-style motion though (as they’ll expect you to hit HJKL)…

                                                                                                                        1. 29

                                                                                                                          Hmm. I have just spent a week or two getting my mind around systemd, so I will add a few comments….

                                                                                                                          • Systemd is a Big step forward on sysv init and even a good step forward on upstart. Please don’t throw the baby out with the bathwater in trying achieve what seems to be mostly political rather than technical aims. ie.

                                                                                                                          ** The degree of parallelism achieved by systemd does very good things to start up times. (Yes, that is a critical parameter, especially in the embedded world)

                                                                                                                          ** Socket activation is very nifty / useful.

                                                                                                                          ** There are a lot of learning that has gone into things like dbus https://lwn.net/Articles/641277/ While there are things I really don’t like about dbus (cough, xml, cough)…. I respect the hard earned experience encoded into it)

                                                                                                                          ** Systemd’s use of cgroups is actually a very very nifty feature in creating rock solid systems, systems that don’t go sluggish because a subsystem is rogue or leaky. (But I think we are all just learning to use it properly)

                                                                                                                          ** The thought and effort around “playing nice” with distro packaging systems via “drop in” directories is valuable. Yup, it adds complication, but packaging is real and you need a solution.

                                                                                                                          ** The thought and complication around generators to aid the transition from sysv to systemd is also vital. Nobody can upgrade tens of thousands of packages in one go.

                                                                                                                          TL;DR; Systemd actually gives us a lot of very very useful and important stuff. Any competing system with the faintest hope of wide adoption has a pretty high bar to meet.

                                                                                                                          The biggest sort of “WAT!?” moments for me around systemd is that it creates it’s own entirely new language… that is remarkably weaker even than shell. And occasionally you find yourself explicitly invoking, yuck, shell, to get stuff done.

                                                                                                                          Personally I would have preferred it to be something like guile with some addons / helper macros.

                                                                                                                          1. 15

                                                                                                                            I actually agree with most of what you’ve said here, Systemd is definitely trying to solve some real problems and I fully acknowledge that. The main problem I have with Systemd is the way it just subsumes so much and it’s pretty much all-or-nothing; combined with that, people do experience real problems with it and I personally believe its design is too complicated, especially for such an essential part of the system. I’ll talk about it a bit more in my blog (along with lots of other things) at some stage, but in general the features you list are good features and I hope to have Dinit support eg socket activation and cgroups (though as an optional rather than mandatory feature). On the other hand I am dead-set that there will never be a dbus-connection in the PID 1 process nor any XML-based protocol, and I’m already thinking about separating the PID 1 process from the service manager, etc.

                                                                                                                            1. 9

                                                                                                                              Please stick with human-readable logs too. :)

                                                                                                                              1. 6

                                                                                                                                Please don’t. It is a lot easier to turn machine-readable / binary logs to human-readable than the other way around, and machines will be processing and reading logs a lot more than humans.

                                                                                                                                1. 4

                                                                                                                                  Human-readable doesn’t mean freeform. It can be machine-readable too. At my last company, we logged everything as date, KV pairs, and only then freeform text. It had a natural mapping to JSON and protocol buffers after that.

                                                                                                                                  https://github.com/uber-go/zap This isn’t what we used, but the general idea.

                                                                                                                                  1. 3

                                                                                                                                    Yeah, you can do that. But then it becomes quite a bit harder to sign, encrypt, or index logs. I still maintain that going binary->human readable is more efficient, and practical, as long as computers do more processing on the logs than humans do.

                                                                                                                                    Mind you, I’m talking about storage. The logs should be reasonably easy for a human to process when emitted, and a mapping to a human-readable format is desirable. When stored, human-readability is, in my opinion, a mistake.

                                                                                                                                    1. 2

                                                                                                                                      You make good points. It’s funny, because I advocated hard for binary logs (and indeed stored many logs as protocol buffers on Kafka; only on the filesystem was it text) from systems at $dayjob-1, but when it comes to my own Linux system it’s a little harder for me to swallow. I suppose I’m looking at it from the perspective of an interactive user and not a fleet of Linux machines; on my own computer I like to be able to open my logs as standard text without needing to pipe it through a utility.

                                                                                                                                      I’ll concede the point though: binary logs do make a lot more sense as building blocks if they’re done right and have sufficient metadata to be better than the machine-readable text format. If it’s a binary log of just date + facility + level + text description, it may as well have been a formatted text log.

                                                                                                                                2. 2

                                                                                                                                  So long as they accumulate the same amount of useful info…. and is machine parsable, sure.

                                                                                                                                  journalctl spits out human readable or json or whatever.

                                                                                                                                  I suspect to achieve near the same information density / speed as journalctl with plain old ascii will be a hard ask.

                                                                                                                                  In my view I want both. Human and machine readable… how that is done is an implementation detail.

                                                                                                                                3. 4

                                                                                                                                  I’m sort of curious about which “subsume everything” bits are hurting you in particular.

                                                                                                                                  For example, subsuming the business of mounting is fairly necessary since these days the order in which things get mount relative to the order in which various services are run is pretty inexorable.

                                                                                                                                  I have doubts about how much of the networkd / resolved should be part of systemd…. except something that collaborates with the startup infrastructure is required. ie. I suspect your choices in dinit will be slightly harsh…. modding dinit to play nice with existing network managers or modding existing network managers to play nice with dinit or subsuming the function of network management or leaving fairly vital chunks of functionality undone and undoable.

                                                                                                                                  Especially in the world of hot plug devices and mobile data….. things get really really hairy.

                                                                                                                                  I am dead-set that there will never be a dbus-connection in the PID 1

                                                                                                                                  You still need a secure way of communicating with pid 1….

                                                                                                                                  That said, systemd process itself could perhaps be decomposed into more processes than it currently is.

                                                                                                                                  However as I hinted…. there are things that dbus gives you, like bounded trusted between untrusted and untrusting and untrustworthy programs that is hard to achieve without reimplementing large chunks of dbus….

                                                                                                                                  …and then going through the long and painful process of learning from your mistakes that dbus has already gone through.

                                                                                                                                  Yes, I truly hate xml in there…. but you still need some security sensitive serialization mechanism in there.

                                                                                                                                  ie. Whatever framework you choose will still need to enforce the syntactic contract of the interface so that a untrusted and untrustworthy program cannot achieve a denial of service or escalation of privilege through abuse of a serialized interface.

                                                                                                                                  There are other things out there that do that (eg. protobuffers, cap’n’proto, …), but then you still in a world where desktops and bluetooth and network managers and …….. need to be rewritten to use the new mechanism.

                                                                                                                                  1. 3

                                                                                                                                    For example, subsuming the business of mounting is fairly necessary since these days the order in which things get mount relative to the order in which various services are run is pretty inexorable.

                                                                                                                                    systemd’s handling of mounting is beyond broken. It’s impossible to get bind mounts to work successfully on boot, nfs mounts don’t work on boot unless you make systemd handle it with autofs and sacrifice a goat, and last week I had a broken mount that couldn’t be fixed. umount said there were open files, lsof said none were open. Had to reboot because killing systemd would kill the box anyway.

                                                                                                                                    It doesn’t even start MySQL reliably on boot either. Systemd is broken. Stop defending it.

                                                                                                                                    1. 3

                                                                                                                                      For example, subsuming the business of mounting is fairly necessary since these days the order in which things get mount relative to the order in which various services are run is pretty inexorable.

                                                                                                                                      There are a growing number of virtual filesystems that Linux systems expect or need to be mounted for full operation - /proc, /dev, /sys and cgroups all have their own - but these can all be mounted in the traditional way: by running ‘/bin/mount’ from a service. And because it’s a service, dependencies on it can be expressed. What Systemd does is understand the natural ordering imposed by mount paths as implicit dependencies between mount units, which is all well and good but which could also be expressed explicitly in service descriptions, either manually (how often do you really change your mount hierarchies…) or via an external tool. It doesn’t need to be part of the init system directly.

                                                                                                                                      (Is it bad that systemd can do this? Not really; it is a feature. On the other hand, systemd’s complexity has I feel already gotten out of hand. Also, is this particular feature really giving that much real-world benefit? I’m not convinced).

                                                                                                                                      I suspect your choices in dinit will be slightly harsh…. modding dinit to play nice with existing network managers or modding existing network managers to play nice with dinit

                                                                                                                                      At this stage I want to believe there is another option: delegating Systemd API implementation to another daemon (which communicates with Dinit if and as it needs to). Of course such a daemon could be considered as part of Dinit anyway, so it’s a fine distinction - but I want to keep the lines between the components much clearer (than I feel they are in Systemd).

                                                                                                                                      I believe in many cases the services provided by parts of Systemd don’t actually need to be tied to the init system. Case in point, elogind has extraced the logind functionality from systemd and made it systemd-independent. Similarly there’s eudev, the Gentoo fork of the udev device node management daemon which extracts it from systemd.

                                                                                                                                      You still need a secure way of communicating with pid 1…

                                                                                                                                      Right now, that’s via root-only unix socket, and I’d like to keep it that way. The moment unprivileged processes can talk to a privileged process, you have to worry about protocol flaws a lot more. The current protocol is compact and simple. More complicated behavior could be wrapped in another daemon with a more complex API, if necessary, but again, the boundary lines (is this init? is this service management? or is this something else?) can be kept clearer, I feel.

                                                                                                                                      Putting it another way, a lot of the parts of Systemd that required a user-accessible API just won’t be part of Dinit itself: they’ll be part of an optional package that communicates the Dinit only if it needs to, and only by a simple internal protocol. That way, boundaries between components are more clear, and problems (whether bugs or configuration issues) are easier to localise and resolve.

                                                                                                                                    2. 1

                                                                                                                                      On the other hand I am dead-set that there will never be a dbus-connection in the PID 1 process nor any XML-based protocol

                                                                                                                                      Comments like this makes me wonder what you actually know about D-Bus and what you think it uses XML for.

                                                                                                                                      1. 2

                                                                                                                                        I suppose you are hinting that I’ve somehow claimed D-Bus is/uses an XML-based protocol? Read the statement again…

                                                                                                                                        1. 1

                                                                                                                                          It certainly sounded like it anyway.

                                                                                                                                    3. 8

                                                                                                                                      Systemd solves (or attempts to) some actually existing problems, yes. It solves them from a purely Dev(Ops) perspective while completely ignoring that we use Linux-based systems in big part for how flexible they are. Systemd is a very big step towards making systems we use less transparent and simple in design. Thus, less flexible.

                                                                                                                                      And if you say that’s the point: systems need to get more uniform and less unique!.. then sure. I very decidedly don’t want to work in an industry that cripples itself like that.

                                                                                                                                      1. 8

                                                                                                                                        Hmm. I strongly disagree with that.

                                                                                                                                        As a simple example, in sysv your only “targets” were the 7 runlevels. Pretty crude.

                                                                                                                                        Alas the sysv simplicity came at a huge cost. Slow boots since it was hard to parallelize, and Moore’s law has stopped giving us more clock cycles… it only gives us more cores these days.

                                                                                                                                        On my ubuntu xenial box I get… locate target | grep -E ‘^/(run|etc|lib)/.*.target$’ | grep -v wants | wc 61 61 2249

                                                                                                                                        (Including the 7 runlevels for backwards compatibility)

                                                                                                                                        ie. Much more flexibility.

                                                                                                                                        ie. You have much more flexibility than you ever had in sysv…. and if you need to drop into a whole of shell (or whatever) flexibility…. nothing is stopping you.

                                                                                                                                        It’s actually very transparent…. the documentation is actually a darn sight better that sysv init ever was and the source code is pretty readable. (Although at the user level I find I can get by mostly by looking at the .service files and guessing, it’s a lot easy to read than a sysv init script.)

                                                                                                                                        So my actual experience of wrangling systemd on a daily basis is it is more transparent and flexible than what we had before…..

                                                                                                                                        A bunch of the complexity is due to the need to transition from sysv/upstart to systemd.

                                                                                                                                        I can see on my box a huge amount of crud that can just be deleted once everything is converted.

                                                                                                                                        All the serious “Huh!? WTF!?” moments in the last few weeks have been around the mishmash of old and new.

                                                                                                                                        Seriously. It is simpler.

                                                                                                                                        That said, could dinit be even simpler?

                                                                                                                                        I don’t know.

                                                                                                                                        As I say, systemd has invented it’s own quarter arsed language for the .unit files. Maybe if dinit uses a real language…. (I call shell a half arsed language)

                                                                                                                                        1. 11

                                                                                                                                          You are comparing systemd to “sysv”. That’s a false dichotomy that was very agressively pushed into every conversation about systemd. No. Those are not the only two choices.

                                                                                                                                          BTW, sysvinit is a dumb-ish init that can spawn processes and watch over them. We’ve been using it as more or less just a dumb init for the last decade or so. What you’re comparing systemd to is an amorphous, distro-specific blob of scripts, wrappers and helpers that actually did the work. Initscripts != sysvinit. Insserv != sysvinit.

                                                                                                                                          1. 4

                                                                                                                                            Ok, fair cop.

                                                                                                                                            I was using sysv as a hand waving reference to the various flavours of init /etc/init.d scripts, including upstart that Debian / Ubuntu have been using prior to systemd.

                                                                                                                                            My point is not to say systemd is the greatest and end point of creation… my point is it’s a substantial advance on what went before (in yocto / ubuntu / debian land) (other distros may have something better than that I haven’t experienced.)

                                                                                                                                            And I wasn’t seeing anything in the dinit aims and goals list yet that was making me saying, at the purely technical level, that the next step is on it’s way.

                                                                                                                                      2. 3

                                                                                                                                        Personally I would have preferred it to be something like guile with some addons / helper macros.

                                                                                                                                        So, https://www.gnu.org/software/shepherd/ ?

                                                                                                                                        Ah, no, you probably meant just the language within systemd. But adding systemd-like functionality to The Shepherd would do that. I think running things in containers is in, or will be, but maybe The Shepherd is too tangled up in GuixSD for many people’s use cases.

                                                                                                                                      1. 6

                                                                                                                                        I have been reading through this and I feel like POSIX MQs are really underutilized. Perhaps it’s because they don’t use the file API.

                                                                                                                                        Does anyone have thoughts on using POSIX MQs?

                                                                                                                                        1. 8

                                                                                                                                          One issue with them on Linux, at least, is that they have smallish default maximums: 8kb max message size, max 10 messages in flight at a time.

                                                                                                                                          The bigger issue though imo is that they fit in an awkward gap between stream abstractions like pipes or sockets, and full-featured message queue systems. Most people who want simple local IPC just use pipes or unix sockets (even if this requires a bit of DIY protocol around message delimeters), and people who want full-on queueing usually want queue state that persists over reboots, at least the option for network distribution, etc., so they use zeromq or similar.

                                                                                                                                          1. 4

                                                                                                                                            I’m looking at POSIX mqueue’s for better concurrency control than pipes but with less ceremony than sockets. Seems like that might be its sweet spot. Also, on FreeBSD, PIPE_BUF is way too small (512 bytes). I might whip up some test programs to see how well they go.

                                                                                                                                          2. 3

                                                                                                                                            I don’t think I’ve ever actually heard of them before, though I’m definitely planning on looking into using them for a few things now…

                                                                                                                                            1. 2

                                                                                                                                              They seem like a very nice tool. It’s interesting that they support priority natively, and can handle both non-blocking modes and blocking with a timeout. To clarify what mjn says, they have a 8kb default max size. Looks like the actual max size is somewhere around 16MB, which makes them more than big enough for my use cases.

                                                                                                                                              I wonder what the performance is like.

                                                                                                                                              1. 2

                                                                                                                                                Yeah, the max size (and number of messages) is configurable, but it’s a kernel parameter rather than something accessible from userspace (so the program can’t request the change itself, at least if it doesn’t run as root). Which is probably fine if you’re writing something fairly specialized, but it really reduces the number of cases I’d consider using them. I don’t want to write software where the install instructions have to tell users how to tweak their kernel parameters (and package managers don’t like to package those kinds of packages either).

                                                                                                                                                1. 2

                                                                                                                                                  That’s interesting. I wonder how that relates to containers/cgroups/etc. Can a docker container be spun up with a dedicated message size specific to that container?

                                                                                                                                                  [edit]

                                                                                                                                                  I think I partially answered my own question. It appears that POSIX message queues are part of the IPC namespace, what I’m not sure about is if /proc/sys/fs/mqueue/msgsize_max is per-container.

                                                                                                                                                  1. 2

                                                                                                                                                    Yeah, the max size (and number of messages) is configurable, but it’s a kernel parameter rather than something accessible from userspace

                                                                                                                                                    Is this the case? mq_open takes an mq_attr which lets one specify the size and max messages. There are upper bounds but they seem quite high from what I can gather.

                                                                                                                                                    1. 2

                                                                                                                                                      Of the various /proc/sys/fs/mqueue/ parameters: you can override msgsize_default and msg_default with the parameters to mq_open, but only up to the ceilings specified by msgsize_max and msg_max.

                                                                                                                                                      But on my Debian machine, the *_default and *_max parameters are the same, 8192 message size and 10 messages, so in practice you can’t actually request anything larger than the default, without tweaking settings in /proc. It’s possible other distributions ship different defaults; I’ve only checked Debian.

                                                                                                                                              1. 1

                                                                                                                                                This is just unbelievable, and if I hadn’t been following the author for many many years I don’t think I’d believe it. The fact that it also rick rolls you is …. I’m speechless.

                                                                                                                                                1. 1

                                                                                                                                                  Also check out PoC||GTFO. :D