1. 28

    I just can’t shake the feeling that Kubernetes is Google externalizing their training costs to the industry as a whole (and I feel the same applies to Go).

    1. 9

      Golang is great for general application development, IME. I like the culture of explicit error handling with thoughtful error messages, the culture of debugging with fast unit tests, when possible, and the culture of straightforward design. And interfaces are great for breaking development into components which can be developed in parallel. What don’t you like about it?

      1. 11

        It was initially the patronizing quote from Rob Pike that turned me off Go. I’m also not a fan of gofmt [1] (and I’m not a fan of opinionated software in general, unless I’m the one controlling the opinions [2]). I’m also unconvinced about the whole “unit testing” thing [5]. Also, it’s from Google [3]. I rarely mention it, because it goes against the current zeitgeist (especially at the Orange Site), and really, what can I do about it?

        [1] I’m sorry, but opening braces go on their own line. We aren’t developing on 24 line terminals anymore, so stop shoving your outdated opinions in my face.

        [2] And yes, I realize I’m being hypocritical here.

        [3] Google is (in my opinion, in case that’s not apparent) shoving what they want on the entire industry to a degree that Microsoft could only dream of. [4]

        [4] No, I’m not bitter. Really!

        [5] As an aside, but up through late 2020, my department had a method of development that worked (and it did not involve anything resembling a “unit test”)—in 10 years we only had two bugs get to production. In the past few moths there’s been a management change and a drastic change in how we do development (Agile! Scrum! Unit tests über alles! We want push button testing!) and so far, we’ve had four bugs in production.

        Way to go!

        I should also note that my current manager retired, the other developer left for another job, and the QA engineer assigned to our team also left for another job (but has since come back because the job he moved to was worse, and we could really use him back in our office). So nearly the entire team was replaced back around December of 2020.

        1. 11

          I can’t even tell if this is a troll post or not.

          1. 1

            I can assure you that I’m not intentionally trolling, and those are my current feelings.

          2. 2

            I’m sorry, but opening braces go on their own line. We aren’t developing on 24 line terminals anymore, so stop shoving your outdated opinions in my face.

            I use a portrait monitor with a full-screen Emacs window for my programming, and I still find myself wishing for more vertical space when programming in curly-brace languages such as Go. And when I am stuck on a laptop screen I am delighted when working on a codebase which does not waste vertical space.

            Are you perhaps younger than I am, with very small fonts configured? I have found that as I age I find a need for large and larger fonts. Nothing grotesque yet, but I went from 9 to 12 to 14 and, in a few places, 16 points. All real 1/72” points, because I have my display settings configured that way. 18-year-old me would have thought I am ridiculous! Granted, you’ve been at your current employer at least 10 years, so I doubt you are 18🙂

            I’m also unconvinced about the whole “unit testing” thing … my department had a method of development that worked (and it did not involve anything resembling a “unit test”)—in 10 years we only had two bugs get to production. In the past few moths there’s been a management change and a drastic change in how we do development (Agile! Scrum! Unit tests über alles! We want push button testing!) and so far, we’ve had four bugs in production.

            I suspect that the increase in bugs has to do with the change in process rather than the testing regime. Adding more tests on its own can only lead to more bugs if either incorrect tests flag correct behaviour as bugs (leading to buggy ‘bugfixes,’ or rework to fix the tests), or if correct tests for unimportant bugs lead to investing resources inefficiently, or if the increased emphasis leads to worse code architecture or rework rewriting old code to conform to the old architecture (I think I covered all the bases here). OTOH, changing development processes almost inevitably leads to poor outcomes in the short term: there is a learning curve; people and secondary processes must adapt &c.

            That is worth it if the long-term outcomes are sufficiently better. In the specific case of unit testing, I think it is worth it, especially in the long run and especially as team size increases. The trickiest thing about it in my experience has been getting the units right. I feel pretty confident about the right approach now, but … ask me in a decade!

            1. 2

              Are you perhaps younger than I am, with very small fonts configured?

              I don’t know, you didn’t give your age. I’m currently 52, and my coworkers (back when I was in the office) often complained about the small font size I use (and have used).

              I suspect that the increase in bugs has to do with the change in process rather than the testing regime.

              The code (and it’s several different programs that comprise the whole thing) was not written with unit testing in mind (even though it was initially written in 2010, it’s in C89/C++98, and the developer who wrote it didn’t believe in unit tests). We do have a regression test that tests end-to-end [1] but there are a few cases that as of right now require manual testing [2], which I (as a dev) can do, but generally QA does a more in-depth testing. And I (or rather, we devs did, before the major change) work closely with the QA engineer to coordinate testing.

              And that’s just the testing regime. The development regime is also being forced changed.

              [1] One program to generate the data required, and another program that runs the eight programs required (five of which aren’t being tested but need to be endpoints our stuff talks to) and runs through 15,800+ tests we have (it takes around two minutes). It’s gotten harder to add tests to it (the regression test is over five years old) due to the nature of how the cases are generated (automatically, and not all cases generated are technically “valid” in the sense we’ll see it in production).

              [2] Our business logic module queries two databases at the same time (via UDP—they’re DNS queries), so how does one automate the testing of result A returns before result B, result B returns before result A, A returns but B times out, B returns and A times out? The new manager wants “push button testing”.

              1. 1

                [2] Our business logic module queries two databases at the same time (via UDP—they’re DNS queries), so how does one automate the testing of result A returns before result B, result B returns before result A, A returns but B times out, B returns and A times out? The new manager wants “push button testing”

                Here are three options, but there are many others:

                1. Separate the networking code from the business logic, test the business logic
                2. Have the business logic send to a test server running on localhost, have it send back results ordered as needed
                3. Change the routing configuration or use netfilter to rewrite the requests to a test server, have it send back results ordered as needed.

                Re-ordering results from databases is a major part of what Jepsen does; you could take ideas from there too.

                1. 1
                  1. Even if that was possible (and I wish it was), I would still have to test the networking code to ensure it’s working, per the new regime.
                  2. That’s what I’m doing
                  3. I’m not sure I understand what you mean by “routing configuration”, but I do understand what “netfilter” is, and my response to that is—the new regime wants “push button testing,” and if there’s a way to automate that, then that is an option.
                  1. 1
                    1. Yes, of course the networking code would still need to be tested.

                      Ideally, the networking code would have its own unit tests. And, of course, unit tests don’t replace integration tests. Test pyramid and such.

                    2. 🚀

                    3. netfilter can be automated. It’s an API.

                    What’s push button testing?

                    1. 1

                      You want to test the program. You push a button. All the tests run. That’s it. Fully automated testing.

            2. 1

              Would love to hear about your prior development method. Did adopting the new practices have any upsides?

              1. 3

                First off, our stuff is a collection of components that work together. There are two front-end pieces (one for SS7 traffic, one for SIP traffic) that then talk to the back-end (that implements the business logic). The back-end makes parallel DNS queries [1] to get the required information, muck with the data according to the business logic, then return data to the front-ends to ultimately return the information back to the Oligarchic Cell Phone Companies. Since this process happens as a call is being placed we are on the Oligarchic Cell Phone Companies network, and we have some pretty short time constraints. And due to this, not only do we have some pretty severe SLAs, but any updates have to be approved 10 business days before deployment by said Oligarchic Cell Phone Companies. As a result, we might get four deployments per year [2].

                And the components are written in a combination of C89, C++98 [3], C99, and Lua [4].

                So, now that you have some background, our development process. We do trunk based development (all work done on one branch, for the most part). We do NOT have continuous deployment (as noted above). When working, we developers (which never numbered more than three) would do local testing, either with the regression test, or another tool that allows us to target a particular data configuration (based off the regression test, which starts eight programs, five of which are just needed for the components being tested). Why not test just the business logic? Said logic is spread throughout the back-end process, intermixed with all the I/O it does (it needs data from multiple sources, queried at the same time).

                Anyway, code is written, committed (main line), tested, fixed, committed (main line), repeat, until we feel it’s good. And the “tested” part not only includes us developers, but also QA at the same time. Once it’s deemed working (using both regression testing and manual testing), we then officially pass it over to QA, who walks it down the line from the QA servers, staging servers and finally (once we get permission from the Oligarchic Cell Phone Companies) into production, where not only devops is involved, but QA and the developer who’s code is being installed (at 2:00 am Eastern, Tuesday, Wednesday or Thursday, never Monday or Friday).

                Due to the nature of what we are dealing with, testing at all is damn near impossible (or rather, hideously expensive, because getting actual cell phone traffic through the lab environment involves, well, being a phone company (which we aren’t), very expensive and hard to get equipment, and a very expensive and hard to get laboratory setup (that will meet FCC regulations, blah blah yada yada)) so we do the best we can. We can inject messages as if they were coming from cell phones, but it’s still not a real cell phone, so there is testing done during deployment into production.

                It’s been a 10 year process, and it has gotten better until this past December.

                Now it’s all Agile, scrum, stories, milestones, sprints, and unit testing über alles! As I told my new manager, why bother with a two week sprint when the Oligarchic Cell Phone Companies have a two year sprint? It’s not like we ever did continuous deployment. Could more testing be done automatically? I’m sure, but there are aspects that are very difficult to test automatically [5]. Also, more branch development. I wouldn’t mind so much this, except we’re using SVN (for reasons that are mostly historical at this point) and branching is … um … not as easy as in git. [6] And the new developer sent me diffs to ensure his work passes the tests. When I asked him why didn’t he check the new code in, he said he was told by the new manager not to, as it could “break the build.” But we’ve broken the build before this—all we do is just fix code and check it in [8]. But no, no “breaking the build”, even though we don’t do continuous integration, nor continuous deployment, and what deployment process we do have locks the build number from Jenkins of what does get pushed (or considered “gold”).

                Is there any upside to the new regime? Well, I have rewritten the regression test (for the third time now) to include such features as “delay this response” and “did we not send a notification to this process”. I should note that is is code for us, not for our customer, which, need I remind people, is the Oligarchic Cell Phone Companies. If anyone is interested, I have spent June and July blogging about this (among other things).

                [1] Looking up NAPTR records to convert phone numbers to names, and another set to return the “reputation” of the phone number.

                [2] It took us five years to get one SIP header changed slightly by the Oligarchic Cell Phone Companies to add a bit more context to the call. Five years. Continuous deployment? What’s that?

                [3] The original development happened in 2010, and the only developer at the time was a) very conservative, b) didn’t believe in unit tests. The code is not written in a way to make it easy to unit test, at least, as how I understand unit testing.

                [4] A prototype I wrote to get my head around parsing SIP messages that got deployed to production without my knowing it by a previous manager who was convinced the company would go out of business if it wasn’t. This was six years ago. We’re still in business, and I don’t think we’re going out of business any time soon.

                [5] As I mentioned, we have multiple outstanding requests to various data sources, and other components that are notified on a “fire and forget” mechanism (UDP, but it’s all on the same segment) that the new regime want to ensure gets notified correctly. Think about that for a second, how do you prove a negative? That is, something that wasn’t supposed to happen (like a component not getting notified) didn’t happen?

                [6] I think we’re the only department left using SVN—the rest of the company has switched to git. Why are we still on SVN? 1) Because the Solaris [7] build servers aren’t configured to pull from git yet and 2) the only redeeming feature of SVN is the ability to checkout a subdirectory, which given the layout of our repository, and how devops want the build servers configured, is used extensively. I did look into using git submodules, but man, what a mess. It totally doesn’t work for us.

                [7] Oh, did I neglect to mention we’re still using Solaris because of SLAs? Because we are.

                [8] Usually, it’s Jenkins that breaks the build, not the code we checked in. Sometimes, the Jenkins checkout fails. Devops has to fix the build server [7] and try the call again.

                1. 1

                  As a result, we might get four deployments per year [2]

                  AIUI most agile practices are to decrease cycle time and get faster feedback. If you can’t, though, then you can’t! Wrong practices for the wrong context.

                  I feel for you.

                  1. 1

                    Thank you! More grist for my “unit testing is fine in its place” mill.

                    Also: hiring new management is super risky.

          1. 1

            This is just a laptop with a proprietary parts system.

            Not sure if this advertisement belongs here.

            1. 8

              proprietary parts system

              That’s true for some of the parts I’m sure (due to necessity since the market doesn’t have a concept of “standardized laptop enclosures”), but the expansion cards are just internal USB C dongles. They’ve also released the CAD files for the expansion card housing, so people can make their own.

              1. 5

                Maybe, apart from the screen, expansion cards with ports on them, speakers, memory, storage, camera, microphone, plastic bit around the screen, wifi module.

                Nothing is stopping people from buying the same components or compatible components or even making new compatible components. If I am wrong and naive then please tell me why.

                1. 1

                  Proprietary as in “only used by the one company”, or proprietary as in “fees required for production of compatible devices”?

                  If the former, that’s how most good hardware standards start off - someone makes their version and shows it can work (and gains nontrivial marketshare), then others produce components that can match.

                  If the latter, well, that’s news to me.

                1. 2

                  Any news on when it will be possible to ship to Europe, namely Poland or Germany?

                    1. 25

                      It looks like the maintainer has resigned due to IRL harassment from channers: https://github.com/tenacityteam/tenacity/issues/99

                      1. 9

                        Based on their followup comment, seems like harassment is an understatement, they were assaulted. :((

                        1. 4

                          Assault with a knife and a running Investigation by the Federal Criminal Police Office. So it doesn’t actually matter anymore what 4chan might have bothered, I hope they really catch those guys for good.

                          1. 6

                            It’s likely that they’ll catch and convict the one person who did the assault, and that nobody else will have any liability. I say this as somebody who’s followed the activities of hate groups for years.

                            1. 1

                              Well that would be at least something. If we can push such activity back to online harassment it’ll already be a win.. Or rather: I wouldn’t be surprised if they can’t find out where/who it was and the charge is so light for some technical reasons that nothing actually happens. I think it’s fair to convict at least the person that was ready enough to start going at people with a knife, these are ticking bombs anyway in my experience. But yeah, it’s probably not the last “raid” of 4chan.

                          2. 2

                            Is the context for this preserved somewhere? The comment is replying to @alicemargatroid but I don’t see any posts from them in that issue. Seems like GitHub may have Optimized our Experience.

                        2. 4

                          As near as I can tell in the five minutes I’m willing to spend looking in to this, the joke on 4chan was that the project should be named “Sneedacity”. Apparently “sneed” is some sort of meme, in-joke, or something. And instead of leaving it as just some joke comment people started “campaigning” to name it Sneedacity.

                          🤷

                          1. 5

                            Huh. I assumed it was a play on Au (gold) vs Sn (tin) with “ee” filled in to make it pronounceable.

                            1. 18

                              Your mind is operating on a slightly higher level than 4chan…

                              1. 4

                                I’m glad I’m at least operating on a different level, whether you want to call it higher or lower. Wow. When I first heard about the policy change announcements, I thought about grabbing the source from the last change set before the transition, tossing it into a git repo, and putting builds online for the platforms I use + Windows. ’Cause I use it regularly and generally build it from source for myself.

                                Now I’ll just keep building it for myself, and won’t jump into this fray. It’s not like I was really going to do any more maintenance than fixing the odd wx upgrade breakage anyway. Bleh

                            2. 4

                              Apparently it’s 4chan-speak for “special needs”, though with the amount of fake symbol-recontextualizing 4chan does I’m not sure if I believe it.

                              1. 4

                                Formerly Chuck’s.

                                1. 1

                                  Eh?

                                  1. 4
                                    1. 2

                                      “Sneed’s feed and seed” “Formerly Chuck’s”

                                  2. 4

                                    I believe a poll organised by the dev for the name was won by “sneedacity” and the dev refused to use the name therefore starting this situation.

                                    1. 14

                                      Sorry, but this reads as blaming the victim. Sure, the dev decided not to use the name, but angry 4chan mob appearing in front of his place is way above any meaningful escalation.

                                      1. 9

                                        I am stating the facts as I know them, please read less into a single sentence.

                                        1. 6

                                          I don’t think it was intended like that at all: it was just establishing what happened exactly, not assigning blame. That’s how I read it anyway.

                                        2. 4

                                          The term “legitimately won” is basically meaningless when it comes to internet polls.

                                          1. 2

                                            Well yeah, but what I mean is that people came and voted and didn’t “hack” the result, whatever that is supposed to mean.

                                    1. 22

                                      I’m honestly appalled that such an ignorant article has been written by a former EU MEP. This article completely ignores the fact that the creation of Copilot’s model itself is a copyright infringement. You give Github a license to store and distribute your code from public repositories. You do not give it a permission to Github to use it or create derivative works. And as Copilot’s model is created from various public code, it is a derivative of that code. Some may try to argue that training machine learning models is ‘fair use’, yet I doubt that you can call something that can regurgitate the entire meaningful portion of a file (example taken from Github’s own public dataset of exact generated code collisions) is not a derivative work.

                                      1. 13

                                        In many jurisdictions, as noted in the article, the “right to read is the right to mine” - that is the point. There is already an automatic exemption from copyright law for the purposes of computational analysis, and GitHub don’t need to get that permission from you, as long as they have the legal right to read the code (i.e. they didn’t obtain it illegally).

                                        This appears to be the case in the EU and Britain - https://www.gov.uk/guidance/exceptions-to-copyright - I’m not sure about the US.

                                        Something is not a derivative work in copyright law simply due to having a work as an “input” - you cannot simply argue “it is derived from” therefore “it is a derivative work”, because copyright law, not English language, defines what a “derivative work” is.

                                        For example, Markov chain analysis done on SICP is not infringing.

                                        Obviously, there are limits to this argument. If Copilot regurgitates a significant portion verbatim, e.g. 200 LOC, is that a derivative? If it is 1,000 lines where not one line matches, but it is essentially the same with just variables renamed, is that a derivative work? etc. I think the problem is that existing law doesn’t properly anticipate the kind of machine learning we are talking about here.

                                        1. 3

                                          Dunno how it is in other countries, but in Lithuania, I can not find any exceptions to use my works without me agreeing to it that fit what Github has done. The closest one could be citation, but they do not comply with the requirement of mentioning my name and work from which the citation is taken.

                                          I gave them the license to reproduce, not to use or modify - these are two entirely different things. If they weren’t, then Github has the ability to use all AGPL’d code hosted on it without any problems, and that’s obviously wrong.

                                          There is no separate “mining” clause. That is not a term in copyright. Notice how research is quite explicitly “non-comercial” - and I very much doubt that what Github is doing with Copilot is non-comercial in nature.

                                          The fact that similar works were done previously doesn’t mean that they were legal. They might have been ignored by the copyright owners, but this one quite obviously isn’t.

                                          1. 8

                                            There is no separate “mining” clause. That is not a term in copyright. Notice how research is quite explicitly “non-comercial” - and I very much doubt that what Github is doing with Copilot is non-comercial in nature.

                                            Ms. Reda is referring to a copyright reform adapted on the EU level in 2019. This reform entailed the DSM directive 2019/790, which is more commonly known for the regulations regarding upload filters. This directive contains a text and data mining copyright limitation in Art. 3 ff. The reason why you don’t see this limitation in Lithuanian law (yet), is probably because Lithuania has not yet transformed the DSM directive into its national law. This should probably follow soon, since Art. 29 mandates transformation into national law until June 29th, 2021. Germany has not yet completed the transformation either.

                                            That is, “text and data mining” now is a term in copyright. It is even legally defined on the EU level in Art. 2 Nr. 2 DSM directive.

                                            That being said, the text and data mining exception in Art. 3 ff. DSM directive does not – at first glance, I have only taken a cursory look – allow commercial use of the technique, but only permits research.

                                            1. 1

                                              Oh, huh, here it’s called an education and research exception and has been in law for way longer than that directive, and it doesn’t mention anything remotely translatable as mining. It didn’t even cross my mind that she could have been referring to that. I see that she pushed for that exception to be available for everyone, not only research and cultural heritage, but it is careless of her to mix up what she wants the law to be, and what the law is.

                                              Just as a preventative answer, no, Art 4. of DSM directive does not allow Github to do what it does either, as it applies to work that “has not been expressly reserved by their rightholders in an appropriate manner, such as machine-readable means in the case of content made publicly available online.”, and Github was free to get the content in an appropriate manner for machine learning. It is using the content for machine learning that infringes the code owners copyright.

                                            2. 5

                                              I gave them the license to reproduce, not to use or modify - these are two entirely different things. If they weren’t, then Github has the ability to use all AGPL’d code hosted on it without any problems, and that’s obviously wrong.

                                              Important thing is also that the copyright owner is often different person than the one, who signed a contract with GitHub and uploaded there the codes (git commit vs. git push). The uploader might agree with whatever terms and conditions, but the copyright owner’s rights must not be disrupted in any way.

                                              1. 3

                                                Nobody is required to accept terms of a software license. If they don’t agree to the license terms, then they don’t get additional rights granted in the license, but it doesn’t take away rights granted by the copyright law by default.

                                                Even if you licensed your code under “I forbid you from even looking at this!!!”, I can still look at it, and copy portions of it, parody it, create transformative works, use it for educational purposes, etc., as permitted by copyright law exceptions (details vary from country to country, but the gist is the same).

                                            3. 10

                                              Ms. Reda is a member of the Pirate Party, which is primarily focused on the intersection of tech and copyright. She has a lot of experience working on copyright-related legislation, including proposals specifically about text mining. She’s been a voice of reason when the link tax and upload filters were proposed. She’s probably the copyright expert in the EU parliament.

                                              So be careful when you call her ignorant and mistaken about basics of copyright. She may have drafted the laws you’re trying to explain to her.

                                              1. 16

                                                It is precisely because of her credentials that I am so appalled. I cannot in a good mind find this statement not ignorant.

                                                The directive about text mining very explicitly specifies “only for “research institutions” and “for the purposes of scientific research”.” Github and it’s Copilot doesn’t fall into that classification at all.

                                                1. 3

                                                  Indeed.

                                                  Even though my opinion of Copilot is near-instant revulsion, the basic idea is that information and code is being used to train a machine learning system.

                                                  This is analogous to a human reviewing and reading code, and learning how to do so from lots of examples. And someone going through higher ed school isn’t “owned” by the copyright owners of the books and code they read and review.

                                                  If Copilot is violating, so are humans who read. And that… that’s a very disturbing and disgusting precedent that I hope we don’t set.

                                                  1. 6

                                                    Copilot doesn’t infringe, but GitHub does, when they distribute Copilot’s output. Analogously to humans, humans who read do not infringe, but they do when they distribute.

                                                    1. 1

                                                      Why is it not the human that distributes copilots output?

                                                      1. 1

                                                        Because Copilot first had to deliver the code to the human. Across the Internet.

                                                    2. 4

                                                      I don’t think that’s right. A human who learns doesn’t just parrot out pre-memorized code, and if they do they’re infringing on the copyright in that code.

                                                      1. 2

                                                        The real question, that I think people are missing, is learning itself is a derivative work?

                                                        How that learning happens can either be with a human, or with a machine learning algorithm. And with the squishiness and lack of insight with human brains, a human can claim they insightfully invented it, even if it was derived. The ML we’re seeing here is doing a rudimentary version of what a human would do.

                                                        If Copilot is ‘violating’, then humans can also be ‘violating’. And I believe that is a dangerous path, laying IP based claims on humans because they read something.

                                                        And as I said upthread, as much as I have a kneejerk that Copilot is bad, I don’t see how it could be infringing without also doing the same to humans.

                                                        And as a underlying idea: copyright itself is a busted concept. It worked for the time before mechanical and electrical duplication took hold at a near 0 value. Now? Not so much.

                                                        1. 3

                                                          I don’t agree with you that humans and Copilot are learning somewhat the same.

                                                          The human may learn by rote memorization, but more likely, they are learning patterns and the why behind those patterns. Copilot also learns patterns, but there is no why in its “brain.” It is completely rote memorization of patterns.

                                                          The fact that humans learn the why is what makes us different and not infringing, while Copilot infringes.

                                                          1. 2

                                                            Computers learn syntax, humans learn syntax and semantics.

                                                            1. 1

                                                              Perfect way of putting it. Thank you.

                                                          2. 3

                                                            No I don’t think that’s the real question. Copying is treated as an objective question (and I’m willing to be corrected by experts in copyright law) ie similarity or its lack determine copying regardless of intent to copy, unless the creation was independent.

                                                            But even if we address ourselves to that question, I don’t think machine learning is qualitatively similar to human learning. Shoving a bunch of data together into a numerical model to perform sequence prediction doesn’t equate to human invention, it’s a stochastic copying tool.

                                                        2. 3

                                                          It seems like it could be used to shirk the effort required for a clean room implementation. What if I trained the model on one and only one piece of code I didn’t like the license of, and then used the model to regurgitate it, can I then just stick my own license on it and claim it’s not derivative?

                                                        3. 2

                                                          Ms. Reda is a member of the Pirate Party

                                                          She has left the Pirate Party years ago, after having installed a potential MEP “successor” who was unknown to almost everyone in the party; she subsequently published a video not to vote Pirates because of him as he was allegedly a sex offender (which was proven untrue months later).

                                                          1. 0

                                                            Why exactly do you think someone from the ‘pirate party’ would respect any sort of copyright? That sounds like they might be pretty biased against copyright…

                                                            1. 3

                                                              Despite a cheeky name, it’s a serious party. Check out their programme. Even if the party is biased against copyright monopolies, DRM, frivolous patents, etc. they still need expertise in how things work currently in order to effectively oppose them.

                                                          2. 3

                                                            Have you read the article?

                                                            She addresses these concerns directly. You might not agree but you claim she “ignores” this.

                                                            1. 1

                                                              And as Copilot’s model is created from various public code, it is a derivative of that code.

                                                              Depends on the legal system. I don’t know what happens if I am based in Europe but the guys doing this are in USA. It probably just means that they can do whatever they want. The article makes a ton of claims about various legal aspects of all of this but as far as I know Julia is not actually a lawyer so I think we can ignore this article.

                                                              In Poland maybe this could be considered a “derivative work” but then work which was “inspired” by the original is not covered (so maybe the output of the network is inspired?) and then you have a separate section about databases so maybe this is a database in some weird way of understanding it? If you are not a lawyer I doubt you can properly analyse this. The article tries to analyse the legal aspect and a moral aspect at the same time while those are completely different things.

                                                            1. 16

                                                              The short code snippets that Copilot reproduces from training data are unlikely to reach the threshold of originality.

                                                              The thing is that the core logic of a tricky problem can very well be very little code. Take quicksort for instance. It is super-clever algorithm, yet not much code. Luckily quicksort is not under some license that hinders its use in any setting, but it could very well be. Just because it is only 10 lines, it does not mean it is not an invention that is copy-rightable. Code is very different from written language in that regard.

                                                              1. 16

                                                                Yeah the title is ignoring this important bit. The claim is precisely that Github Copilot can suggest code that DOES exceed the threshold of originality.

                                                                Based on machine learning systems I’ve used, it seems very plausible. There is no guarantee that individual pieces of the training data don’t get output verbatim.

                                                                And in fact I say that’s likely, not just plausible. Several years ago, I worked on a paper called Deep Learning with Differential Privacy that addresses the leakage from the training data -> model -> inferences (which seems to have nearly 2000 citations now). If such things were impossible then there’s no reason to do such research.

                                                                1. 2

                                                                  That was/is still a concern at least about two years ago, because it was one of the topics I was contemplating for my bachelor’s thesis. The hypothesis I was presented was that there (allegedly) is a smooth transition: the better your model the more training data it leaks, and vice versa. Unfortunately, I chose a different topic so I can’t go into detail here.

                                                                  1. 1

                                                                    There is no guarantee that individual pieces of the training data don’t get output verbatim.

                                                                    The funny thing is that humans do that too. They read something and unknowingly reproduce the same thing in wiring as their own w/o bad intents. I think there is a name for that effect, but I fail to find it atm.

                                                                    1. 5

                                                                      I’d say it depends on the scale. Sure in theory it’s possible for a human to type out 1000 lines of code that are identical to what they saw elsewhere, without realizing it, but vanishingly unlikely. They might do that for 10 lines, but not 1000 lines, and honestly not even 100.

                                                                      On the other hand it’s pretty easy to imagine a machine learning system doing that. Computers are fundamentally different than humans in that they can remember stuff exactly … It’s actually harder to make them remember stuff approximately, which is what humans do :)

                                                                  2. 8

                                                                    Quicksort is an algorithm, and isn’t covered under copyright in the first place. A specific implementation might be, but there’s a very limited number of ways you can implement quicksort and usually there’s just one “obvious” way, and that’s usually not covered under copyright either – and neither should it IMO. It would be severely debilitating as it would mean that the first person to come up with a piece of code to make a HTTP request or other trivial stuff would hold copyright over that. Open source code as we know it today would be nigh impossible.

                                                                    1. 6

                                                                      So that means the fast square root example from the Quake engine source code can be copied by anyone w/o adhering to the GPL? It is just an algorithm after all. If that is truly the case, then the GPL is completely useless, since I can start copy & pasting GPL code w/o any repercussions, since it “just an algorithm”.

                                                                      1. 9

                                                                        If your friend looks at the Quake example and describes it to you without actually telling you the code – by describing the algorithm – and you write it in a different language, you are definitely safe.

                                                                        If your friend copies the Quake engine code into a chat message and sends it to you, and you copy it into your codebase and change the variable names to match what you were doing, you are very probably in the wrong.

                                                                        Somewhere in between those two it gets fuzzy.

                                                                        1. 11

                                                                          It looks like my friend copilot is willing to paste the quake version of it directly into my editor verbatim, comments included, without telling me its provenance. If a human did that, it would be problematic.

                                                                        2. 4

                                                                          In the Quake example it copied the (fairly minor) comments too, which is perhaps a bit iffy but a minor detail. There is just one way to express this algorithm: if anyone were to implement this by just having the algorithm described to them but without actually having seen the code then the code would be pretty much identical.

                                                                          I’m not sure if you’re appreciating the implications if it would work any different. Patents on these sort of things are already plenty controversial. Copyright would mean writing any software would have a huge potential for copyright suits and trolls; there’s loads of “only one obvious implementation” code like this, both low-level and high-level. What you’re argueing for would be much much worse than software patents.

                                                                          People seem awfully focused on this Copilot thing at the moment. The Open Source/Free Software movement has spent decades fighting against expansion of copyright and other IP on these kind of things. The main beneficiaries of such an expansion wouldn’t be authors of free software but Microsoft, copyright trolls, and other corporations.

                                                                          1. 2

                                                                            Semi-related, Carmack sorta regrets using the GPL license

                                                                            https://twitter.com/ID_AA_Carmack/status/1412271091393994754

                                                                            1. 5

                                                                              Semi-related, Carmack sorta regrets using the GPL license

                                                                              …in this specific instance.

                                                                              Quote from Carmack in that thread: “I’m still supportive of lots of GPL work, but I don’t think the restrictions helped in this particular case.”

                                                                              I’m not implying you meant to imply otherwise, I’m just adding some context so it’s clearer.

                                                                              1. 1

                                                                                That is fine, I am not the biggest GPL fan myself, yet if I use GPL code/software, I try to adhere to its license

                                                                          2. 5

                                                                            quicksort is not under some license that hinders its use in any setting, but it could very well be

                                                                            Please do not confuse patent law and copyright law. By referring to an algorithm you seem to be alluding to the patent law. And yet you mixed the term “invention” and “copy-rightable”. Please explain what you mean further because as far as I know this is nonsense. As far as I know “programs for computers” can’t be regarded as an invention and therefore patented under the european legal system, this is definitely a case in Poland. This is a separate concept from source code copyright.

                                                                            Maybe your comment is somehow relevant in the USA but I am suspicious.

                                                                          1. 5

                                                                            Ignoring all the other problems the fact that the verification options are on purpose hidden in the UI is extremely harmful and I’ve been ranting about it for years. Instead they should make this a prominent feature and also make it easier to verify someone by using a trusted 3rd party that you both verified. On top of that the verification number is not a real hash - it is imperative to check the entire number and not just the beginning or the end of it as far as I can tell which obviously is a huge problem since personally I just checked the first couple of groups of numbers before I learned that they can stay the same. I think this is a serious problem as well.

                                                                            1. 5

                                                                              The reason some stay the same is because the safety numbers are just a concatenation of your hash and the other party’s, so the overall number is the same on both devices. That simplifies the UX because users have to do one comparison that works both ways instead of two that work one way. Clearly though the fact that you made this mistake means there’s more work needed to improve the UX.

                                                                              I do think the QR code scanning helps a lot with that, if you can use it.

                                                                              1. 2

                                                                                Either way you can make this a single comparison for example by always appending a larger number to a smaller number before hashing them. This is not an argument.

                                                                            1. 9

                                                                              Global rate limit of 100 RPS by IP

                                                                              The bane of anyone using a VPN/connecting from a university campus etc.

                                                                              1. 7

                                                                                This article actually made me consciously aware that I habitually hover my mouse cursor over a link before clicking it, to see what the pop-up at the bottom of the screen says is the real URL. I did this before I clicked the link, like I have apparently trained myself to do with any link over many years of browser usage, and immediately noted that the link was to a different page on this person’s site rather than the wikipedia article. I thought for a moment that the “real” trick might be faking the text in that pop-up, but that wasn’t it.

                                                                                1. 4

                                                                                  I do this as well but you can bypass this with JavaScript, a user will see a different link on hover then the one you navigate to after clicking it.

                                                                                  1. 2

                                                                                    When I’m feeling particularly distrusting (i.e. by default, when not on a site I’m pretty confident won’t be engaging in such sleaziness), I often opt for a right-click (or perhaps even a shift-right-click as I recently learned can be used to bypass right-click interception fuckery), copy the link, and paste it into a new tab as an extra line of defense.

                                                                                1. 1

                                                                                  bit of an odd example because go test has a repeat flag; you can do go test -run TestWhatever -count=1000

                                                                                  1. 7

                                                                                    I find it much easier to deal with a trace which only contain the bad run.

                                                                                    1. 1

                                                                                      And how is what is written in the article different than a 3 line bash script with a loop in it? Why do I need rr? What advantage does rr bring to the table? What is it used for in this article that a bash script can’t solve? Those and many other questions are left unanswered I am afraid.

                                                                                      1. 2

                                                                                        Because it is actually reproducible, or at least closer:

                                                                                        Remember, you’re debugging the recorded trace deterministically; not a live, nondeterministic execution. The replayed execution’s address spaces, register contents, syscall data etc are exactly the same in every run.

                                                                                        1. 1

                                                                                          The rr website explains it pretty well https://rr-project.org/

                                                                                          You record a failure once, then debug the recording, deterministically, as many times as you want. The same execution is replayed every time.

                                                                                    1. 3

                                                                                      error processing BAE7DF4-DDF-3RG-5TY3E3RF456AS10: nil

                                                                                      So you are blaming poor logging or error handling on the type of an identifier used? Sorry but this point is nonsensical enough that I stopped reading right at the beginning of an article.

                                                                                      1. 23

                                                                                        I gues it’s better to leave on your own terms than to get your domain blocked or get kicked out.

                                                                                        The wording on the banner is far from being a friendly advice - I’d call it antagonistic and confrontational, hostile even.

                                                                                        BTW, the code itself has been added last year in this commit.

                                                                                        Ironically, lobste.rs was created by /u/jcs as response to HN heavy-handed moderation.

                                                                                        1. 40

                                                                                          His engagement with lobste.rs was much more polarising than burntsushi. The latter didn’t jump into comment sections to deliberately kick off a flame war that may not have otherwise occurred; the former did so deliberately and unashamedly. I heartily respect both their views but I can understand why they might be moderated differently.

                                                                                          1. 13

                                                                                            Thank you for saying this in a far more polite way than I was about to.

                                                                                            1. 8

                                                                                              And why would that result in banning the domain? Drew wasn’t even the one posting his blog posts here and they were always upvoted.

                                                                                              1. 11

                                                                                                Because many of his posts were explicit flamebait; look at the last two posts on that domain for instance.

                                                                                                1. 2

                                                                                                  Then clearly this community is not what the admin intended it to be before banning this domain because the stories from that domain were routinely getting above 30 points which is rare for most stories. It is time to shut this whole website down and just change it to be a private RSS feed of the admin.

                                                                                                  1. 3

                                                                                                    It’s an attempt to avoid the Repugnant Conclusion; the mere addition of a steady attractor of upvotes can degrade the quality of life for everybody else.

                                                                                                  2. 1

                                                                                                    Did you mean to include the one about a finger server and io_uring as one of the two? I found it interesting and informative.

                                                                                                    1. 5

                                                                                                      I meant what was submitted to Lobsters, which were the final straws,

                                                                                                      1. 2

                                                                                                        Thanks for the clarification. Not sure why I didn’t read it that way.

                                                                                                2. 1

                                                                                                  This was just an example - there’s more in the moderation log if you care to look.

                                                                                                3. 15

                                                                                                  Wow, this ban message from your second link:

                                                                                                  Please go be loudly disappointed in the entire world (and promote sourcehut) somewhere else.

                                                                                                  I really hope that this happened at the end of a process of attempting to politely engage, rather than as the immediate response. That reads like something from a burned-out moderator who needs to take a break.

                                                                                                  1. 26

                                                                                                    This was a sustained pattern of behavior over months.

                                                                                                    1. 2

                                                                                                      That reads like something from a burned-out moderator who needs to take a break.

                                                                                                      Pro tip: moderators are always burnt-out.

                                                                                                    2. 7

                                                                                                      There are two issues here:

                                                                                                      • banning the user
                                                                                                      • banning the domain

                                                                                                      The reason for banning the user account was reported by the admin as apparently rude comments/encouraging arguments/arguing? The comments were usually upvoted though as far as I remember so I think the decision was mostly arbitrary.

                                                                                                      The domain was blocked just because the admin banned the author from lobsters, not because there was something wrong with the content on that website. Drew wasn’t even the one posting his blog posts here.

                                                                                                      Therefore at least one of those decisions is nonsensical.

                                                                                                      You can try to create a website with semi-transparent moderation policies but that will never fix the standard power abuse by moderators like in this situation. The personal grievances usually win and no moderation log will fix this. The community enjoyed the content and @pushcx didn’t => the comments and the domain get nuked off the website.

                                                                                                      I tried to get an answer at least to why the domain was banned but of course I never did (in the name of transparency).

                                                                                                      1. 3

                                                                                                        The reason for banning the user account was reported by the admin as apparently rude comments/encouraging arguments/arguing? The comments were usually upvoted though as far as I remember so I think the decision was mostly arbitrary.

                                                                                                        The domain was blocked just because the admin banned the author from lobsters, not because there was something wrong with the content on that website. Drew wasn’t even the one posting his blog posts here.

                                                                                                        I disagree with your opinion that his behavior on the site was not rude, though I didn’t look closely at all of his posts so I can’t say for certain. What I do agree with is the domain ban. The ban itself seemed unclear and arbitrary. Moreover, as you mentioned, a domain ban affects much more than just a user, it affects all content on that domain.

                                                                                                        1. 1

                                                                                                          Negative comments are deleted when users are banned or leave; you won’t find any of his egregious comments here.

                                                                                                      2. 7

                                                                                                        oh wow, Drew got banned ..

                                                                                                        I don’t like anyone getting banned for anything. I have a lot of respect for how much DeVault puts into his open source contributions and am envious he can live off of it. That being said, he banned me on Mastodon forever ago because I reposted an open letter a professor made during the eight of the 2020 US riots. We had a discussion over DMs and he blocked me in the end.

                                                                                                        The more I lean about some of the stuff he’s said and done, I realize I can still respect his work while still agreeing with all the others who’ve come to the conclusion his actions are often inflammatory or childish. I’m not surprised he’s banned. He left the Fediverse a few months back too.

                                                                                                        1. 12

                                                                                                          Yup. I was actually pretty interested in Sourcehut, but in the end I didn’t really want to use a service run by someone that hot-headed.

                                                                                                          1. 1

                                                                                                            because I reposted an open letter a professor made during the eight of the 2020 US riots. We had a discussion over DMs and he blocked me in the end.

                                                                                                            What was the nature of the letter?

                                                                                                          2. 5

                                                                                                            For my sins I’m tracking every submission to lobste.rs.

                                                                                                            Here’s a gist with an extract of submissions matching ‘drewdevault’ in the URL. I consider a comments/score ration above 1.25 “controversial”.

                                                                                                            Hopefully this can give a sampling of how Devault’s content was received by the community here.

                                                                                                          1. 1

                                                                                                            Oh, I thought we are talking about Babel and I was surprised that they have full time maintainers. http://babel.pocoo.org/en/latest/

                                                                                                            1. 9

                                                                                                              The tone of the letter makes it sound like a typical PR apology written by people not because they think what they did was wrong but because they had to do it due to the public opinion. Reads like a typical apology letter from a large corporation.

                                                                                                              1. 5

                                                                                                                Here is some kind of a follow up from those people: https://www-users.cs.umn.edu/~kjlu/papers/clarifications-hc.pdf

                                                                                                                If you have further concerns, please email us at ​kjlu@umn.edu

                                                                                                                Well guess what, I did. And I encourage everyone to do the same.

                                                                                                                1. 3

                                                                                                                  That’s interesting, apparently they “apologized” half a year ago for what they’d done, and yet here we are in 2021, with them trying it again.

                                                                                                                  1. 1

                                                                                                                    Are you sure this is from half a year ago? The document isn’t dated, and the timestamp in the directory listing is 2021-04-21 (yesterday).

                                                                                                                    (aside: always date your documents people, no matter what it is or where it’s published).

                                                                                                                    1. 3

                                                                                                                      It’s at the top of the document @boreq linked:

                                                                                                                      December 15, 2020

                                                                                                                      1. 2

                                                                                                                        Oh, it looks like the Firefox PDF viewer doesn’t render the metadata; but if I click “view page info” it does say “Modified: 15 December 2020, 22:59:54 GMT+8”.

                                                                                                                1. 7

                                                                                                                  I am curious, is this illegal in some way? They are effectively on purpose introducing bugs or security holes into a ton of computer systems including ones that are run by various government agencies and they admit openly to doing it.

                                                                                                                  1. 7

                                                                                                                    Probably not illegal, but there is no evidence of ethics approval. Chances are they can’t get ethics on it.

                                                                                                                    I’ve spoken to a couple of academics about this case and they can’t quite believe someone is trying to pull this in the name of research.

                                                                                                                    Also, looking at the funding sources they cite, they seem pretty out of bounds on that front:

                                                                                                                    https://nsf.gov/awardsearch/showAward?AWD_ID=1931208 https://nsf.gov/awardsearch/showAward?AWD_ID=1815621

                                                                                                                    1. 5

                                                                                                                      I think it’s borderline. Pen-testing is legal, and it’s generally done “on the sly” but with management’s approval.

                                                                                                                      1. 18

                                                                                                                        I don’t think this is pen-testing, their code reached the stable trees supposedly. Once that happens they actually introduced bugs and security issues and potentially compromised various systems. This is not pen-testing anymore.

                                                                                                                        https://lore.kernel.org/linux-nfs/CADVatmNgU7t-Co84tSS6VW=3NcPu=17qyVyEEtVMVR_g51Ma6Q@mail.gmail.com/

                                                                                                                        1. 1

                                                                                                                          Whether their code reached stable trees is irrelevant to whether or not it’s pen-testing - you can just as easily imagine a pen-tester accidentally leaving a back-door in a system after their contract has expired. Criminal negligence? Yes. Evidence of an unethical practice in the first place? Not in the slightest.

                                                                                                                          Similarly, the researchers said that, as soon as one of their patches was accepted, they would immediately notify the tree maintainer. If they did that, and the maintainer was paying attention, the patch would never make it to a stable tree.

                                                                                                                          Whether someone is ethical or not is completely unrelated to its outcome.

                                                                                                                        2. 2

                                                                                                                          Pentesting comes with contracts and project plans signed by both the tester(s) and the company main stakeholder(s). So, no it’s not at all the same.

                                                                                                                        3. 4

                                                                                                                          Probably not, opensource is “no warranty” all the down.

                                                                                                                          1. 1

                                                                                                                            Almost certainly… For instance the following seems appropriate.

                                                                                                                            18 U.S. Code § 2154 - Production of defective war material, war premises, or war utilities

                                                                                                                            Whoever, when the United States is at war, or in times of national emergency as declared by the President or by the Congress, […] with reason to believe that his act may injure, interfere with, or obstruct the United States or any associate nation in preparing for or carrying on the war or defense activities, willfully makes, constructs, or causes to be made or constructed in a defective manner, or attempts to make, construct, or cause to be made or constructed in a defective manner any war material, war premises or war utilities, or any tool, implement, machine, utensil, or receptacle used or employed in making, producing, manufacturing, or repairing any such war material, war premises or war utilities, shall be fined under this title or imprisoned not more than thirty years, or both

                                                                                                                            Probably also various crimes relating to fraud…

                                                                                                                            1. 8

                                                                                                                              when the United States is at war,

                                                                                                                              Except it’s not, so, this is not appropriated at all.

                                                                                                                              There’s no contract, no relationship, no agreement at all between an opensource contributor and the project they contribute to. At most some sort of contributor agreement that is usually in there only for handling patents. When someone submits a patch they’re making absolutely no legal promises as for the quality of said patch, and this propagates all the way to whoever uses the software. The licenses don’t say THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND for nothing. Sure, the US army or whatever might use Linux, but they do it at their own peril.

                                                                                                                              Now, they might get in trouble for being sketchy about the ethical approval and stuff, but that will only get them in professional trouble at most, like loosing their jobs.

                                                                                                                              1. 3

                                                                                                                                You missed the second half of the disjunction

                                                                                                                                or in times of national emergency as declared by the President or by the Congress,

                                                                                                                                This clause is true… many times over https://en.m.wikipedia.org/wiki/List_of_national_emergencies_in_the_United_States

                                                                                                                                Edit: The US army does not do it at their own peril against actively malicious activities. Civil contracts do not override statutory law, rather the other way around.

                                                                                                                                1. 2

                                                                                                                                  Hmm, yeah, I stand corrected (partially, at least).

                                                                                                                                  However, the law you’re quoting says war stuff or stuff used to make war stuff. I’m not even sure software would qualify as stuff, as described in there. But yeah, I’m less sure they are not screwed now. Also, from the names, they might not be US citizens, which could make things worse.

                                                                                                                                  That said, I’m somewhat skeptical anyone would pursue this kind of legal action.

                                                                                                                                  1. 7

                                                                                                                                    The definition of what’s protected here is really broad. Is the linux kernel used a tool to help operate the telecommunications infrastructure for the company making uniforms for the military? If so it’s protected.

                                                                                                                                    It’s almost like it was written for actual times of war, not this nonsense of a constant 30 national emergencies going on. Blame congress.

                                                                                                                                    I agree it’s unlikely to be prosecuted, unless there is significant damage attributable to the act of sabotage (someone deploys some ransomware to a hospital that exploits something they did, for instance), or someone in power decides that the act of sabotage’s main purpose was actually sabotage not getting papers… If it is prosecuted I also think it’s likely that they’ll find some more minor fraud related crime to actually charge… I just found this one by googling “sabotage, us law”.

                                                                                                                                    1. 3

                                                                                                                                      There’s what the law says (or can be construed to say) and what a court will actually accept. I think a lawyer would have a hard time convincing a jury that a silly research paper was war sabotage.

                                                                                                                                      1. 3

                                                                                                                                        I wish I had your faith in the system. I think a lot of this stuff depends on whether prosecutors choose to make an example of the person. I can’t see that happening here; I very much doubt that the US federal government sees its own power threatened by this irresponsible research. However, if you look at the history, there are examples that I find similarly absurd which did lead to convictions. The differentiating factor seems to not be any genuine legal distinction, but simply whether prosecutors want to go all-out.

                                                                                                                                        Furthermore, the ones the public knows about are the ones that happened in regular courts. Decisions by FISA courts or by military tribunals do not receive the same scrutiny, and thus we must assume the injustice is even greater in those venues.

                                                                                                                                        1. 1

                                                                                                                                          I don’t deny that unjust laws are often enforced, despite jury trial, I just think that in this case it would be pretty unlikely for that to happen.

                                                                                                                                          I think the state/ruling class is more likely to abuse its power when it is threatened, embarassed (journalists, whistleblowers (Wikileaks), minor hackers) or when there is the opportunity to harm an out-group or political opponent (e.g. non-dominant ethnic groups, leftist movements, sometimes extreme right-wing groups); and I don’t think any of those really apply here.

                                                                                                                                          1. 2

                                                                                                                                            I apologize for the belated reply. I do agree with all of that.

                                                                                                                                        2. 1

                                                                                                                                          Feels like a case ripe for independent reinvention of jury nullification.

                                                                                                                            1. 50

                                                                                                                              The paper has this to say (page 9):

                                                                                                                              Regarding potential human research concerns. This experiment studies issues with the patching process instead of individual behaviors, and we do not collect any personal information. We send the emails to the Linux community and seek their feedback. The experiment is not to blame any maintainers but to reveal issues in the process. The IRB of University of Minnesota reviewed the procedures of the experiment and determined that this is not human research. We obtained a formal IRB-exempt letter.

                                                                                                                              [..]

                                                                                                                              Honoring maintainer efforts. The OSS communities are understaffed, and maintainers are mainly volunteers. We respect OSS volunteers and honor their efforts. Unfortunately, this experiment will take certain time of maintainers in reviewing the patches. To minimize the efforts, (1) we make the minor patches as simple as possible (all of the three patches are less than 5 lines of code changes); (2) we find three real minor issues (i.e., missing an error message, a memory leak, and a refcount bug), and our patches will ultimately contribute to fixing them.

                                                                                                                              I’m not familiar with the generally accepted standards on these kind of things, but this sounds rather iffy to me. I’m very far removed from academia, but I’ve participated in a few studies over the years, which were always just questionaries or interviews, and even for those I had to sign a consent waiver. “It’s not human research because we don’t collect personal information” seems a bit strange.

                                                                                                                              Especially since the wording “we will have to report this, AGAIN, to your university” implies that this isn’t the first time this has happened, and that the kernel folks have explicitly objected to being subject to this research before this patch.

                                                                                                                              And trying to pass off these patches as being done in good faith with words like “slander” is an even worse look.

                                                                                                                              1. 78

                                                                                                                                They are experimenting on humans, involving these people in their research without notice or consent. As someone who is familiar with the generally accepted standards on these kinds of things, it’s pretty clear-cut abuse.

                                                                                                                                1. 18

                                                                                                                                  I would agree. Consent is absolutely essential but just one of many ethical concerns when doing research. I’ve seen simple usability studies be rejected due to lesser issues.

                                                                                                                                  It’s pretty clear this is abuse.. the kernel team and maintainers feel strongly enough to ban the whole institution.

                                                                                                                                  1. 10

                                                                                                                                    Yeah, agreed. My guess is they misrepresented the research to the IRB.

                                                                                                                                    1. 3

                                                                                                                                      They are experimenting on humans

                                                                                                                                      This project claims to be targeted at the open-source review process, and seems to be as close to human experimentation as pentesting (which, when you do social engineering, also involves interacting with humans, often without their notice or consent) - which I’ve never heard anyone claim is “human experimentation”.

                                                                                                                                      1. 19

                                                                                                                                        A normal penetration testing gig is not academic research though. You need to separate between the two, and also hold one of them to a higher standard.

                                                                                                                                        1. 0

                                                                                                                                          A normal penetration testing gig is not academic research though. You need to separate between the two, and also hold one of them to a higher standard.

                                                                                                                                          This statement is so vague as to be almost meaningless. In what relevant ways is a professional penetration testing contract (or, more relevantly, the associated process) different from this particular research project? Which of the two should be held to a higher standard? Why? What does “held to a higher standard” even mean?

                                                                                                                                          Moreover, that claim doesn’t actually have anything to do with the comment I was replying to, which was claiming that this project was “experimenting on humans”. It doesn’t matter whether or not something is “research” or “industry” for the purposes of whether or not it’s “human experimentation” - either it is, or it isn’t.

                                                                                                                                          1. 18

                                                                                                                                            Resident pentester and ex-academia sysadmin checking in. I totally agree with @Foxboron and their statement is not vague nor meaningless. Generally in a penetration test I am following basic NIST 800-115 guidance for scoping and target selection and then supplement contractual expectations for my clients. I can absolutely tell you that the methodologies that are used by academia should be held to a higher standard in pretty much every regard I could possibly come up with. A penetration test does not create a custom methodology attempting do deal with outputting scientific and repeatable data.

                                                                                                                                            Let’s put it in real terms, I am hired to do a security assessment in a very fixed highly focused set of targets explicitly defined in contract by my client in an extremely fixed time line (often very short… like 2 weeks maximum and 5 day average). Guess what happens if social engineering is not in my contract? I don’t do it.

                                                                                                                                            1. 1

                                                                                                                                              Resident pentester and ex-academia sysadmin checking in.

                                                                                                                                              Note: this is worded like an appeal to authority, although you probably don’t mean it that way, so I’m not going to act like you are.

                                                                                                                                              I totally agree with @Foxboron and their statement is not vague nor meaningless.

                                                                                                                                              Those are two completely separate things, and neither is implied by the other.

                                                                                                                                              their statement is not vague nor meaningless.

                                                                                                                                              Not true - their statement contained none of the information you just provided, nor any other sort of concrete or actionable information - the statement “hold to a higher standard” is both vague and meaningless by itself…and it was by itself in that comment (or, obviously, there were other words - none of them relevant) - there was no other information.

                                                                                                                                              the methodologies that are used by academia should be held to a higher standard

                                                                                                                                              Now you’re mixing definitions of “higher standard” - GP and I were talking about human experimentation and ethics, while you seem to be discussing rigorousness and reproducibility of experiments (although it’s not clear, because “A penetration test does not create a custom methodology attempting do deal with outputting scientific and repeatable data” is slightly ambiguous).

                                                                                                                                              None of the above is relevant to the question of “was this a human experiment” and the closely-related one “is penetration testing a human experiment”. Evidence suggests “no” given that the term does not appear in that document, nor have I heard of any pentest being reviewed by an ethics review board, nor have I heard any mention of “human experimenting” in the security community (including when gray-hat and black-hat hackers and associated social engineering e.g. Kevin Mitnick are mentioned), nor are other similar, closer-to-human experimentation (e.g. A/B testing, which is far closer to actually experimenting on people) processes considered to be such - up until this specific case.

                                                                                                                                            2. 5

                                                                                                                                              if you’re an employee in an industry, you’re either informed of penetration testing activity, or you’ve at the very least tacitly agreed to it along with many other things that exist in employee handbooks as a condition of your employment.

                                                                                                                                              if a company did this to their employees without any warning, they’d be shitty too, but the possibility that this kind of underhanded behavior in research could taint the results and render the whole exercise unscientific is nonzero.

                                                                                                                                              either way, the goals are different. research seeks to further the verifiability and credibility of information. industry seeks to maximize profit. their priorities are fundamentally different.

                                                                                                                                              1. 1

                                                                                                                                                you’ve at the very least tacitly agreed to it along with many other things that exist in employee handbooks as a condition of your employment

                                                                                                                                                By this logic, you’ve also agreed to everything else in a massive, hundred-page long EULA that you click “I agree” on, as well as consent to be tracked by continuing to use a site that says that in a banner at the bottom, as well as consent to Google/companies using your data for whatever they want and/or selling it to whoever will buy.

                                                                                                                                                …and that’s ignoring whether or not companies that have pentesting done on them actually explicitly include that specific warning in your contract - “implicit” is not good enough, as then anyone can claim that, as a Linux kernel patch reviewer, you’re “implicitly agreeing that you may be exposed to the risk of social engineering for the purpose of getting bad code into the kernel”.

                                                                                                                                                the possibility that this kind of underhanded behavior in research could taint the results and render the whole exercise unscientific

                                                                                                                                                Like others, you’re mixing up the issue of whether the experiment was properly-designed with the issue of whether it was human experimentation. I’m not making any attempt to argue the former (because I know very little about how to do good science aside from “double-blind experiments yes, p-hacking no”), so I don’t know why you’re arguing against it in a reply to me.

                                                                                                                                                either way, the goals are different. research seeks to further the verifiability and credibility of information. industry seeks to maximize profit. their priorities are fundamentally different.

                                                                                                                                                I completely agree that the goals are different - but again, that’s irrelevant for determining whether or not something is “human experimentation”. Doesn’t matter what the motive is, experimenting on humans is experimenting on humans.

                                                                                                                                          2. 18

                                                                                                                                            This project claims to be targeted at the open-source review process, and seems to be as close to human experimentation as pentesting (which, when you do social engineering, also involves interacting with humans, often without their notice or consent) - which I’ve never heard anyone claim is “human experimentation”.

                                                                                                                                            I had a former colleague that once bragged about getting someone fired at his previous job during a pentesting exercise. He basically walked over to this frustrated employee at a bar, bribed him a ton of money and gave a job offer in return for plugging a usb key into the network. He then reported it to senior management and the employee was fired. While that is an effective demonstration of a vulnerability in their organization, what he did was unethical under many moral frameworks.

                                                                                                                                            1. 2

                                                                                                                                              First, the researchers didn’t engage in any behavior remotely like this.

                                                                                                                                              Second, while indeed an example of pentesting, most pentesting is not like this.

                                                                                                                                              Third, the fact that it was “unethical under many moral frameworks” is irrelevant to what I’m arguing, which is that the study was not “human experimentation”. You can steal money from someone, which is also “unethical under many moral frameworks”, and yet still not be doing “human experimentation”.

                                                                                                                                            2. 3

                                                                                                                                              If there is a pentest contract, then there is consent, because consent is one of the pillars of contract law.

                                                                                                                                              1. 1

                                                                                                                                                That’s not an argument that pentesting is human experimentation in the first place.

                                                                                                                                          3. 42

                                                                                                                                            The statement from the UMinn IRB is in line with what I heard from the IRB at the University of Chicago after they experimented on me, who said:

                                                                                                                                            I asked about their use of any interactions, or use of information about any individuals, and they indicated that they have not and do not use any of the data from such reporting exchanges other than tallying (just reports in aggregate of total right vs. number wrong for any answers received through the public reporting–they said that much of the time there is no response as it is a public reporting system with no expectation of response) as they are not interested in studying responses, they just want to see if their tool works and then also provide feedback that they hope is helpful to developers. We also discussed that they have some future studies planned to specifically study individuals themselves, rather than the factual workings of a tool, that have or will have formal review.

                                                                                                                                            They because claim they’re studying the tool, it’s OK to secretly experiment on random strangers without disclosure. Somehow I doubt they test new drugs by secretly dosing people and observing their reactions, but UChicago’s IRB was 100% OK with doing so to programmers. I don’t think these IRBs literally consider programmers sub-human, but it would be very inconvenient to accept that experimenting on strangers is inappropriate, so they only want to do so in places they’ve been forced to by historical abuse. I’d guess this will continue for years until some random person is very seriously harmed by being experimented on (loss of job/schooling, pushing someone unstable into self-harm, targeting someone famous outside of programming) and then over the next decade IRBs will start taking it seriously.

                                                                                                                                            One other approach that occurs to me is that the experimenters and IRBs claim they’re not experimenting on their subjects. That’s obviously bullshit because the point of the experiment is to see how the people respond to the treatment, but if we accept the lie it leaves an open question: what is the role played by the unwitting subject? Our responses are tallied, quoted, and otherwise incorporated into the results in the papers. I’m not especially familiar with academic publishing norms, but perhaps this makes us unacknowledged co-authors. So maybe another route to stopping experimentation like this would be things like claiming copyright over the papers, asking journals for the papers to be retracted until we’re credited, or asking the universities to open academic misconduct investigations over the theft of our work. I really don’t have the spare attention for this, but if other subjects wanted to start the ball rolling I’d be happy to sign on.

                                                                                                                                            1. 23

                                                                                                                                              I can kind of see where they’re coming from. If I want to research if car mechanics can reliably detect some fault, then sending a prepared car to 50 garages is probably okay, or at least a lot less iffy. This kind of (informal) research is actually fairly commonly by consumer advocacy groups and the like. The difference is that the car mechanics will get paid for their work where as the Linux devs and you didn’t.

                                                                                                                                              I’m gonna guess the IRBs probably aren’t too familiar with the dynamics here, although the researchers definitely were and should have known better.

                                                                                                                                              1. 18

                                                                                                                                                Here it’s more like keying someone’s car to see how quick it takes them to get an insurance claim.

                                                                                                                                                1. 4

                                                                                                                                                  Am I misreading? I thought the MR was a patch designed to fix a potential problem, and the issue was

                                                                                                                                                  1. pushcx thought it wasn’t a good fix (making it a waste of time)
                                                                                                                                                  2. they didn’t disclose that it was an auto-generated PR.

                                                                                                                                                  Those are legitimate complaints, c.f. https://blog.regehr.org/archives/2037, but from the analogies employed (drugs, dehumanization, car-keying), I have to double-check that I haven’t missed an aspect of the interaction that makes it worse than it seemed to me.

                                                                                                                                                  1. 2

                                                                                                                                                    We were talking about Linux devs/maintainers too, I commented on that part.

                                                                                                                                                    1. 1

                                                                                                                                                      Gotcha. I missed that “here” was meant to refer to the Linux case, not the Lobsters case from the thread.

                                                                                                                                                2. 1

                                                                                                                                                  Though there they are paying the mechanic.

                                                                                                                                                3. 18

                                                                                                                                                  IRB is a regulatory board that is there to make sure that researchers follow the (Common Rule)[https://www.hhs.gov/ohrp/regulations-and-policy/regulations/common-rule/index.html].

                                                                                                                                                  In general, any work that receives federal funding needs to comply with the federal guidelines for human subject research. All work involving human subjects (usually defined as research activities that involve interaction with humans) need to be reviewed and approved by the institution IRB. These approvals fall within a continuum, from a full IRB review (which involve the researcher going to a committee and explaining their work and usually includes continued annual reviews) to a declaration of the work being exempt from IRB supervision (usually this happens when the work meets one of the 7 exemptions listed in the federal guidelines). The whole process is a little bit more involved, see for example (all the charts)[https://www.hhs.gov/ohrp/regulations-and-policy/decision-charts/index.html] to figure this out.

                                                                                                                                                  These rules do not cover research that doesn’t involve humans, such as research on technology tools. I think that there is currently a grey area where a researcher can claim that they are studying a tool and not the people interacting with the tool. It’s a lame excuse that probably goes around the spirit of the regulations and is probably unethical from a research stand point. The data aggregation method or the data anonymization is usually a requirement for an exempt status and not a non-human research status.

                                                                                                                                                  The response that you received from IRB is not surprising, as they probably shouldn’t have approved the study as non-human research but now they are just protecting the institution from further harm rather than protecting you as a human subject in the research (which, by the way, is not their goal at this point).

                                                                                                                                                  One thing that sticks out to me about your experience is that you weren’t asked to give consent to participate in the research. That usually requires a full IRB review as informed consent is a requirement for (most) human subject research. Exempt research still needs informed consent unless it’s secondary data analysis of existing data (which your specific example doesn’t seem to be).

                                                                                                                                                  One way to quickly fix it is to contact the grant officer that oversees the federal program that is funding the research. A nice email stating that you were coerced to participate in the research study by simply doing your work (i.e., review a patch submitted to a project that you lead) without being given the opportunity to provide prospective consent and without receiving compensation for your participation and that the research team/university is refusing to remove your data even after you contacted them because they claim that the research doesn’t involve human subjects can go a long way to force change and hit the researchers/university where they care the most.

                                                                                                                                                  1. 7

                                                                                                                                                    Thanks for explaining more of the context and norms, I appreciate the introduction. Do you know how to find the grant officer or funding program?

                                                                                                                                                    1. 7

                                                                                                                                                      It depends on how “stalky” you want to be.

                                                                                                                                                      If NSF was the funder, they have a public search here: https://nsf.gov/awardsearch/

                                                                                                                                                      Most PIs also add a line about grants received to their CVs. You should be able to match the grant title to the research project.

                                                                                                                                                      If they have published a paper from that work, it should probably include an award number.

                                                                                                                                                      Once you have the award number, you can search the funder website for it and you should find a page with the funding information that includes the program officer/manager contact information.

                                                                                                                                                      1. 3

                                                                                                                                                        If they published a paper about it they likely included the grant ID number in the acknowledgements.

                                                                                                                                                        1. 1

                                                                                                                                                          You might have more luck reaching out to the sponsored programs office at their university, as opposed to first trying to contact an NSF program officer.

                                                                                                                                                      2. 4

                                                                                                                                                        How about something like a an Computer Science - External Review Board? Open source projects could sign up, and include a disclaimer that their project and community ban all research that hasn’t been approved. The approval process could be as simple as a GitHub issue the researcher has to open, and anyone in the community could review it.

                                                                                                                                                        It wouldn’t stop the really bad actors, but any IRB would have to explain why they allowed an experiment on subjects that explicitly refused consent.

                                                                                                                                                        [Edit] I felt sufficiently motivated, so I made a quick repo for the project . Suggestions welcome.

                                                                                                                                                        1. 7

                                                                                                                                                          I’m in favor of building our own review boards. It seems like an important step in our profession taking its reponsibility seriously.

                                                                                                                                                          The single most important thing I’d say is, be sure to get the scope of the review right. I’ve looked into this before and one of the more important limitations on IRBs is that they aren’t allowed to consider the societal consequences of the research succeeding. They’re only allowed to consider harm to experimental subjects. My best guess is that it’s like that because that’s where activists in the 20th-century peace movement ran out of steam, but it’s a wild guess.

                                                                                                                                                          1. 4

                                                                                                                                                            At least in security, there are a lot of different Hacker Codes of Ethics floating around, which pen testers are generally expected to adhere to… I don’t think any of them cover this specific scenario though.

                                                                                                                                                            1. 2

                                                                                                                                                              any so-called “hacker code of ethics” in use by any for-profit entity places protection of that entity first and foremost before any other ethical consideration (including human rights) and would likely not apply in a research scenario.

                                                                                                                                                        2. 23

                                                                                                                                                          They are bending the rules for non human research. One of the exceptions for non-human research is research on organization, which my IRB defines as “Information gathering about organizations, including information about operations, budgets, etc. from organizational spokespersons or data sources. Does not include identifiable private information about individual members, employees, or staff of the organization.” Within this exception, you can talk with people about how the organization merges patches but not how they personally do that (for example). All the questions need to be about the organization and not the individual as part of the organization.

                                                                                                                                                          On the other hand, research involving human subjects is defined as any research activity that involves an “individual who is or becomes a participant in research, either:

                                                                                                                                                          • As a recipient of a test article (drug, biologic, or device); or
                                                                                                                                                          • As a control.”

                                                                                                                                                          So, this is how I interpret what they did.

                                                                                                                                                          The researchers submitted an IRB approval saying that they just downloaded the kernel maintainer mailing lists and analyzed the review process. This doesn’t meet the requirements for IRB supervision because it’s either (1) secondary data analysis using publicly available data and (2) research on organizational practices of the OSS community after all identifiable information is removed.

                                                                                                                                                          Once they started emailing the list with bogus patches (as the maintainers allege), the research involved human subjects as these people received a test article (in the form of an email) and the researchers interacted with them during the review process. The maintainers processing the patch did not do so to provide information about their organization’s processes and did so in their own personal capacity (In other words, they didn’t ask them how does the OSS community processes this patch but asked them to process a patch themselves). The participants should have given consent to participate in the research and the risks of participating in it should have been disclosed, especially given the fact that missing a security bug and agreeing to merge it could be detrimental to someone’s reputation and future employability (that is, this would qualify for more than minimal risk for participants, requiring a full IRB review of the research design and process) with minimal benefits to them personally or to the organization as a whole (as it seems from the maintainers’ reaction to a new patch submission).

                                                                                                                                                          One way to design this experiment ethically would have been to email the maintainers and invite them to participate in a “lab based” patch review process where the research team would present them with “good” and “bad” patches and ask them whether they would have accepted them or not. This is after they were informed about the study and exercised their right to informed consent. I really don’t see how emailing random stuff out and see how people interact with it (with their full name attached to it and in full view of their peers and employers) can qualify as research with less than minimal risks and that doesn’t involve human subjects.

                                                                                                                                                          The other thing that rubs me the wrong way is that they sought (and supposedly received) retroactive IRB approval for this work. That wouldn’t fly with my IRB, as my IRB person would definitely rip me a new one for seeking retroactive IRB approval for work that is already done, data that was already collected, and a paper that is already written and submitted to a conference.

                                                                                                                                                          1. 6

                                                                                                                                                            You make excellent points.

                                                                                                                                                            1. IRB review has to happen before the study is started. For NIH, the grant application has to have the IRB approval - even before a single experiment is even funded to be done, let alone actually done.
                                                                                                                                                            2. I can see the value of doing a test “in the field” so as to get the natural state of the system. In a lab setting where the participants know they are being tested, various things will happen to skew results. The volunteer reviewers might be systematically different from the actual population of reviewers, the volunteers may be much more alert during the experiment and so on.

                                                                                                                                                            The issue with this study is that there was no serious thought given to what are the ethical ramifications of this are.

                                                                                                                                                            If the pen tested system has not asked to be pen tested then this is basically a criminal act. Otherwise all bank robbers could use the “I was just testing the security system” defense.

                                                                                                                                                            1. 8

                                                                                                                                                              The same requirement for prior IRB approval is necessary for NSF grants (which the authors seem to have received). By what they write in the paper and my interpretation of the circumstances, they self certified as conducting non-human research at time of submitting the grant and only asked their IRB for confirmation after they wrote the paper.

                                                                                                                                                              Totally agree with the importance of “field experiment” work and that, sometimes, it is not possible to get prospective consent to participate in the research activities. However, the guidelines are clear on what activities fall within research activities that are exempt from prior consent. The only one that I think is applicable to this case is exception 3(ii):

                                                                                                                                                              (ii) For the purpose of this provision, benign behavioral interventions are brief in duration, harmless, painless, not physically invasive, not likely to have a significant adverse lasting impact on the subjects, and the investigator has no reason to think the subjects will find the interventions offensive or embarrassing. Provided all such criteria are met, examples of such benign behavioral interventions would include having the subjects play an online game, having them solve puzzles under various noise conditions, or having them decide how to allocate a nominal amount of received cash between themselves and someone else.

                                                                                                                                                              These usually cover “simple” psychology experiments involving mini games or economics games involving money.

                                                                                                                                                              In the case of this kernel patching experiment, it is clear that this experiment doesn’t meet this requirement as participants have found this intervention offensive or embarrassing, to the point that they are banning the researchers’ institution from pushing patched to the kernel. Also, I am not sure if reviewing a patch is a “benign game” as this is the reviewers’ jobs, most likely. Plus, the patch review could have adverse lasting impact on the subject if they get asked to stop reviewing patches if they don’t catch the security risk (e.g., being deemed imcompetent).

                                                                                                                                                              Moreover, there is this follow up stipulation:

                                                                                                                                                              (iii) If the research involves deceiving the subjects regarding the nature or purposes of the research, this exemption is not applicable unless the subject authorizes the deception through a prospective agreement to participate in research in circumstances in which the subject is informed that he or she will be unaware of or misled regarding the nature or purposes of the research.

                                                                                                                                                              As their patch submission process was deceptive in nature, as their outline in the paper, exemption 3(ii) cannot apply to this work unless they notify maintainers that they will be participating in a deceptive research study about kernel patching.

                                                                                                                                                              That leaves the authors to either pursue full IRB review for their work (as a full IRB review can approve a deceptive research project if it deems it appropriate and the risk/benefit balance is in favor to the participants) or to self-certify as non-human subjects research and fix any problems later. They decided to go with the latter.

                                                                                                                                                          2. 35

                                                                                                                                                            We believe that an effective and immediate action would be to update the code of conduct of OSS, such as adding a term like “by submitting the patch, I agree to not intend to introduce bugs.”

                                                                                                                                                            I copied this from that paper. This is not research, anyone who writes a sentence like this with a straight face is a complete moron and is just mocking about. I hope all of this will be reported to their university.

                                                                                                                                                            1. 18

                                                                                                                                                              It’s not human research because we don’t collect personal information

                                                                                                                                                              I yelled bullshit so loud at this sentence that it woke up the neighbors’ dog.

                                                                                                                                                              1. 2

                                                                                                                                                                Yeah, that came from the “clarifiactions” which is garbage top to bottom. They should have apologized, accepted the consequences and left it at that. Here’s another thing they came up with in that PDF:

                                                                                                                                                                Suggestions to improving the patching process In the paper, we provide our suggestions to improve the patching process.

                                                                                                                                                                • OSS projects would be suggested to update the code of conduct, something like “By submitting the patch, I agree to not intend to introduce bugs”

                                                                                                                                                                i.e. people should say they won’t do exactly what we did.

                                                                                                                                                                They acted in bad faith, skirted IRB through incompetence (let’s assume incompetence and not malice) and then act surprised.

                                                                                                                                                              2. 14

                                                                                                                                                                Apparently they didn’t ask the IRB about the ethics of the research until the paper was already written: https://www-users.cs.umn.edu/~kjlu/papers/clarifications-hc.pdf

                                                                                                                                                                Throughout the study, we honestly did not think this is human research, so we did not apply for an IRB approval in the beginning. We apologize for the raised concerns. This is an important lesson we learned—Do not trust ourselves on determining human research; always refer to IRB whenever a study might be involving any human subjects in any form. We would like to thank the people who suggested us to talk to IRB after seeing the paper abstract.

                                                                                                                                                                1. 14

                                                                                                                                                                  I don’t approve of researchers YOLOing IRB protocols, but I also want this research done. I’m sure many people here are cynical/realistic enough that the results of this study aren’t surprising. “Of course you can get malicious code in the kernel. What sweet summer child thought otherwise?” But the industry as a whole proceeds largely as if that’s not the case (or you could say that most actors have no ability to do anything about the problem). Heighten the contradictions!

                                                                                                                                                                  There are some scary things in that thread. It sounds as if some of the malicious patches reached stable, which suggests that the author mostly failed by not being conservative enough in what they sent. Or for instance:

                                                                                                                                                                  Right, my guess is that many maintainers failed in the trap when they saw respectful address @umn.edu together with commit message saying about “new static analyzer tool”.

                                                                                                                                                                  1. 17

                                                                                                                                                                    I agree, while this is totally unethical, it’s very important to know how good the review processes are. If one curious grad student at one university is trying it, you know every government intelligence department is trying it.

                                                                                                                                                                    1. 8

                                                                                                                                                                      I entirely agree that we need research on this topic. There’s better ways of doing it though. If there aren’t better ways of doing it, then it’s the researcher’s job to invent them.

                                                                                                                                                                    2. 7

                                                                                                                                                                      It sounds as if some of the malicious patches reached stable

                                                                                                                                                                      Some patches from this University reached stable, but it’s not clear to me that those patches also introduced (intentional) vulnerabilities; the paper explicitly mentions the steps that they’re taking steps to ensure those patches don’t reach stable (I omitted that part, but it’s just before the part I cited)

                                                                                                                                                                      All umn.edu are being reverted, but at this point it’s mostly a matter of “we don’t trust these patches and will need additional review” rather than “they introduced security vulnerabilities”. A number of patches already have replies from maintainers indicating they’re genuine and should not be reverted.

                                                                                                                                                                      1. 5

                                                                                                                                                                        Yes, whether actual security holes reached stable or not is not completely clear to me (or apparently to maintainers!). I got that impression from the thread, but it’s a little hard to say.

                                                                                                                                                                        Since the supposed mechanism for keeping them from reaching stable is conscious effort on the part of the researchers to mitigate them, I think the point may still stand.

                                                                                                                                                                        1. 1

                                                                                                                                                                          It’s also hard to figure out what the case is since there is no clear answer what the commits where, and where they are.

                                                                                                                                                                      2. 4

                                                                                                                                                                        The Linux review process is so slow that it’s really common for downstream folks to grab under-review patches and run with them. It’s therefore incredibly irresponsible to put patches that you know introduce security vulnerabilities into this form. Saying ‘oh, well, we were going to tell people before they were deployed’ is not an excuse and I’d expect it to be a pretty clear-cut violation of the Computer Misuse Act here and equivalent local laws elsewhere. That’s ignoring the fact that they were running experiments on people without their consent.

                                                                                                                                                                        I’m pretty appalled the Oakland accepted the paper for publication. I’ve seen paper rejected from there before because they didn’t have appropriate ethics review oversite.

                                                                                                                                                                    1. 9

                                                                                                                                                                      All I gathered from this blog post was “OpenSSL has incomprehensible error codes or the entire cert ecosystem is too complicated”.

                                                                                                                                                                      1. 20

                                                                                                                                                                        Correction: “OpenSSL has incomprehensible error codes AND the entire cert ecosystem is too complicated”

                                                                                                                                                                        I’m currently trying to figure out why connections between older stunnel/openssl versions and newer versions of the same software aren’t working. My current hypothesis is that the certificates used are “invalid” according to the newer versions, and because of this they refuse to use them as client certificates - but they do this silently, so the other end just sees a connection with no client certificate. Yum yum.

                                                                                                                                                                        1. 3

                                                                                                                                                                          While the cert ecosystem is complicated, openssl’s bad errors are what make it incomprehensible I think. I’ve spent a fair amount of time debugging TLS in different situations. OpenSSL and stunnel was sufficiently opaque and hard to debug that we ended up replacing it entirely with a version written in Go, which has a TLS stack that actually gives half-reasonable error messages.

                                                                                                                                                                          1. 3

                                                                                                                                                                            Absolute shot in the dark but how long are your keys? OpenSSL recently started erroring out when asked to use short keys, and that messed me up for a while. 2048 bit minimum for RSA, don’t know about any of the others off the top of my head. My code didn’t fail silently, but I was using Python and for all I know I only ever saw error messages because of that. Feel free to message me if you hit a dead end or just want to chat, I can’t promise I can help but happy to try.

                                                                                                                                                                            1. 3

                                                                                                                                                                              That’s one possible problem, thanks for the suggestion! One part of the system uses 1024 bit RSA keys, I think.

                                                                                                                                                                              Finding out about this kind of requirement seems to be on the level of “oh, I saw a comment on a Stack Overflow post about something remotely related”… Perhaps I just don’t know where to look.

                                                                                                                                                                          2. 2

                                                                                                                                                                            I actually ran into this problem last week and my takeaway was that Google’s server expects a server name indicator (SNI) in https requests; don’t know how familiar you are with TLS, but SNI can be sent by the client during negotiation to indicate which certificate the server should use. Handy for servers that host multiple domains and need to know which certificate to present before they receive a Host header. Anyway, if Google doesn’t get SNI it apparently falls back to a self-signed certificate that has this message buried in it, since it doesn’t know to use the www.google.com certificate or whatever.

                                                                                                                                                                            Edit: None of that actually justifies this outcome. Google’s doing something weird and nonstandard to draw attention to what it perceives as a defect (and probably 99% of the time, they’re right), because there’s no official way to raise the error they want to raise. How much of that is on Google and how much of that is on the ecosystem is debatable, but it creates a headache when the solution to “The server uses a self-signed certificate!” is “Send SNI in your client”, and also there’s no good way to look this up.

                                                                                                                                                                            1. 5

                                                                                                                                                                              In the age of cloud/CDNs everywhere, it’s safest to treat SNI as a hard requirement. Take Cloudfront as an example:

                                                                                                                                                                              % openssl s_client -connect cf.feitsui.com:443
                                                                                                                                                                              CONNECTED(00000006)
                                                                                                                                                                              4559363692:error:1400410B:SSL routines:CONNECT_CR_SRVR_HELLO:wrong version number:/AppleInternal/BuildRoot/Library/Caches/com.apple.xbs/Sources/libressl/libressl-47.140.1/libressl-2.8/ssl/ssl_pkt.c:386:
                                                                                                                                                                              ---
                                                                                                                                                                              no peer certificate available
                                                                                                                                                                              ---
                                                                                                                                                                              No client certificate CA names sent
                                                                                                                                                                              ---
                                                                                                                                                                              SSL handshake has read 5 bytes and written 0 bytes
                                                                                                                                                                              ---
                                                                                                                                                                              
                                                                                                                                                                              % openssl s_client -connect cf.feitsui.com:443 -servername cf.feitsui.com
                                                                                                                                                                              CONNECTED(00000006)
                                                                                                                                                                              depth=4 C = US, O = "Starfield Technologies, Inc.", OU = Starfield Class 2 Certification Authority
                                                                                                                                                                              verify return:1
                                                                                                                                                                              depth=3 C = US, ST = Arizona, L = Scottsdale, O = "Starfield Technologies, Inc.", CN = Starfield Services Root Certificate Authority - G2
                                                                                                                                                                              verify return:1
                                                                                                                                                                              depth=2 C = US, O = Amazon, CN = Amazon Root CA 1
                                                                                                                                                                              verify return:1
                                                                                                                                                                              depth=1 C = US, O = Amazon, OU = Server CA 1B, CN = Amazon
                                                                                                                                                                              verify return:1
                                                                                                                                                                              depth=0 CN = *.cloudping.cloud
                                                                                                                                                                              verify return:1
                                                                                                                                                                              
                                                                                                                                                                              (...)
                                                                                                                                                                              

                                                                                                                                                                              It’s only really ancient clients that don’t support SNI - think IE on XP and Android 1, maybe? As a result you find SNI is often a requirement or CDNs give the option to pay extra for the dedicated IP you need for non-SNI connections. I know Cloudfront charges $600 a month for dedicated IPs/SSL certificates, and I know others (Fastly, Cloudflare, etc.) charge as well.

                                                                                                                                                                              And your server would have to be dangerously old (think “pre-dating TLS”) to not support it.

                                                                                                                                                                              1. 3

                                                                                                                                                                                Android got SNI support around 2011 in versions 3 and later. Internet Explorer on Windows XP would have been the last holdout. I can’t imagine either of those can effectively use the internet today, especially given they’re both TLS 1.0 only clients and many servers require TLS 1.2 or later; at least anything under PCI-DSS scope.

                                                                                                                                                                          1. 23

                                                                                                                                                                            Openwrt all the way

                                                                                                                                                                            1. 2

                                                                                                                                                                              I’ve used openwrt in the past for single router/AP setups, but as far as I’m aware for larger properties it wouldn’t be enough, unless I’m misunderstanding something. Is it possible to use OpenWRT with multiple APs?

                                                                                                                                                                              1. 3

                                                                                                                                                                                It is possible, either as an 802.11s mesh or with a number of wired access points set up in bridge mode. I’m currently using the latter and it works fine.

                                                                                                                                                                              2. 2

                                                                                                                                                                                Same here, openwrt as main router and few dumb ap for wireless

                                                                                                                                                                                1. 3

                                                                                                                                                                                  What do you have for a dumb AP?

                                                                                                                                                                                  I’m in the market for something that I can broadcast two ssids (guest and home) and have them on separate vlans.

                                                                                                                                                                                2. 2

                                                                                                                                                                                  With what kind of hardware?

                                                                                                                                                                                  1. 3

                                                                                                                                                                                    Not the OP, but in my case a NetGear R7800. Does 802.11ac, has dual radios so you can run 2.4GHz & 5Ghz simultaneously. 4+1 gigabit ethernet ports with a half decent switch behind them that can do tagged vlans.

                                                                                                                                                                                    1. 1

                                                                                                                                                                                      I’m still using an old tplink archer c7. Probably gonna do an upgrade in the next year or so to get wifi 6. Pretty sure it was something like $80 back in 2014 or 2015.

                                                                                                                                                                                      1. 1

                                                                                                                                                                                        Not the OP, but I use a Linksys WRT1900ACS. A tad pricy, or was when I got it, but the wifi is good, has native support for OpenWRT, and and it’s fast enough to handle gigabit fiber.