1. 1

    Are there modern systems meeting these requirements?

    1. 1

      may be ClickHouse ? https://clickhouse.com/docs/en/

      From commercial options: Vertica.

      I know, also, that KDB+ is used a lot in fintech.

      All of the systems that support time series aggregations, that I am aware of, use columnar oriented storage model (not optimal when a portion of the document/row is updated… for those row-oriented storage is much more optimal). Some use what they call ‘hybrid’ storage model (I think SAP HANA is one of those)

      So if you search for ‘columnar oriented’ or ‘columnar compression’ + distributed – I am sure you will see a few options in this space.

      I terms of bringing query to data, rather than data to query –, also, all of the systems (at least that I am aware of) that must operate on data larger than can be stored in RAM, have do that it that way. They have to bring query to a data shard and then merge.

      In the old days, when cost of sharding was high, and RAM memory was limited – database engines would ‘spill’ data onto disk and use sorting techniques that keep little in RAM (like mergesort). I have not looked at it this for a few years, so not sure how much of this ‘disk-centric’ data structures/algorithms are still in use.

      1. 1

        This was a few years ago, but at a previous startup I worked at we used custom Cassandra tables for tracking time series analytics about ad views. Later on we switched to using Druid, and found it to be much better. It’s definitely quirky, but does a great job at tracking time series data. I’m sure there are other options now though.

      1. 1

        Why is the case so large but empty?

        1. 4

          There’s a lot of reasons. You need something big enough to hold both the innards and the various probes and wires for a logic analyser. Some modules start out as breadboard prototypes before being integrated on the main PCB, so you may need to house both the innards and a breadboard. During early development, you need buttons that are easy to reach and easy to press, with contacts that are easy to connect to signal generators, even if the actual product has small buttons, because you are going to press them damn buttons thousands of times and stress-test everything with a signal generator that presses buttons hundreds of times per second before you get IRQ handling right. That’s why the wheel is about as big as the whole iPod. Boards slip out easily when they’re held in place by duct tape (you can’t screw them in place because sometimes you need to flip them over for legit purposes, like reaching a test pad), and you want something that’s big enough to allow for some slippage without falling onto something metallic on someone’s desk and accidentally shorting the board.

          This is a pretty late prototype so some of these reasons are less obvious, but – obviously, depending on what route each development team takes and so on – earlier prototypes tend to make these things look far less empty. I’ve worked on things that ended up slightly larger than a VHS tape but started out in a box big enough to take up half my desk.

          Edit: just a few additional notes because I just know these are gonna pop up :-D.

          1. Dogfooding: yes, it’s a valid concern, but if you start doing that too early, all you get is angry developers who have to press tiny buttons while being careful not to accidentally short teenie-weenie 0402 SMDs with their fingers. Some development has to happen before you can dogfood some things. Managers who insist on doing it as early as technically possible aren’t visionary, just clueless.

          2. Obscuring hardware details to software teams and vice-versa: first of all, there’s only so much you can obscure before making development literally impossible – e.g. the people doing the UI need to know how big the screen is, for example, not to mention that the people doing the hardware need to know how big the screen is and where the buttons are. If you give the former a box that has a spinwheel and four buttons and a screen yay big, it’s not that hard to put two and two together and get a basic set of specs.

          What the author most likely meant to say by this:

          It also has the Jobsian side-benefit of keeping the engineers in the dark about what the final device will look like.

          is IMHO that it had the side-benefit of allowing people to work on the software while minimising the chances of a leak about the case design. That’s obviously important for a company which uses leaks as a marketing tool.

          Second, while you can technically do this obscuring thing (albeit not too early in a project) it’s usually a very bad idea – it’s one of the ways that you get hardware which is ill-suited for the software it’s supposed to run, and slow software (that’s not even inefficient, it’s just optimised for the wrong hardware).

          A prototype like this one has the far more mundane benefit of allowing people to start working on the software way before the hardware is finished in all its details. Depending on what CPU you use, you can often start working on the software using a development board, long before the first hardware design draft, as long as you stick to the correct parts. This isn’t some Jobsian vision, it’s how embedded software has been written since practically forever.

          1. 1

            It also has the Jobsian side-benefit of keeping the engineers in the dark about what the final device will look like.

            I understood this to mean the engineers could work on hardware and software of the iPod without risk of them leaking what the final device would look like. It’s not clear if that was the intention of the large case, or a natural side effect.

            I also expect that developing on a prototype is much more productive and inexpensive. You could swap components more easily than if it were all glued together in a tiny metal case.

          1. 2

            I recently started using “newsboat” reader for RSS feed, and find it much less distracting. My primary feeds are mostly security bulletins, but it is much quicker and efficient to find a new posted bulletin and the required info via newsboat than the actual website. I have become a fan, again, of RSS!

            1. 1

              I recently installed ttrss. I had been using rss2email for the arxiv which was nice since searching for names brought up both email and articles.

              In any case, I find RSS is helping reduce my compulsive checking. I wish that lobsters’ article ranking would somehow appear in the feed.

            1. 2

              Lotus Notes is also a product of the (late!) 80s. I’m too young to have used it, but it strikes me as a collaborative tool in the groupware space that fibery occupies now?

              1. 4

                I don’t know about Fibery (I just submitted this article b/c of hypertext and the fact the promo was minimal), but I’ve used Notes and I wouldn’t really call it hypertext. It kinda smashes two things together:

                • A distributed document database, with an easy to use UI (building/reporting/forms) on top of it, which it excelled at
                • A PIM/groupware thing on top of that, with a notoriously clumsy UI
                1. 1

                  Since I kinda forgot to say it and it’s important to the point I was trying to make here, and it’s too late to edit it in: Domino/Notes is multiplayer FileMaker/Access more than anything.

              1. 4

                The live demo is really great! The drawing gesture feels a lot like holding chalk. I can’t sketch as quick as I’d like, presumably because there’s plenty of motion blur.

                1. 1

                  In moving back to work and emptying my home office, I’ve been struck at how my books, CDs, DVDs, video games are all physical artefacts from a particular pre-streaming age… So I have little from the past few years, despite having experienced a lot of culture I imagine this is common? Do folks buy so many paper books? Records?

                  1. 1

                    I feel obliged to share the counter point to this. I’ve never had a place that I could call my own for any long amount of time (more than a year or two). In fact, a lot of the possessions I maintain are the things that I was able to grab and carry on my back at the last moment. I’m so glad that I never had to decide if I could grab the photo album or not, because one day I might not be able to and those memories are gone forever.

                  1. 3

                    This is great to see. I was at UChicago and remember Eugenia Cheng giving a talk about this. It profoundly affected my perspective on mathematics and the goal of mathematical proof… that mathematics is ultimately about more than just lists of true theorems.

                    Thurston’s On proof and progress in mathematics feels related. Or the “literate programming” conceit that programs are written first for human insight. I often wonder how all this connects to “proofgramming” and the interest in the formal verification of mathematics.

                    1. 2

                      See also “Social processes and proofs of theorems and programs” for an unduly pessimistic but insightful take on the implications of “social proof” in mathematics for formal verification of software.

                    1. 5

                      Why were these BGP updates being made in the first place? Must these updates be made periodically?

                      1. 10

                        We’ll have to wait and see if FB releases a postmortem. I’d suspect, based on outages I’ve seen at much smaller companies, someone made the wrong update, and then got locked out of the networking infrastructure because it was done remotely. This change might have passed review because it seemed unrelated to their primary BGP advertisements, but had a bug or fall through that affected way more than it should have.

                        There’s also the possibility it was intentional malice (I’ve heard some lead network engineers quit last week; might be rumors/complete BS or unrelated. I’m sure FB wouldn’t admit if it was).

                        I suspect if this was just a user/developer failure, FB will probably put additional safety procedures in place. They might add additional LTE connections to the data centers just for network engineers as backups. They might also require a network engineer with credentials to be in the physical data center and logged in, prior to major route changes.

                        1. 2

                          Surprised FB does not plonk someone down in the same city as any of their DCs so they don’t have to make the six-hour drive to the router.

                        2. 1

                          If you acquire a new set of IP addresses, or change the allocation of your existing IP addresses to different parts of your network so that the outside world needs to take a different route to get to them, then you’ll update your network peers via a BGP update.

                        1. 2

                          Is there something like thingiverse but for parametric models? Or just an “awesome list” of openscad code?

                          1. 1

                            Fun to see the Twiddler mentioned – I used one a couple decades ago! The Twiddler was connected to a computer being carried, but now the Twiddler and computer could be a single hand held device: a smartphone could be strapped to your hand with a matrix of keys on the side.

                            1. 2

                              I hadn’t realized that David Farmer was involved with this project.

                              I wonder if it has been converted to PreTeXt.

                              1. 2

                                Writing bootsector games is a lot of fun.

                                There is also http://olivier.poudade.free.fr/src/2048.asm which is the 2048 game in 2048 bits, i.e., 256 bytes.

                                1. 2

                                  I understand this. I use Phosh on Pinephone as my daily driver, but I cannot yet make an unqualified recommendation for it.

                                  What I can say is that if you’re willing to accept a reduced featureset and polish, a Linux-based OS for phones is doable today.

                                  I also have a mainstream mobile phone which I use for workflows not yet supported on Linux-based phone OSs, such as navigation and banking.

                                  1. 2

                                    Is banking not possible in the browser?

                                    1. 1

                                      My bank offers mobile deposits of paper checks with an android app, but afaik the bank website doesn’t provide mobile deposits.

                                      1. 1

                                        Ah fair point. A rare but useful need for me.

                                  1. 50

                                    The paper has this to say (page 9):

                                    Regarding potential human research concerns. This experiment studies issues with the patching process instead of individual behaviors, and we do not collect any personal information. We send the emails to the Linux community and seek their feedback. The experiment is not to blame any maintainers but to reveal issues in the process. The IRB of University of Minnesota reviewed the procedures of the experiment and determined that this is not human research. We obtained a formal IRB-exempt letter.


                                    Honoring maintainer efforts. The OSS communities are understaffed, and maintainers are mainly volunteers. We respect OSS volunteers and honor their efforts. Unfortunately, this experiment will take certain time of maintainers in reviewing the patches. To minimize the efforts, (1) we make the minor patches as simple as possible (all of the three patches are less than 5 lines of code changes); (2) we find three real minor issues (i.e., missing an error message, a memory leak, and a refcount bug), and our patches will ultimately contribute to fixing them.

                                    I’m not familiar with the generally accepted standards on these kind of things, but this sounds rather iffy to me. I’m very far removed from academia, but I’ve participated in a few studies over the years, which were always just questionaries or interviews, and even for those I had to sign a consent waiver. “It’s not human research because we don’t collect personal information” seems a bit strange.

                                    Especially since the wording “we will have to report this, AGAIN, to your university” implies that this isn’t the first time this has happened, and that the kernel folks have explicitly objected to being subject to this research before this patch.

                                    And trying to pass off these patches as being done in good faith with words like “slander” is an even worse look.

                                    1. 78

                                      They are experimenting on humans, involving these people in their research without notice or consent. As someone who is familiar with the generally accepted standards on these kinds of things, it’s pretty clear-cut abuse.

                                      1. 18

                                        I would agree. Consent is absolutely essential but just one of many ethical concerns when doing research. I’ve seen simple usability studies be rejected due to lesser issues.

                                        It’s pretty clear this is abuse.. the kernel team and maintainers feel strongly enough to ban the whole institution.

                                        1. 10

                                          Yeah, agreed. My guess is they misrepresented the research to the IRB.

                                          1. 3

                                            They are experimenting on humans

                                            This project claims to be targeted at the open-source review process, and seems to be as close to human experimentation as pentesting (which, when you do social engineering, also involves interacting with humans, often without their notice or consent) - which I’ve never heard anyone claim is “human experimentation”.

                                            1. 19

                                              A normal penetration testing gig is not academic research though. You need to separate between the two, and also hold one of them to a higher standard.

                                              1. 0

                                                A normal penetration testing gig is not academic research though. You need to separate between the two, and also hold one of them to a higher standard.

                                                This statement is so vague as to be almost meaningless. In what relevant ways is a professional penetration testing contract (or, more relevantly, the associated process) different from this particular research project? Which of the two should be held to a higher standard? Why? What does “held to a higher standard” even mean?

                                                Moreover, that claim doesn’t actually have anything to do with the comment I was replying to, which was claiming that this project was “experimenting on humans”. It doesn’t matter whether or not something is “research” or “industry” for the purposes of whether or not it’s “human experimentation” - either it is, or it isn’t.

                                                1. 18

                                                  Resident pentester and ex-academia sysadmin checking in. I totally agree with @Foxboron and their statement is not vague nor meaningless. Generally in a penetration test I am following basic NIST 800-115 guidance for scoping and target selection and then supplement contractual expectations for my clients. I can absolutely tell you that the methodologies that are used by academia should be held to a higher standard in pretty much every regard I could possibly come up with. A penetration test does not create a custom methodology attempting do deal with outputting scientific and repeatable data.

                                                  Let’s put it in real terms, I am hired to do a security assessment in a very fixed highly focused set of targets explicitly defined in contract by my client in an extremely fixed time line (often very short… like 2 weeks maximum and 5 day average). Guess what happens if social engineering is not in my contract? I don’t do it.

                                                  1. 1

                                                    Resident pentester and ex-academia sysadmin checking in.

                                                    Note: this is worded like an appeal to authority, although you probably don’t mean it that way, so I’m not going to act like you are.

                                                    I totally agree with @Foxboron and their statement is not vague nor meaningless.

                                                    Those are two completely separate things, and neither is implied by the other.

                                                    their statement is not vague nor meaningless.

                                                    Not true - their statement contained none of the information you just provided, nor any other sort of concrete or actionable information - the statement “hold to a higher standard” is both vague and meaningless by itself…and it was by itself in that comment (or, obviously, there were other words - none of them relevant) - there was no other information.

                                                    the methodologies that are used by academia should be held to a higher standard

                                                    Now you’re mixing definitions of “higher standard” - GP and I were talking about human experimentation and ethics, while you seem to be discussing rigorousness and reproducibility of experiments (although it’s not clear, because “A penetration test does not create a custom methodology attempting do deal with outputting scientific and repeatable data” is slightly ambiguous).

                                                    None of the above is relevant to the question of “was this a human experiment” and the closely-related one “is penetration testing a human experiment”. Evidence suggests “no” given that the term does not appear in that document, nor have I heard of any pentest being reviewed by an ethics review board, nor have I heard any mention of “human experimenting” in the security community (including when gray-hat and black-hat hackers and associated social engineering e.g. Kevin Mitnick are mentioned), nor are other similar, closer-to-human experimentation (e.g. A/B testing, which is far closer to actually experimenting on people) processes considered to be such - up until this specific case.

                                                  2. 5

                                                    if you’re an employee in an industry, you’re either informed of penetration testing activity, or you’ve at the very least tacitly agreed to it along with many other things that exist in employee handbooks as a condition of your employment.

                                                    if a company did this to their employees without any warning, they’d be shitty too, but the possibility that this kind of underhanded behavior in research could taint the results and render the whole exercise unscientific is nonzero.

                                                    either way, the goals are different. research seeks to further the verifiability and credibility of information. industry seeks to maximize profit. their priorities are fundamentally different.

                                                    1. 1

                                                      you’ve at the very least tacitly agreed to it along with many other things that exist in employee handbooks as a condition of your employment

                                                      By this logic, you’ve also agreed to everything else in a massive, hundred-page long EULA that you click “I agree” on, as well as consent to be tracked by continuing to use a site that says that in a banner at the bottom, as well as consent to Google/companies using your data for whatever they want and/or selling it to whoever will buy.

                                                      …and that’s ignoring whether or not companies that have pentesting done on them actually explicitly include that specific warning in your contract - “implicit” is not good enough, as then anyone can claim that, as a Linux kernel patch reviewer, you’re “implicitly agreeing that you may be exposed to the risk of social engineering for the purpose of getting bad code into the kernel”.

                                                      the possibility that this kind of underhanded behavior in research could taint the results and render the whole exercise unscientific

                                                      Like others, you’re mixing up the issue of whether the experiment was properly-designed with the issue of whether it was human experimentation. I’m not making any attempt to argue the former (because I know very little about how to do good science aside from “double-blind experiments yes, p-hacking no”), so I don’t know why you’re arguing against it in a reply to me.

                                                      either way, the goals are different. research seeks to further the verifiability and credibility of information. industry seeks to maximize profit. their priorities are fundamentally different.

                                                      I completely agree that the goals are different - but again, that’s irrelevant for determining whether or not something is “human experimentation”. Doesn’t matter what the motive is, experimenting on humans is experimenting on humans.

                                                2. 18

                                                  This project claims to be targeted at the open-source review process, and seems to be as close to human experimentation as pentesting (which, when you do social engineering, also involves interacting with humans, often without their notice or consent) - which I’ve never heard anyone claim is “human experimentation”.

                                                  I had a former colleague that once bragged about getting someone fired at his previous job during a pentesting exercise. He basically walked over to this frustrated employee at a bar, bribed him a ton of money and gave a job offer in return for plugging a usb key into the network. He then reported it to senior management and the employee was fired. While that is an effective demonstration of a vulnerability in their organization, what he did was unethical under many moral frameworks.

                                                  1. 2

                                                    First, the researchers didn’t engage in any behavior remotely like this.

                                                    Second, while indeed an example of pentesting, most pentesting is not like this.

                                                    Third, the fact that it was “unethical under many moral frameworks” is irrelevant to what I’m arguing, which is that the study was not “human experimentation”. You can steal money from someone, which is also “unethical under many moral frameworks”, and yet still not be doing “human experimentation”.

                                                  2. 3

                                                    If there is a pentest contract, then there is consent, because consent is one of the pillars of contract law.

                                                    1. 1

                                                      That’s not an argument that pentesting is human experimentation in the first place.

                                                3. 42

                                                  The statement from the UMinn IRB is in line with what I heard from the IRB at the University of Chicago after they experimented on me, who said:

                                                  I asked about their use of any interactions, or use of information about any individuals, and they indicated that they have not and do not use any of the data from such reporting exchanges other than tallying (just reports in aggregate of total right vs. number wrong for any answers received through the public reporting–they said that much of the time there is no response as it is a public reporting system with no expectation of response) as they are not interested in studying responses, they just want to see if their tool works and then also provide feedback that they hope is helpful to developers. We also discussed that they have some future studies planned to specifically study individuals themselves, rather than the factual workings of a tool, that have or will have formal review.

                                                  They because claim they’re studying the tool, it’s OK to secretly experiment on random strangers without disclosure. Somehow I doubt they test new drugs by secretly dosing people and observing their reactions, but UChicago’s IRB was 100% OK with doing so to programmers. I don’t think these IRBs literally consider programmers sub-human, but it would be very inconvenient to accept that experimenting on strangers is inappropriate, so they only want to do so in places they’ve been forced to by historical abuse. I’d guess this will continue for years until some random person is very seriously harmed by being experimented on (loss of job/schooling, pushing someone unstable into self-harm, targeting someone famous outside of programming) and then over the next decade IRBs will start taking it seriously.

                                                  One other approach that occurs to me is that the experimenters and IRBs claim they’re not experimenting on their subjects. That’s obviously bullshit because the point of the experiment is to see how the people respond to the treatment, but if we accept the lie it leaves an open question: what is the role played by the unwitting subject? Our responses are tallied, quoted, and otherwise incorporated into the results in the papers. I’m not especially familiar with academic publishing norms, but perhaps this makes us unacknowledged co-authors. So maybe another route to stopping experimentation like this would be things like claiming copyright over the papers, asking journals for the papers to be retracted until we’re credited, or asking the universities to open academic misconduct investigations over the theft of our work. I really don’t have the spare attention for this, but if other subjects wanted to start the ball rolling I’d be happy to sign on.

                                                  1. 23

                                                    I can kind of see where they’re coming from. If I want to research if car mechanics can reliably detect some fault, then sending a prepared car to 50 garages is probably okay, or at least a lot less iffy. This kind of (informal) research is actually fairly commonly by consumer advocacy groups and the like. The difference is that the car mechanics will get paid for their work where as the Linux devs and you didn’t.

                                                    I’m gonna guess the IRBs probably aren’t too familiar with the dynamics here, although the researchers definitely were and should have known better.

                                                    1. 18

                                                      Here it’s more like keying someone’s car to see how quick it takes them to get an insurance claim.

                                                      1. 4

                                                        Am I misreading? I thought the MR was a patch designed to fix a potential problem, and the issue was

                                                        1. pushcx thought it wasn’t a good fix (making it a waste of time)
                                                        2. they didn’t disclose that it was an auto-generated PR.

                                                        Those are legitimate complaints, c.f. https://blog.regehr.org/archives/2037, but from the analogies employed (drugs, dehumanization, car-keying), I have to double-check that I haven’t missed an aspect of the interaction that makes it worse than it seemed to me.

                                                        1. 2

                                                          We were talking about Linux devs/maintainers too, I commented on that part.

                                                          1. 1

                                                            Gotcha. I missed that “here” was meant to refer to the Linux case, not the Lobsters case from the thread.

                                                      2. 1

                                                        Though there they are paying the mechanic.

                                                      3. 18

                                                        IRB is a regulatory board that is there to make sure that researchers follow the (Common Rule)[https://www.hhs.gov/ohrp/regulations-and-policy/regulations/common-rule/index.html].

                                                        In general, any work that receives federal funding needs to comply with the federal guidelines for human subject research. All work involving human subjects (usually defined as research activities that involve interaction with humans) need to be reviewed and approved by the institution IRB. These approvals fall within a continuum, from a full IRB review (which involve the researcher going to a committee and explaining their work and usually includes continued annual reviews) to a declaration of the work being exempt from IRB supervision (usually this happens when the work meets one of the 7 exemptions listed in the federal guidelines). The whole process is a little bit more involved, see for example (all the charts)[https://www.hhs.gov/ohrp/regulations-and-policy/decision-charts/index.html] to figure this out.

                                                        These rules do not cover research that doesn’t involve humans, such as research on technology tools. I think that there is currently a grey area where a researcher can claim that they are studying a tool and not the people interacting with the tool. It’s a lame excuse that probably goes around the spirit of the regulations and is probably unethical from a research stand point. The data aggregation method or the data anonymization is usually a requirement for an exempt status and not a non-human research status.

                                                        The response that you received from IRB is not surprising, as they probably shouldn’t have approved the study as non-human research but now they are just protecting the institution from further harm rather than protecting you as a human subject in the research (which, by the way, is not their goal at this point).

                                                        One thing that sticks out to me about your experience is that you weren’t asked to give consent to participate in the research. That usually requires a full IRB review as informed consent is a requirement for (most) human subject research. Exempt research still needs informed consent unless it’s secondary data analysis of existing data (which your specific example doesn’t seem to be).

                                                        One way to quickly fix it is to contact the grant officer that oversees the federal program that is funding the research. A nice email stating that you were coerced to participate in the research study by simply doing your work (i.e., review a patch submitted to a project that you lead) without being given the opportunity to provide prospective consent and without receiving compensation for your participation and that the research team/university is refusing to remove your data even after you contacted them because they claim that the research doesn’t involve human subjects can go a long way to force change and hit the researchers/university where they care the most.

                                                        1. 7

                                                          Thanks for explaining more of the context and norms, I appreciate the introduction. Do you know how to find the grant officer or funding program?

                                                          1. 7

                                                            It depends on how “stalky” you want to be.

                                                            If NSF was the funder, they have a public search here: https://nsf.gov/awardsearch/

                                                            Most PIs also add a line about grants received to their CVs. You should be able to match the grant title to the research project.

                                                            If they have published a paper from that work, it should probably include an award number.

                                                            Once you have the award number, you can search the funder website for it and you should find a page with the funding information that includes the program officer/manager contact information.

                                                            1. 3

                                                              If they published a paper about it they likely included the grant ID number in the acknowledgements.

                                                              1. 1

                                                                You might have more luck reaching out to the sponsored programs office at their university, as opposed to first trying to contact an NSF program officer.

                                                            2. 4

                                                              How about something like a an Computer Science - External Review Board? Open source projects could sign up, and include a disclaimer that their project and community ban all research that hasn’t been approved. The approval process could be as simple as a GitHub issue the researcher has to open, and anyone in the community could review it.

                                                              It wouldn’t stop the really bad actors, but any IRB would have to explain why they allowed an experiment on subjects that explicitly refused consent.

                                                              [Edit] I felt sufficiently motivated, so I made a quick repo for the project . Suggestions welcome.

                                                              1. 7

                                                                I’m in favor of building our own review boards. It seems like an important step in our profession taking its reponsibility seriously.

                                                                The single most important thing I’d say is, be sure to get the scope of the review right. I’ve looked into this before and one of the more important limitations on IRBs is that they aren’t allowed to consider the societal consequences of the research succeeding. They’re only allowed to consider harm to experimental subjects. My best guess is that it’s like that because that’s where activists in the 20th-century peace movement ran out of steam, but it’s a wild guess.

                                                                1. 4

                                                                  At least in security, there are a lot of different Hacker Codes of Ethics floating around, which pen testers are generally expected to adhere to… I don’t think any of them cover this specific scenario though.

                                                                  1. 2

                                                                    any so-called “hacker code of ethics” in use by any for-profit entity places protection of that entity first and foremost before any other ethical consideration (including human rights) and would likely not apply in a research scenario.

                                                              2. 23

                                                                They are bending the rules for non human research. One of the exceptions for non-human research is research on organization, which my IRB defines as “Information gathering about organizations, including information about operations, budgets, etc. from organizational spokespersons or data sources. Does not include identifiable private information about individual members, employees, or staff of the organization.” Within this exception, you can talk with people about how the organization merges patches but not how they personally do that (for example). All the questions need to be about the organization and not the individual as part of the organization.

                                                                On the other hand, research involving human subjects is defined as any research activity that involves an “individual who is or becomes a participant in research, either:

                                                                • As a recipient of a test article (drug, biologic, or device); or
                                                                • As a control.”

                                                                So, this is how I interpret what they did.

                                                                The researchers submitted an IRB approval saying that they just downloaded the kernel maintainer mailing lists and analyzed the review process. This doesn’t meet the requirements for IRB supervision because it’s either (1) secondary data analysis using publicly available data and (2) research on organizational practices of the OSS community after all identifiable information is removed.

                                                                Once they started emailing the list with bogus patches (as the maintainers allege), the research involved human subjects as these people received a test article (in the form of an email) and the researchers interacted with them during the review process. The maintainers processing the patch did not do so to provide information about their organization’s processes and did so in their own personal capacity (In other words, they didn’t ask them how does the OSS community processes this patch but asked them to process a patch themselves). The participants should have given consent to participate in the research and the risks of participating in it should have been disclosed, especially given the fact that missing a security bug and agreeing to merge it could be detrimental to someone’s reputation and future employability (that is, this would qualify for more than minimal risk for participants, requiring a full IRB review of the research design and process) with minimal benefits to them personally or to the organization as a whole (as it seems from the maintainers’ reaction to a new patch submission).

                                                                One way to design this experiment ethically would have been to email the maintainers and invite them to participate in a “lab based” patch review process where the research team would present them with “good” and “bad” patches and ask them whether they would have accepted them or not. This is after they were informed about the study and exercised their right to informed consent. I really don’t see how emailing random stuff out and see how people interact with it (with their full name attached to it and in full view of their peers and employers) can qualify as research with less than minimal risks and that doesn’t involve human subjects.

                                                                The other thing that rubs me the wrong way is that they sought (and supposedly received) retroactive IRB approval for this work. That wouldn’t fly with my IRB, as my IRB person would definitely rip me a new one for seeking retroactive IRB approval for work that is already done, data that was already collected, and a paper that is already written and submitted to a conference.

                                                                1. 6

                                                                  You make excellent points.

                                                                  1. IRB review has to happen before the study is started. For NIH, the grant application has to have the IRB approval - even before a single experiment is even funded to be done, let alone actually done.
                                                                  2. I can see the value of doing a test “in the field” so as to get the natural state of the system. In a lab setting where the participants know they are being tested, various things will happen to skew results. The volunteer reviewers might be systematically different from the actual population of reviewers, the volunteers may be much more alert during the experiment and so on.

                                                                  The issue with this study is that there was no serious thought given to what are the ethical ramifications of this are.

                                                                  If the pen tested system has not asked to be pen tested then this is basically a criminal act. Otherwise all bank robbers could use the “I was just testing the security system” defense.

                                                                  1. 8

                                                                    The same requirement for prior IRB approval is necessary for NSF grants (which the authors seem to have received). By what they write in the paper and my interpretation of the circumstances, they self certified as conducting non-human research at time of submitting the grant and only asked their IRB for confirmation after they wrote the paper.

                                                                    Totally agree with the importance of “field experiment” work and that, sometimes, it is not possible to get prospective consent to participate in the research activities. However, the guidelines are clear on what activities fall within research activities that are exempt from prior consent. The only one that I think is applicable to this case is exception 3(ii):

                                                                    (ii) For the purpose of this provision, benign behavioral interventions are brief in duration, harmless, painless, not physically invasive, not likely to have a significant adverse lasting impact on the subjects, and the investigator has no reason to think the subjects will find the interventions offensive or embarrassing. Provided all such criteria are met, examples of such benign behavioral interventions would include having the subjects play an online game, having them solve puzzles under various noise conditions, or having them decide how to allocate a nominal amount of received cash between themselves and someone else.

                                                                    These usually cover “simple” psychology experiments involving mini games or economics games involving money.

                                                                    In the case of this kernel patching experiment, it is clear that this experiment doesn’t meet this requirement as participants have found this intervention offensive or embarrassing, to the point that they are banning the researchers’ institution from pushing patched to the kernel. Also, I am not sure if reviewing a patch is a “benign game” as this is the reviewers’ jobs, most likely. Plus, the patch review could have adverse lasting impact on the subject if they get asked to stop reviewing patches if they don’t catch the security risk (e.g., being deemed imcompetent).

                                                                    Moreover, there is this follow up stipulation:

                                                                    (iii) If the research involves deceiving the subjects regarding the nature or purposes of the research, this exemption is not applicable unless the subject authorizes the deception through a prospective agreement to participate in research in circumstances in which the subject is informed that he or she will be unaware of or misled regarding the nature or purposes of the research.

                                                                    As their patch submission process was deceptive in nature, as their outline in the paper, exemption 3(ii) cannot apply to this work unless they notify maintainers that they will be participating in a deceptive research study about kernel patching.

                                                                    That leaves the authors to either pursue full IRB review for their work (as a full IRB review can approve a deceptive research project if it deems it appropriate and the risk/benefit balance is in favor to the participants) or to self-certify as non-human subjects research and fix any problems later. They decided to go with the latter.

                                                                2. 35

                                                                  We believe that an effective and immediate action would be to update the code of conduct of OSS, such as adding a term like “by submitting the patch, I agree to not intend to introduce bugs.”

                                                                  I copied this from that paper. This is not research, anyone who writes a sentence like this with a straight face is a complete moron and is just mocking about. I hope all of this will be reported to their university.

                                                                  1. 18

                                                                    It’s not human research because we don’t collect personal information

                                                                    I yelled bullshit so loud at this sentence that it woke up the neighbors’ dog.

                                                                    1. 2

                                                                      Yeah, that came from the “clarifiactions” which is garbage top to bottom. They should have apologized, accepted the consequences and left it at that. Here’s another thing they came up with in that PDF:

                                                                      Suggestions to improving the patching process In the paper, we provide our suggestions to improve the patching process.

                                                                      • OSS projects would be suggested to update the code of conduct, something like “By submitting the patch, I agree to not intend to introduce bugs”

                                                                      i.e. people should say they won’t do exactly what we did.

                                                                      They acted in bad faith, skirted IRB through incompetence (let’s assume incompetence and not malice) and then act surprised.

                                                                    2. 14

                                                                      Apparently they didn’t ask the IRB about the ethics of the research until the paper was already written: https://www-users.cs.umn.edu/~kjlu/papers/clarifications-hc.pdf

                                                                      Throughout the study, we honestly did not think this is human research, so we did not apply for an IRB approval in the beginning. We apologize for the raised concerns. This is an important lesson we learned—Do not trust ourselves on determining human research; always refer to IRB whenever a study might be involving any human subjects in any form. We would like to thank the people who suggested us to talk to IRB after seeing the paper abstract.

                                                                      1. 14

                                                                        I don’t approve of researchers YOLOing IRB protocols, but I also want this research done. I’m sure many people here are cynical/realistic enough that the results of this study aren’t surprising. “Of course you can get malicious code in the kernel. What sweet summer child thought otherwise?” But the industry as a whole proceeds largely as if that’s not the case (or you could say that most actors have no ability to do anything about the problem). Heighten the contradictions!

                                                                        There are some scary things in that thread. It sounds as if some of the malicious patches reached stable, which suggests that the author mostly failed by not being conservative enough in what they sent. Or for instance:

                                                                        Right, my guess is that many maintainers failed in the trap when they saw respectful address @umn.edu together with commit message saying about “new static analyzer tool”.

                                                                        1. 17

                                                                          I agree, while this is totally unethical, it’s very important to know how good the review processes are. If one curious grad student at one university is trying it, you know every government intelligence department is trying it.

                                                                          1. 8

                                                                            I entirely agree that we need research on this topic. There’s better ways of doing it though. If there aren’t better ways of doing it, then it’s the researcher’s job to invent them.

                                                                          2. 7

                                                                            It sounds as if some of the malicious patches reached stable

                                                                            Some patches from this University reached stable, but it’s not clear to me that those patches also introduced (intentional) vulnerabilities; the paper explicitly mentions the steps that they’re taking steps to ensure those patches don’t reach stable (I omitted that part, but it’s just before the part I cited)

                                                                            All umn.edu are being reverted, but at this point it’s mostly a matter of “we don’t trust these patches and will need additional review” rather than “they introduced security vulnerabilities”. A number of patches already have replies from maintainers indicating they’re genuine and should not be reverted.

                                                                            1. 5

                                                                              Yes, whether actual security holes reached stable or not is not completely clear to me (or apparently to maintainers!). I got that impression from the thread, but it’s a little hard to say.

                                                                              Since the supposed mechanism for keeping them from reaching stable is conscious effort on the part of the researchers to mitigate them, I think the point may still stand.

                                                                              1. 1

                                                                                It’s also hard to figure out what the case is since there is no clear answer what the commits where, and where they are.

                                                                            2. 4

                                                                              The Linux review process is so slow that it’s really common for downstream folks to grab under-review patches and run with them. It’s therefore incredibly irresponsible to put patches that you know introduce security vulnerabilities into this form. Saying ‘oh, well, we were going to tell people before they were deployed’ is not an excuse and I’d expect it to be a pretty clear-cut violation of the Computer Misuse Act here and equivalent local laws elsewhere. That’s ignoring the fact that they were running experiments on people without their consent.

                                                                              I’m pretty appalled the Oakland accepted the paper for publication. I’ve seen paper rejected from there before because they didn’t have appropriate ethics review oversite.

                                                                          1. 2

                                                                            I use an Iris currently with fairly aggressive tenting, with kailh box whites.

                                                                            I’m planning on building a corne next, with box jades. I’m also interested in a dactyl but handwriting seems hard.

                                                                            1. 1

                                                                              the wiring is why my Dactyl is not finished - I was thinking about making some single key kailh hot swap PCB’s that could be mounted in the 3D printed case, but that is another project :~)

                                                                            1. 1

                                                                              Maybe just recycle them, they are way too out dated.

                                                                              1. 7

                                                                                “Outdated”? They’re Raspberry Pis, jeepers. You’re not expecting to run Crysis on them.

                                                                                I have a little home data center and all the computers in it are 5 to 10 years old. They work fine as servers.

                                                                                1. 1

                                                                                  Outdated in terms of rpi4 or even rpi3. A 5 to 10 years x86 maybe is still working fine (I am running a home lab with 1st gen Xeon). but the first generation of rpi is way too slow to run most application, due to ram and io limitations.

                                                                                  1. 2

                                                                                    I ran nextcloud on gentoo on my original rpi. The stories of their inadequate resources to be useful are greatly exaggerated

                                                                                    1. 2

                                                                                      I mean, my gen 1 or whatever works great for a shell server (SOCKS proxy, IRC client, webhook callback testing) and random other stuff, like hanging a CO2 logger off of. It also used to host a Jabber chat service, although I now have that running on a different machine for reasons unrelated to performance.

                                                                                      There’s already enough of a problem with electronic waste. Unless the power draw difference is massive (10x?) then there’s no reason to get a new computer for basic stuff like this.

                                                                                      1. 1

                                                                                        I’m super curious why you have a CO2 logger at home?

                                                                                        1. 1

                                                                                          I think I originally bought it because I’d seen these studies on the impact of CO2 concentration on human cognition, suggesting that long meetings at work in closed meeting rooms literally made people stupider by the end of the meeting. (I’m still not convinced this is a thing.) And it turned out that I could get a USB CO2 meter for reasonably cheap on craigslist, so I bought one from a little old lady who had used it for her cannabis grow room.

                                                                                          It turned out that the office CO2 levels were quite good, only about 600 ppm, even in the conference rooms. My home, though, was around 1000 ppm, rising up to 2000 or even 3000 ppm in a closed bedroom overnight (with two adults). I used the 1000 ppm, the floor area, and the outdoor CO2 levels (450-ish) to calculate the air exchange rate for the main living area, which came out to… I actually don’t recall exactly, but it might have been 1/3 turnover per hour. Less than I thought.

                                                                                          Most recently I’ve hooked up the meter to my rpi in the basement and set up a logger script using https://github.com/heinemml/CO2Meter to see what it’s like down there. The basement tends to be fairly warm in winter, probably due to the boilers and pipes for the steam radiators, and it has occurred to me that it might be a nice place to set up a workstation that isn’t in my bedroom during these work-from-home times. The levels down there are shockingly low, around 500 ppm – there aren’t people breathing all over the place, but I guess it also means the boilers and hot water tanks are vented properly. (I would expect it to be colder if it were just due to the air exchange rate.) If the radon levels look good, it might be a good option for a workspace.

                                                                                1. 41

                                                                                  Something other than “everything is bytes”, for starters. The operating system should provide applications with a standard way of inputting and outputting structured data, be it via pipes, to files, …

                                                                                  Also, a standard mechanism for applications to send messages to each other, preferably using the above structured format when passing data around. Seriously, IPC is one of the worst parts of modern OSes today.

                                                                                  If we’re going utopic, then the operating system should only run managed code in a abstract VM via the scheduler, which can provide safety beyond what the hardware can. So basically it would be like if your entire operating system is Java and the kernel runs everything inside the JVM. (Just an example, I do not condone writing an operating system in Java).

                                                                                  I’m also liking what SerenityOS is doing with the LibCore/LibGfx/LibGui stuff. A “standard” set of stuff seems really cool because you know it will work as long as you’re on SerenityOS. While I’m all for freedom of choice having a default set of stuff is nice.

                                                                                  1. 21

                                                                                    The operating system should provide applications with a standard way of inputting and outputting structured data, be it via pipes, to files

                                                                                    I’d go so far as to say that processes should be able to share not only data structures, but closures.

                                                                                    1. 4

                                                                                      This has been tried a few times, it was super interesting. What comes to mind is Obliq, (to some extent) Modula-3, and things like Kali Scheme. Super fascinating work.

                                                                                      1. 3

                                                                                        Neat! Do you have a use-case in mind for interprocess closures?

                                                                                        1. 4

                                                                                          To me that sounds like the ultimate way to implement capabilities: a capability is just a procedure which can do certain things, which you can send to another process.

                                                                                          1. 5

                                                                                            This is one of the main things I had in mind too. In a language like Lua where closure environments are first-class, it’s a lot easier to build that kind of thing from scratch. I did this in a recent game I made where the in-game UI has access to a repl that lets you reconfigure the controls/HUD and stuff but doesn’t let you rewrite core game data: https://git.sr.ht/~technomancy/tremendous-quest-iv

                                                                                        2. 1

                                                                                          I would be interested in seeing how the problem with CPU time stealing and DoS attacks that would arise from that could be solved.

                                                                                        3. 17

                                                                                          Digging into IPC a bit, I feel like Windows actually had some good stuff to say on the matter.

                                                                                          I think the design space looks something like:

                                                                                          • Messages vs streams (here is a cat picture vs here is a continuing generated sequence of cat pictures)
                                                                                          • Broadcast messages vs narrowcast messages (notify another app vs notify all apps)
                                                                                          • Known format vs unknown pile of bytes (the blob i’m giving you is an image/png versus lol i dunno here’s the size of the bytes and the blob, good luck!)
                                                                                          • Cancellable/TTL vs not (if this message is not handled by this time, don’t deliver it)
                                                                                          • Small messages versus big messages (here is a thumbnail of a cat versus the digitized CAT scan of a cat)

                                                                                          I’m sure there are other axes, but that’s maybe a starting point. Also, fuck POSIX signals. Not in my OS.

                                                                                          1. 5

                                                                                            Is a video of cats playing a message or a stream? Does it matter whether it’s 2mb or 2gb (or whether the goal is to display one frame at a time vs to copy the file somewhere)?

                                                                                            1. 2

                                                                                              It would likely depend on the reason the data is being transferred. Video pretty much always fits into the ‘streaming’ category if it’s going to be decoded and played, as the encoding allows for parts of a file to be decoded independent of the other parts. Messages are for atomic chucks of data that only make sense when they’re complete. Transferring whole files over a message bus is probably a bad idea though, you’d likely want to instead pass a message that says “here’s a path to a file and some metadata, do what you want with it” and have the permissions model plug into the message bus so that applications can have temporary r/rw access to the file in question. Optionally, if you have a filesystem that supports COW and deduplication, you can efficiently and transparently copy the file for the other applications use and it can do whatever it wants with it without affecting the “original”.

                                                                                              1. 5

                                                                                                Which is why copy&paste is implemented the way it is!

                                                                                                Many people don’t realize but it’s not actually just some storage buffer. As long as the program is running when you try to paste something the two programs can talk to each other and negotiate the format they want.

                                                                                                That is why people sometimes have odd bugs on linux where the clipboard disappears when a program ends or why Powerpoint sometimes asks you if you want to keep your large clipboard content when you try to exit.

                                                                                          2. 13

                                                                                            Something other than “everything is bytes”, for starters. The operating system should provide applications with a standard way of inputting and outputting structured data, be it via pipes, to files, …

                                                                                            It’s a shame I can agree only once.

                                                                                            Things like Records Management Services, ARexx, Messages and Ports on Amiga or OpenVMS’ Mailboxes (to say nothing of QIO), and the data structures of shared libraries on Amiga…

                                                                                            Also, the fact that things like Poplog (which is an operating environment for a few different languages but allows cross-language calls), OpenVMS’s common language environment, or even USCD p-System aren’t more popular is sad to me.

                                                                                            Honestly, I’ve thought about this a few times, and I’d love something that is:

                                                                                            • an information utility like Multics
                                                                                            • secure like seL4 and Multics
                                                                                            • specified like seL4
                                                                                            • distributed like Plan9/CLive
                                                                                            • with rich libraries, ports, and plumbing rules
                                                                                            • and separated like Qubes
                                                                                            • with a virtual machine that is easy to inspect like LispM’s OSes, but easy to lock down like Bitfrost on one-laptop per child…

                                                                                            a man can dream.

                                                                                            1. 7

                                                                                              Something other than “everything is bytes”, for starters. The operating system should provide applications with a standard way of inputting and outputting structured data

                                                                                              have you tried powershell

                                                                                              1. 4

                                                                                                or https://www.nushell.sh/ for that matter

                                                                                              2. 4

                                                                                                In many ways you can’t even remove the *shells from current OS’s IPC is so b0rked.

                                                                                                How can a shell communicate with a program it’s trying to invoke? Array of strings for options and a global key value dictionary of strings for environment variables.


                                                                                                It should be able to introspect to find out the schema for the options (what options are available, what types they are…)

                                                                                                Environment variables are a reliability nightmare. Essentially hidden globals everywhere.

                                                                                                Pipes? The data is structured, but what is the schema? I can pipe this to that, does it fit? Does it make sense….? Can I b0rk your adhoc parser of input, sure I can, you scratched it together in half a day assuming only friendly inputs.

                                                                                                In many ways IPC is step zero to figure out. With all the adhoc options parsers and adhoc stdin/out parsers / formatters being secure, robust and part of the OS.

                                                                                                1. 3

                                                                                                  I agree wholeheartedly with the first part of your comment. But then there is this:

                                                                                                  If we’re going utopic, then the operating system should only run managed code in a abstract VM via the scheduler, which can provide safety beyond what the hardware can.

                                                                                                  What sort of safety can a managed language provide from the point of view of an operating system compared to the usual abstraction of processes (virtual memory and preemptive scheduling) combined with thoughtful design of how you give programs access to resources? When something goes wrong in Java, the program may either get into a state that violates preconditions assumed by the authors or an exception will terminate some superset of erroneous computation. When something goes wrong in a process in a system with virtual memory, again program may reach a state violating preconditions assumed by the authors, or it may trigger a hardware exception, handled by the OS which may terminate the program or inform it about the fault. Generally, it all gets contained within the process. The key difference is, with a managed language you seem to be sacrificing performance for an illusory feeling of safety.

                                                                                                  There are of course other ways programs may violate safety, but that has more to do with how you give them access to resources such as special hardware components, filesystem, operating system services, etc. Nothing that can be fixed by going away from native code.

                                                                                                  No-breaks programming languages like C may be a pain for the author of the program and there is a good reason to switch away from them to something safer, in order to write more reliable software. But a language runtime can’t protect an operating system any more than the abstractions that make up a process, which are a lot more efficient. There are of course things like Spectre and Meltdown, but those are hardware bugs. Those bugs should be fixed, not papered over by another layer, lurking at the bottom.

                                                                                                  Software and hardware need to be considered together, as they together form a system. Ironically, I may conclude this comment with an Alan Kay quote:

                                                                                                  People who are really serious about software should make their own hardware.