1. 3

    I don’t see any mention of delivery sematics in the linked repo. @houqp, perhaps you can expand on this? Right now, the linked repo seems like a kafka connector, but there’s not much in there from what I can see.

    1. 2

      Yes, it’s a native kafka delta lake connector. In short, the exactly once message delivery is accomplished by batching the message and kafka offset into a single Delta Table commit so they are written to the table atomically. If a message has been written to a Delta Table, trying to write the same message again will result in transaction conflicts because kafka-delta-ingest only allows the offset to go forward.

    1. 5

      In this case, passing a pointer into a function is still passing by value in the strictest sense, but it’s actually the pointer’s value itself that is being copied, not the thing that the pointer refers to

      Is this not how every language works when handling pointers?

      1. 6

        I think so, but I believe the main point of the article is how there are certain types, like slices, maps, and channels, that feel as if you’re passing them by value, even though they behave like references.

        This sometimes trips people up (like me), for example: https://eli.thegreenplace.net/2018/beware-of-copying-mutexes-in-go/

        1. 7

          I learned recently that go vet will give that warning on copying any struct which implements Lock() and Unlock() methods.

          E.g.,

          package main
          type t struct{}; func (*t) Lock() {};func (*t) Unlock() {}
          func main() { a := t{}; _ = a }
          

          will trigger the vet warning.

        2. 2

          C++ references are distinct! For example in Python (and I imagine in Go as well) you can’t pass a reference to an integer. You can’t do

          x = 3
          f(x)
          # x now equals 4
          

          (in CPython you can by doing some stack trace trickery)

          This is kind of linked back to a fundamental C++-ism of built-ins being “the same as” other data types. Whereas Python/Java/Go/lots of other stuff have this distinction between builtins and aggregate types.

          Rust, being the true successor to C++ in so many ways, carries over references nicely tho…

          fn f(x: &mut i32){
              *x += 1;
          }
          
          fn main() {
              let mut x:i32=4;
              println!("x={}", x);
              f(&mut x);
              println!("x={}", x);
              println!("Done");
          }
          

          And beyond “changing the contents of an integer”, ultimately being able to change the variable itself (even replacing it with an entirely different object) is only really an option in systems languages.

          1. 1

            The only exceptions I can think of are:

            • perl - lists are copied
            • tcl - lists are strings
            • C - structurs are actually copied without explicit pointers
            • languages with explicit value types like C#
          1. 5

            It’s important to note that this protocol is specialized for data center use, so situations with high reliability low latency links, and specifically for RPC scenarios, so frequent short message exchanges. An overview of the differences between Homa and TCP:

            1. No explicit acknowledgements. Instead, GRANT packets are occasionally sent to acknowledge packets.
            2. SRPT (Shortest Receiver Processing Time) based prioritization where higher-priority queues are kept specifically for packets that need quick responses.
            3. Connectionless.
            4. At most once semantics, so the receiver does not need to make RPC methods idempotent.
            1. 2

              At most once semantics, so the receiver does not need to make RPC methods idempotent.

              It’s the other way around. Homa is at-least-once:

              Homa allows RPCs to be executed more than once: in the normal case, an RPC is executed one or more times; after an error, it could have been executed any number of times (including zero). […] Duplicates must be filtered at a level above the transport layer.

              […]

              Homa assumes that higher level software will either tolerate redundant executions of RPCs or filter them out.

              1. 1

                Huh the article says:

                At-most-once delivery semantics: Other RPC protocols are designed to ensure at-most once-delivery of a complete message, but Homa targets at-least-once semantics. This means that Homa can possibly re-execute RPC requests if there are failures in the network (and an RPC ends up being retried). While at-least-once semantics put a greater burden on the receiving system (which might have to make RPCs idempotent), relaxing the messaging semantics allows Homa receivers to adapt to failures that happen in a data center environment. As an example, Homa receivers can discard state if an RPC becomes inactive, which might happen if a client exceeds a deadline and retries.

                I haven’t read through the paper yet, but if so, then this summary is incorrect.

                1. 2

                  I think the bit about “at-most-once delivery semantics” in the article should be shortened to “delivery semantics”, a quick read of the section titles might lead one to think that Homa has at-most-once. If you read the actual body, though:

                  Homa targets at-least-once semantics. This means that Homa can possibly re-execute RPC requests […] While at-least-once semantics put a greater burden on the receiving system (which might have to make RPCs idempotent) relaxing the messaging semantics allows Homa receivers to adapt to failures

            1. 6

              This is the paper linked from https://lobste.rs/s/g0mflv/ribbon_filter_practically_smaller_than, so I feel like they should be merged.

              Edit: linked post is now deleted, but there’s an overview of the paper here: https://engineering.fb.com/2021/07/09/data-infrastructure/ribbon-filter/

              1. 2

                i deleted the other story. No need for 2

              1. 5

                The current title (UK right to repair law excludes smartphones, fridges, etc) is wrong and I think editorialized, which would be against the submission guidelines. Unless I missed something fridges don’t seem to be included.

                1. 5

                  Indeed, the title of the submission has now been changed to reflect that. This is what is explicitly excluded, according to the article:

                  Cookers, hobs, tumble dryers, microwaves or tech such as laptops or smartphones aren’t covered.

                  1. 2

                    According to the linked article, refrigerators are explicitly included.

                    Edit all users with a certain karma(?) can suggest title edits. If enough of them do, it is updated automatically

                  1. 8

                    Although the original post was tongue-in-cheek, cap-std would disallow things like totally-safe-transmute (discussed at the time), since the caller would need a root capability to access /proc/self/mem (no more sneaking filesystem calls inside libraries!)

                    Having the entire standard library work with capabilities would be a great thing. Pony (and Monte too, I think) uses capabilities extensively in the standard library, which allows users to trust third party packages: if the package doesn’t use FFI (the compiler can check this) nor requires the appropriate capabilities, it won’t be able to do much: no printing to the screen, using the filesystem, or connecting to the network.

                    1. 3

                      Yes. While Rust cannot be capability-safe (as explored in a sibling thread), this sort of change to a library is very welcome, because it prevents many common sorts of bugs from even being possible for programmers to write. This is the process of taming, and a tamed standard library is a great idea for languages which cannot guarantee capability-safety. The Monte conversation about /proc/self/mem still exists, but is only a partial compromise of security, since filesystem access is privileged by default.

                      Pony and Monte are capability-safe; they treat every object reference as a capability. Pony uses compile-time guarantees to make modules safe, while Monte uses runtime auditors to prove that modules are correct. The main effect of this, compared to Rust, is to remove the need for a tamed standard library. Instead, Pony and Monte tame the underlying operating system API directly. This is a more monolithic approach, but it removes the possibility of unsafe artifacts in standard-library code.

                      1. 3

                        Yeah, I reckon capabilities would have helped with the security issues surrounding procedural macros too. I hope more new languages take heed of this, it’s a nice approach!

                        1. 4

                          It can’t help with proc macros, unless you run the macros in a (Rust-agnostic) process-wide sandbox like WASI. Rust is not a sandbox/VM language, and has no way to enforce it itself.

                          In Rust, the programmer is always on the trusted side. Rust safety features are for protecting programs from malicious external inputs and/or programmer mistakes when the programmer is cooperating. They’re ill-suited for protecting against programs from intentionally malicious parts of the same program.

                          1. 2

                            We might trust the compiler while compiling proc macros, though, yes? And the compiler could prevent calling functions that use ambient authority (along with unsafe rust). That would provide capability security, no?

                            1. 5

                              No, we can’t trust the compiler. It hasn’t been designed to be a security barrier. It also sits on top of LLVM and C linkers that also historically assumed that the programmer is trusted and in full control.

                              Rust will allow the programmer to break and bypass language’s rules. There are obvious officially-sanctioned holes, like #[no_mangle] (this works in Rust too) and linker options. There are less obvious holes like hash collisions of TypeId, and a few known soundness bugs. Since security within the compiler was never a concern (these are bugs on the same side of the airtight hatchway) there’s likely many many more.

                              It’s like a difference between a “Do Not Enter” sign and a vault. Both keep people out, but one is for stopping cooperating people, and the other is against determined attackers. It’s not easy to upgrade a “Do Not Enter” sign to be a vault.

                              1. 3

                                You can disagree with the premise of trusting the compiler, but I think the argument is still valid. If the compiler can be trusted, then we could have capability security for proc macros.

                                Whether to trust the compiler is a risk that some might accept, others would not.

                                1. 3

                                  But this makes the situation entirely hypothetical. If Rust was a different language, with different features, and a different compiler implementation, then you could indeed trust that not-Rust compiler.

                                  The Rust language as it exists today has many features that intentionally bypass compiler’s protections if the programmer wishes so.

                                  1. 1

                                    Between “do not enter” signs and vaults, a lot of business gets done with doors, even with a known risk that the locks that can be picked.

                                    You seem to argue that there is no such thing as safe rust or that there are no norms for denying unsafe rust.

                                    1. 3

                                      Rust’s safety is already often misunderstood. fs::remove_dir_all("/") is safe by Rust’s definition. I really don’t want to give people an idea that you could ban a couple of features and make Rust have safety properties of JavaScript in a browser. Rust has an entirely different threat model. The “safe” subset of Rust is not a complete language, and it’s closer to being a linter for undefined behavior than a security barrier.

                                      Security promises in computing are often binary. What does it help if a proc macro can’t access the filesystem through std::fs, but can by making a syscall directly? It’s a few lines of code extra for the attacker, and a false sense of security for users.

                                      1. 1

                                        Ok, let’s talk binary security properties. Object Capability security consists of:

                                        1. Memory safety
                                        2. Encapsulation
                                        3. No powerful globals

                                        There are plenty of formal proofs of the security properties that follow… patterns for achieving cooperation without vulnerability. See peer reviewed articles in https://gihub.com/dckc/awesome-ocap

                                        This cap-std work aims to address #3. For example, with compiler support to deny ambient authority, it addresses std::fs.

                                        Safe rust, especially run on wasm, is memory safe much like JS, yes? i.e. safe modulo bugs. Making a syscall requires using asm, which is not in safe rust.

                                        Rust’s encapsulation is at the module level rather than object level, but it’s there.

                                        While this cap-std and tools to deny ambient authority are not as mature as std, I do want to give people an idea that this is a good approach to building scalable secure systems.

                                        I grant that the relevant threat model isn’t emphasized around rust the way it is around JS, but I don’t see why rust would have to be a different language to shift this emphasis.

                                        I see plenty of work on formalizing safe rust. Safety problems seem to be considered serious bugs, not intentional design decisions.

                                        1. 1

                                          In presence of malicious code, Rust on WASM is exactly as safe as C on WASM. All of the safety is thanks to the WASM VM, not due to anything that Rust does.

                                          Safe Rust formalizations assume the programmer won’t try to exploit bugs in the compiler, and the Rust compiler has exploitable bugs. For example, symbol mangling uses a hash that has 1 in 2⁶⁴ chance of colliding (or less due to bday attack). I haven’t heard of anyone running into this by accident, but a determined attacker could easily compute a collision that makes their cap-approved innocent_foo() actually link to the code of evil_bar() and bypass whatever formally-proven safety the compiler tried to have.

                      1. 2

                        This is a very nice talk. Even as someone who’s not terribly familiar with C++, I could appreciate the comparisons and footguns that Rust will prevent.

                        1. 5

                          I have Elaho installed on my phone and it’s quite nice, thanks for all the work you’ve done, @pitr!

                          1. 4

                            Hi all, this is @nehbit here, I’m the maintainer. Happy to answer if you have any questions!

                            1. 2

                              Aether is a flood protocol. Its network topology is effectively only concerned with delivering all data to everywhere, regardless of who’s following whom.

                              is there anything stopping a person from encoding copyrighted material (or something much worse) as plaintext and then forcing every other node to receive it? I guess Usenet faces a similar problem, but they just don’t care/they have killfiles, right? If I’m hosting a node, can I opt out of hosting stuff I don’t like? Sorry if this is answered in the FAQ, but I couldn’t find it.

                              1. 3

                                We do have some equivalent of killfiles that can be shared between users and the app supplies a default one as well.

                                Hosting stuff you don’t like: this is actively being worked on. Aether being a flood network makes having ‘incomplete’ nodes (i.e. everything sans the stuff you don’t want) a little challenging because the core assumption is that every node is equivalent to each other, but it’s not an impossible problem to solve.

                                1. 1

                                  I’ve done some thinking about similar systems. My current working hypothesis is that you can keep the digest/signature metadata of a post, but not the data itself. That way your Merkle tree or graph or whatever is intact, but stuff you don’t want is just skeletal.

                                  1. 3

                                    You can do this, we in fact do this in Aether, it’s called the manifest and it’s the first few pages of any node’s payload. However having the manifest only makes you aware of what you have missing (i.e. saying ‘this node does not have these data’) but it does not tell you anything about where that data actually is.

                                    The usual solution to this is to stick a DHT in front of it but DHTs fail at 30%+ attrition rates. That means, for it to work, out of 10 people that download and start the app, more than 7 will need to stay for the long term, so that DHT can reliably stick indexing data into them. That is a big ask, and it is not enough resilience for a P2P network that purports to ‘just work’, and not ask its users of making sacrifices because it’s P2P.

                              2. 2

                                A while ago I checked out the docs of the Mim protocol, which Aether uses. Are there any other applications that you know of that are using Mim?

                                One thing I noticed is that, like SSB did with Node’s Json.stringify, Mim relies on the output of Go’s json.Marshal to compute and verify fingerprints and Proof of Work.

                                I’d recommend that you introduce some canonicalization step in the protocol before it causes any headaches in the future. All attempts at building SSB clients had a really hard time w.r.t this (including my own, I had to fork a json library and modify it to become compatible with Node’s). SSB is in the process of transitioning to a new feed format that uses CBOR instead of Json, which supports a canonical ordering.

                                1. 1

                                  A couple of suggestions:

                                  1). Add source code to the download list, as an archive and as a link to some git repo. Right now it’s rather unintuitive to find where the source code lives.

                                  2). Add a users guide that is easily visible and show people how to elect, filter, vote, block, delete, etc. from the js client since, again, it’s rather unintuitive how to do it.

                                  3). If possible post a repo of the website so people can send you UX improvements without needing to email and asking for the source code.

                                1. 6

                                  The usual problem encountered when cross-compiling from a non-macOS system to macOS is you need the macOS headers and it’s against the licence agreement to redistribute them or even use them on non-Apple hardware:

                                  You may not alter the Apple Software or Services in any way in such copy, e.g., You are expressly prohibited from separately using the Apple SDKs or attempting to run any part of the Apple Software on non-Apple-branded hardware.

                                  How does Zig handle this?

                                  Edit: having said that, this repo has existed for a long time and hasn’t been taken down yet…

                                  1. 17

                                    it’s not against the license agreement. the header files are under the APSL https://spdx.org/licenses/APSL-1.1.html

                                    1. 3

                                      Even if it was, it’s probably not enforceable. Didn’t we have a ruling a while back stating that interfaces were not eligible for copyright?

                                      1. 2

                                        That was Oracle v Google, right?

                                        1. 2

                                          That’s the one. If I recall correctly, Google originally lost, then appealed, and the ruling was basically reversed to “interfaces are not subject to copyright”.

                                          Now that was American law. I have no idea about the rest of the world. I do believe many legislations have explicit exceptions for interoperability, though.

                                          1. 5

                                            That’s the one. If I recall correctly, Google originally lost, then appealed, and the ruling was basically reversed to “interfaces are not subject to copyright”.

                                            The Supreme Court judgement said ‘assume interfaces are copyrightable, in this case Oracle still loses’ it did not make a ruling on whether interfaces are copyrightable.

                                            1. 3

                                              and the ruling was basically reversed to “interfaces are not subject to copyright”

                                              Not exactly, the ruling didn’t want to touch the “interfaces are not subject to copyright” matter since that would open a big can of worms. What it did say, however, was that Google’s specific usage of those interfaces fell into the fair use category.

                                              1. 1

                                                Ah, so in the case of Zig, it would also be fair use, but since fair use is judged on a case by case basis, there’s still some uncertainty. Not ideal, though it looks like it should work.

                                                1. 1

                                                  There’s no useful precedent. Google’s fair use was from an independent implementation of an interface for compatibility. Zig is copying header files directly and so must comply with the licenses for them. The exact licenses that apply depend on whether you got the headers from the open source code dump or by agreeing to the XCode EULA. A lot of the system headers for macOS / iOS are only available if you agree to the XCode EULA, which prohibits compilation on anything other than an Apple-branded system.

                                                  1. 1

                                                    I recall that Google did copy interface files (or code) directly, same as Zig?

                                                    1. 2

                                                      Java doesn’t have any analogue of .h files, they wrote new .java files that implemented the same methods. There is a difference between creating a new .h file that contains equivalent definitions and copying a .h file that someone else wrote. If interfaces are not copyrightable, then the specific serialisation in a text file may still be because it may contain comments and other things that are not part of the interface.

                                        2. 1

                                          Interesting. Ok so does Zig just include the headers from the most SDK then?

                                          1. 10

                                            The way zig collects macos headers is still experimental. We probably need to migrate to using an SDK at some point. For now it is this project.

                                            1. 1

                                              I’d be super nervous about using this in production. This is using code under the Apple Public Source License, which explicitly prohibits using it to circumvent EULAs of Apple products. The XCode EULA under which the SDKs are prohibited explicitly prohibits cross-compiling from a non-Apple machine. I have no idea what a judge would decide, but I am 100% sure that Apple can afford to hire better lawyers than I can.

                                              1. 3

                                                Zig has nothing to do with xcode. Zig does not depend on xcode or use xcode in any way. The macos headers have to do with interfacing with the Darwin kernel.

                                        3. 1

                                          Edit: having said that, this repo has existed for a long time and hasn’t been taken down yet…

                                          Apple generally doesn’t bother with small-scale infringement. They care about preventing cross compilation only insofar as it might hurt Mac sales.

                                        1. 3

                                          Spot instances are unstable by design, and can go down any time. However, in practice I have seen very few terminations. My longest uptime has been above 300 days (in region eu-west-1)

                                          In my experience, at least on us-east-1, I’ve seen multiple spot instances get terminated every day, with “no capacity” errors when trying to submit new spot requests, so YMMV.

                                          1. 1

                                            True, I’ve seen more terminations in other regions/for other instance types. However, even with a daily termination, this would mean ~5 minutes of downtime (as a new instance boots up) which for many personal projects should be sufficient. The key is to specify as many instance types as possible.

                                            1. 1

                                              Can confirm this as well (at least for us-east-1). We use to use spot instances for builds, and even though the builds were roughly 10-15 minutes, there were cases of intermittent spot instance termination.

                                              Easier to just eat the cost of on-demand in my experience. Maybe it is region specific.

                                            1. 4

                                              Not as impressive as the linked time (2 minutes! incredible), but I switching from an Intel to an ARM-based Macbook was like night and day in terms of building LLVM. What used to take 37 minutes and plenty of fan noise on my i7 now took 13 minutes of no noise at all.

                                              1. 2

                                                It looks (though the post is a little tentative about saying it) like this bug is present in 10.15 Catalina, but that there isn’t a security patch for Catalina. At the moment, I don’t know whether that is just that there isn’t a patch yet, or whether Apple’s tacit but observable “current minus one” policy for security fixes has been re-unwritten.

                                                I don’t particularly want to update my work/PhD-writing laptop to Big Whoop any time soon if I don’t have to, though.

                                                1. 2

                                                  There should be a patch now: https://support.apple.com/en-us/HT212326

                                                  Impact: A malicious application may bypass Gatekeeper checks

                                                  Description: A logic issue was addressed with improved state management.

                                                  CVE-2021-1810: an anonymous researcher

                                                  1. 1

                                                    thanks for that, it doesn’t show up for me yet but hopefully does soon!

                                                1. 3

                                                  I generally like the runtime-level decisions Go makes, e.g. instead of object headers there are “fat pointers” (interface values) instantiated specifically when you want to do something dynamic, the use of interfaces rather than hierarchies. Despite the throughput issues vs. other GC’s, I like that Go’s allows internal pointers, avoids stopping the world for long, and doesn’t require a read barrier.

                                                  One cool thing Erlang/BEAM did that Go didn’t is separate shared and per-process local/private heaps at the runtime and language level.

                                                  In a language like Go that would probably involve a type qualifier like shared, and when something needed to be shared that wasn’t yet (or something shared needed to become private to a thread) you could copy it, with the compiler sometimes able to move where an allocation is done as an optimization. (There’s a loose analogy to creating pointers vs. values and stack/heap.)

                                                  It would help with a couple things. Most important, it’s a path towards safer concurrency. There are various approaches: Rust-like rules that accesses to the shared heap must be explicitly guarded, something more implicit (with some risk of locking not working how you meant), or at least patch up the ways race conditions cause type and memory unsafety today (e.g. write type/pointer and length/pointer pairs to the shared heap with atomics like x86’s cmpxchg128) like the JVM does. If you don’t build static concurrency safety into the language, a shared qualifier could at least possibly help static analysis and let dynamic checkers slow down fewer accesses. (If this sort of stuff sounds interesting you might like “smaller Rust” blog posts (1, 2) though it’s definitely its own idea only loosely related.)

                                                  It would also open up some options for GC. With a rule that shared data can’t point to any thread’s private data, you could revive Go’s ‘request-oriented collector’ idea (quick collection of data private to one thread when it quits), do per-thread collections that don’t have to worry about concurrent accesses, or even do moving or generational collection for local data. All that works because stopping one thread isn’t stopping the world; threads pause all the time. You could keep Go’s existing non-generational approach on the shared heap, but its performance could benefit when local allocations no longer factor into global GC rate. (Or you could go for a design like Java’s ZGC with a read barrier, but man, seems even harder than what Go does!)

                                                  I realize these things get vastly more complicated once you get into details. (How do you not get your lunch eaten by private<->shared copies and synchronization accessing the shared heap? What on earth is that “something more implicit” to semi-safely access shared stuff? etc.) And Go is Go and BEAM is BEAM and you can’t just order up a mix of the two. But the bulk of the Go runtime model with an explicit shared/private distinction tacked seems like a neat spot in the design space that I haven’t seen explored. If there are existing examples I don’t know of it would be neat to hear about ’em!

                                                  1. 3

                                                    I haven’t read about it in a while, but what you’re describing sounds similar to the way references work in Pony. They have lots of great papers about how their GC works, but here’s a start to the different kinds of references: http://jtfmumm.com/blog/2016/03/06/safely-sharing-data-pony-reference-capabilities/

                                                    1. 1

                                                      Thank you!

                                                    2. 3

                                                      (Pony contributor here) Like @Pentlander says, Pony checks several of these points:

                                                      • It enforces write-uniqueness (among other things) across actors using reference capabilities, which allows you to “move” data across actors without any copying–most things are pass-by-reference.

                                                      • Each actor has its own heap, so GC can happen independently.

                                                      There’s a talk comparing Pony to Erlang here: https://www.youtube.com/watch?v=_0m0_qtfzLs if you want to take a look. If you have any questions, you can post them here, or take a look at the community page, join, and asks question there!

                                                      1. 1

                                                        Thanks!

                                                    1. 50

                                                      The paper has this to say (page 9):

                                                      Regarding potential human research concerns. This experiment studies issues with the patching process instead of individual behaviors, and we do not collect any personal information. We send the emails to the Linux community and seek their feedback. The experiment is not to blame any maintainers but to reveal issues in the process. The IRB of University of Minnesota reviewed the procedures of the experiment and determined that this is not human research. We obtained a formal IRB-exempt letter.

                                                      [..]

                                                      Honoring maintainer efforts. The OSS communities are understaffed, and maintainers are mainly volunteers. We respect OSS volunteers and honor their efforts. Unfortunately, this experiment will take certain time of maintainers in reviewing the patches. To minimize the efforts, (1) we make the minor patches as simple as possible (all of the three patches are less than 5 lines of code changes); (2) we find three real minor issues (i.e., missing an error message, a memory leak, and a refcount bug), and our patches will ultimately contribute to fixing them.

                                                      I’m not familiar with the generally accepted standards on these kind of things, but this sounds rather iffy to me. I’m very far removed from academia, but I’ve participated in a few studies over the years, which were always just questionaries or interviews, and even for those I had to sign a consent waiver. “It’s not human research because we don’t collect personal information” seems a bit strange.

                                                      Especially since the wording “we will have to report this, AGAIN, to your university” implies that this isn’t the first time this has happened, and that the kernel folks have explicitly objected to being subject to this research before this patch.

                                                      And trying to pass off these patches as being done in good faith with words like “slander” is an even worse look.

                                                      1. 78

                                                        They are experimenting on humans, involving these people in their research without notice or consent. As someone who is familiar with the generally accepted standards on these kinds of things, it’s pretty clear-cut abuse.

                                                        1. 18

                                                          I would agree. Consent is absolutely essential but just one of many ethical concerns when doing research. I’ve seen simple usability studies be rejected due to lesser issues.

                                                          It’s pretty clear this is abuse.. the kernel team and maintainers feel strongly enough to ban the whole institution.

                                                          1. 10

                                                            Yeah, agreed. My guess is they misrepresented the research to the IRB.

                                                            1. 3

                                                              They are experimenting on humans

                                                              This project claims to be targeted at the open-source review process, and seems to be as close to human experimentation as pentesting (which, when you do social engineering, also involves interacting with humans, often without their notice or consent) - which I’ve never heard anyone claim is “human experimentation”.

                                                              1. 19

                                                                A normal penetration testing gig is not academic research though. You need to separate between the two, and also hold one of them to a higher standard.

                                                                1. 0

                                                                  A normal penetration testing gig is not academic research though. You need to separate between the two, and also hold one of them to a higher standard.

                                                                  This statement is so vague as to be almost meaningless. In what relevant ways is a professional penetration testing contract (or, more relevantly, the associated process) different from this particular research project? Which of the two should be held to a higher standard? Why? What does “held to a higher standard” even mean?

                                                                  Moreover, that claim doesn’t actually have anything to do with the comment I was replying to, which was claiming that this project was “experimenting on humans”. It doesn’t matter whether or not something is “research” or “industry” for the purposes of whether or not it’s “human experimentation” - either it is, or it isn’t.

                                                                  1. 18

                                                                    Resident pentester and ex-academia sysadmin checking in. I totally agree with @Foxboron and their statement is not vague nor meaningless. Generally in a penetration test I am following basic NIST 800-115 guidance for scoping and target selection and then supplement contractual expectations for my clients. I can absolutely tell you that the methodologies that are used by academia should be held to a higher standard in pretty much every regard I could possibly come up with. A penetration test does not create a custom methodology attempting do deal with outputting scientific and repeatable data.

                                                                    Let’s put it in real terms, I am hired to do a security assessment in a very fixed highly focused set of targets explicitly defined in contract by my client in an extremely fixed time line (often very short… like 2 weeks maximum and 5 day average). Guess what happens if social engineering is not in my contract? I don’t do it.

                                                                    1. 1

                                                                      Resident pentester and ex-academia sysadmin checking in.

                                                                      Note: this is worded like an appeal to authority, although you probably don’t mean it that way, so I’m not going to act like you are.

                                                                      I totally agree with @Foxboron and their statement is not vague nor meaningless.

                                                                      Those are two completely separate things, and neither is implied by the other.

                                                                      their statement is not vague nor meaningless.

                                                                      Not true - their statement contained none of the information you just provided, nor any other sort of concrete or actionable information - the statement “hold to a higher standard” is both vague and meaningless by itself…and it was by itself in that comment (or, obviously, there were other words - none of them relevant) - there was no other information.

                                                                      the methodologies that are used by academia should be held to a higher standard

                                                                      Now you’re mixing definitions of “higher standard” - GP and I were talking about human experimentation and ethics, while you seem to be discussing rigorousness and reproducibility of experiments (although it’s not clear, because “A penetration test does not create a custom methodology attempting do deal with outputting scientific and repeatable data” is slightly ambiguous).

                                                                      None of the above is relevant to the question of “was this a human experiment” and the closely-related one “is penetration testing a human experiment”. Evidence suggests “no” given that the term does not appear in that document, nor have I heard of any pentest being reviewed by an ethics review board, nor have I heard any mention of “human experimenting” in the security community (including when gray-hat and black-hat hackers and associated social engineering e.g. Kevin Mitnick are mentioned), nor are other similar, closer-to-human experimentation (e.g. A/B testing, which is far closer to actually experimenting on people) processes considered to be such - up until this specific case.

                                                                    2. 5

                                                                      if you’re an employee in an industry, you’re either informed of penetration testing activity, or you’ve at the very least tacitly agreed to it along with many other things that exist in employee handbooks as a condition of your employment.

                                                                      if a company did this to their employees without any warning, they’d be shitty too, but the possibility that this kind of underhanded behavior in research could taint the results and render the whole exercise unscientific is nonzero.

                                                                      either way, the goals are different. research seeks to further the verifiability and credibility of information. industry seeks to maximize profit. their priorities are fundamentally different.

                                                                      1. 1

                                                                        you’ve at the very least tacitly agreed to it along with many other things that exist in employee handbooks as a condition of your employment

                                                                        By this logic, you’ve also agreed to everything else in a massive, hundred-page long EULA that you click “I agree” on, as well as consent to be tracked by continuing to use a site that says that in a banner at the bottom, as well as consent to Google/companies using your data for whatever they want and/or selling it to whoever will buy.

                                                                        …and that’s ignoring whether or not companies that have pentesting done on them actually explicitly include that specific warning in your contract - “implicit” is not good enough, as then anyone can claim that, as a Linux kernel patch reviewer, you’re “implicitly agreeing that you may be exposed to the risk of social engineering for the purpose of getting bad code into the kernel”.

                                                                        the possibility that this kind of underhanded behavior in research could taint the results and render the whole exercise unscientific

                                                                        Like others, you’re mixing up the issue of whether the experiment was properly-designed with the issue of whether it was human experimentation. I’m not making any attempt to argue the former (because I know very little about how to do good science aside from “double-blind experiments yes, p-hacking no”), so I don’t know why you’re arguing against it in a reply to me.

                                                                        either way, the goals are different. research seeks to further the verifiability and credibility of information. industry seeks to maximize profit. their priorities are fundamentally different.

                                                                        I completely agree that the goals are different - but again, that’s irrelevant for determining whether or not something is “human experimentation”. Doesn’t matter what the motive is, experimenting on humans is experimenting on humans.

                                                                  2. 18

                                                                    This project claims to be targeted at the open-source review process, and seems to be as close to human experimentation as pentesting (which, when you do social engineering, also involves interacting with humans, often without their notice or consent) - which I’ve never heard anyone claim is “human experimentation”.

                                                                    I had a former colleague that once bragged about getting someone fired at his previous job during a pentesting exercise. He basically walked over to this frustrated employee at a bar, bribed him a ton of money and gave a job offer in return for plugging a usb key into the network. He then reported it to senior management and the employee was fired. While that is an effective demonstration of a vulnerability in their organization, what he did was unethical under many moral frameworks.

                                                                    1. 2

                                                                      First, the researchers didn’t engage in any behavior remotely like this.

                                                                      Second, while indeed an example of pentesting, most pentesting is not like this.

                                                                      Third, the fact that it was “unethical under many moral frameworks” is irrelevant to what I’m arguing, which is that the study was not “human experimentation”. You can steal money from someone, which is also “unethical under many moral frameworks”, and yet still not be doing “human experimentation”.

                                                                    2. 3

                                                                      If there is a pentest contract, then there is consent, because consent is one of the pillars of contract law.

                                                                      1. 1

                                                                        That’s not an argument that pentesting is human experimentation in the first place.

                                                                  3. 42

                                                                    The statement from the UMinn IRB is in line with what I heard from the IRB at the University of Chicago after they experimented on me, who said:

                                                                    I asked about their use of any interactions, or use of information about any individuals, and they indicated that they have not and do not use any of the data from such reporting exchanges other than tallying (just reports in aggregate of total right vs. number wrong for any answers received through the public reporting–they said that much of the time there is no response as it is a public reporting system with no expectation of response) as they are not interested in studying responses, they just want to see if their tool works and then also provide feedback that they hope is helpful to developers. We also discussed that they have some future studies planned to specifically study individuals themselves, rather than the factual workings of a tool, that have or will have formal review.

                                                                    They because claim they’re studying the tool, it’s OK to secretly experiment on random strangers without disclosure. Somehow I doubt they test new drugs by secretly dosing people and observing their reactions, but UChicago’s IRB was 100% OK with doing so to programmers. I don’t think these IRBs literally consider programmers sub-human, but it would be very inconvenient to accept that experimenting on strangers is inappropriate, so they only want to do so in places they’ve been forced to by historical abuse. I’d guess this will continue for years until some random person is very seriously harmed by being experimented on (loss of job/schooling, pushing someone unstable into self-harm, targeting someone famous outside of programming) and then over the next decade IRBs will start taking it seriously.

                                                                    One other approach that occurs to me is that the experimenters and IRBs claim they’re not experimenting on their subjects. That’s obviously bullshit because the point of the experiment is to see how the people respond to the treatment, but if we accept the lie it leaves an open question: what is the role played by the unwitting subject? Our responses are tallied, quoted, and otherwise incorporated into the results in the papers. I’m not especially familiar with academic publishing norms, but perhaps this makes us unacknowledged co-authors. So maybe another route to stopping experimentation like this would be things like claiming copyright over the papers, asking journals for the papers to be retracted until we’re credited, or asking the universities to open academic misconduct investigations over the theft of our work. I really don’t have the spare attention for this, but if other subjects wanted to start the ball rolling I’d be happy to sign on.

                                                                    1. 23

                                                                      I can kind of see where they’re coming from. If I want to research if car mechanics can reliably detect some fault, then sending a prepared car to 50 garages is probably okay, or at least a lot less iffy. This kind of (informal) research is actually fairly commonly by consumer advocacy groups and the like. The difference is that the car mechanics will get paid for their work where as the Linux devs and you didn’t.

                                                                      I’m gonna guess the IRBs probably aren’t too familiar with the dynamics here, although the researchers definitely were and should have known better.

                                                                      1. 18

                                                                        Here it’s more like keying someone’s car to see how quick it takes them to get an insurance claim.

                                                                        1. 4

                                                                          Am I misreading? I thought the MR was a patch designed to fix a potential problem, and the issue was

                                                                          1. pushcx thought it wasn’t a good fix (making it a waste of time)
                                                                          2. they didn’t disclose that it was an auto-generated PR.

                                                                          Those are legitimate complaints, c.f. https://blog.regehr.org/archives/2037, but from the analogies employed (drugs, dehumanization, car-keying), I have to double-check that I haven’t missed an aspect of the interaction that makes it worse than it seemed to me.

                                                                          1. 2

                                                                            We were talking about Linux devs/maintainers too, I commented on that part.

                                                                            1. 1

                                                                              Gotcha. I missed that “here” was meant to refer to the Linux case, not the Lobsters case from the thread.

                                                                        2. 1

                                                                          Though there they are paying the mechanic.

                                                                        3. 18

                                                                          IRB is a regulatory board that is there to make sure that researchers follow the (Common Rule)[https://www.hhs.gov/ohrp/regulations-and-policy/regulations/common-rule/index.html].

                                                                          In general, any work that receives federal funding needs to comply with the federal guidelines for human subject research. All work involving human subjects (usually defined as research activities that involve interaction with humans) need to be reviewed and approved by the institution IRB. These approvals fall within a continuum, from a full IRB review (which involve the researcher going to a committee and explaining their work and usually includes continued annual reviews) to a declaration of the work being exempt from IRB supervision (usually this happens when the work meets one of the 7 exemptions listed in the federal guidelines). The whole process is a little bit more involved, see for example (all the charts)[https://www.hhs.gov/ohrp/regulations-and-policy/decision-charts/index.html] to figure this out.

                                                                          These rules do not cover research that doesn’t involve humans, such as research on technology tools. I think that there is currently a grey area where a researcher can claim that they are studying a tool and not the people interacting with the tool. It’s a lame excuse that probably goes around the spirit of the regulations and is probably unethical from a research stand point. The data aggregation method or the data anonymization is usually a requirement for an exempt status and not a non-human research status.

                                                                          The response that you received from IRB is not surprising, as they probably shouldn’t have approved the study as non-human research but now they are just protecting the institution from further harm rather than protecting you as a human subject in the research (which, by the way, is not their goal at this point).

                                                                          One thing that sticks out to me about your experience is that you weren’t asked to give consent to participate in the research. That usually requires a full IRB review as informed consent is a requirement for (most) human subject research. Exempt research still needs informed consent unless it’s secondary data analysis of existing data (which your specific example doesn’t seem to be).

                                                                          One way to quickly fix it is to contact the grant officer that oversees the federal program that is funding the research. A nice email stating that you were coerced to participate in the research study by simply doing your work (i.e., review a patch submitted to a project that you lead) without being given the opportunity to provide prospective consent and without receiving compensation for your participation and that the research team/university is refusing to remove your data even after you contacted them because they claim that the research doesn’t involve human subjects can go a long way to force change and hit the researchers/university where they care the most.

                                                                          1. 7

                                                                            Thanks for explaining more of the context and norms, I appreciate the introduction. Do you know how to find the grant officer or funding program?

                                                                            1. 7

                                                                              It depends on how “stalky” you want to be.

                                                                              If NSF was the funder, they have a public search here: https://nsf.gov/awardsearch/

                                                                              Most PIs also add a line about grants received to their CVs. You should be able to match the grant title to the research project.

                                                                              If they have published a paper from that work, it should probably include an award number.

                                                                              Once you have the award number, you can search the funder website for it and you should find a page with the funding information that includes the program officer/manager contact information.

                                                                              1. 3

                                                                                If they published a paper about it they likely included the grant ID number in the acknowledgements.

                                                                                1. 1

                                                                                  You might have more luck reaching out to the sponsored programs office at their university, as opposed to first trying to contact an NSF program officer.

                                                                              2. 4

                                                                                How about something like a an Computer Science - External Review Board? Open source projects could sign up, and include a disclaimer that their project and community ban all research that hasn’t been approved. The approval process could be as simple as a GitHub issue the researcher has to open, and anyone in the community could review it.

                                                                                It wouldn’t stop the really bad actors, but any IRB would have to explain why they allowed an experiment on subjects that explicitly refused consent.

                                                                                [Edit] I felt sufficiently motivated, so I made a quick repo for the project . Suggestions welcome.

                                                                                1. 7

                                                                                  I’m in favor of building our own review boards. It seems like an important step in our profession taking its reponsibility seriously.

                                                                                  The single most important thing I’d say is, be sure to get the scope of the review right. I’ve looked into this before and one of the more important limitations on IRBs is that they aren’t allowed to consider the societal consequences of the research succeeding. They’re only allowed to consider harm to experimental subjects. My best guess is that it’s like that because that’s where activists in the 20th-century peace movement ran out of steam, but it’s a wild guess.

                                                                                  1. 4

                                                                                    At least in security, there are a lot of different Hacker Codes of Ethics floating around, which pen testers are generally expected to adhere to… I don’t think any of them cover this specific scenario though.

                                                                                    1. 2

                                                                                      any so-called “hacker code of ethics” in use by any for-profit entity places protection of that entity first and foremost before any other ethical consideration (including human rights) and would likely not apply in a research scenario.

                                                                                2. 23

                                                                                  They are bending the rules for non human research. One of the exceptions for non-human research is research on organization, which my IRB defines as “Information gathering about organizations, including information about operations, budgets, etc. from organizational spokespersons or data sources. Does not include identifiable private information about individual members, employees, or staff of the organization.” Within this exception, you can talk with people about how the organization merges patches but not how they personally do that (for example). All the questions need to be about the organization and not the individual as part of the organization.

                                                                                  On the other hand, research involving human subjects is defined as any research activity that involves an “individual who is or becomes a participant in research, either:

                                                                                  • As a recipient of a test article (drug, biologic, or device); or
                                                                                  • As a control.”

                                                                                  So, this is how I interpret what they did.

                                                                                  The researchers submitted an IRB approval saying that they just downloaded the kernel maintainer mailing lists and analyzed the review process. This doesn’t meet the requirements for IRB supervision because it’s either (1) secondary data analysis using publicly available data and (2) research on organizational practices of the OSS community after all identifiable information is removed.

                                                                                  Once they started emailing the list with bogus patches (as the maintainers allege), the research involved human subjects as these people received a test article (in the form of an email) and the researchers interacted with them during the review process. The maintainers processing the patch did not do so to provide information about their organization’s processes and did so in their own personal capacity (In other words, they didn’t ask them how does the OSS community processes this patch but asked them to process a patch themselves). The participants should have given consent to participate in the research and the risks of participating in it should have been disclosed, especially given the fact that missing a security bug and agreeing to merge it could be detrimental to someone’s reputation and future employability (that is, this would qualify for more than minimal risk for participants, requiring a full IRB review of the research design and process) with minimal benefits to them personally or to the organization as a whole (as it seems from the maintainers’ reaction to a new patch submission).

                                                                                  One way to design this experiment ethically would have been to email the maintainers and invite them to participate in a “lab based” patch review process where the research team would present them with “good” and “bad” patches and ask them whether they would have accepted them or not. This is after they were informed about the study and exercised their right to informed consent. I really don’t see how emailing random stuff out and see how people interact with it (with their full name attached to it and in full view of their peers and employers) can qualify as research with less than minimal risks and that doesn’t involve human subjects.

                                                                                  The other thing that rubs me the wrong way is that they sought (and supposedly received) retroactive IRB approval for this work. That wouldn’t fly with my IRB, as my IRB person would definitely rip me a new one for seeking retroactive IRB approval for work that is already done, data that was already collected, and a paper that is already written and submitted to a conference.

                                                                                  1. 6

                                                                                    You make excellent points.

                                                                                    1. IRB review has to happen before the study is started. For NIH, the grant application has to have the IRB approval - even before a single experiment is even funded to be done, let alone actually done.
                                                                                    2. I can see the value of doing a test “in the field” so as to get the natural state of the system. In a lab setting where the participants know they are being tested, various things will happen to skew results. The volunteer reviewers might be systematically different from the actual population of reviewers, the volunteers may be much more alert during the experiment and so on.

                                                                                    The issue with this study is that there was no serious thought given to what are the ethical ramifications of this are.

                                                                                    If the pen tested system has not asked to be pen tested then this is basically a criminal act. Otherwise all bank robbers could use the “I was just testing the security system” defense.

                                                                                    1. 8

                                                                                      The same requirement for prior IRB approval is necessary for NSF grants (which the authors seem to have received). By what they write in the paper and my interpretation of the circumstances, they self certified as conducting non-human research at time of submitting the grant and only asked their IRB for confirmation after they wrote the paper.

                                                                                      Totally agree with the importance of “field experiment” work and that, sometimes, it is not possible to get prospective consent to participate in the research activities. However, the guidelines are clear on what activities fall within research activities that are exempt from prior consent. The only one that I think is applicable to this case is exception 3(ii):

                                                                                      (ii) For the purpose of this provision, benign behavioral interventions are brief in duration, harmless, painless, not physically invasive, not likely to have a significant adverse lasting impact on the subjects, and the investigator has no reason to think the subjects will find the interventions offensive or embarrassing. Provided all such criteria are met, examples of such benign behavioral interventions would include having the subjects play an online game, having them solve puzzles under various noise conditions, or having them decide how to allocate a nominal amount of received cash between themselves and someone else.

                                                                                      These usually cover “simple” psychology experiments involving mini games or economics games involving money.

                                                                                      In the case of this kernel patching experiment, it is clear that this experiment doesn’t meet this requirement as participants have found this intervention offensive or embarrassing, to the point that they are banning the researchers’ institution from pushing patched to the kernel. Also, I am not sure if reviewing a patch is a “benign game” as this is the reviewers’ jobs, most likely. Plus, the patch review could have adverse lasting impact on the subject if they get asked to stop reviewing patches if they don’t catch the security risk (e.g., being deemed imcompetent).

                                                                                      Moreover, there is this follow up stipulation:

                                                                                      (iii) If the research involves deceiving the subjects regarding the nature or purposes of the research, this exemption is not applicable unless the subject authorizes the deception through a prospective agreement to participate in research in circumstances in which the subject is informed that he or she will be unaware of or misled regarding the nature or purposes of the research.

                                                                                      As their patch submission process was deceptive in nature, as their outline in the paper, exemption 3(ii) cannot apply to this work unless they notify maintainers that they will be participating in a deceptive research study about kernel patching.

                                                                                      That leaves the authors to either pursue full IRB review for their work (as a full IRB review can approve a deceptive research project if it deems it appropriate and the risk/benefit balance is in favor to the participants) or to self-certify as non-human subjects research and fix any problems later. They decided to go with the latter.

                                                                                  2. 35

                                                                                    We believe that an effective and immediate action would be to update the code of conduct of OSS, such as adding a term like “by submitting the patch, I agree to not intend to introduce bugs.”

                                                                                    I copied this from that paper. This is not research, anyone who writes a sentence like this with a straight face is a complete moron and is just mocking about. I hope all of this will be reported to their university.

                                                                                    1. 18

                                                                                      It’s not human research because we don’t collect personal information

                                                                                      I yelled bullshit so loud at this sentence that it woke up the neighbors’ dog.

                                                                                      1. 2

                                                                                        Yeah, that came from the “clarifiactions” which is garbage top to bottom. They should have apologized, accepted the consequences and left it at that. Here’s another thing they came up with in that PDF:

                                                                                        Suggestions to improving the patching process In the paper, we provide our suggestions to improve the patching process.

                                                                                        • OSS projects would be suggested to update the code of conduct, something like “By submitting the patch, I agree to not intend to introduce bugs”

                                                                                        i.e. people should say they won’t do exactly what we did.

                                                                                        They acted in bad faith, skirted IRB through incompetence (let’s assume incompetence and not malice) and then act surprised.

                                                                                      2. 14

                                                                                        Apparently they didn’t ask the IRB about the ethics of the research until the paper was already written: https://www-users.cs.umn.edu/~kjlu/papers/clarifications-hc.pdf

                                                                                        Throughout the study, we honestly did not think this is human research, so we did not apply for an IRB approval in the beginning. We apologize for the raised concerns. This is an important lesson we learned—Do not trust ourselves on determining human research; always refer to IRB whenever a study might be involving any human subjects in any form. We would like to thank the people who suggested us to talk to IRB after seeing the paper abstract.

                                                                                        1. 14

                                                                                          I don’t approve of researchers YOLOing IRB protocols, but I also want this research done. I’m sure many people here are cynical/realistic enough that the results of this study aren’t surprising. “Of course you can get malicious code in the kernel. What sweet summer child thought otherwise?” But the industry as a whole proceeds largely as if that’s not the case (or you could say that most actors have no ability to do anything about the problem). Heighten the contradictions!

                                                                                          There are some scary things in that thread. It sounds as if some of the malicious patches reached stable, which suggests that the author mostly failed by not being conservative enough in what they sent. Or for instance:

                                                                                          Right, my guess is that many maintainers failed in the trap when they saw respectful address @umn.edu together with commit message saying about “new static analyzer tool”.

                                                                                          1. 17

                                                                                            I agree, while this is totally unethical, it’s very important to know how good the review processes are. If one curious grad student at one university is trying it, you know every government intelligence department is trying it.

                                                                                            1. 8

                                                                                              I entirely agree that we need research on this topic. There’s better ways of doing it though. If there aren’t better ways of doing it, then it’s the researcher’s job to invent them.

                                                                                            2. 7

                                                                                              It sounds as if some of the malicious patches reached stable

                                                                                              Some patches from this University reached stable, but it’s not clear to me that those patches also introduced (intentional) vulnerabilities; the paper explicitly mentions the steps that they’re taking steps to ensure those patches don’t reach stable (I omitted that part, but it’s just before the part I cited)

                                                                                              All umn.edu are being reverted, but at this point it’s mostly a matter of “we don’t trust these patches and will need additional review” rather than “they introduced security vulnerabilities”. A number of patches already have replies from maintainers indicating they’re genuine and should not be reverted.

                                                                                              1. 5

                                                                                                Yes, whether actual security holes reached stable or not is not completely clear to me (or apparently to maintainers!). I got that impression from the thread, but it’s a little hard to say.

                                                                                                Since the supposed mechanism for keeping them from reaching stable is conscious effort on the part of the researchers to mitigate them, I think the point may still stand.

                                                                                                1. 1

                                                                                                  It’s also hard to figure out what the case is since there is no clear answer what the commits where, and where they are.

                                                                                              2. 4

                                                                                                The Linux review process is so slow that it’s really common for downstream folks to grab under-review patches and run with them. It’s therefore incredibly irresponsible to put patches that you know introduce security vulnerabilities into this form. Saying ‘oh, well, we were going to tell people before they were deployed’ is not an excuse and I’d expect it to be a pretty clear-cut violation of the Computer Misuse Act here and equivalent local laws elsewhere. That’s ignoring the fact that they were running experiments on people without their consent.

                                                                                                I’m pretty appalled the Oakland accepted the paper for publication. I’ve seen paper rejected from there before because they didn’t have appropriate ethics review oversite.

                                                                                            1. 28

                                                                                              “AI” proctoring software has had this as a failure mode for a long time, but it’s become especially pressing over the past year or so as so many things went remote.

                                                                                              I recall a story about some Black students taking their bar exams who kept getting flagged for leaving during the exam because of failure of facial recognition, and some were literally told to go out and buy the brightest high-beam lights they could get and point them directly into their own faces – all through the exam – in hopes of producing a light enough tone that the facial recognition would work.

                                                                                              And this is just the latest in a long long line of problems, including infamous examples like automated restroom soap dispensers that don’t register dark skin, cameras that go into “did someone blink?” mode when taking photographs of Asian people, etc. The fact that it’s been developed and tested by monoculture or nearly-monoculture teams is often painfully obvious.

                                                                                              1. 20

                                                                                                It’s worth noting that the “did someone blink” case happened on Nikon cameras, which is a Japanese company. The fact that their own cameras failed against Asian faces points to a biased dataset, not necessarily to a homogeneous team.

                                                                                                1. 9

                                                                                                  Or that Nikon is really bad at developing software, which is born out by other examples.

                                                                                                  1. 7

                                                                                                    I’ve had a very dark cat for the last half year, not quite black but close enough, and it’s surprisingly hard to take a decent picture of her with my phone camera, even in good light conditions. And with less-than-good light conditions it’s just impossible. This isn’t the best camera but those facial recognition systems probably aren’t either.

                                                                                                    It seems to me that these kind of things are just really hard to develop in the first place; I don’t think it’s necessarily indicative of any sort of bias (although it could be). The real problem is that the current zeitgeist is to push through with automation no matter what the costs or trade-offs might be, which is a problem that extends far beyond this sort of stuff. Th classic “waving your loved one goodbye from the train platform”-scene is now impossible in many countries as you need a ticket to enter the train platform, and those machines don’t have an “I just want to wave goodbye” option.

                                                                                                    As a society we seem to be like a kid with a new toy that wants to do nothing other than play with this new toy and we view this sort of (alleged) “progress” as a force of nature we have no control over.

                                                                                              1. 3

                                                                                                I have to be honest, I have no idea what I just read. It seems to be a description of an interchange binary protocol, but has several snippets that make me dubious that anything here makes any sense:

                                                                                                We claim that Behaviour = Trust(Information) can model any entity in the shared reality

                                                                                                We believe in freedom; Trust ⊂ Information × Behaviour

                                                                                                We believe in intelligence; throughput ∼ latency−1 i.e. tradeoff between efficiency and redundancy.

                                                                                                We believe in harmony; there is a convergence of knowledge, ∃polite ∈ ∨Knowledge.

                                                                                                We believe in prosperity; ∃world ∈ Freedom ∧ Harmony

                                                                                                By using datalisp to describe how to communicate in the network we can get closer to a goal of the system, which is to optimize the firing sequences in the world-wide propnet by calculating paths using the bayesian semi-naive evaluation or something else that allows us to be as lazy as possible without compromising security.

                                                                                                1. 1

                                                                                                  It’s a description of a /human readable/ data-interchange format. The human readable part is what makes it necessary to talk about politics (how do we decide what each word means?) and economics (how do we avoid the tragedy of the commons? - i.e. a noisy language where nothing is properly defined).

                                                                                                  I tried to be as honest as I was able to in how I came around to these decisions when designing the system, while this is just a thought experiment as-is, I really do plan on building this.

                                                                                                1. 5

                                                                                                  I remember someone here who was writing about Github’s whole “fork” model being totally busted cuz the activation energy is like “fork this repo, send in one patch, then send it back” instead of just “write the patch and give it in”. I wonder if widespread adoption of this for Git would enable that kind of workflow over on GH & co.

                                                                                                  (I mean there’s always email patches too I guess… would be cool if GH supported that as well)

                                                                                                  1. 13

                                                                                                    The friction is still lower than hoping your email/list didn’t get mangled.

                                                                                                    1. 2

                                                                                                      I think it would have been better if Git made sure that attaching patches as Email attachments was also supported by the regular tooling. The entire issue with mangled patch messages stems from the assumption of non-technical clients that plain-text doesn’t have to maintain a structure.

                                                                                                    2. 4

                                                                                                      For simple one-file changes, Github’s web UI works fine. You click “edit” on a file, do your changes, then at the bottom there’s a button that allows you to create a PR with that change as its only commit. Github will fork the repository in the background for you.

                                                                                                      1. 2

                                                                                                        Yeah, that wasn’t around for the first few years of GitHub’s existence, but it definitely does help with the rate of casual contributions.

                                                                                                    1. 13

                                                                                                      There were some… colorful comments in the post when I read it. I’m sure there’s a better source for this than a page that also publishes transphobic comments and is filled with /g/-tier comments.

                                                                                                      1. 5

                                                                                                        My initial reaction to your comment was that LinuxReviews is a wiki that anyone can edit, so it might just be a single bad actor. A closer look revealed the author of that article to be one of the main authors of the site, and a staff member to boot. I also see a similar pattern in the comments. Yeah, things aren’t looking good.

                                                                                                        /me sighs. Guess I’ll keep my LR account for correcting errors/misinformation on popular posts but otherwise keep some distance.

                                                                                                        1. 4

                                                                                                          There’s also blatantly nationalistic hostility in the post itself, claiming that American’s don’t understand “face” or trustworthiness.

                                                                                                      1. 3

                                                                                                        I’m actually doing this right now to debug some problems, but with TCP, which basically boils down to the same: embed a timestamp with your application message, and compare it with a local timestamp when you receive the message.

                                                                                                        If you’re using AWS, they expose an NTP server based on GPS and Atomic Clocks which has good enough accuracy to measure millisecond-level timing information.