1. 35

    Unlike say, VMs, containers have a minimal performance hit and overhead

    Ugh. I really hate it when people say things like that because it’s both wrong and a domain error:

    A container is a packaging format, a VM is an isolation mechanism. Containers can be deployed on VMs or they can be deployed with shared-kernel isolation mechanisms (such as FreeBSD Jails, Solaris Zones, Linux cgroups, namespaces seccomp-bpf and wishful thinking), , or with hybrids such as gVisor.

    Whether a VM or a shared-kernel system has more performance overhead is debatable. For example, FreeBSD Jails now support having per-jail copies of the entire network stack because using RSS in the hardware to route packets to a completely independent instance of the network stack gives better performance and scalability than sharing state owned by different jails in the same kernel data structures. Modern container-focused VM systems do aggressive page sharing and so have very little memory overhead and even without that the kernel is pretty tiny in comparison to the rest of a typical container-deployed software stack.

    Running everything as root. We never let your code run as root before, why is it now suddenly a good idea?

    This depends entirely on your threat model. We don’t run things as root because we have multiple security contexts and we want to respect the principle of least privilege. With containerised deployments, each container is a separate security context and already runs with lower privileges than the rest of the system. If your isolation mechanism works properly then the only reason to run as a non-root user in a container is if you’re running different programs in different security contexts within the container. If everything in your container is allowed to modify all state owned by the container then there’s no reason to not run it all as root. If you do have multiple security contexts inside a container then you need to think about why they’re not separate containers because now you’re in a world where you’re managing two different mechanisms for isolating different security contexts.

    1. 22

      I think you mean an image is a packaging format, whereas a container is an instance of a jail made up of various shared-kernel isolation mechanisims (including the wishful thinking) as you mentioned.

      Yes, the terminology is unfortunate. My reimplementation of Docker calls it an “instance” rather than “container”.

      1. 3

        yeah, the “’never run as root in your container” thing kills me

        1. 9

          IIUC that’s all because the way Linux isolates users (with the whole UID remapping into a flat range thing) is weird and there’s way too many security bugs related to that.

          1. 1

            I don’t know if this is still true, but part of where this advice comes from is that it used to be that running as root meant running as root on the host (i.e. the mechanism you’re talking about was not used by Docker). In theory this was “fine” because you could only get at stuff on the container environment, but it meant that if there was a container breakout exploit you were unconfined root on the host. So running as non-root in the container meant that you’d have to pair a container breakout with a privilege escalation bug to get that kind of access.

            In other words: the isolation mechanism did not work properly.

        2. 1

          That’s interesting. I haven’t actually bench tested the two in years. I’ll have to revisit it.

          1. 1

            You might want to have a look at NVIDIA’s enroot or Singularity for some lower-overhead alternatives. I’ve briefly looked at enroot after I saw the talk about Distributed HPC Applications with Unprivileged Containers at FOSDEM 2020, but sadly haven’t gotten a chance to use them at work yet.

            1. 2

              Have you tried https://github.com/weaveworks/ignite to just run a docker image in a VM instead of a container?

              1. 1

                No, haven’t stumbled across that before. Thanks, that looks very interesting!

                1. 1

                  That seems interesting. I wonder what benefit it provides compared to the shared-kernel isolation mechanism used by docker run <container>. Do I get stronger isolation, performance boost, or something else?

                  1. 2

                    I think there are always tradeoffs, but a VM may be easier to reason about than a container still. It’s a level of abstraction that you can apply thinking about a single computer to.

                    I do think that you get stronger isolation guarantees too. You can also more easily upgrade things, so if you have a kernel vulnerability that affects one of the containers, you can reload just that one. There are many issues that affect hypervisors only or guests only.

                    At launch we used per-customer EC2 instances to provide strong security and isolation between customers. As Lambda grew, we saw the need for technology to provide a highly secure, flexible, and efficient runtime environment for services like Lambda and Fargate. Using our experience building isolated EC2 instances with hardware virtualization technology, we started an effort to build a VMM that was tailored to run serverless functions and integrate with container ecosystems.

                    It also seems like a compromise between the user interface for a developer and an operations deep expertise. If you have invested 15 years in virtualization expertise, maybe you stick with that with ops and present a container user interface to devs?

                    For me, one of the big things about containers was not requiring special hardware to virtualize at full speed and automatic memory allocation. You’re never stuck with an 8GB VM you have to shut down to prevent your web browser from being swapped out when you’re trying to open stack overflow. You know 8gb was suggested, but you also see that only 512MB is actually being used.

                    Most hardware these days has hardware acceleration for virtualization and firecracker supports the virtio memory ballooning driver as of Dec 2020, so many of the reasons I would have used containers in 2013 are moot.

                    As an ops person myself, I find containers to often have an impedance mismatch with software defaults. Why show a container that is limited to two cores that it has 64 cores? Haproxy will deadlock itself waiting for all 64 connection threads to get scheduled on those two cores. You look in there and you’re like ‘oh, how do I hardcode the number of threads in haproxy now to two…’. It’s trivial with haproxy, but it’s not default. How many other things do you know of that use nproc+1 and will get tripped up in a container? How many different ways do you have to configure this for different runtimes and languages?

            2. 1

              Containers can be deployed on VMs

              OT because I agree with everything you said, but I have yet to find a satisfying non-enterprise (i.e. requiring a million other network services and solutions).

              Once upon a time, I was sure VMware was going to add “deploy container as VM instance” to ESXi but then they instead released Photon and made it clear containers would never be first-class residents on ESXi but would rather require a (non-invisible) host VM in a one-to-many mapping.

              1. 2
                1. 1

                  We use this at my work (Sourcegraph) for running semi-arbitrary code (language indexers, etc.), it works really well.

            1. 4

              As browsers expose more APIs, exposing the Unicode data they already need for their own regexp implementations, sorting, etc. seems like a reasonable thing to want. (There are already substantial i18n APIs.)

              Many wasm programs need to process text and all need the same data to do so, and sometimes following the browser’s idea of Unicode may be an added benefit–gets an old program ‘free’ Unicode version updates, or conversely helps grapheme clustering match the browser’s when the browser is not up to date.

              1. 2

                (Author here) I would love this. I think that even just improving the existing Intl.Collator API could do good enough. The problem with it currently, from what I have seen, is that it’s actual behavior is not formally specified. That is, it works well if you’re, say, trying to sort strings in the user’s locale - which is what the API was designed to do - but if you expect those results to be consistent across browsers you’re out of luck and will quickly run into issues like[0]

                If the behavior was consistent across browsers, I think that’d be good enough to rely on the Intl APIs in general. But yeah, differing behavior for say a regexp engine across browsers is.. well.. not desirable, to say the least.

                [0] https://stackoverflow.com/questions/33919257/sorting-strings-with-punctuation-using-intl-collator-is-inconsistent-across-brow

                1. 1

                  Oh, wow, Collator being almost right is kind of a maddening snatching-defeat-from-the-jaws-of-victory situation.

                  Zorex looks cool! A regex-like language that does things Everyone Knows Regexes Can’t Do seems like it could be useful in a lot of places.

              1. 14

                This post contains worrisome amounts of misinformation, and I am sad to see it on the front page.

                I am way too busy to take a whole day off to write a rebuttal, but have written on some of these issues before: https://writing.kemitchell.com/2018/01/06/CLAs-Are-Not-a-Sham.html

                Understand what CLAs, copyright assignments, and the DCO were meant to accomplish, and how they are supposed to accomplish those things. Start by actually reading them. The most common and important are all way shorter than GPL.

                1. 6

                  I know you are a lawyer, so you know better than me. In fact, I have read your blog extensively and trust you enough that I might purchase your services as a lawyer one day.

                  But…while your post may be right from a law perspective, it is wrong in practice. See, for example, 1 and 2.

                  In your post, you claim that most CLA’s are done wrong, and that may be true. But I looked at your Berneout Pledge, and to my untrained eye, it looks like a DCO, not a CLA. It says only that the contributions will be licensed under the license for the software that the contributions are for.

                  I know that you will protest that your Berneout Pledge is not “a” DCO because there is only one DCO: for the Linux kernel. In your post, you say that the DCO was made for the Linux kernel, and historically, that is true.

                  However, the term “Developer Certificate of Origin” has moved on from that meaning, as vocabulary often does, and in popular use, it does not always specifically refer to that DCO; it is often used, in practice, as a term for all like certificates.

                  The same thing has happened with the term “Contributor License Agreement.” That term is now used for all agreements that have a copyright assignment or an agreement to allow relicensing. While CLA’s don’t technically have to allow that, in practice they do, which is why, as you say, CLA’s are done wrong.

                  Personally, I would argue that if they are done that way in general, that’s the new meaning of the term. And as you can see in 1 and 2, such agreements cause big problems for developers who don’t want to have projects they contributed to be taken over, and it causes problems for users who have trusted the developers.

                  In short, you say that CLA’s are not a sham and present a version of one, but to most developers, they would call that a DCO, so in practice, developers will still consider CLA’s a sham. And I think they are because the meaning of the term has come to mean just that: agreements that are a sham.

                  1. 9

                    Might you link me to some examples of DCOs that aren’t the DCO? I’d be grateful to see those. I’m sure I’ll have reason to know or mention them in response to questions from others, someday soon.

                    My first thought there is that I suspect they are actually license agreements. I expect they actually grant licenses for contributions, and don’t just talk about the right to license.

                    My second thought was that for whatever the git-commit manpage section on --signoff is worth—not much, I think—it’d be worth even less for some other form different from the one named and linked there. Evidence of a Git flag specifically evidencing intent to express a particular set of legal terms could possibly beat a lawsuit in the right circumstances. The stronger the convention, the more circumstances those might be. But proliferation of different terms Signed-Off-By might refer to would tend to weaken, not strengthen, arguments about implied intent.

                    By the way, your digging up the Berneout Pledge was right on point. To be clear: The Pledge was a proof of concept and an experiment. I don’t actively recommend its use. But I was glad to see it succeeded in one respect: it “felt” more like DCO than CLA to you, as you read it. That was an explicit goal: to adopt the DCO style, but cover real CLA substance. The Berneout Pledge didn’t just clarify and document having the right to license a contribution. It actually licensed contributions:

                    Unless I specifically say otherwise, whenever I offer a contribution to a free public project, I license my contribution under the same license terms on which that project is licensed to the public at the time.

                    This idea, often called “inbound=outbound”, is one folks who dislike CLAs like to try and argue is implied just from “doing open source” in 2021. But I don’t see a strong context-independent legal argument for that under US law. And I don’t think I’m alone, because GitHub wrote something similar into the terms of service for GitHub. The difference being that with the Berneout Pledge, the developer takes explicit action to license their work. They don’t just passively ignore terms of service. IIRC, Red Hat did something similar a while back, with a tiny, pared-down CLA that basically said what I quoted above.

                    On Semantics: Meh. The trouble with calling legal forms with copyright assignments CLAs is that CLA stands for “contributor license agreement”. The word “assignment” does not start with “L”.

                    In any event, actual assignments are vanishingly rare for outside contributions to open source projects these days. Even FSF and SFLC are moving away from them. In part because there are extra formal process requirements for signing assignments that don’t apply to mere licenses, and legal theories about needing assignments and not just licenses to enforce the GPL got overtaken by experience in the courts.

                    1. 3

                      Thank you for your reply.

                      After searching, I actually do not have any examples of another DCO, just examples where DCO is used as a generic term. For example, see the Wikipedia page for the DCO, which also refers to “DCOs” [sic] and “the DCO” in the same page.

                      My point was not that there were more DCO’s out there; my point was that the term itself has evolved beyond just the DCO, even if it’s still the only one.

                      And based on what you said, having only one is a good thing.

                      About the Berneout Pledge: good to know that you do not recommend its use. However, the fact that I would call it “a DCO” and so would (I suspect, but have no evidence) other developers, means that even though you, as a lawyer, consider it a CLA, in practice, it is a DCO because the intended audience would call it a DCO. That means there is a mismatch in terms.

                      Now, if you argued in CLAs Are Not a Sham that developers need to use CLA for DCO-like things and give us a new term for the CLA’s that are a sham, then maybe developers could get on the same page because we need something to help us refer to those poisonous things; we need a replacement for “CLA.”

                      Do you have a term for such sham CLA’s? If you do, I’ll start using it and encourage others to. That would be the only way to get rid of this mismatch in terms.

                      Also, if you have a non-sham CLA/DCO that you would suggest people use instead of the DCO, I’d love to hear it. In fact, making one might be something I need your (paid) services as a lawyer for because I am trying to build a software business right now.

                      1. 6

                        I don’t share your guess that many developers would call the Berneout Pledge “a DCO”. Any more than they might call The Blue Oak Model License “an MIT license”. I have never seen anyone call anything but the kernel form “Developer Certificate of Origin”, which was why I was so interested when you mentioned examples.

                        The Apache CLA forms are approaching de facto standard, and while they’re not perfect, there’s nothing evil or broken about them. They do the job. The only stink on them, in my opinion, comes of developers’ misconceptions.

                        The rush to “substitute” DCOs, in my opinion, largely boils down to this:

                        • Developers understand the inconvenience of signing and tracking proper CLAs perfectly well, however mild it may be.

                        • Developers don’t understand the difference between what the DCO was meant to do and what CLAs are meant to do, or how they do it.

                        • Developers have nonetheless managed to overcome objections by more experienced hands, largely by pointing to successful projects under ancient permissive licenses that “adopted the DCO”.

                        • Many of those “leading” projects did little more than tack the text of the DCO onto CONTRIBUTING.md or some other file in the GitHub repo, without adopting any of the workflow or other aspects of kernel development—many don’t even use Signed-Off-By—that help the DCO arguably do legal work in context.

                        1. 2

                          I disagree that the Apache CLA forms are approaching de facto standard, by the very fact that I, a developer who have spent a lot of time reading licenses and other IP-related stuff, have not seen them before. They may be used in a lot of places, but there are companies that are explicitly using sham CLA’s.

                          You have good points, but I still think that to get developers to talk about CLA’s the way you want us to, you have to give us some way of referring to the sham CLA’s, other than “sham CLA’s” of course.

                          Maybe Relicense Agreements (RLA’s)? I don’t know.

                          1. 5

                            I disagree that the Apache CLA forms are approaching de facto standard, by the very fact that I, a developer who have spent a lot of time reading licenses and other IP-related stuff, have not seen them before.

                            That’s on you, man. Stop bothering me and go do your homework.

                            Nobody sees all of open source. But I can say with very high confidence that every lawyer specializing in open licensing that I know, as well as most open source program office people and foundation veterans, know the Apache CLAs, and know them well.

                            I never said CLAs are a sham. I wrote a blog post saying they’re not.

                            1. 4

                              You lost yourself a future client. This was not the behavior I expected from someone who I thought was professional.

                              I’m trying to tell you that developers, in general, think they are a sham, and if you want to convince us otherwise, you need to have another way to refer to actual sham agreements that allow companies to relicense open source, such as the Audacity one. When I was referring to “sham CLA’s,” I was trying to refer to the agreements that you were not talking about in your post, but that developers are concerned about.

                              You say CLA’s are not a sham. Well, there are agreements which are called CLA’s by their authors that are a sham. So if you mean to say that all CLA’s are not a sham, you are wrong, as proven by Audacity. If you mean to say that certain CLA’s are not a sham and that only those deserve to be called CLA’s, then you must give the sham agreements masquerading as CLA’s another name. That’s what I was saying.

                              For the record, what’s my homework as a developer? To go to law school? To keep track of what every open source community uses? I can only do so much and see so much, and I hardly interact with the Apache Foundation, an entity that appears to be slowly fading away. That’s why I would want someone like you to help, but you would have to meet me in the middle. Also, you say most lawyers and office people know about the Apache CLA. But we are talking about developers.

                              On top of that, I would bet there are a lot of developers out there whose only exposure to CLA’s are the ones that are called CLA’s, but in your eyes are just sham agreements masquerading as CLA’s, the ones used in bad ways. If they read your post, then a lot of them might come away thinking that you either don’t know what you’re talking about, or that you don’t understand developers concerns.

                              I can now firmly put myself in the latter category as I spent time trying to help you see my concerns while acknowledging your concerns that developers are using the wrong terminology, and yet you refused.

                              That said, I might write a future blog post pointing to yours and suggesting people use the phrase “Relicense Agreements” to refer to the sham agreements masquerading as CLA’s.

                              That long phrase, “sham agreements masquerading as CLA’s,” is just to not allow you to assume that I am putting words into your mouth about CLA’s. The fact that I have to do that is sad.

                              I apologize for bothering you. I won’t bother you again, even if you reply to this last post.

                  2. 6

                    @kemitchell, your description of a CLA makes perfect sense to me, but it also is very unlike other instances where I’ve encountered CLAs. I might have a different view on some things, since I’m looking from a European perspective, I assume you have a Northern American perspective on things.

                    The CLAs I have encountered are typically intended to allow the “host” company to change the license afterwards, through constructions where ownership, copyright or intellectual property rights are transferred, and a license to use your own contribution (but not the whole product) however you like, regardless of the license of the project.

                    I am very much against these kinds of CLAs, that want to retain ownership of the product at one place. I feel that the CLA “replaces” the open source license, and open source licenses currently are well known among programmers, but we don’t have standard CLAs. I think that your idea for the berneout-pledge is on the right track, where a developer states that they know about intellectual property rights and pitfalls. Ideally I’d like to be able mark the commits somehow (like Signed-Off-By, but more explicit).

                    The term CLA may have become very negatively loaded with the Audacity and ElasticSearch incidents, most developers would be more willing to sign “a DCO”; I don’t know any other than the Linux one, but I do understand what @gavinhoward means when he says “a DCO”; The term CLA “feels” like something that you sign to give up rights in order to obtain the privilege of contributing, DCO “feels” like something that you “sign” or “use” to indicate that what you give is yours to give, and you make it available under the same terms as the original.

                    I noticed that you mentioned easier litigation against GPL violations, but do you really need ownership for that, or is it possible for me to allow you to fight GPL violations on my behalf without me giving up anything?

                    1. 5

                      You don’t need ownership of all contributions in one place to change license terms over time. You can get the same effect with licenses for all contributions on sufficiently generous (permissive) terms.

                      That goes both ways: oftentimes there’s little or no practical difference, in the short term, between owning copyright in some code or having a license to do pretty much whatever you want with it. That’s true for developers as well as organizations. What good is owning copyright in a hundred or even a thousand lines of patches to a much larger project, which is only useful as part of that broader project, which also happens to be licensed to the public on open source terms?

                      There is a popular, even de facto “standard”, set of CLA terms:

                      It’s just that other orgs don’t use these forms verbatim, because they say “Apache” specifically. Instead, they patch them, or use them as starting points. Which is exactly how the X/MIT and BSD licenses grew into de facto standards, as well. Among specialists, it’s perfectly normal to hear a conversation like:

                      — Does SuperCo use CLAs for their project?

                      — Yeah. Apache CLAs.

                      Compare the Apache CLAs to Google’s:

                      Or Facebook’s:

                      Or compare Microsoft’s form, also a license and not an assignment: https://opensource.microsoft.com/pdf/microsoft-contribution-license-agreement.pdf

                      The only meaningfully popular software contributor IP form with any assignment-like legal mechanism in it that I can think of was Oracle’s. But even then, the putative result was joint ownership—both Oracle and the developer can act as owners—not complete transfer from developer to Oracle. Oracle’s form also got reused, most notably for me by Rich Hickey, for Clojure.

                      On whether an org like FSF needs assignments to enforce a license like GPL, read my comment here carefully. I mention the legal theory behind that position has largely been overtaken by legal developments.

                    2. 5

                      I really appreciate your thoughts - especially as someone who follows your blog a bit avidly. Also appreciate your/Heather Meeker’s work on Polyform. I’ll be honest, it hurts a bit to hear from someone I look up to in many ways that I am spreading misinformation and don’t fully understand why.

                      It’s clear to me that my statements are at the very least missing nuance, as that is the overwhelming negative feedback I have received - and I suppose that is feedback I should’ve readily anticipated in writing an article bordering on legal suggestions from my, non-lawyer, developer viewpoint.

                      I have read a few corporate CLAs, copyright assignments, the DCO, and a handful of licenses in full before arriving at these conclusions and feeling comfortable enough to talk about my viewpoint publicly - and while I did know I would certainly be mixing up some terminology, I guess I didn’t think my viewpoint could be that far off.

                      I guess it’s too late to redact the article in full, so I’ll include a link to your comment at the top of it instead if that’s okay with you.

                      1. 3

                        Chin up! And never be discouraged from reading up on legal topics, especially those affecting your rights.

                        Know this: There isn’t any particular detail in software licensing you couldn’t grok in a couple days. The question is how many days you have and want to spend. Not whether you’re “smart enough”. Not whether you can gaze through the shroud over any legal mystery.

                        I have one advantage over you: a big head start. You will have to make a whole lot more mistakes, and a whole lot more serious ones, at an impressive pace, to risk catching up to me. I have regretted my share of blog posts, that’s for sure, not to mention social comments. In a really direct way, those missteps are my qualifications, insofar as I learned and recovered from them. That shouldn’t be such an awkward thing for so-called “experts” to say.

                        You should also be aware that there are other so-called “experts” out there—with and without the brain damage of legal training—who agree with me on terms, law, and practical details, but strongly support transitioning away from CLAs and toward the DCO in particular. As an example, I’d guess Bradley Kuhn might lean that way. But my (numerous) disagreements with Bradley mostly reside at a higher level, where the question is what outcome we want, or how we guess the legal fundamentals will play out in practice, now or twenty years from now. Not, I think, whether the DCO belongs in the category of “license agreements”.

                        1. 2

                          I may not be a Kyle E. Mitchell lawyer, or a lawyer at all, but I don’t think your post was full of misinformation. The reason is that you used the terms as developers generally understand them, and Mr. Mitchell is using them as lawyers understand them. (See my reply to his comment above.)

                          Since your audience is developers, I think it’s important to speak using the terms the way they will understand them. Linking to Mr. Mitchell’s comment is still good, however, because it will help developers see how a lawyer sees the issue.

                        2. 1

                          I’m not sure how the blog post is a rebuttal of the authors points.

                          In the end, developers’ perceptions come down to:

                          • CLAs usually require copyright assignment¹
                          • DCOs explicitly don’t

                          Personally, I consider copyright assignment to for-profit organization to always be an abusive practice, and I simply won’t contribute if a CLA is in place.

                          I have a policy to allow assignments to non-profit organizations under existing CLAs, but will not accept any new ones for those organizations either.


                          ¹ something you incorrectly dismiss with “assignment of some or all copyright in the contributions to the project steward (uncommon)” in your article

                          1. 3

                            CLAs usually require copyright assignment

                            No, they do not. Read the terms.

                            Apache’s CLAs—the most popular model for other orgs’ CLA terms—do not assign copyright. Google’s CLAs do not. Facebook’s CLAs do not. Microsoft’s CLA does not. Oracle’s CLA does, and some projects like Clojure use versions of the Oracle CLA. But the result is joint ownership, not sole ownership by Oracle.

                            The DCO doesn’t assign copyright, and it doesn’t license it, either. The DCO is therefore not a license agreement, and therefore not any contributor license agreement (CLA), either.

                        1. 6

                          I don’t get it, isn’t this just a simple, reusable CLA? Also, I actually think there’s value in allowing the project to switch licenses, as long as the new license is still philosophically compatible with the old license. For example, what if a court invalidates an important part of a popular license? Presumably projects would want to switch, or “upgrade” to a newer version of the license.

                          1. 6

                            No, it’s not. A CLA is an arbitrary agreement often reassigning ownership of your change to the company entirely. a DCO just says “I agree it’s under the current license.”

                            If you want to be able to switch licenses, that’s fine, but it should be part of your license itself IMO (like GPLv3) and not in a contributor agreement.

                            1. 3

                              Yeah, but if your CLA just said “I agree it’s under the current license.”, wouldn’t it be the same thing? I don’t know, maybe it’s just that “Developer Certificate of Origin” seems like a weird name for what this is doing. It’s not a certificate (in the usual sense), and it has nothing to do with the origin.

                              1. 2

                                It is certifying that the change came from a good origin (you, or someone who can approve of its release).

                                The problem with CLAs is that it could say anything and when you want to submit a small change you’re going to need legal review of the CLA you are signing (or just go in blind..)

                                1. 2

                                  So the DCO is just a particular implementation of a CLA that is known in advance (in the same way that the GPL v3 is a particular implementation of a software license, which, generically speaking, could say literally anything). A CLA could say anything, including exactly what a DCO says. A DCO literally binds the contributor to allowing the code to be released under the project’s license. It also certifies that the contributor CAN enter into that agreement. These are two common functions of a CLA.

                                  1. 2

                                    So yeah, from a legal, English, and purely technical sense you’re 100% correct, I agree, “a DCO is just a CLA” - and I’ll readily admit that from that viewpoint my statements were wrong.

                                    But I would argue that in a world where 95%+ of CLAs are arbitrary legal agreements written up by lawyers at companies, and in a world where I’d bet almost every developer you would ask “What is a CLA?” to would respond with “some arbitrary legal agreement that is not open source”, it makes sense to try and draw a distinction between “a CLA” in the modern predominant usage of the term and “the DCO”. That is the viewpoint I wrote this from, and responded to you from.

                                    I do understand in retrospect, though, that if you do have a very nuanced view of software law / licensing that the statement the title is making would at best be a confusing one, at worst be an outlandish one (because “a DCO is just a CLA” from a purely technical view, but not from a practical lay developer’s view.) In that case, please accept my apologies and substitute “CLAs” with “95%+ of CLAs” when reading into my writing.

                                    1. 1

                                      I appreciate you humoring me, I honestly wasn’t trying to be obnoxious. It makes more sense to me now! Thank you for the explanation!

                                      1. 4

                                        The conclusion above is incorrect. The DCO is not a license agreement, and it was not designed to address the same legal problems as CLAs. Plugging a CLA shaped hole in your process with a DCO is fine only if and only if you can live with the gaps. Which probably means you’ve got a copyleft project with no foundation or company home using kernel-style e-mail-based workflow.

                                        Read the DCO. https://developercertificate.org/. Read a short license like MIT or BSD. Look for overlap. You won’t find much.

                                        Then read Apache’s Individual CLA, a de facto “standard” applied to projects in many other contexts. Compare again. Do the homework.

                                        1. 2

                                          I never said the DCO was a license agreement (at least a Software License Agreement, like MIT or BSD), I said it was a particular case of a CLA, a Contributor License Agreement, because it binds the contributor in certain ways (they agree to have their code licensed in a certain way). I think you misunderstood me.

                                          1. 3

                                            I referred to slimsag’s statement that “a DCO is just a CLA”. It is not.

                                            CLA stands for Contributor License Agreement.

                                            1. 1

                                              Yes, it is, otherwise it wouldn’t be useful. A CLA binds the contributor in certain ways to protect the project. So does the DCO. What, exactly, causes the DCO to fall outside the set of possible CLAs?

                                              1. 5

                                                You’re trying to work backward from observed behavior, a lot of which is based on misconceptions repeated in slimsag’s blog post, instead of working forward from terms. Don’t take my word for it. Read the DCO. Read, say, Apache’s individual contributor license agreement. Compare.

                                                As for where the DCO belongs, see my comment above. The DCO is fit for purpose in a pretty small number of projects that don’t by chance resemble Linux in licensing situation and workflow. The DCO was written for the kernel. For why, see https://en.wikipedia.org/wiki/SCO%E2%80%93Linux_disputes, or dig up the kernel mailing list messages on DCO 1.0 from back in ’04 or whenever it was.

                                                For a more details, see https://writing.kemitchell.com/2018/01/06/CLAs-Are-Not-a-Sham.html

                                                1. 2

                                                  I’ve read them both. They seem like different implementations of the same concept. They both attest that the contributor has or will do various things, and that the contributor possesses certain legal rights. Kind of like how the GPL and MIT licenses are quite different, but they’re both obviously software licenses. In fact, clause (a) of the DCO and the first sentence of clause 5 of the AICL are basically identical. Again, the DCO is just a particular example of a CLA.

                                                  1. 6

                                                    Where is the present-tense language granting a license in the DCO? The condition requiring notice of license terms be given with copies? Any copyleft rule? Where’s the warranty disclaimer?

                                                    You can’t just compare general impressions of legal texts. These are functional documents. What “features” does the DCO provide? How about the Apache ICLA? Compare those feature sets. What’s the diff?

                                                    CLAs almost always address the question of whether a contributor has the right to license code. The DCO tries to do that, too. But CLAs go beyond the question of ability to license and grant licenses. Hence “license agreement”.

                                                    In other words: CLAs don’t just say “I can license this”. They say “I can license this and I do license it, under these terms”. CLAs document that act of licensing, usually by having the contributor electronically sign something, yielding a record the project steward can store long term.

                                                    The DCO, or more specifically the Signed-Off-By convention, were also designed to create documentation. But that documentation covers where code came from, through all the various patch reviewers and Linus lieutenants that handle a kernel patch. Not the act of licensing. There’s only one set of terms kernel code can be licensed under: GPL, and specifically GPLv2, or some “GPLv2-compatible” permissive license.

                                                    As for actually granting that license, it’s all implied. The words “submit” and “sign-off” are nowhere defined and explained in the DCO. They have particular, specific meaning in the context of kernel development. But most open source projects aren’t developed like Linux or Git. Most hackers have never read or even heard of --signoff, and the git-commit man page is not statutory law.

                                                    The problems SCO made for Linus weren’t questions about whether contributors chose GPLv2. They have to choose GPLv2. The problems were SCOs claims that particular Linux source code, much of it likely brought in by Linus’ own patches, came from the UNIX they bought, rather than Linus’ original work or other open-licensed releases. The kernel devs didn’t have evidence on hand to document where a lot of Linux code came from. SCO filled some of those gaps with claims the code was theirs.

                                                    1. 1

                                                      This is interesting context, and I enjoyed reading it. I still think it’s accurate to say that the DCO is a particular kind of CLA.

                                                      1. 2

                                                        Suit yourself. Call it a “CLA”, but it is not a License Agreement.

                          1. 3

                            Unbeknownst to you, that person actually works at Sony. They didn’t realize this, but their contributions in their spare time are not owned by them and Sony’s lawyers now want to sue you for using their proprietary, patented technology.

                            If you didn’t get a written statement from that person saying they owned those contributions, and actually intended to release them under your open source license terms - then you could indeed be found in the wrong!

                            Legally speaking, what does having them sign a CLA do? If Sony has that line in their authors’ employment contract, then it seems like they would own the code regardless. Is the idea something like: the CLA provides some sort of protection because you put up a step asking for authorization so you can say you’re not intentionally using code that’s under others’ copyright?

                            1. 2

                              Is the idea something like: the CLA provides some sort of protection because you put up a step asking for authorization so you can say you’re not intentionally using code that’s under others’ copyright?

                              Yes, exactly that. In one scenario you may be liable for damages (‘profits lost due to using their unlicensed IP’) while in another you could argue to the court that you did due diligence and it’s the contributors fault, not yours.

                              1. 3

                                Lack of intent and due diligence are not defenses to copyright infringement. That’s a common misconception you’ll see on YouTube, in disclaimers people try to use when posting copyrighted content, but will not see in the Copyright Act.

                                On whether people actually have the right to license contributions: please read a few CLAs. See also Apache’s licenses page, which has CLA forms both for companies and for individuals.

                                1. 3

                                  Lack of intent and due diligence are not defenses to copyright infringement. That’s a common misconception you’ll see on YouTube, in disclaimers people try to use when posting copyrighted content, but will not see in the Copyright Act.

                                  I do think these disclaimers have value though: namely, I find them extremely funny because this is not how it works at all.

                                  In general loads of people seem confused about this. Back when I edited on RationalWiki (many years ago) I tried to clean up the copyright mess of their images a bit, but this was met with significant resistance from some people. They would take some webcomic, slap that on a page, and claim “fair use” as “parody” or “criticism”. But … you’re not actually criticising or parodying the copyrighted work, it’s just used in the course of parodying something else. That’s not how fair use works at all. This mostly fell of deaf ears though 🤷

                                  That entire site is a copyright lawsuit waiting to happen. I’m surprised it hasn’t already happened given that lots of people would just love to see the site shut down.

                                  1. 5

                                    Yeah, I used to kind of “collect” badly misconceived copyright disclaimers I’d run into online, like some folks stockpile memes. And it was fun, once or twice, to trade best-ofs with fellow copyright lawyers. But I wouldn’t have any fun hyuck-hyucking about it these days, especially in public. In the end, it’s at least partly a failure of the copyright bar that there’s so much bad information out there. It’s got to the point where we’re flooded with bunk second-order information, like those disclaimers, based on more fundamental misconceptions.

                                    Basically: We’ve really flubbed public education. We didn’t scale up good copyright information with application and enforcement of copyright online.

                                    On the flip side, there’s what the law allows and there’s what copyright holders allow. Plenty of copying, sharing, and even “remixing” we see online probably wouldn’t hold up in court, or would cost so much to defend that nobody reasonable would try. But also falls beneath cost-benefit or concern for the copyright holders. Some of them even welcome it, so long as they don’t have to formally license it. I know some Star Wars fans who’ve spelled out the rules of the road for fan media and the like, basically a stricter noncommercial, with a few nuances. Those rules are nowhere written, but broadly understood.

                                    That’s a very normal situation in law. Theory and reality combine where they conflict. But the mismatch also creates opportunities to prey on expectations developed from experience, rather than formal study. So we have photographers running around suing people for reusing their stuff on social media, in all honesty just seeking fair comp and creative control, often against people convinced they’ve done nothing wrong. But we also have folks enforcing copyrights—or even acquiring copyrights to enforce—where the only real financial value in the IP is the power to shake others down for small dollars, even against folks who might win fair use if they could afford to fight. And every shade of grey between.

                                    I often suspect it’s the really frustrating experiences of folks on the receiving end of claims under rules that don’t match lived experience, moreso than the rules themselves, that makes people so cynical about the legal system. I don’t say that to absolve lawyers. There’s lawyers on every side of this.

                              2. 1

                                It shifts the burden to the developer that contributes the code. My understanding is that in the case of a copyright lawsuit, the contributor is now the only person liable to pay for damages.

                                That’s why, as a developer, you should never sign a CLA. Unless you are ready to pay a lawyer that can advise you on exactly what it means.