1. 73
  1. 21

    It’s nice to see a software project looking at some bit of manual work they periodically do by rote, asking themselves whether it actually makes sense to do this work, and concluding that it actually doesn’t and ceasing to do that work.

    1. 6

      This is the Forth philosophy applied to life.

    2. 19

      I find it so odd that there is still code being written where they put a copyright clause at beginning of each source code file. It would drive me mad seeing same thing every time I open a file to edit.

      1. 22

        I find it so odd that there is still code being written where they put a copyright clause at beginning of each source code file.

        It’s important. Individual files are often copied between projects, without a per-file copyright notice it is incredibly difficult to comply with the license.

        1. 10

          Fair, but parts of files are also often copied. Why is the right granularity file-level rather than, say, function-level? Or module-level?

          1. 8

            Someone copying part of a file has access to the source file and so can easily copy the relevant copyright from it. Someone who has access to a file is not guaranteed to have access to the root of the repo.

            More importantly, someone copying part of a file already has made work for themselves if they want to sync with upstream, but if you copy an entire file then you can trivially update it by just copying the new one as an atomic unit. If you have to separately track copyright, that’s harder. For example, LLVM recently relicensed from. UIUC to Apache 2 + GPL exemption. LLVM has per-file copyrights, so anyone pulling a file from LLVM will see the license change and can check if they need to do anything else to comply. If the license is only in the root of the repo, this is easy to miss.

            Finally, a lot of projects pull a few files under a different license. For example, we’re about to release something under an MIT license, but a couple of files come from FreeBSD and so are 2-clause BSDL (which is more or less equivalent but very subtly different). If you want to know the license of a given file, it’s a lot easier to look at the first line than to go and read a detailed third-party notices file and see if it’s one of those or whether it’s from the LICENSE file (some projects that are less concerned with audit trails just put all of the licenses in LICENSE and say ‘some parts under this other license’. Working out which files are under which license is a pain).

            1. 2

              parts of files are also often copied

              One argument here is that at the function-level, for sufficiently short functions, the code might not actually fall under copyright protection.

              1. 1

                MPL 2.0 allows you to modify and open source the single file to comply with the license rather than the entire project.

              2. 6

                If it has to be there, why not put it at the bottom of the file so the top can have something more important?

                1. 3

                  Mostly convention, some tooling. These days, the per-file license can be a two-line SPDX identifier and list of copyright holders (which can be vague. For Verona, ours says ‘Microsoft and Project Verona contributors’, which is sufficient that someone who cares can look at the git history in the canonical repo and get a complete list). That doesn’t take much space and makes that auditing tools simpler.

                2. 4

                  I don’t know if it’s still common practice but adding a file named LICENCE to the root directory of a set of files under a particular licence is/was common.

                  This seems like it would be easy to find and therefore comply with. When you say it’s incredibly difficult without a per-file copyright notice (assuming you mean licence notice), is that without the top-level LICENCE file too?

                  1. 5

                    The issue is with copying individual files. Imagine I want to copy, say, the printf implementation from FreeBSD libc and it didn’t have a per-file copyright. I now need to also grab the LICENSE file from the root of the repo. I now either add it to my own LICENSE file, or to a separate third-party-notices file. In either case, I need to specify in that file that this file, unlike the others in my repo, is 2-clause BSDL. Someone else who comes along sees the printf implementation and sees that my project is MIT licensed, so they copy it. Did they notice the thing in my third-party-notices file that says that this file is actually 2-clause BSDL? If not, they don’t coy it and are now (technically , at least) I’m violation of the license.

                    Now consider that example in reality, where the file has the license in it. The only thing that they need to do to comply with the license is retain the license and copyright in source form and so they can simply grab it.

                    Now imagine that FreeBSD relicenses their printf implementation (this actually happened a few years ago when UCB allowed everyone to drop the advertising clause). I pull in the new one and, with per-file copyrights, the change of license is immediately visible in the diff for the file. Someone copying the file from me will see the change. With a separate license file, I might miss it and even if I don’t the folks downstream from me can easily miss it, especially if I don’t see it immediately had have the update of printf.c and LICENSE in separate commits.

                    It’s good practice to do this for every file because it’s hard to work out in advance which files will be copied to other projects.

                    1. 1

                      The issue is with copying individual files. Imagine I want to copy, say, the printf implementation from FreeBSD libc and it didn’t have a per-file copyright. I now need to also grab the LICENSE file from the root of the repo. I now either add it to my own LICENSE file, or to a separate third-party-notices file. In either case, I need to specify in that file that this file, unlike the others in my repo, is 2-clause BSDL. Someone else who comes along sees the printf implementation and sees that my project is MIT licensed, so they copy it.

                      A project should not be labeled as MIT licensed if it is in fact mixed-licensed.

                      It makes sense to use per-file notices for such a project, rather than trying to point to individual files from a LICENSE or third-party-notices file. But for projects that are all one license, a single LICENSE file is easier and reduces clutter, and nothing is stopping someone who wants to use a file from that repo from adding the license at the top if their project does it that way.

                      1. 2

                        But for projects that are all one license, a single LICENSE file is easier and reduces clutter, and nothing is stopping someone who wants to use a file from that repo from adding the license at the top if their project does it that way.

                        See my other comments in this thread. If I need to add a license for your file, my copy of your file is no longer simply a copy of your file. Now I have made changes and I need to track those changes. You make more work for me if I pull in a new version because now it isn’t a straight copy, I have to do a merge, and I need to go and check that you haven’t relicensed the project by reading a completely different file. In contrast, with per-file copyrights, I just grab the file and look at the diff. If the copyright header has changed, I know the license has changed and I catch that when I check for any other incompatible changes. If your copyright header is an SPDX identifier (which it generally is these days) then I don’t even need to look, it’s an automated check.

                        I consider doing a trivial amount of work if it avoids other people having to do a lot of work to be basic politeness. You may disagree.

                        1. 1

                          So you’re talking about repeatedly pulling a single file from another repo and merging it into yours as it is updated. That seems like a pretty uncommon practice, but maybe I’m wrong. If the author of a project wants to support that use case, then the file they expect to be used in other projects becomes a mini-project unto itself and it would make sense to add a license header, short of giving it its own repo.

                          I consider doing a trivial amount of work if it avoids other people having to do a lot of work to be basic politeness. You may disagree.

                          I would weigh the work of monitoring changes to a LICENSE file against the need to scroll past the header every time someone opens any file. The latter is a small amount of work, but it is a task done more often and by more people than the task of merging a single file from an external repo. Then there is the task of adding the LICENSE header whenever the author creates a new file.

                          1. 1

                            Copying individual files (or small groups of files) is pretty common for any popular project. The main reason that it happens is that consumers have a different notion of a reusable unit to the creator of the project. It also tends to happen after a project has grown, rather than as something that would be useful at the point files are originally added.

                            I modern SPDX copyright header is one or two lines long. If scrolling past that is a problem then you might want to consider larger editor windows.

                            1. 1

                              Copying individual files (or small groups of files) is pretty common for any popular project. The main reason that it happens is that consumers have a different notion of a reusable unit to the creator of the project. It also tends to happen after a project has grown, rather than as something that would be useful at the point files are originally added.

                              Can you think of an illustrative example?

                              I modern SPDX copyright header is one or two lines long. If scrolling past that is a problem then you might want to consider larger editor windows.

                              That seems handy but I am worried about whether those headers are as legally sound as a full copy of the license. Who certifies the binding between the license identifier and the license text?

                3. 3

                  Reading about countless GPL copliance failures and how rarely copyleft theft is being accountable makes me want to put exact minimal amount of effort into licensing declaration. I’ll drop LICENSE file and I highly doubt any other action would protect my work more than that.

                  I wonder if adding the clause in each file ever made a difference historically speaking.

                  1. 1

                    We were instructed to do it, and it doesn’t bother me when I’m in the IDE, only when I’m using cat on the CLI for instance.

                  2. 9

                    I never really understood the annual bumping of copyright years. Copyright terms are fixed from when a work is copyrighted, but to become a new version it needs to have been modified in a way that counts as creative, automatic update of a year in the header doesn’t count. Bumping the year when you make a substantive change makes sense.

                    1. 2

                      That can be difficult to remember when you’re hacking on some files that haven’t been touched in a while: “did we touch it earlier this year?” is not a question that springs to mind, and typically people just skip the comments at the top without reading them when hacking on some code (e.g. “jump to definition”).

                      1. 5

                        It’s not hard to do a check before you commit to see whether any of the files you’re committing has a copyright year less than your current one. It’s probably fairly easy to automate into a pre-commit hook if that’s the thing that you actually want to check. It’s also something that I’ve had caught in code review. If your project tracks the copyright holders in the per-file copyright, you need to check that anyway (have I [or someone else at my employer] modified this file before?).

                        1. 4

                          I think it is only relevant if anything substantial changes in the file, not when you’re doing some small modifications. That significantly reduces the number of times you have to check the copyright year, and if it is forgotten it can still be easily changed afterwards.

                      2. 6

                        I wonder if this is legal to do. The curl license says

                        Permission to use, copy, modify, and distribute this software for any purpose with or without fee is hereby granted, provided that the above copyright notice and this permission notice appear in all copies.

                        Which seems to imply that modifying the copyright notice, such as to remove years, would be a violation.

                        1. 8

                          In that case modifying the notice to add new years was just as wrong, no?

                          1. 2

                            Yes, unless you are the copyright owner.

                            1. 5

                              Would they even be able to get statutory damages from a court case involving that? It’s hard to imagine a court not laughing out a claim that changing:

                              Copyright (c) 1998-2013 Jane Doe
                              

                              to

                              Copyright (c) 1998-2014 Jane Doe, Alan Smithee
                              

                              while preserving the rest of the license, author names, and abiding by the rest of the terms, constitutes copyright infringement. And similarly for changing to it:

                              Copyright (c) Jane Doe
                              
                          2. 7

                            Which seems to imply that modifying the copyright notice, such as to remove years, would be a violation.

                            You may be right that this is a technical violation of the license. However, it’s not the kind of violation that invalidates the license, which means there’s no damage because all the permitted uses of the copyrighted material are still permitted. Which means nobody has the right to sue over it. A violation without a remedy is practically equivalent to no violation at all.

                            The law is unlike programming in this way: judges, lawyers etc don’t blindly follow the precise text; they look at the meaning of the text and decide how it applies to the present situation. They don’t act like computers, they act like people. (There’s obviously a lot of complexity here, and some cases where it may seem like judges do act in that way, but those situations are almost always about some judge choosing to act that way. If you’re very interested, I highly recommend Christian Turner’s Modern American Legal Theory course, available freely in podcast form).

                            Source: I’m a former intellectual property lawyer (I practiced in Australia).

                            1. 6

                              Copyright law at least in the US (would assume the same worldwide) affords the owner the exclusive right to create derivative works of their own work.

                              1. 7

                                Yes, but Daniel is not the only person who has contributed to curl.

                              2. 3

                                The license was left unchanged, though.

                                1. 3

                                  provided that the above copyright notice … appear[s] in all copies.

                                  You have to preserve the copyright notice in order to take advantage of the permissions granted by the license.

                                  1. 1

                                    The copyright notice is preserved, in COPYING or the licenses directory.

                                    1. 2

                                      Not with the years.

                              3. 3

                                As another aspect: Now that we’re talking about unnecessary bits in licenses, the copyright symbol ‘©’ is also unnecessary. Thus instead of

                                Copyright © 2021-2023 John Doe
                                

                                it is also completely sufficient to write

                                Copyright 2021-2023 John Doe
                                
                                1. 2

                                  I personally like the years because it gives a contribution history of each person involved in the project. The year is only added if a person contributes something in the year. Indeed, git also offers that (by definition), but the license is preserved even in releases.

                                  It feels like this article more shows how sort-lived most projects have become. The aforementioned advantage only weighs in if the project is older than 5-10 years.

                                  1. 2

                                    Personally I find listing the authors in copyright mention very awkward. The original person who created the file puts themselves there, but what are the rules for other people to insert themselves? A large contribution, sure, but how do we track people who make a lot of small contributions over time? (Do we decide at some point to add them?) What will people think if we don’t add them, that their contribution has less merit? Very awkward.

                                    In practice, in the project I work on:

                                    • The copyright header has just the name of the person who created the file in the first place, it is never updated. It is a bit silly, I guess we could remove it.
                                    • We maintain a Changes file that explicitly lists, for each Changes item, who participated to the change (authors, reviewers, reporter, etc.).
                                    1. 1

                                      There is a so-called threshold of originality and it depends on the jurisdiction. The special case in programming is that even very minimal contributions can be the result of extensive work (e.g. a one-line-bugfix as the result of many hours of debugging), so I don’t see this as a legal question and more of one of common sense.

                                      To give an example, I wouldn’t add people to a license that merely submitted style changes, however, anything that goes deeper usually goes with a person becoming involved in a project with multiple commits.