1. 41
  1.  

  2. 29

    This could’ve been an excellent post about engineering in the large, given GitHub’s scale and age. What did it take to translate the legal requirements into a software project, to build a list of every cookie that every line of code has set in the last 12 years, to match them to business justifications, to migrate away from or combine all the unjustified ones, and to manage multipart deploys as they remove dependencies on specific cookies? Instead the entire post is patting themselves on the back for deleting one div.

    1. 5

      Concur. When I read “So, we have removed all non-essential cookies from GitHub” my immediate first thought is “What is essential?” followed shortly by “Why is that essential?” Is it because you made a design decision that required you to use that package with a tracking cookie? Is it because you wanted an easy answer and setting a cookie was that?

      For example, Github right now sets a cookie “logged_in=yes” for my account. I can delete that cookie and the page loads just fine, resetting the cookie.

      Github also sets a cookie “dotcom_user” that is my username. But it also keeps a device ID and two separate session IDs. Do you really need my username stored in that cookie?

      This is them calling “Mission Accomplished” on the aircraft carrier of privacy to feel good, nothing more.

      1. 9

        logged_in=yes

        Good god, the horror! Someone call the police! 😒

        GitHub setting cookies that only they themselves can read isn’t a problem; they got that information anyway. The problem is cookies by third-party services, and those have, as I understand it, been removed.

        1. 6

          I hope it was clear from the rest of my comment that I am not trying to say this cookie is an abomination to mankind. I am trying to point to the issue of “essential” not being defined. To be more precise, Github is saying this falls within:

          Strictly necessary cookies — These cookies are essential for you to browse the website and use its features, such as accessing secure areas of the site. Cookies that allow web shops to hold your items in your cart while you are shopping online are an example of strictly necessary cookies. These cookies will generally be first-party session cookies. While it is not required to obtain consent for these cookies, what they do and why they are necessary should be explained to the user.

          My question then, as @pushcx noted, is how Github’s engineers/developers/policy folks decided what was “essential”. That would have been a good read. But if I can delete an arbitrary cookie and the site will still work (and immediately reset it), is this actually essential? It seems like the answer was to remove 3rd party cookies and then everything left must be essential, instead of digging in to understand why each cookie was set and determine if it held value.

          Do I care that Github sets cookies? Not in the least. Does Github setting a boolean flag and storing it for a year (logged_in=yes is a persistent year-long cookie, not a session cookie) affect my privacy directly? Not likely. Does Github setting a string value (dotcom_user=[my username]) affect my privacy? Well, it could, depending on which computer I log in to and if they actually destroy that cookie, since it also is a persistent cookie stored for a year and not a session cookie. Does Github setting a value called device_id and storing it as a persistent cookie for a year affect my privacy? I mean, it easily could, depending on what they do to aggregate that information and what computer I use to log in.

          But ultimately, this comes down to the assertion they’ve removed everything but the essential cookie and the glaringly obvious question “How did you decide what is essential?” It wasn’t addressed and at least the logged_in example above would seem to identify a persistent cookie which is stored for a year which is not “essential for you to browse the website and use its features, such as accessing secure areas of the site” since they also track two other session cookies that would tie to my logged in session.

          1. 3

            My previous comment was probably a bit more snarky than it should have been – apologies.

            The GDPR only applies to personally identifiable information, and a logged_in=yes cookie isn’t, so GDPR doesn’t really apply. You still have the ePrivacy directive (“cookie law”) which, if you follow it to the letter, you may need to ask consent before setting that logged_in=yes cookie as it may not be “strictly” necessary (depending on your definition of “strictly”). However, it’s not really in the spirit of that law, and I think worrying about it is rather … pointless.

            I think far too many people are somewhat overly occupied with the letter of the law on these issues, rather than the spirit of it.

            I agree the depth of the article could have been a lot better – it’s pretty shallow as-is.

            1. 1

              Thank you for the clarification, all is forgiven! I appreciate your position much better now. I would definitely agree that logged_in=yes (or no) is not personally identifiable and not within the spirit of the law. I went too far in my comparisons of the cookies and should have stuck to things like the username.

              Thank you for clarifying GDPR v ePrivacy directive for me.

            2. 3

              I just tried opening GitHub in a private window to see what cookies it sets. There are three:

              • _gh_sess, set to what looks like a random identifier. This is a session cookie.
              • tz, set to my time zone. This is a session cookie.
              • _octo, set to GH1.1.2014279703.1608313886. This is persistent and seems to be a unique identifier (the numbers change if I clear it and start again).
              • logged_in, set to no.

              The tz and logged_in cookies are probably not technically essential (the tz one contains a copy of the time zone my browser sent in the HTTP request), but they don’t contain any personally identifiable information.

              The _gh_sess cookie is a generic ID that’s probably tied to a database entry for maintaining persistent state across navigation. Whether that’s fine or not depends on what they store in the database.

              The _octo cookie is interesting because it’s a unique identifier that tracks me across return visits. That’s on the borderline of okay, depending on what it’s used for. I’d really like see a follow-on blog post that talks about these two cookies and what they’re used for.

      2. 10

        This is a project announcement about complying with laws around privacy. So, should be tagged law, privacy, release. It contains no technical information about version control, so no vcs, and no information about browsers, so no browsers. web doesn’t really make sense either.

        1. 4

          This post is pretty low on (technical) content, but I think it’s worthwhile for what it is signaling in the bigger scheme of things: privacy matters, and having third parties snooping on a user on your site is undesirable. Combine this with Facebook’s flailing around to pretend its user spying is a good thing somehow(?) and non creepy tech gift giving guides (and I probably forgot some more), to me sounds like times are finally changing.

          1. 1

            Great, now do ICE